Articles About Natural Language Processing

Issue #132 – Tokenization strategies for Korean MT tasks

27 May21 Issue #132 – Tokenization strategies for Korean MT tasks in Model improvement, The Neural MT Weekly Author: Dr. Jingyi Han, Machine Translation Scientist @ Iconic Introduction Asian languages have always been challenging for machine translation (MT) tasks due to their completely different grammar and writing system. As we know, there are specific segmenters for Chinese and Japanese as there is no space between words in these languages. With regards to Korean, even though the words are separated by […]

Read more

Natural Language Processing Step by Step Guide

This article was published as a part of the Data Science Blogathon Overview Basic understanding of Natural Language Processing. Learn Various Techniques used for the implementation of NLP. Understand how to use NLP for text mining. Prerequisite You must have a basic knowledge of Python. As we know every piece of data has some meaning in its position. Most important is that text data is getting generated in various formats like reviews, SMS, emails, and many more for every moment. The […]

Read more

Top 8 Python Libraries For Natural Language Processing (NLP) in 2021

This article was published as a part of the Data Science Blogathon. Introduction Natural language processing (NLP) is a field situated at the convergence of data science and Artificial Intelligence (AI) that – when reduced to the basics – is all about teaching machines how to comprehend human dialects and extract significance from the text. This is additionally why Artificial Intelligence is regularly essential for NLP projects. So what’s the reason, why many companies care about NLP? Basically in light […]

Read more

BERT for Natural Language Inference simplified in Pytorch!

This article was published as a part of the Data Science Blogathon Introduction to BERT: BERT stands for Bidirectional Encoder Representations from Transformers. It was introduced in 2018 by Google Researchers. BERT achieved state-of-art performance in most of the NLP tasks at that time and drawn the attention of the data science community worldwide. It is extensively used today by data science practitioners for various NLP tasks. Details about the working of the BERT model can be found here. Introduction to […]

Read more

A Time-Domain Convolutional Recurrent Network for Packet Loss Concealment

Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong […]

Read more

Language Modelling as a Multi-Task Problem

April 18, 2021 By: Lucas Weber, Jaap Jumelet, Elia Bruni, Dieuwke Hupkes Abstract In this paper, we propose to study language modeling as a multi-task problem, bringing together three strands of research: multitask learning, linguistics, and interpretability. Based on hypotheses derived from linguistic theory, we investigate whether language models adhere to learning principles of multi-task learning during training. We showcase the idea by analysing the generalization behavior of language models during learning of the linguistic concept of Negative Polarity Items […]

Read more

Co-evolution of language and agents in referential games

Abstract Referential games offer a grounded learning environment for neural agents which accounts for the fact that language is functionally used to communicate. However, they do not take into account a second constraint considered to be fundamental for the shape of human language: that it must be learnable by new language learners. Cogswell et al. (2019) introduced cultural transmission within referential games through a changing population of agents to constrain the emerging language to be learnable. However, the resulting languages […]

Read more

Quality Estimation without Human-labeled Data

April 21, 2021 By: Yi-Lin Tuan, Ahmed El-Kishky, Adithya Renduchintala, Vishrav Chaudhary, Francisco Guzmán, Lucia Specia Abstract Quality estimation aims to measure the quality of translated content without access to a reference translation. This is crucial for machine translation systems in real-world scenarios where high-quality translation is needed. While many approaches exist for quality estimation, they are based on supervised machine learning requiring costly human labelled data. As an alternative, we propose a technique that does not rely on examples […]

Read more

MLQA: Evaluating Cross-lingual Extractive Question Answering

Abstract Question answering (QA) models have shown rapid progress enabled by the availability of large, high-quality benchmark datasets. Such annotated datasets are difficult and costly to collect, and rarely exist in languages other than English, making building QA systems that work well in other languages challenging. In order to develop such systems, it is crucial to invest in high quality multilingual evaluation benchmarks to measure progress. We present MLQA, a multi-way aligned extractive QA evaluation benchmark intended to spur research […]

Read more

Machine Learning Automation using EvalML Library

This article was published as a part of the Data Science Blogathon Introduction Machine Learning is one of the fastest-growing technology in the modern era. New innovations in the field of ML and AI are made each and every day which supports the world to leap forward. Earlier for a person entering into the ML field finds it difficult to create accurate machine learning models, but now AutoML Libraries are created which helps the beginners to create an accurate model with […]

Read more
1 19 20 21 22 23 71