Articles About Natural Language Processing

TextCaps: a Dataset for Image Captioning with Reading Comprehension

Abstract Image descriptions can help visually impaired people to quickly understand the image content. While we made significant progress in automatically describing images and optical character recognition, current approaches are unable to include written text in their descriptions, although text is omnipresent in human environments and frequently critical to understand our surroundings. To study how to comprehend text in the context of an image we collect a novel dataset, TextCaps, with 145k captions for 28k images. Our dataset challenges a […]

Read more

Spatially Aware Multimodal Transformers for TextVQA

August 23, 2020 By: Yash Kant, Dhruv Batra, Peter Anderson, Alexander Schwing, Devi Parikh, Jiasen Lu, Harsh Agrawal Abstract Textual cues are essential for everyday tasks like buying groceries and using public transport. To develop this assistive technology, we study the TextVQA task, i.e., reasoning about text in images to answer a question. Existing approaches are limited in their use of spatial relations and rely on fully-connected transformer-based architectures to implicitly learn the spatial structure of a scene. In contrast, […]

Read more

Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline

Abstract Prior work in visual dialog has focused on training deep neural models on VisDial in isolation. Instead, we present an approach to leverage pretraining on related vision-language datasets before transferring to visual dialog. We adapt the recently proposed ViLBERT model for multi-turn visually-grounded conversations. Our model is pretrained on the Conceptual Captions and Visual Question Answering datasets, and finetuned on VisDial. Our best single model outperforms prior published work by > 1% absolute on NDCG and MRR. Next, we […]

Read more

Information Extraction of Clinical Trial Eligibility Criteria

Abstract Clinical trials predicate subject eligibility on a diversity of criteria ranging from patient demographics to food allergies. Trials post their requirements as semantically complex, unstructured free-text. Formalizing trial criteria to a computer-interpretable syntax would facilitate eligibility determination. In this paper, we investigate an information extraction (IE) approach for grounding criteria from trials in ClinicalTrials.gov to a shared knowledge base. We frame the problem as a novel knowledge base population task, and implement a solution combining machine learning and context […]

Read more

Unsupervised Quality Estimation for Neural Machine Translation

August 31, 2020 By: Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco (Paco) Guzman, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia Abstract Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation and time for training. As an alternative, we devise an unsupervised approach […]

Read more

Lemotif: An Affective Visual Journal Using Deep Neural Networks

Abstract We present Lemotif, an integrated natural language processing and image generation system that uses machine learning to (1) parse a text-based input journal entry describing the user’s day for salient themes and emotions and (2) visualize the detected themes and emotions in creative and appealing image motifs. Synthesizing approaches from artificial intelligence and psychology, Lemotif acts as an affective visual journal, encouraging users to regularly write and reflect on their daily experiences through visual reinforcement. By making patterns in […]

Read more

Weak-Attention Suppression For Transformer Based Speech Recognition

Abstract Transformers, originally proposed for natural language processing (NLP) tasks, have recently achieved great success in automatic speech recognition (ASR). However, adjacent acoustic units (i.e., frames) are highly correlated, and long-distance dependencies between them are weak, unlike text units. It suggests that ASR will likely benefit from sparse and localized attention. In this paper, we propose Weak-Attention Suppression (WAS), a method that dynamically induces sparsity in attention probabilities. We demonstrate that WAS leads to consistent Word Error Rate (WER) improvement […]

Read more

Issue #103 – LEGAL-BERT: The Muppets straight out of Law School

16 Oct20 Issue #103 – LEGAL-BERT: The Muppets straight out of Law School Author: Akshai Ramesh, Machine Translation Scientist @ Iconic Introduction BERT (Bidirectional Encoder Representations from Transformers) is a large-scale pre-trained autoencoding language model that has made a substantial contribution to natural language processing (NLP) and has been studied as a potentially promising way to further improve neural machine translation (NMT). “Given that BERT is based on a similar approach to neural MT in Transformers, there’s considerable interest and […]

Read more

Quick Guide: Steps To Perform Text Data Cleaning in Python

Introduction Twitter has become an inevitable channel for brand management. It has compelled brands to become more responsive to their customers. On the other hand, the damage it would cause can’t be undone. The 140 character tweets has now become a powerful tool for customers / users to directly convey messages to brands. For companies, these tweets carry a lot of information like sentiment, engagement, reviews and features of its products and what not. However, mining these tweets isn’t easy. Why? Because, before you mine this data, you need […]

Read more
1 58 59 60 61 62 71