Knowledge Graph – A Powerful Data Science Technique to Mine Information from Text (with Python code)

Overview Knowledge graphs are one of the most fascinating concepts in data science Learn how to build a knowledge graph to mine information from Wikipedia pages You will be working hands-on in Python to build a knowledge graph using the popular spaCy library   Introduction Lionel Messi needs no introduction. Even folks who don’t follow football have heard about the brilliance of one of the greatest players to have graced the sport. Here’s his Wikipedia page: Quite a lot of […]

Read more

Quick Introduction to Bag-of-Words (BoW) and TF-IDF for Creating Features from Text

The Challenge of Making Machines Understand Text “Language is a wonderful medium of communication” You and I would have understood that sentence in a fraction of a second. But machines simply cannot process text data in raw form. They need us to break down the text into a numerical format that’s easily readable by the machine (the idea behind Natural Language Processing!). This is where the concepts of Bag-of-Words (BoW) and TF-IDF come into play. Both BoW and TF-IDF are […]

Read more

Transfer Learning for NLP: Fine-Tuning BERT for Text Classification

Introduction With the advancement in deep learning, neural network architectures like recurrent neural networks (RNN and LSTM) and convolutional neural networks (CNN) have shown a decent improvement in performance in solving several Natural Language Processing (NLP) tasks like text classification, language modeling, machine translation, etc. However, this performance of deep learning models in NLP pales in comparison to the performance of deep learning in Computer Vision. One of the main reasons for this slow progress could be the lack of […]

Read more

Elon Musk AI Text Generator with LSTMs in Tensorflow 2

Introduction Elon Musk has become an internet sensation over the past couple of years, with his views about the future, funny personality along with his passion for technology. By now everyone knows him, either as that electric car guy, or that guy who builds flamethrowers. He is mostly active on his Twitter, where he shares everything, Even memes! He inspires a lot of young people in the IT industry, and I wanted to do a fun little project, where I […]

Read more

TTS Skins: Speaker Conversion via ASR

Abstract We present a fully convolutional wav-to-wav network for converting between speakers’ voices, without relying on text. Our network is based on an encoder-decoder architecture, where the encoder is pre-trained for the task of Automatic Speech Recognition, and a multi-speaker waveform decoder is trained to reconstruct the original signal in an autoregressive manner. We train the network on narrated audiobooks, and demonstrate multi-voice TTS in those voices, by converting the voice of a TTS robot. To finish reading, please visit […]

Read more

Entropy Minimization In Emergent Languages

Abstract There is growing interest in studying the languages that emerge when neural agents are jointly trained to solve tasks requiring communication through a discrete channel. We investigate here the information-theoretic complexity of such languages, focusing on the basic two-agent, one-exchange setup. We find that, under common training procedures, the emergent languages are subject to an entropy minimization pressure that has also been detected in human language, whereby the mutual information between the communicating agent’s inputs and the messages is […]

Read more

TextCaps: a Dataset for Image Captioning with Reading Comprehension

Abstract Image descriptions can help visually impaired people to quickly understand the image content. While we made significant progress in automatically describing images and optical character recognition, current approaches are unable to include written text in their descriptions, although text is omnipresent in human environments and frequently critical to understand our surroundings. To study how to comprehend text in the context of an image we collect a novel dataset, TextCaps, with 145k captions for 28k images. Our dataset challenges a […]

Read more

Spatially Aware Multimodal Transformers for TextVQA

August 23, 2020 By: Yash Kant, Dhruv Batra, Peter Anderson, Alexander Schwing, Devi Parikh, Jiasen Lu, Harsh Agrawal Abstract Textual cues are essential for everyday tasks like buying groceries and using public transport. To develop this assistive technology, we study the TextVQA task, i.e., reasoning about text in images to answer a question. Existing approaches are limited in their use of spatial relations and rely on fully-connected transformer-based architectures to implicitly learn the spatial structure of a scene. In contrast, […]

Read more

Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline

Abstract Prior work in visual dialog has focused on training deep neural models on VisDial in isolation. Instead, we present an approach to leverage pretraining on related vision-language datasets before transferring to visual dialog. We adapt the recently proposed ViLBERT model for multi-turn visually-grounded conversations. Our model is pretrained on the Conceptual Captions and Visual Question Answering datasets, and finetuned on VisDial. Our best single model outperforms prior published work by > 1% absolute on NDCG and MRR. Next, we […]

Read more
1 750 751 752 753 754 911