Elon Musk AI Text Generator with LSTMs in Tensorflow 2

Introduction Elon Musk has become an internet sensation over the past couple of years, with his views about the future, funny personality along with his passion for technology. By now everyone knows him, either as that electric car guy, or that guy who builds flamethrowers. He is mostly active on his Twitter, where he shares everything, Even memes! He inspires a lot of young people in the IT industry, and I wanted to do a fun little project, where I […]

Read more

TTS Skins: Speaker Conversion via ASR

Abstract We present a fully convolutional wav-to-wav network for converting between speakers’ voices, without relying on text. Our network is based on an encoder-decoder architecture, where the encoder is pre-trained for the task of Automatic Speech Recognition, and a multi-speaker waveform decoder is trained to reconstruct the original signal in an autoregressive manner. We train the network on narrated audiobooks, and demonstrate multi-voice TTS in those voices, by converting the voice of a TTS robot. To finish reading, please visit […]

Read more

Entropy Minimization In Emergent Languages

Abstract There is growing interest in studying the languages that emerge when neural agents are jointly trained to solve tasks requiring communication through a discrete channel. We investigate here the information-theoretic complexity of such languages, focusing on the basic two-agent, one-exchange setup. We find that, under common training procedures, the emergent languages are subject to an entropy minimization pressure that has also been detected in human language, whereby the mutual information between the communicating agent’s inputs and the messages is […]

Read more

TextCaps: a Dataset for Image Captioning with Reading Comprehension

Abstract Image descriptions can help visually impaired people to quickly understand the image content. While we made significant progress in automatically describing images and optical character recognition, current approaches are unable to include written text in their descriptions, although text is omnipresent in human environments and frequently critical to understand our surroundings. To study how to comprehend text in the context of an image we collect a novel dataset, TextCaps, with 145k captions for 28k images. Our dataset challenges a […]

Read more

Spatially Aware Multimodal Transformers for TextVQA

August 23, 2020 By: Yash Kant, Dhruv Batra, Peter Anderson, Alexander Schwing, Devi Parikh, Jiasen Lu, Harsh Agrawal Abstract Textual cues are essential for everyday tasks like buying groceries and using public transport. To develop this assistive technology, we study the TextVQA task, i.e., reasoning about text in images to answer a question. Existing approaches are limited in their use of spatial relations and rely on fully-connected transformer-based architectures to implicitly learn the spatial structure of a scene. In contrast, […]

Read more

Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline

Abstract Prior work in visual dialog has focused on training deep neural models on VisDial in isolation. Instead, we present an approach to leverage pretraining on related vision-language datasets before transferring to visual dialog. We adapt the recently proposed ViLBERT model for multi-turn visually-grounded conversations. Our model is pretrained on the Conceptual Captions and Visual Question Answering datasets, and finetuned on VisDial. Our best single model outperforms prior published work by > 1% absolute on NDCG and MRR. Next, we […]

Read more

Information Extraction of Clinical Trial Eligibility Criteria

Abstract Clinical trials predicate subject eligibility on a diversity of criteria ranging from patient demographics to food allergies. Trials post their requirements as semantically complex, unstructured free-text. Formalizing trial criteria to a computer-interpretable syntax would facilitate eligibility determination. In this paper, we investigate an information extraction (IE) approach for grounding criteria from trials in ClinicalTrials.gov to a shared knowledge base. We frame the problem as a novel knowledge base population task, and implement a solution combining machine learning and context […]

Read more

Unsupervised Quality Estimation for Neural Machine Translation

August 31, 2020 By: Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco (Paco) Guzman, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia Abstract Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation and time for training. As an alternative, we devise an unsupervised approach […]

Read more

Lemotif: An Affective Visual Journal Using Deep Neural Networks

Abstract We present Lemotif, an integrated natural language processing and image generation system that uses machine learning to (1) parse a text-based input journal entry describing the user’s day for salient themes and emotions and (2) visualize the detected themes and emotions in creative and appealing image motifs. Synthesizing approaches from artificial intelligence and psychology, Lemotif acts as an affective visual journal, encouraging users to regularly write and reflect on their daily experiences through visual reinforcement. By making patterns in […]

Read more
1 758 759 760 761 762 919