Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis

Prosody modeling is an essential component in modern text-to-speech (TTS) frameworks. By explicitly providing prosody features to the TTS model, the style of synthesized utterances can thus be controlled… However, predicting natural and reasonable prosody at inference time is challenging. In this work, we analyzed the behavior of non-autoregressive TTS models under different prosody-modeling settings and proposed a hierarchical architecture, in which the prediction of phoneme-level prosody features are conditioned on the word-level prosody features. The proposed method outperforms other […]

Read more

Learning Inter-Modal Correspondence and Phenotypes from Multi-Modal Electronic Health Records

Non-negative tensor factorization has been shown a practical solution to automatically discover phenotypes from the electronic health records (EHR) with minimal human supervision. Such methods generally require an input tensor describing the inter-modal interactions to be pre-established; however, the correspondence between different modalities (e.g., correspondence between medications and diagnoses) can often be missing in practice… Although heuristic methods can be applied to estimate them, they inevitably introduce errors, and leads to sub-optimal phenotype quality. This is particularly important for patients […]

Read more

Atrial Fibrillation Detection and ECG Classification based on CNN-BiLSTM

It is challenging to visually detect heart disease from the electrocardiographic (ECG) signals. Implementing an automated ECG signal detection system can help diagnosis arrhythmia in order to improve the accuracy of diagnosis… In this paper, we proposed, implemented, and compared an automated system using two different frameworks of the combination of convolutional neural network (CNN) and long-short term memory (LSTM) for classifying normal sinus signals, atrial fibrillation, and other noisy signals. The dataset we used is from the MIT-BIT Arrhythmia […]

Read more

Biomedical Named Entity Recognition at Scale

Named entity recognition (NER) is a widely applicable natural language processing task and building block of question answering, topic modeling, information retrieval, etc. In the medical domain, NER plays a crucial role by extracting meaningful chunks from clinical notes and reports, which are then fed to downstream tasks like assertion status detection, entity resolution, relation extraction, and de-identification… Reimplementing a Bi-LSTM-CNN-Char deep learning architecture on top of Apache Spark, we present a single trainable NER model that obtains new state-of-the-art […]

Read more

Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement Learning

Previous studies on image classification have mainly focused on the performance of the networks, not on real-time operation or model compression. We propose a Gaussian Deep Recurrent visual Attention Model (GDRAM)- a reinforcement learning based lightweight deep neural network for large scale image classification that outperforms the conventional CNN (Convolutional Neural Network) which uses the entire image as input… Highly inspired by the biological visual recognition process, our model mimics the stochastic location of the retina with Gaussian distribution. We […]

Read more

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

We propose a real-time intermediate flow estimation algorithm (RIFE) for video frame interpolation (VFI). Most existing methods first estimate the bi-directional optical flows, and then linearly combine them to approximate intermediate flows, leading to artifacts around motion boundaries… We design an intermediate flow model named IFNet that can directly estimate the intermediate flows from coarse to fine. We then warp the input frames according to the estimated intermediate flows and employ a fusion process to compute final results. Based on […]

Read more

Same Object, Different Grasps: Data and Semantic Knowledge for Task-Oriented Grasping

Despite the enormous progress and generalization in robotic grasping in recent years, existing methods have yet to scale and generalize task-oriented grasping to the same extent. This is largely due to the scale of the datasets both in terms of the number of objects and tasks studied… We address these concerns with the TaskGrasp dataset which is more diverse both in terms of objects and tasks, and an order of magnitude larger than previous datasets. The dataset contains 250K task-oriented […]

Read more

Fine-Grained Sentiment Analysis of Smartphone Review

How to conduct fine-grained sentiment analysis: Approaches and Tools Data collection and preparation. For data collection, we scraped the top 100 smartphone reviews from Amazon using python, selenium, and beautifulsoup library. If you don’t know how to use python and beautifulsoup and request a library for web-scraping here is a quick tutorial. Selenium Python bindings provide a simple API to write functional/acceptance tests using Selenium WebDriver. Let’s begin coding    

Read more

Hugging Face – 🤗Hugging Face Newsletter Issue #1 – Aug 20th 2020

News 🤗Welcome to the Hugging Face Newsletter! 🤗 Every few weeks, we’ll be updating you on the latest happenings at Hugging Face. Make sure to subscribe and share with all NLP lovers to get the latest updates on releases, readings, research, and more! Have an idea for the newsletter? Email newsletter@huggingface.co 🚀 Model Hub Highlights 🚀 Open-Source Machine TranslationDid you know that you can translate between many languages with open-source 🤗 Transformers and great models    

Read more

Hugging Face – 🤗Hugging Face Newsletter Issue #2 – Sep 11th 2020

News Transformers gets a new release: v3.1.0 This new version is the first PyPI release to feature: The PEGASUS models, the current State-of-the-Art in summarization DPR, for open-domain Q&A research mBART, a multilingual encoder-decoder model trained using the BART objective Alongside the three new models, we are also releasing a long-awaited feature: “named outputs”. By passing return_dict=True, model outputs can now be accessed as named values as well as by    

Read more
1 721 722 723 724 725 911