Articles About Natural Language Processing

Customer Sentiments Analysis of Pepsi and Coca-Cola using Twitter Data in R

This article was published as a part of the Data Science Blogathon. Introduction Coca-Cola and PepsiCo are well-established names in the soft drink industry with both in the fortune 500. The companies that own a wide spectrum of product lines in a highly competitive market have a fierce rivalry with each other and constantly competing for market share in almost all subsequent product verticals. We will analyze the sentiment of customers of these two companies with the help of 5000 […]

Read more

Issue #117 – Subword Segmentation and a Single Bridge Language Affect Zero-Shot Neural Machine Translation

11 Feb21 Issue #117 – Subword Segmentation and a Single Bridge Language Affect Zero-Shot Neural Machine Translation Author: Dr. Jingyi Han, Machine Translation Scientist @ Iconic Introduction Nowadays, zero-shot machine translation is receiving more and more attention due to the expensive cost of building new engines for different language directions. The underlying principle of this strategy is to build a single model that can learn to translate between different language pairs without involving direct training for such combinations. Following the […]

Read more

Hugging Face – Issue 7 – Feb 9th 2021

News New Year, New Website! Our vision for the future of machine learning is one step closer to reality thanks to the 1,000+ researchers & open-source contributors, thousands of companies & the fantastic Hugging Face team! Last month, we announced the launch of the latest version of huggingface.co and we couldn’t be more proud. 🔥 Play live with >10 billion parameters models for tasks including translation, NER, zero-shot classification, and

Read more

Introduction to Hugging Face’s Transformers v4.3.0 and its First Automatic Speech Recognition Model – Wav2Vec2

Overview Hugging Face has released Transformers v4.3.0 and it introduces the first Automatic Speech Recognition model to the library: Wav2Vec2 Using one hour of labeled data, Wav2Vec2 outperforms the previous state of the art on the 100-hour subset while using 100 times less labeled data Using just ten minutes of labeled data and pre-training on 53k hours of unlabeled data Wav2Vec2 achieves 4.8/8.2 WER Understand Wav2Vec2 implementation using transformers library on audio to text generation   Introduction Transformers has been […]

Read more

Summarising Historical Text in Modern Languages

de №11 Story Die Arbeiten im hiesigen Arsenal haben schon seit langer Zeit nachgelassen, und seitdem die Perser so sehr von den Russen geschlagen worden sind, hört man überhaupt nichts mehr von Kriegsrüstungen in den türkischen Provinzen. Die Pforte hatte nicht geglaubt, daß Rußland eine so starke Macht nach den Ufern des kaspischen Meeres abschicken, und daß der Krieg mit den Persern sobald eine so entscheidende Wendung nehmen würde. Alle kriegerischen Nachrichten, die wir jetzt aus den türkischen Provinzen erhalten, […]

Read more

Spark NLP: Natural Language Understanding at Scale

Abstract Spark NLP is a Natural Language Processing (NLP) library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that can scale easily in a distributed environment. Spark NLP comes with 1100+ pretrained pipelines and models in more than 192+ languages. It supports nearly all the NLP tasks and modules that can be used seamlessly in a cluster. Downloaded more than 2.7 million times and experiencing 9x growth since January […]

Read more

Attention Can Reflect Syntactic Structure (If You Let It)

Abstract Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel multi-head attention mechanism. However, much of such work focused almost exclusively on English — a language with rigid word order and a lack of inflectional morphology. In this study, we present decoding experiments for multilingual BERT across 18 languages in order to test the generalizability of the claim that dependency syntax is reflected in attention patterns. We […]

Read more

“Laughing at you or with you”: The Role of Sarcasm in Shaping the Disagreement Space

Frans Hendrik van Eemeren, Rob Grootendorst, Sally Jackson, Scott Jacobs, et al. 1993. Reconstructing argumentative discourse. University of Alabama Press. Rob Abbott, Marilyn Walker, Pranav Anand, Jean E Fox Tree, Robeson Bowmani, and Joseph King. 2011. How can you say such things?!?: Recognizing disagreement in informal political argument. In Proceedings of the Workshop on Languages in Social Media, pages 2–11. Association for Computational Linguistics. Marilyn A Walker, Jean E Fox Tree, Pranav Anand, Rob Abbott, and Joseph King. 2012b.    

Read more

Syntactic Nuclei in Dependency Parsing – A Multilingual Exploration

In the previous sections, we have shown how syntactic nuclei can be identified in the UD annotation and how transition-based parsers can be made sensitive to these structures in their internal representations through the use of nucleus composition. We now proceed to a set of experiments investigating the impact of nucleus composition on a diverse selection of languages. 5.1 Experimental Settings We use UUParser (de Lhoneux et al., 2017, Smith    

Read more

Does injecting linguistic structure into language models lead to better alignment with brain recordings?

Figure 1 shows a high-level outline of our experimental design, which aims to establish whether injecting structure derived from a variety of syntacto-semantic formalisms into neural language model representations can lead to better correspondence with human brain activation data. We utilize fMRI recordings of human subjects reading a set of texts. Representations of these texts are then derived from the activations of the language models. Following Gauthier and Levy (

Read more
1 37 38 39 40 41 71