Articles About Natural Language Processing

Performing Sentiment Analysis Using Twitter Data!

Photo by Daddy Mohlala on Unsplash Data is water, purifying to make it edible is a role of Data Analyst – Kashish Rastogi We are going to clean the twitter text data and visualize data in this blog. Table Of Contents: Problem Statement Data Description Cleaning text with NLP Finding if the text has: with spacy Cleaning text with preprocessor library Analysis of the sentiment of data Data visualizing   I am taking the twitter data which is available here on […]

Read more

Training BERT Text Classifier on Tensor Processing Unit (TPU)

Training hugging face most famous model on TPU for social media Tunisian Arabizi sentiment analysis.   Introduction The Arabic speakers usually express themself in local dialect on social media, so Tunisians use Tunisian Arabizi which consists of Arabic written in form of Latin alphabets. The sentiment analysis relies on cultural knowledge and word sense with contextual information. We will be using both Arabizi dialect and sentimental analysis to solve the problem in this project. The competition is hosted on Zindi which […]

Read more

Dialogue in the Wild: Learning from a Deployed Role-Playing Game with Humans and Bots

Abstract Much of NLP research has focused on crowdsourced static datasets and the supervised learning paradigm of training once and then evaluating test performance. As argued in de Vries et al. (2020), crowdsourced data has the issues of lack of naturalness and relevance to real-world use cases, while the static dataset paradigm does not allow for a model to learn from its experiences of using language (Silver et al., 2013). In contrast, one might hope for machine learning systems that […]

Read more

Why must text data be pre-processed ?

This article was published as a part of the Data Science Blogathon Introduction Language is a structured medium we humans use to communicate with each other. Language can be in the form of speech or text. “Blah blah”, “Meh”, “zzzz…” Yup, we can understand these words. But the question is, “Can computers understand these?” Nop, machines can’t understandthese. In fact, machines can’t understand any text data at all, be it the word “blah” or the word “machine”. They only understand numbers. […]

Read more

Beyond Offline Mapping: Learning Cross-lingual Word Embeddings through Context Anchoring

July 31, 2021 By: Aitor Ormazabal, Mikel Artetxe, Aitor Soroa, Gorka Labaka, Eneko Agirre Abstract Recent research on cross-lingual word embeddings has been dominated by unsupervised mapping approaches that align monolingual embeddings. Such methods critically rely on those embeddings having a similar structure, but it was recently shown that the separate training in different languages causes departures from this assumption. In this paper, we propose an alternative approach that does not have this limitation, while requiring a weak seed dictionary […]

Read more

Part 16 : Step by Step Guide to Master NLP – Topic Modelling using LSA

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article, we completed a basic technique of Topic Modeling named Non-Negative Matrix Factorization. So, In continuation of that part now we will start our discussion on another Topic modeling technique named Latent Semantic Analysis. So, In this article, we will deep dive into a Topic Modeling technique named Latent Semantic Analysis […]

Read more

Part 20: Step by Step Guide to Master NLP – Information Retrieval

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article, we completed our discussion on Topic Modelling Techniques. Now, in this article, we will be discussing an important application of NLP in Information Retrieval. So, In this article, we will discuss the basic concepts of Information Retrieval along with some of the models that are used in Information Retrieval. NOTE: […]

Read more

Bag-of-words vs TFIDF vectorization –A Hands-on Tutorial

This article was published as a part of the Data Science Blogathon Whenever we apply any algorithm to textual data, we need to convert the text to a numeric form. Hence, there arises a need for some pre-processing techniques that can convert our text to numbers. Both bag-of-words (BOW) and TFIDF are pre-processing techniques that can generate a numeric form from an input text. Bag-of-Words: The bag-of-words model converts text into fixed-length vectors by counting how many times each word appears. […]

Read more

Spam Detection – An application of Deep Learning

This article was published as a part of the Data Science Blogathon What each big tech company wants is the Security and Safety of its customers. By detecting spam alerts in emails and messages, they want to secure their network and enhance the trust of their customers. The official messaging app of Apple and the official chatting app of Google i.e Gmail is unbeatable examples of such applications where the process of spam detection and filtering works well to protect users […]

Read more

Getting Started with Natural Language Processing using Python

This article was published as a part of the Data Science Blogathon Why NLP? Natural Language Processing has always been a key tenet of Artificial Intelligence (AI). With the increase in the adoption of AI, systems to automate sophisticated tasks are being built. Some of these examples are described below. Diagnosing rare form of cancer –  At the University of Tokyo’s Institute of Medical Science, doctors used artificial intelligence to successfully diagnose a rare type of leukemia. The doctors used an AI […]

Read more
1 8 9 10 11 12 71