Articles About Natural Language Processing

NLP Essentials: Removing Stopwords and Performing Text Normalization using NLTK and spaCy in Python

Overview Learn how to remove stopwords and perform text normalization in Python – an essential Natural Language Processing (NLP) read We will explore the different methods to remove stopwords as well as talk about text normalization techniques like stemming and lemmatization Put your theory into practice by performing stopwords removal and text normalization in Python using the popular NLTK, spaCy and Gensim libraries   Introduction Don’t you love how wonderfully diverse Natural Language Processing (NLP) is? Things we never imagined […]

Read more

An Exhaustive Guide to Detecting and Fighting Neural Fake News using NLP

Overview Neural fake news (fake news generated by AI) can be a huge issue for our society This article discusses different Natural Language Processing methods to develop robust defense against Neural Fake News, including using the GPT-2 detector model and Grover (AllenNLP) Every data science professional should be aware of what neural fake news is and how to combat it   Introduction Fake news is a major concern in our society right now. It has gone hand-in-hand with the rise […]

Read more

What is Tokenization in NLP? Here’s All You Need To Know

Highlights Tokenization is a key (and mandatory) aspect of working with text data We’ll discuss the various nuances of tokenization, including how to handle Out-of-Vocabulary words (OOV)   Introduction Language is a thing of beauty. But mastering a new language from scratch is quite a daunting prospect. If you’ve ever picked up a language that wasn’t your mother tongue, you’ll relate to this! There are so many layers to peel off and syntaxes to consider – it’s quite a challenge. […]

Read more

A Comprehensive Step-by-Step Guide to Become an Industry-Ready Data Science Professional

Introduction to Artificial Intelligence and Machine Learning Artificial Intelligence (AI) and its sub-field Machine Learning (ML) have taken the world by storm. From face recognition cameras, smart personal assistants to self-driven cars. We are moving towards a world enhanced by these recent upcoming technologies. It’s the most exciting time to be in this career field! The global Artificial Intelligence market is expected to grow to $400 billion by the year 2025. From Startups to big organizations, all want to join […]

Read more

Issue #106 – Informative Manual Evaluation of Machine Translation Output

05 Nov20 Issue #106 – Informative Manual Evaluation of Machine Translation Output Author: Méabh Sloane, MT Researcher @ Iconic Introduction With regards to manual evaluation of machine translation (MT) output, there is a continuous search for balance between the time and effort required with manual evaluation, and the significant results it achieves. As MT technology continues to improve and evolve, the need for human evaluation increases, an element often disregarded due to its demanding nature. This need is heightened by […]

Read more

NAACL 2019 Highlights

Update 19.04.20: Added a translation of this post in Spanish. This post discusses highlights of the 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019). You can find past highlights of conferences here. The conference accepted 424 papers (which you can find here) and had 1575 participants (see the opening session slides for more details). These are the topics that stuck out for me most: Transfer learning The room at the Transfer Learning […]

Read more

Who is the world cheering for? 2014 FIFA WC winner predicted using Twitter feed (in R)

Sports are filled with emotions! Cheering of audience, reactions to events on various media channels are some of the factors, which make a huge impact on the mind of the players. If people support you, your chances to win are greatly enhanced. Live example of this fact, are the statistics of Indian cricket team playing in India and abroad. The win rate of Indian cricket team in India is approximately twice the win rate abroad. Football is again a game driven largely by emotions. […]

Read more

Kaggle Solution: What’s Cooking ? (Text Mining Competition)

Introduction Tutorial on Text Mining, XGBoost and Ensemble Modeling in R I came across What’s Cooking competition on Kaggle last week. At first, I was intrigued by its name. I checked it and realized that this competition is about to finish. My bad! It was a text mining competition.  This competition went live for 103 days and ended on 20th December 2015. Still, I decided to test my skills. I downloaded the data set, built a model and managed to get a score of […]

Read more

Measuring Audience Sentiments about Movies using Twitter and Text Analytics

Introduction The practice of using analytics to measure movie’s success is not a new phenomenon. Most of these predictive models are based on structured data with input variables such as Cost of Production, Genre of the Movie, Actor, Director, Production House, Marketing expenditure, no of distribution platforms, etc. However, with the advent of social media platforms, young demographics, digital media and the increasing adoption of platforms like Twitter, Facebook, etc to express views and opinions. Social Media has become a […]

Read more

Introduction to Computational Linguistics and Dependency Trees in data science

Introduction In recent years, the amalgam of deep learning fundamentals with Natural Language Processing techniques has shown a great improvement in the information mining tasks on unstructured text data. The models are now able to recognize natural language and speech comparable to human levels. Despite such improvements, discrepancies in the results still exist as sometimes the information is coded very deep in the syntaxes and syntactic structures of the corpus. Example – Problem with Neural Networks For example, a conversation […]

Read more
1 52 53 54 55 56 71