Articles About Natural Language Processing

Hugging Face Releases New NLP ‘Tokenizers’ Library Version (v0.8.0)

Hugging Face is at the forefront of a lot of updates in the NLP space. They have released one groundbreaking NLP library after another in the last few years. Honestly, I have learned and improved my own NLP skills a lot thanks to the work open-sourced by Hugging Face. And today, they’ve released another big update – a brand new version of their popular Tokenizer library.   A Quick Introduction to Tokenization So, what is tokenization? Tokenization is a crucial […]

Read more

Handling Imbalanced Data – Machine Learning, Computer Vision and NLP

This article was published as a part of the Data Science Blogathon. Introduction: In the real world, the data we gather will be heavily imbalanced most of the time. so, what is an Imbalanced Dataset?. The training samples are not equally distributed across the target classes.  For instance, if we take the case of the personal loan classification problem, it is effortless to get the ‘not approved’ data, in contrast to,  ‘approved’ details. As a result, the model is more […]

Read more

Framework to build a niche dictionary for text mining

Having the right dictionary is at the heart of any text mining analysis. Dictionary for text mining can be compared to maps while travelling in a new city. The more precise and accurate maps you use, the faster you reach to the destination. On the other hand, a wrong or incomplete map can end up confusing the traveler. Use of dictionary helps us convert unstructured text into structured data. The more precise dictionary you have for the analysis, the more accurate […]

Read more

Tapping Twitter Sentiments: A Complete Case-Study on 2015 Chennai Floods

Introduction We did this case study as a part of our capstone project at Great Lakes Institute of Management, Chennai. After we presented this study, we got an overwhelming response from our professors & mentors. Later, they encouraged us to share our work to help others learn something new. We’ve been following Analytics Vidhya for a while now. Everyone knows, it’s probably the largest engine to share analytics knowledge. We tried and got lucky in connecting with their content team. So, […]

Read more

An Intuitive Understanding of Word Embeddings: From Count Vectors to Word2Vec

Introduction Before we start, have a look at the below examples. You open Google and search for a news article on the ongoing Champions trophy and get hundreds of search results in return about it. Nate silver analysed millions of tweets and correctly predicted the results of 49 out of 50 states in 2008 U.S Presidential Elections. You type a sentence in google translate in English and get an Equivalent Chinese conversion.   So what do the above examples have […]

Read more

The Ultimate Learning Path to Becoming a Data Scientist in 2018

Introduction So you’ve taken the plunge. You want to become a data scientist. But where to begin? There are far too many resources out there. How do you decide the starting point? Did you miss out on topics you should have studied? Which are the best resources to learn? Don’t worry, we have you covered! Analytics Vidhya’s learning path for 2016 saw 250,000+ views. In 2017, we went even further and saw an incredible 500,000+ views! So this year, we […]

Read more

An Introductory Guide to Understand how ANNs Conceptualize New Ideas (using Embedding)

Introduction Here’s something you don’t hear everyday – everything we perceive is just a best case probabilistic prediction by our brain, based on our past encounters and knowledge gained through other mediums. This might sound extremely counter intuitive because we have always imagined that our brain mostly gives us deterministic answers. We’ll do a small experiment to showcase this logic. Take a look at the below image: Q1. Do you see a human ? Q2. Can you identify the person? […]

Read more

Tutorial on Text Classification (NLP) using ULMFiT and fastai Library in Python

Introduction Natural Language Processing (NLP) needs no introduction in today’s world. It’s one of the most important fields of study and research, and has seen a phenomenal rise in interest in the last decade. The basics of NLP are widely known and easy to grasp. But things start to get tricky when the text data becomes huge and unstructured. That’s where deep learning becomes so pivotal. Yes, I’m talking about deep learning for NLP tasks – a still relatively less […]

Read more

A Step-by-Step NLP Guide to Learn ELMo for Extracting Features from Text

Introduction I work on different Natural Language Processing (NLP) problems (the perks of being a data scientist!). Each NLP problem is a unique challenge in its own way. That’s just a reflection of how complex, beautiful and wonderful the human language is. But one thing has always been a thorn in an NLP practitioner’s mind is the inability (of machines) to understand the true meaning of a sentence. Yes, I’m talking about context. Traditional NLP techniques and frameworks were great when […]

Read more

11 Superb Data Science Videos Every Data Scientist Must Watch

Overview Presenting 11 data science videos that will enhance and expand your current skillset We have categorized these videos into three fields – Natural Language Processing (NLP), Generative Models, and Reinforcement Learning Learn how the concepts in these videos work and build your own data science project!   Introduction I love learning and understanding data science concepts through videos. I simply do not have the time to pour through books and pages of text to understand different ideas and topics. […]

Read more
1 51 52 53 54 55 71