Build Text Categorization Model with Spark NLP

Overview Setting up John Snow labs Spark-NLP on AWS EMR and using the library to perform a simple text categorization of BBC articles. Introduction Natural Language Processing is one of the important processes for data science teams across the globe. With ever-growing data, most of the organizations have already moved to big data platforms like Apache Hadoop and cloud offerings like AWS, Azure, and GCP. These platforms are more than capable of handling    

Read more

Text Classification & Word Representations using FastText (An NLP library by Facebook)

Introduction If you put a status update on Facebook about purchasing a car -don’t be surprised if Facebook serves you a car ad on your screen. This is not black magic! This is Facebook leveraging the text data to serve you better ads. The picture below takes a jibe at a challenge while dealing with text data. Well, it clearly failed in the above attempt to deliver the right ad. It is all the more important to capture the context […]

Read more

Ultimate guide to deal with Text Data (using Python) – for Data Scientists and Engineers

Introduction One of the biggest breakthroughs required for achieving any level of artificial intelligence is to have machines which can process text data. Thankfully, the amount of text data being generated in this universe has exploded exponentially in the last few years. It has become imperative for an organization to have a structure in place to mine actionable insights from the text being generated. From social media analytics to risk management and cybercrime protection, dealing with text data has never […]

Read more

Comprehensive Hands on Guide to Twitter Sentiment Analysis with dataset and code

Introduction Natural Language Processing (NLP) is a hotbed of research in data science these days and one of the most common applications of NLP is sentiment analysis. From opinion polls to creating entire marketing strategies, this domain has completely reshaped the way businesses work, which is why this is an area every data scientist must be familiar with. Thousands of text documents can be processed for sentiment (and other features including named entities, topics, themes, etc.) in seconds, compared to […]

Read more

Natural Language Processing for Beginners: Using TextBlob

Introduction Natural Language Processing (NLP) is an area of growing attention due to increasing number of applications like chatbots, machine translation etc. In some ways, the entire revolution of intelligent machines in based on the ability to understand and interact with humans. I have been exploring NLP for some time now.  My journey started with NLTK library in Python, which was the recommended library to get started at that time. NLTK is a perfect library for education and research, it becomes […]

Read more

Learn how to Build your own Speech-to-Text Model (using Python)

Overview Learn how to build your very own speech-to-text model using Python in this article The ability to weave deep learning skills with NLP is a coveted one in the industry; add this to your skillset today We will use a real-world dataset and build this speech-to-text model so get ready to use your Python skills!   Introduction “Hey Google. What’s the weather like today?” This will sound familiar to anyone who has owned a smartphone in the last decade. […]

Read more

Tapping Twitter Sentiments: A Complete Case-Study on 2015 Chennai Floods

Introduction We did this case study as a part of our capstone project at Great Lakes Institute of Management, Chennai. After we presented this study, we got an overwhelming response from our professors & mentors. Later, they encouraged us to share our work to help others learn something new. We’ve been following Analytics Vidhya for a while now. Everyone knows, it’s probably the largest engine to share analytics knowledge. We tried and got lucky in connecting with their content team. So, […]

Read more

Tutorial on Text Classification (NLP) using ULMFiT and fastai Library in Python

Introduction Natural Language Processing (NLP) needs no introduction in today’s world. It’s one of the most important fields of study and research, and has seen a phenomenal rise in interest in the last decade. The basics of NLP are widely known and easy to grasp. But things start to get tricky when the text data becomes huge and unstructured. That’s where deep learning becomes so pivotal. Yes, I’m talking about deep learning for NLP tasks – a still relatively less […]

Read more

Kaggle Solution: What’s Cooking ? (Text Mining Competition)

Introduction Tutorial on Text Mining, XGBoost and Ensemble Modeling in R I came across What’s Cooking competition on Kaggle last week. At first, I was intrigued by its name. I checked it and realized that this competition is about to finish. My bad! It was a text mining competition.  This competition went live for 103 days and ended on 20th December 2015. Still, I decided to test my skills. I downloaded the data set, built a model and managed to get a score of […]

Read more

Comprehensive Guide to Text Summarization using Deep Learning in Python

Introduction “I don’t want a full report, just give me a summary of the results”. I have often found myself in this situation – both in college as well as my professional life. We prepare a comprehensive report and the teacher/supervisor only has time to read the summary. Sounds familiar? Well, I decided to do something about it. Manually converting the report to a summarized version is too time taking, right? Could I lean on Natural Language Processing (NLP) techniques […]

Read more
1 2 3 4 5 6