A Comprehensive Guide to Understand and Implement Text Classification in Python

Improving Text Classification Models While the above framework can be applied to a number of text classification problems, but to achieve a good accuracy some improvements can be done in the overall framework. For example, following are some tips to improve the performance of text classification models and this framework. 1. Text Cleaning : text cleaning can help to reducue the noise present in text data in the form of stopwords, punctuations marks, suffix variations etc. This article can help to understand how […]

Read more

Introduction to Flair for NLP: A Simple yet Powerful State-of-the-Art NLP Library

Introduction Last couple of years have been incredible for Natural Language Processing (NLP) as a domain! We have seen multiple breakthroughs – ULMFiT, ELMo, Facebook’s PyText, Google’s BERT, among many others. These have rapidly accelerated the state-of-the-art research in NLP (and language modeling, in particular). We can now predict the next sentence, given a sequence of preceding words. What’s even more important is that machines are now beginning to understand the key element that had eluded them for long. Context! Understanding context […]

Read more

How do Transformers Work in NLP? A Guide to the Latest State-of-the-Art Models

Overview The Transformer model in NLP has truly changed the way we work with text data Transformer is behind the recent NLP developments, including Google’s BERT Learn how the Transformer idea works, how it’s related to language modeling, sequence-to-sequence modeling, and how it enables Google’s BERT model   Introduction I love being a data scientist working in Natural Language Processing (NLP) right now. The breakthroughs and developments are occurring at an unprecedented pace. From the super-efficient ULMFiT framework to Google’s […]

Read more

Innoplexus Sentiment Analysis Hackathon: Top 3 Out-of-the-Box Winning Approaches

Overview Hackathons are a wonderful opportunity to gauge your data science knowledge and compete to win lucrative prizes and job opportunities Here are the top 3 approaches from the Innoplexus Sentiment Analysis Hackathon – a superb NLP challenge   Introduction I’m a big fan of hackathons. I’ve learned so much about data science from participating in these hackathons in the past few years. I’ll admit it – I have gained a lot of knowledge through this medium and this, in […]

Read more

A Beginner’s Guide to Exploratory Data Analysis (EDA) on Text Data (Amazon Case Study)

The Importance of Exploratory Data Analysis (EDA) There are no shortcuts in a machine learning project lifecycle. We can’t simply skip to the model building stage after gathering the data. We need to plan our approach in a structured manner and the exploratory data analytics (EDA) stage plays a huge part in that. I can say this with the benefit of hindsight having personally gone through this situation plenty of times. In my early days in this field, I couldn’t […]

Read more

Ultimate Guide to Understand and Implement Natural Language Processing (with codes in Python)

Overview Complete guide on natural language processing (NLP) in Python Learn various techniques for implementing NLP including parsing & text processing Understand how to use NLP for text feature engineering   Introduction According to industry estimates, only 21% of the available data is present in structured form. Data is being generated as we speak, as we tweet, as we send messages on Whatsapp and in various other activities. Majority of this data exists in the textual form, which is highly unstructured […]

Read more

How to create a poet / writer using Deep Learning (Text Generation using Python)?

Introduction From short stories to writing 50,000 word novels, machines are churning out words like never before. There are tons of examples available on the web where developers have used machine learning to write pieces of text, and the results range from the absurd to delightfully funny. Thanks to major advancements in the field of Natural Language Processing (NLP), machines are able to understand the context and spin up tales all by themselves.               […]

Read more

Complete tutorial on Text Classification using Conditional Random Fields Model (in Python)

Introduction The amount of text data being generated in the world is staggering. Google processes more than 40,000 searches EVERY second!  According to a Forbes report, every single minute we send 16 million text messages and post 510,00 comments on Facebook. For a layman, it is difficult to even grasp the sheer magnitude of data out there? News sites and other online media alone generate tons of text content on an hourly basis. Analyzing patterns in that data can become […]

Read more

DataHack Radio #21: Detecting Fake News using Machine Learning with Mike Tamir, Ph.D.

Introduction Fake news is one of the biggest scourges in our digitally connected world. That is no exaggeration. It is no longer limited to little squabbles – fake news spreads like wildfire and is impacting millions of people every day. How do you deal with such a sensitive issue? Millions of articles are being churned out every day on the internet – how do you tell real from fake? It’s not as easy as turning to a simple fact checker. […]

Read more

Introduction to PyTorch-Transformers: An Incredible Library for State-of-the-Art NLP (with Python code)

Overview We look at the latest state-of-the-art NLP library in this article called PyTorch-Transformers We will also implement PyTorch-Transformers in Python using popular NLP models like Google’s BERT and OpenAI’s GPT-2! This has the potential to revolutionize the landscape of NLP as we know it   Introduction “NLP’s ImageNet moment has arrived.” – Sebastian Ruder Imagine having the power to build the Natural Language Processing (NLP) model that powers Google Translate. What if I told you this can be done […]

Read more
1 5 6 7 8 9 15