Why and how to use BERT for NLP Text Classification?
This article was published as a part of the Data Science Blogathon
Introduction
NLP or Natural Language Processing is an exponentially growing field. In the “new normal” imposed by covid19, a significant proportion of educational material, news, discussions happen through digital media platforms. This provides more text data available to work upon!
Originally, simple RNNS (Recurrent Neural Networks) were used for training text data. But in recent years there have been many new research publications that provide state-of-the-art results. One of such is BERT! In this blog, I’ll put down my understanding of the BERT transformer and its application.
BERT stands for Bidirectional Encoder Representations from Transformers. I’ll give a brief idea about transformers first before proceeding further.
Source: Image
Intro to Transformers
BERT is a transformer-based architecture.
What is a transformer?
Google introduced the transformer architecture in the paper