Python for NLP: Working with Text and PDF Files
This is the first article in my series of articles on Python for Natural Language Processing (NLP). In this article, we will start with the basics of Python for NLP. We will see how we can work with simple text files and PDF files using Python.
Working with Text Files
Text files are probably the most basic types of files that you are going to encounter in your NLP endeavors. In this section, we will see how to read from a text file in Python, create a text file, and write data to the text file.
Reading a Text File
Create a text file with the following text and save it in your local directory with a “.txt” extension.
Welcome to Natural Language Processing
It is one of the most exciting research areas as of today
We will see how Python can be used to work with text files.
In my case, I stored the file named “myfile.txt” in my root “D:” directory.
Reading All File Contents
Now let’s see how we can read the whole contents of the file. The first step is to specify the path of