Python for NLP: Parts of Speech Tagging and Named Entity Recognition

This is the 4th article in my series of articles on Python for NLP. In my previous article, I explained how the spaCy library can be used to perform tasks like vocabulary and phrase matching. In this article, we will study parts of speech tagging and named entity recognition in detail. We will see how the spaCy library can be used to perform these two tasks. Parts of Speech (POS) Tagging Parts of speech tagging simply refers to assigning parts […]

Read more

Working with PostgreSQL in Python

Introduction PostgreSQL is one of the most advanced and widely used relational database management systems. It’s extremely popular for many reasons, a few of which include it being open source, its extensibility, and its ability to handle many different types of applications and varying loads. With Python, you can easily establish a connection to your PostgreSQL database. There are many Python drivers for PostgreSQL, with “psycopg” being the most popular one. Its current version is psycopg2. In this article, we’ll […]

Read more

Sorting Algorithms in Python

Introduction Sometimes, data we store or retrieve in an application can have little or no order. We may have to rearrange the data to correctly process it or efficiently use it. Over the years, computer scientists have created many sorting algorithms to organize data. In this article we’ll have a look at popular sorting algorithms, understand how they work and code them in Python. We’ll also compare how quickly they sort items in a list. For simplicity, algorithm implementations would […]

Read more

Getting Started with Selenium and Python

Introduction Web Browser Automation is gaining popularity, and many frameworks/tools have arose to offer automation services to developers. Web Browser Automation is often used for testing purposes in development and production environments, though it’s also often used for web scraping data from public sources, analysis, and data processing. Really, what you do with automation is up to you, though, just make sure that what you’re doing is legal, as “bots” created with automation tools can often infringe laws or a […]

Read more

Python for NLP: Sentiment Analysis with Scikit-Learn

This is the fifth article in the series of articles on NLP for Python. In my previous article, I explained how Python’s spaCy library can be used to perform parts of speech tagging and named entity recognition. In this article, I will demonstrate how to do sentiment analysis using Twitter data using the Scikit-Learn library. Sentiment analysis refers to analyzing an opinion or feelings about something using data like text or images, regarding almost anything. Sentiment analysis helps companies in […]

Read more

Python for NLP: Topic Modeling

This is the sixth article in my series of articles on Python for NLP. In my previous article, I talked about how to perform sentiment analysis of Twitter data using Python’s Scikit-Learn library. In this article, we will study topic modeling, which is another very important application of NLP. We will see how to do topic modeling with Python. What is Topic Modeling Topic modeling is an unsupervised technique that intends to analyze large volumes of text data by clustering […]

Read more

Introduction to the Python lxml Library

lxml is a Python library which allows for easy handling of XML and HTML files, and can also be used for web scraping. There are a lot of off-the-shelf XML parsers out there, but for better results, developers sometimes prefer to write their own XML and HTML parsers. This is when the lxml library comes to play. The key benefits of this library are that it’s ease of use, extremely fast when parsing large documents, very well documented, and provides […]

Read more

Python for NLP: Introduction to the TextBlob Library

Introduction This is the seventh article in my series of articles on Python for NLP. In my previous article, I explained how to perform topic modeling using Latent Dirichlet Allocation and Non-Negative Matrix factorization. We used the Scikit-Learn library to perform topic modeling. In this article, we will explore TextBlob, which is another extremely powerful NLP library for Python. TextBlob is built upon NLTK and provides an easy to use interface to the NLTK library. We will see how TextBlob […]

Read more

Introduction to the Python Calendar Module

Introduction Python has an built-in module named Calendar that contains useful classes and functions to support a variety of calendar operations. By default, the Calendar module follows the Gregorian calendar, where Monday is the first day (0) of the week and Sunday is the last day of the week (6). In Python, datetime and time modules also provide low-level calendar-related functionalities. In addition to these modules, the Calendar module provides essential functions related to displaying and manipulating calendars. To print […]

Read more

Working with PDFs in Python: Reading and Splitting Pages

This article is the first in a series on working with PDFs in Python: The PDF Document Format Today, the Portable Document Format (PDF) belongs to the most commonly used data formats. In 1990, the structure of a PDF document was defined by Adobe. The idea behind the PDF format is that transmitted data/documents look exactly the same for both parties that are involved in the communication process – the creator, author or sender, and the receiver. PDF is the […]

Read more
1 23 24 25 26 27 54