Using Regex for Text Manipulation in Python

Introduction Text preprocessing is one of the most important tasks in Natural Language Processing (NLP). For instance, you may want to remove all punctuation marks from text documents before they can be used for text classification. Similarly, you may want to extract numbers from a text string. Writing manual scripts for such preprocessing tasks requires a lot of effort and is prone to errors. Keeping in view the importance of these preprocessing tasks, the Regular Expressions (aka Regex) have been […]

Read more

Text Classification with Python and Scikit-Learn

Introduction Text classification is one of the most important tasks in Natural Language Processing. It is the process of classifying text strings or documents into different categories, depending upon the contents of the strings. Text classification has a variety of applications, such as detecting user sentiment from a tweet, classifying an email as spam or ham, classifying blog posts into different categories, automatic tagging of customer queries, and so on. In this article, we will see a real-world example of […]

Read more

Comparing Strings using Python

In Python, strings are sequences of characters, which are effectively stored in memory as an object. Each object can be identified using the id() method, as you can see below. Python tries to re-use objects in memory that have the same value, which also makes comparing objects very fast in Python: $ python Python 2.7.9 (default, Jun 29 2016, 13:08:31) [GCC 4.9.2] on linux2 Type “help”, “copyright”, “credits” or “license” for more information. >>> a = “abc” >>> b = […]

Read more

Beginner’s Tutorial on the Pandas Python Library

Pandas is an open source Python package that provides numerous tools for data analysis. The package comes with several data structures that can be used for many different data manipulation tasks. It also has a variety of methods that can be invoked for data analysis, which comes in handy when working on data science and machine learning problems in Python. Advantages of Using Pandas The following are some of the advantages of the Pandas library: It can present data in […]

Read more

Text Summarization with NLTK in Python

Introduction As I write this article, 1,907,223,370 websites are active on the internet and 2,722,460 emails are being sent per second. This is an unbelievably huge amount of data. It is impossible for a user to get insights from such huge volumes of data. Furthermore, a large portion of this data is either redundant or doesn’t contain much useful information. The most efficient way to get access to the most important parts of the data, without having to sift through […]

Read more

File Handling in Python

Introduction It is an unwritten consensus that Python is one of the best starting programming languages to learn as a novice. It is extremely versatile, easy to read/analyze, and quite pleasant to the eye. The Python programming language is highly scalable and is widely considered as one of the best toolboxes to build tools and utilities that you may want to use for diverse reasons. This article will briefly covers how Python handles one of the most important components of […]

Read more

Preparing for a Python Developer Interview

Introduction In this article I will be giving my opinions and suggestions for putting yourself in the best position to out-perform competing candidates in a Python programming interview so that you can land a job as a Python developer. You may be thinking, with the shortage of programmers in the job market all I need to do is show up and answer a few questions about basic Python syntax and let my degree or bootcamp certificate take care of the […]

Read more

Implementing Word2Vec with Gensim Library in Python

Introduction Humans have a natural ability to understand what other people are saying and what to say in response. This ability is developed by consistently interacting with other people and the society over many years. The language plays a very important role in how humans interact. Languages that humans use for interaction are called natural languages. The rules of various natural languages are different. However, there is one thing in common in natural languages: flexibility and evolution. Natural languages are […]

Read more

Creating a Simple Recommender System in Python using Pandas

Introduction Have you ever wondered how Netflix suggests movies to you based on the movies you have already watched? Or how does an e-commerce websites display options such as “Frequently Bought Together”? They may look relatively simple options but behind the scenes, a complex statistical algorithm executes in order to predict these recommendations. Such systems are called Recommender Systems, Recommendation Systems, or Recommendation Engines. A Recommender System is one of the most famous applications of data science and machine learning. […]

Read more

How to Format Dates in Python

Introduction Python comes with a variety of useful objects that can be used out of the box. Date objects are examples of such objects. Date types are difficult to manipulate from scratch, due to the complexity of dates and times. However, Python date objects make it extremely easy to convert dates into the desirable string formats. Date formatting is one of the most important tasks that you will face as a programmer. Different regions around the world have different ways […]

Read more
1 894 895 896 897 898 910