Python tutorials

Converting Strings to datetime in Python

Introduction One of the many common problems that we face in software development is handling dates and times. After getting a date-time string from an API, for example, we need to convert it to a human-readable format. Again, if the same API is used in different timezones, the conversion will be different. A good date-time library should convert the time as per the timezone. This is just one of many nuances that need to be handled when dealing with dates […]

Read more

The Naive Bayes Algorithm in Python with Scikit-Learn

When studying Probability & Statistics, one of the first and most important theorems students learn is the Bayes’ Theorem. This theorem is the foundation of deductive reasoning, which focuses on determining the probability of an event occurring based on prior knowledge of conditions that might be related to the event. The Naive Bayes Classifier brings the power of this theorem to Machine Learning, building a very simple yet powerful classifier. In this article, we will see an overview on how […]

Read more

Hierarchical Clustering with Python and Scikit-Learn

Hierarchical clustering is a type of unsupervised machine learning algorithm used to cluster unlabeled data points. Like K-means clustering, hierarchical clustering also groups together the data points with similar characteristics. In some cases the result of hierarchical and K-Means clustering can be similar. Before implementing hierarchical clustering using Scikit-Learn, let’s first understand the theory behind hierarchical clustering. Theory of Hierarchical Clustering There are two types of hierarchical clustering: Agglomerative and Divisive. In the former, data points are clustered using a […]

Read more

Cross Validation and Grid Search for Model Selection in Python

Introduction A typical machine learning process involves training different models on the dataset and selecting the one with best performance. However, evaluating the performance of algorithm is not always a straight forward task. There are several factors that can help you determine which algorithm performance best. One such factor is the performance on cross validation set and another other factor is the choice of parameters for an algorithm. In this article we will explore these two factors in detail. We […]

Read more

The Python Requests Module

Introduction Dealing with HTTP requests is not an easy task in any programming language. If we talk about Python, it comes with two built-in modules, urllib and urllib2, to handle HTTP related operation. Both modules come with a different set of functionalities and many times they need to be used together. The main drawback of using urllib is that it is confusing (few methods are available in both urllib, urllib2), the documentation is not clear and we need to write […]

Read more

Association Rule Mining via Apriori Algorithm in Python

Association rule mining is a technique to identify underlying relations between different items. Take an example of a Super Market where customers can buy variety of items. Usually, there is a pattern in what the customers buy. For instance, mothers with babies buy baby products such as milk and diapers. Damsels may buy makeup items whereas bachelors may buy beers and chips etc. In short, transactions involve a pattern. More profit can be generated if the relationship between the items […]

Read more

Using Regex for Text Manipulation in Python

Introduction Text preprocessing is one of the most important tasks in Natural Language Processing (NLP). For instance, you may want to remove all punctuation marks from text documents before they can be used for text classification. Similarly, you may want to extract numbers from a text string. Writing manual scripts for such preprocessing tasks requires a lot of effort and is prone to errors. Keeping in view the importance of these preprocessing tasks, the Regular Expressions (aka Regex) have been […]

Read more

Text Classification with Python and Scikit-Learn

Introduction Text classification is one of the most important tasks in Natural Language Processing. It is the process of classifying text strings or documents into different categories, depending upon the contents of the strings. Text classification has a variety of applications, such as detecting user sentiment from a tweet, classifying an email as spam or ham, classifying blog posts into different categories, automatic tagging of customer queries, and so on. In this article, we will see a real-world example of […]

Read more

Comparing Strings using Python

In Python, strings are sequences of characters, which are effectively stored in memory as an object. Each object can be identified using the id() method, as you can see below. Python tries to re-use objects in memory that have the same value, which also makes comparing objects very fast in Python: $ python Python 2.7.9 (default, Jun 29 2016, 13:08:31) [GCC 4.9.2] on linux2 Type “help”, “copyright”, “credits” or “license” for more information. >>> a = “abc” >>> b = […]

Read more

Beginner’s Tutorial on the Pandas Python Library

Pandas is an open source Python package that provides numerous tools for data analysis. The package comes with several data structures that can be used for many different data manipulation tasks. It also has a variety of methods that can be invoked for data analysis, which comes in handy when working on data science and machine learning problems in Python. Advantages of Using Pandas The following are some of the advantages of the Pandas library: It can present data in […]

Read more
1 177 178 179 180 181 182