Course Review: Python for Data Science and Machine Learning Bootcamp

Before we get started it would be helpful to know what data science and machine learning actually are. So in case you don’t know, here are some basic definitions: Data science is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms, either structured or unstructured Machine learning is a field of computer science that often uses statistical techniques to give computers the ability to “learn” with data, without being […]

Read more

Course Review: Complete Python Bootcamp – Go from zero to hero in Python 3

Introduction The Python programming language has been around for a long time now and given the powerful language that it is, it shouldn’t be a surprise for it to continue having a strong foothold for years to come. Python’s extensibile frameworks and rich set of libraries make it a top language across various fields such as data science, machine learning, and web development, to name a few. Students and professionals are using it alike to tackle day-to-day problems as well […]

Read more

Random Forest Algorithm with Python and Scikit-Learn

Random forest is a type of supervised machine learning algorithm based on ensemble learning. Ensemble learning is a type of learning where you join different types of algorithms or same algorithm multiple times to form a more powerful prediction model. The random forest algorithm combines multiple algorithm of the same type i.e. multiple decision trees, resulting in a forest of trees, hence the name “Random Forest”. The random forest algorithm can be used for both regression and classification tasks. How […]

Read more

The Python tempfile Module

Introduction Temporary files, or “tempfiles”, are mainly used to store intermediate information on disk for an application. These files are normally created for different purposes such as temporary backup or if the application is dealing with a large dataset bigger than the system’s memory, etc. Ideally, these files are located in a separate directory, which varies on different operating systems, and the name of these files are unique. The data stored in temporary files is not always required after the […]

Read more

Converting Strings to datetime in Python

Introduction One of the many common problems that we face in software development is handling dates and times. After getting a date-time string from an API, for example, we need to convert it to a human-readable format. Again, if the same API is used in different timezones, the conversion will be different. A good date-time library should convert the time as per the timezone. This is just one of many nuances that need to be handled when dealing with dates […]

Read more

The Naive Bayes Algorithm in Python with Scikit-Learn

When studying Probability & Statistics, one of the first and most important theorems students learn is the Bayes’ Theorem. This theorem is the foundation of deductive reasoning, which focuses on determining the probability of an event occurring based on prior knowledge of conditions that might be related to the event. The Naive Bayes Classifier brings the power of this theorem to Machine Learning, building a very simple yet powerful classifier. In this article, we will see an overview on how […]

Read more

Hierarchical Clustering with Python and Scikit-Learn

Hierarchical clustering is a type of unsupervised machine learning algorithm used to cluster unlabeled data points. Like K-means clustering, hierarchical clustering also groups together the data points with similar characteristics. In some cases the result of hierarchical and K-Means clustering can be similar. Before implementing hierarchical clustering using Scikit-Learn, let’s first understand the theory behind hierarchical clustering. Theory of Hierarchical Clustering There are two types of hierarchical clustering: Agglomerative and Divisive. In the former, data points are clustered using a […]

Read more

Cross Validation and Grid Search for Model Selection in Python

Introduction A typical machine learning process involves training different models on the dataset and selecting the one with best performance. However, evaluating the performance of algorithm is not always a straight forward task. There are several factors that can help you determine which algorithm performance best. One such factor is the performance on cross validation set and another other factor is the choice of parameters for an algorithm. In this article we will explore these two factors in detail. We […]

Read more

The Python Requests Module

Introduction Dealing with HTTP requests is not an easy task in any programming language. If we talk about Python, it comes with two built-in modules, urllib and urllib2, to handle HTTP related operation. Both modules come with a different set of functionalities and many times they need to be used together. The main drawback of using urllib is that it is confusing (few methods are available in both urllib, urllib2), the documentation is not clear and we need to write […]

Read more

Association Rule Mining via Apriori Algorithm in Python

Association rule mining is a technique to identify underlying relations between different items. Take an example of a Super Market where customers can buy variety of items. Usually, there is a pattern in what the customers buy. For instance, mothers with babies buy baby products such as milk and diapers. Damsels may buy makeup items whereas bachelors may buy beers and chips etc. In short, transactions involve a pattern. More profit can be generated if the relationship between the items […]

Read more
1 893 894 895 896 897 910