Python tutorials

Implementing LDA in Python with Scikit-Learn

In our previous article Implementing PCA in Python with Scikit-Learn, we studied how we can reduce dimensionality of the feature set using PCA. In this article we will study another very important dimensionality reduction technique: linear discriminant analysis (or LDA). But first let’s briefly discuss how PCA and LDA differ from each other. PCA vs LDA: What’s the Difference? Both PCA and LDA are linear transformation techniques. However, PCA is an unsupervised while LDA is a supervised dimensionality reduction technique. […]

Read more

Introduction to the Python Coding Style

Python as a scripting language is quite simple and compact. Compared to other languages, you only have a relatively low number of keywords to internalize in order to write proper Python code. Furthermore, both simplicity as well as readability of the code are preferred, which is what Python prides itself on. In order to achieve both goals, it is helpful that you follow the language’s specific guidelines. This article focuses on the guidelines mentioned above to write valid code that […]

Read more

Course Review: Python for Data Science and Machine Learning Bootcamp

Before we get started it would be helpful to know what data science and machine learning actually are. So in case you don’t know, here are some basic definitions: Data science is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms, either structured or unstructured Machine learning is a field of computer science that often uses statistical techniques to give computers the ability to “learn” with data, without being […]

Read more

Course Review: Complete Python Bootcamp – Go from zero to hero in Python 3

Introduction The Python programming language has been around for a long time now and given the powerful language that it is, it shouldn’t be a surprise for it to continue having a strong foothold for years to come. Python’s extensibile frameworks and rich set of libraries make it a top language across various fields such as data science, machine learning, and web development, to name a few. Students and professionals are using it alike to tackle day-to-day problems as well […]

Read more

Random Forest Algorithm with Python and Scikit-Learn

Random forest is a type of supervised machine learning algorithm based on ensemble learning. Ensemble learning is a type of learning where you join different types of algorithms or same algorithm multiple times to form a more powerful prediction model. The random forest algorithm combines multiple algorithm of the same type i.e. multiple decision trees, resulting in a forest of trees, hence the name “Random Forest”. The random forest algorithm can be used for both regression and classification tasks. How […]

Read more

The Python tempfile Module

Introduction Temporary files, or “tempfiles”, are mainly used to store intermediate information on disk for an application. These files are normally created for different purposes such as temporary backup or if the application is dealing with a large dataset bigger than the system’s memory, etc. Ideally, these files are located in a separate directory, which varies on different operating systems, and the name of these files are unique. The data stored in temporary files is not always required after the […]

Read more

Converting Strings to datetime in Python

Introduction One of the many common problems that we face in software development is handling dates and times. After getting a date-time string from an API, for example, we need to convert it to a human-readable format. Again, if the same API is used in different timezones, the conversion will be different. A good date-time library should convert the time as per the timezone. This is just one of many nuances that need to be handled when dealing with dates […]

Read more

The Naive Bayes Algorithm in Python with Scikit-Learn

When studying Probability & Statistics, one of the first and most important theorems students learn is the Bayes’ Theorem. This theorem is the foundation of deductive reasoning, which focuses on determining the probability of an event occurring based on prior knowledge of conditions that might be related to the event. The Naive Bayes Classifier brings the power of this theorem to Machine Learning, building a very simple yet powerful classifier. In this article, we will see an overview on how […]

Read more

Hierarchical Clustering with Python and Scikit-Learn

Hierarchical clustering is a type of unsupervised machine learning algorithm used to cluster unlabeled data points. Like K-means clustering, hierarchical clustering also groups together the data points with similar characteristics. In some cases the result of hierarchical and K-Means clustering can be similar. Before implementing hierarchical clustering using Scikit-Learn, let’s first understand the theory behind hierarchical clustering. Theory of Hierarchical Clustering There are two types of hierarchical clustering: Agglomerative and Divisive. In the former, data points are clustered using a […]

Read more

Cross Validation and Grid Search for Model Selection in Python

Introduction A typical machine learning process involves training different models on the dataset and selecting the one with best performance. However, evaluating the performance of algorithm is not always a straight forward task. There are several factors that can help you determine which algorithm performance best. One such factor is the performance on cross validation set and another other factor is the choice of parameters for an algorithm. In this article we will explore these two factors in detail. We […]

Read more
1 175 176 177 178 179 181