Develop an Intuition for Bayes Theorem With Worked Examples

Last Updated on August 19, 2020 Bayes Theorem provides a principled way for calculating a conditional probability. It is a deceptively simple calculation, providing a method that is easy to use for scenarios where our intuition often fails. The best way to develop an intuition for Bayes Theorem is to think about the meaning of the terms in the equation and to apply the calculation many times in a range of different real-world scenarios. This will provide the context for […]

Read more

How to Develop Super Learner Ensembles in Python

Last Updated on August 17, 2020 Selecting a machine learning algorithm for a predictive modeling problem involves evaluating many different models and model configurations using k-fold cross-validation. The super learner is an ensemble machine learning algorithm that combines all of the models and model configurations that you might investigate for a predictive modeling problem and uses them to make a prediction as-good-as or better than any single model that you may have investigated. The super learner algorithm is an application […]

Read more

Tune Hyperparameters for Classification Machine Learning Algorithms

Last Updated on August 28, 2020 Machine learning algorithms have hyperparameters that allow you to tailor the behavior of the algorithm to your specific dataset. Hyperparameters are different from parameters, which are the internal coefficients or weights for a model found by the learning algorithm. Unlike parameters, hyperparameters are specified by the practitioner when configuring the model. Typically, it is challenging to know what values to use for the hyperparameters of a given algorithm on a given dataset, therefore it […]

Read more

How to Transform Target Variables for Regression in Python

Last Updated on August 18, 2020 Data preparation is a big part of applied machine learning. Correctly preparing your training data can mean the difference between mediocre and extraordinary results, even with very simple linear algorithms. Performing data preparation operations, such as scaling, is relatively straightforward for input variables and has been made routine in Python via the Pipeline scikit-learn class. On regression predictive modeling problems where a numerical value must be predicted, it can also be critical to scale […]

Read more

Arithmetic, Geometric, and Harmonic Means for Machine Learning

Last Updated on August 19, 2020 Calculating the average of a variable or a list of numbers is a common operation in machine learning. It is an operation you may use every day either directly, such as when summarizing data, or indirectly, such as a smaller step in a larger procedure when fitting a model. The average is a synonym for the mean, a number that represents the most likely value from a probability distribution. As such, there are multiple […]

Read more

Best Results for Standard Machine Learning Datasets

Last Updated on August 28, 2020 It is important that beginner machine learning practitioners practice on small real-world datasets. So-called standard machine learning datasets contain actual observations, fit into memory, and are well studied and well understood. As such, they can be used by beginner practitioners to quickly test, explore, and practice data preparation and modeling techniques. A practitioner can confirm whether they have the data skills required to achieve a good result on a standard machine learning dataset. A […]

Read more

TensorFlow 2 Tutorial: Get Started in Deep Learning With tf.keras

Last Updated on August 27, 2020 Predictive modeling with deep learning is a skill that modern developers need to know. TensorFlow is the premier open-source deep learning framework developed and maintained by Google. Although using TensorFlow directly can be challenging, the modern tf.keras API beings the simplicity and ease of use of Keras to the TensorFlow project. Using tf.keras allows you to design, fit, evaluate, and use deep learning models to make predictions in just a few lines of code. […]

Read more

How to Use the ColumnTransformer for Data Preparation

Last Updated on August 18, 2020 You must prepare your raw data using data transforms prior to fitting a machine learning model. This is required to ensure that you best expose the structure of your predictive modeling problem to the learning algorithms. Applying data transforms like scaling or encoding categorical variables is straightforward when all input variables are the same type. It can be challenging when you have a dataset with mixed types and you want to selectively apply data […]

Read more

A Gentle Introduction to Imbalanced Classification

Last Updated on January 14, 2020 Classification predictive modeling involves predicting a class label for a given observation. An imbalanced classification problem is an example of a classification problem where the distribution of examples across the known classes is biased or skewed. The distribution can vary from a slight bias to a severe imbalance where there is one example in the minority class for hundreds, thousands, or millions of examples in the majority class or classes. Imbalanced classifications pose a […]

Read more

Best Resources for Imbalanced Classification

Last Updated on January 14, 2020 Classification is a predictive modeling problem that involves predicting a class label for a given example. It is generally assumed that the distribution of examples in the training dataset is even across all of the classes. In practice, this is rarely the case. Those classification predictive models where the distribution of examples across class labels is not equal (e.g. are skewed) are called “imbalanced classification.” Typically, a slight imbalance is not a problem and […]

Read more
1 831 832 833 834 835 896