How to Use Out-of-Fold Predictions in Machine Learning

Last Updated on August 28, 2020

Machine learning algorithms are typically evaluated using resampling techniques such as k-fold cross-validation.

During the k-fold cross-validation process, predictions are made on test sets comprised of data not used to train the model. These predictions are referred to as out-of-fold predictions, a type of out-of-sample predictions.

Out-of-fold predictions play an important role in machine learning in both estimating the performance of a model when making predictions on new data in the future, so-called the generalization performance of the model, and in the development of ensemble models.

In this tutorial, you will discover a gentle introduction to out-of-fold predictions in machine learning.

After completing this tutorial, you will know:

Out-of-fold predictions are a type of out-of-sample predictions made on data not used to train a model.
Out-of-fold predictions are most commonly used to estimate the performance of a model when making predictions on unseen data.
Out-of-fold predictions can be used to construct an ensemble model called a stacked generalization or stacking ensemble.

Let’s get started.

Update Jan/2020: Updated for changes in scikit-learn v0.22 API.

How to Use Out-of-Fold Predictions in Machine Learning

To finish reading, please visit source site

Ensemble Learning