How to Use Out-of-Fold Predictions in Machine Learning
Last Updated on August 28, 2020
Machine learning algorithms are typically evaluated using resampling techniques such as k-fold cross-validation.
During the k-fold cross-validation process, predictions are made on test sets comprised of data not used to train the model. These predictions are referred to as out-of-fold predictions, a type of out-of-sample predictions.
Out-of-fold predictions play an important role in machine learning in both estimating the performance of a model when making predictions on new data in the future, so-called the generalization performance of the model, and in the development of ensemble models.
In this tutorial, you will discover a gentle introduction to out-of-fold predictions in machine learning.
After completing this tutorial, you will know:
- Out-of-fold predictions are a type of out-of-sample predictions made on data not used to train a model.
- Out-of-fold predictions are most commonly used to estimate the performance of a model when making predictions on unseen data.
- Out-of-fold predictions can be used to construct an ensemble model called a stacked generalization or stacking ensemble.
Let’s get started.
- Update Jan/2020: Updated for changes in scikit-learn v0.22 API.