How to Implement Resampling Methods From Scratch In Python
Last Updated on August 13, 2019
The goal of predictive modeling is to create models that make good predictions on new data.
We don’t have access to this new data at the time of training, so we must use statistical methods to estimate the performance of a model on new data.
This class of methods are called resampling methods, as they resampling your available training data.
In this tutorial, you will discover how to implement resampling methods from scratch in Python.
After completing this tutorial, you will know:
- How to implement a train and test split of your data.
- How to implement a k-fold cross validation split of your data.
Kick-start your project with my new book Machine Learning Algorithms From Scratch, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.
- Update Jan/2017: Changed the calculation of fold_size in cross_validation_split() to always be an integer. Fixes issues with Python 3.
- Update May/2018: Fixed typo re LOOCV.
- Update Aug/2018: Tested and updated to work with Python 3.6.