Bagging and Random Forest for Imbalanced Classification
Last Updated on August 21, 2020
Bagging is an ensemble algorithm that fits multiple models on different subsets of a training dataset, then combines the predictions from all models.
Random forest is an extension of bagging that also randomly selects subsets of features used in each data sample. Both bagging and random forests have proven effective on a wide range of different predictive modeling problems.
Although effective, they are not suited to classification problems with a skewed class distribution. Nevertheless, many modifications to the algorithms have been proposed that adapt their behavior and make them better suited to a severe class imbalance.
In this tutorial, you will discover how to use bagging and random forest for imbalanced classification.
After completing this tutorial, you will know:
- How to use Bagging with random undersampling for imbalanced classification.
- How to use Random Forest with class weighting and random undersampling for imbalanced classification.
- How to use the Easy Ensemble that combines bagging and boosting for imbalanced classification.
Kick-start your project with my new book Imbalanced Classification with Python, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.