Standard Machine Learning Datasets for Imbalanced Classification
Last Updated on January 14, 2020
An imbalanced classification problem is a problem that involves predicting a class label where the distribution of class labels in the training dataset is skewed.
Many real-world classification problems have an imbalanced class distribution, therefore it is important for machine learning practitioners to get familiar with working with these types of problems.
In this tutorial, you will discover a suite of standard machine learning datasets for imbalanced classification.
After completing this tutorial, you will know:
- Standard machine learning datasets with an imbalance of two classes.
- Standard datasets for multiclass classification with a skewed class distribution.
- Popular imbalanced classification datasets used for machine learning competitions.
Kick-start your project with my new book Imbalanced Classification with Python, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.
Tutorial Overview
This tutorial is divided into three parts; they are:
- Binary Classification Datasets
- Multiclass Classification Datasets