Rescaling Data for Machine Learning in Python with Scikit-Learn
Last Updated on June 30, 2020
Your data must be prepared before you can build models. The data preparation process can involve three steps: data selection, data preprocessing and data transformation.
In this post you will discover two simple data transformation methods you can apply to your data in Python using scikit-learn.
Kick-start your project with my new book Data Preparation for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.
Update: See this post for a more up to date set of examples.
Data Rescaling
Your preprocessed data may contain attributes with a mixtures of scales for various quantities such as dollars, kilograms and sales volume.
Many machine learning methods expect or are more effective if the data attributes have the same scale. Two popular data scaling methods are normalization and standardization.