How to Choose Data Preparation Methods for Machine Learning
Last Updated on July 15, 2020
Data preparation is an important part of a predictive modeling project.
Correct application of data preparation will transform raw data into a representation that allows learning algorithms to get the most out of the data and make skillful predictions. The problem is choosing a transform or sequence of transforms that results in a useful representation is very challenging. So much so that it may be considered more of an art than a science.
In this tutorial, you will discover strategies that you can use to select data preparation techniques for your predictive modeling datasets.
After completing this tutorial, you will know:
- Data preparation techniques can be chosen based on detailed knowledge of the dataset and algorithm and this is the most common approach.
- Data preparation techniques can be grid searched as just another hyperparameter in the modeling pipeline.
- Data transforms can be applied to a training dataset in parallel to create many extracted features on which feature selection can be applied and a model trained.
Kick-start your project with my new book Data Preparation for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.
Let’s get
To finish reading, please visit source site