Improve Model Accuracy with Data Pre-Processing
Last Updated on August 15, 2020
Data preparation can make or break the predictive ability of your model.
In Chapter 3 of their book Applied Predictive Modeling, Kuhn and Johnson introduce the process of data preparation. They refer to it as the addition, deletion or transformation of training set data.
In this post you will discover the data pre-process steps that you can use to improve the predictive ability of your models.
Kick-start your project with my new book Data Preparation for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.
Data Preparation
You must pre-process your raw data before you model your problem. The specific preparation may depend on the data that you have available and the machine learning algorithms you want to use.
Sometimes, pre-processing of data can lead to unexpected improvements in model accuracy. This may be because a relationship in the data has been simplified or unobscured.