What Is Data Preparation in a Machine Learning Project

Last Updated on June 30, 2020

Data preparation may be one of the most difficult steps in any machine learning project.

The reason is that each dataset is different and highly specific to the project. Nevertheless, there are enough commonalities across predictive modeling projects that we can define a loose sequence of steps and subtasks that you are likely to perform.

This process provides a context in which we can consider the data preparation required for the project, informed both by the definition of the project performed before data preparation and the evaluation of machine learning algorithms performed after.

In this tutorial, you will discover how to consider data preparation as a step in a broader predictive modeling machine learning project.

After completing this tutorial, you will know:

  • Each predictive modeling project with machine learning is different, but there are common steps performed on each project.
  • Data preparation involves best exposing the unknown underlying structure of the problem to learning algorithms.
  • The steps before and after data preparation in a project can inform what data preparation methods to apply, or at least explore.

Kick-start your project with my new book Data Preparation for Machine Learning, including step-by-step
To finish reading, please visit source site