How to Use Feature Extraction on Tabular Data for Machine Learning

Last Updated on August 17, 2020

Machine learning predictive modeling performance is only as good as your data, and your data is only as good as the way you prepare it for modeling.

The most common approach to data preparation is to study a dataset and review the expectations of a machine learning algorithm, then carefully choose the most appropriate data preparation techniques to transform the raw data to best meet the expectations of the algorithm. This is slow, expensive, and requires a vast amount of expertise.

An alternative approach to data preparation is to apply a suite of common and commonly useful data preparation techniques to the raw data in parallel and combine the results of all of the transforms together into a single large dataset from which a model can be fit and evaluated.

This is an alternative philosophy for data preparation that treats data transforms as an approach to extract salient features from raw data to expose the structure of the problem to the learning algorithms. It requires learning algorithms that are scalable of weight input features and using those input features that are most relevant to the target that is being predicted.

This approach
To finish reading, please visit source site

Data Preparation