Principal Component Analysis for Dimensionality Reduction in Python
Last Updated on August 18, 2020
Reducing the number of input variables for a predictive model is referred to as dimensionality reduction.
Fewer input variables can result in a simpler predictive model that may have better performance when making predictions on new data.
Perhaps the most popular technique for dimensionality reduction in machine learning is Principal Component Analysis, or PCA for short. This is a technique that comes from the field of linear algebra and can be used as a data preparation technique to create a projection of a dataset prior to fitting a model.
In this tutorial, you will discover how to use PCA for dimensionality reduction when developing predictive models.
After completing this tutorial, you will know:
- Dimensionality reduction involves reducing the number of input variables or columns in modeling data.
- PCA is a technique from linear algebra that can be used to automatically perform dimensionality reduction.
- How to evaluate predictive models that use a PCA projection as input and make predictions with new raw data.
Kick-start your project with my new book Data Preparation for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.