Feature Importance and Feature Selection With XGBoost in Python

Last Updated on August 27, 2020

A benefit of using ensembles of decision tree methods like gradient boosting is that they can automatically provide estimates of feature importance from a trained predictive model.

In this post you will discover how you can estimate the importance of features for a predictive modeling problem using the XGBoost library in Python.

After reading this post you will know:

How feature importance is calculated using the gradient boosting algorithm.
How to plot feature importance in Python calculated by the XGBoost model.
How to use feature importance calculated by XGBoost to perform feature selection.

Kick-start your project with my new book XGBoost With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Update Jan/2017: Updated to reflect changes in scikit-learn API version 0.18.1.
Update Mar/2018: Added alternate link to download the dataset as the original appears to have been taken down.
Update Apr/2020: Updated example for XGBoost 1.0.2.

Feature Importance and Feature Selection With XGBoost in Python

Feature Importance and Feature Selection With XGBoost
To finish reading, please visit source site

XGBoost