How to Calculate the 5-Number Summary for Your Data in Python
Last Updated on August 8, 2019
Data summarization provides a convenient way to describe all of the values in a data sample with just a few statistical values.
The mean and standard deviation are used to summarize data with a Gaussian distribution, but may not be meaningful, or could even be misleading, if your data sample has a non-Gaussian distribution.
In this tutorial, you will discover the five-number summary for describing the distribution of a data sample without assuming a specific data distribution.
After completing this tutorial, you will know:
- Data summarization, such as calculating the mean and standard deviation, are only meaningful for the Gaussian distribution.
- The five-number summary can be used to describe a data sample with any distribution.
- How to calculate the five-number summary in Python.
Kick-start your project with my new book Statistics for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.