Calculating Pearson Correlation Coefficient in Python with Numpy
Introduction
This article is an introduction to the Pearson Correlation Coefficient, its manual calculation and its computation via Python’s numpy
module.
The Pearson correlation coefficient measures the linear association between variables. Its value can be interpreted like so:
- +1 – Complete positive correlation
- +0.8 – Strong positive correlation
- +0.6 – Moderate positive correlation
- 0 – no correlation whatsoever
- -0.6 – Moderate negative correlation
- -0.8 – Strong negative correlation
- -1 – Complete negative correlation
We’ll illustrate how the correlation coefficient varies with different types of associations. In this article, we’ll also show that zero correlation does not always mean zero associations. Non-linearly related variables may have correlation coefficients close to zero.
What is The Pearson Correlation Coefficient?
The Pearson’s Correlation Coefficient is also known as the Pearson Product-Moment Correlation Coefficient. It is a measure of the linear relationship between two random variables – X and Y. Mathematically, if