# 4. Covariance and Correlation¶

Covariance and correlation describe how two variables are related.

• Variables are positively related if they move in the same direction.
• Variables are inversely related if they move in opposite directions.

Both covariance and correlation indicate whether variables are positively or inversely related. Correlation also tells you the degree to which the variables tend to move together.

## 4.1. Covariance¶

Covariance measures how two variables move with respect to each other and is an extension of the concept of variance (which tells about how a single variable varies). It can take any value from $$-{\infty}$$ to $$+{\infty}$$

• Higher this value, more dependent is the relationship. A positive number signifies positive covariance and denotes that there is a direct relationship. Effectively this means that an increase in one variable would also lead to a corresponding increase in the other variable provided other conditions remain constant.
• On the other hand, a negative number signifies negative covariance which denotes an inverse relationship between the two variables. Though covariance is perfect for defining the type of relationship, it is bad for interpreting its magnitude.
(1)$\operatorname{COV_{x,y}} = \frac{\sum^{n}_{i=1}(x_{i}-\bar{x})(y_{i}-\bar{y})}{n-1}$

where,

• x = the independent variable
• y = the dependent variable
• n = number of data points in the sample
• $$\bar{x}$$ = the mean of the independent variable x
• $$\bar{y}$$ = the mean of the dependent variable y

## 4.2. Correlation¶

Correlation is another way to determine how two variables are related. In addition to telling you whether variables are positively or inversely related, correlation also tells you the degree to which the variables tend to move together.

Correlation standardizes the measure of interdependence between two variables and, consequently, tells you how closely the two variables move. The correlation measurement, called a correlation coefficient, will always take on a value between $$1$$ and $$– 1$$

• If the correlation coefficient is $$1$$, the variables have a perfect positive correlation. This means that if one variable moves a given amount, the second moves proportionally in the same direction. A positive correlation coefficient less than one indicates a less than perfect positive correlation, with the strength of the correlation growing as the number approaches one.
• If correlation coefficient is $$0$$, no relationship exists between the variables. If one variable moves, you can make no predictions about the movement of the other variable; they are uncorrelated.
• If correlation coefficient is $$–1$$, the variables are perfectly negatively correlated (or inversely correlated) and move in opposition to each other. If one variable increases, the other variable decreases proportionally. A negative correlation coefficient greater than $$–1$$ indicates a less than perfect negative correlation, with the strength of the correlation growing as the number approaches $$–1$$.
(2)$\operatorname{COR_{x,y}} = \frac{\operatorname{COV_{x,y}}}{\sigma_{x}\sigma_{y}}$

where,

• $$\sigma_{x}$$ = sample standard deviation of the random variable x
• $$\sigma_{y}$$ = sample standard deviation of the random variable y

## 4.3. Difference¶

1. Meaning
• Covariance is an indicator of the extent to which two random variables are dependent on each other. A higher number denotes higher dependency.
• Correlation is an indicator of how strongly these two variables are related provided other conditions are constant. A maximum value is $$+1$$ denoting perfect dependent relationship.
2. Relationship
• Correlation can be deduced from covariance.
• Correlation provides a measure of covariance on a standard scale. It is deduced by dividing the calculated covariance with standard deviation.
3. Value
• The value of covariance lies in the range of $$-{\infty}$$ and $$+{\infty}$$.
• Correlation is limited to values between the range $$-1$$ and $$+1$$.
4. Scalability
• Correlation is not affected by a change in scales or multiplication by a constant.
• Covariance affects Correlation
5. Units
• Covariance has a definite unit as it is deduced by the multiplication of two numbers and their units.
• Correlation is a unitless absolute number between $$-1$$ and $$+1$$ including decimal values.

Citations

Footnotes

References