Rank Correlation

A rank correlation is any of several statistics that measure the relationship between rankings.

Learning Objective

Evaluate the relationship between rankings of different ordinal variables using rank correlation

Key Points

A rank correlation coefficient measures the degree of similarity between two rankings, and can be used to assess the significance of the relation between them.
Kendall's tau ($\tau$) and Spearman's rho ($\rho$) are particular (and frequently used) cases of a general correlation coefficient.
The measure of significance of the rank correlation coefficient can show whether the measured relationship is small enough to be likely to be a coincidence.

Terms

concordant
Agreeing; correspondent; in keeping with; agreeable with.
rank correlation
Any of several statistics that measure the relationship between rankings of different ordinal variables or different rankings of the same variable.

Full Text

Rank Correlation

In statistics, a rank correlation is any of several statistics that measure the relationship between rankings of different ordinal variables or different rankings of the same variable, where a "ranking" is the assignment of the labels (e.g., first, second, third, etc.) to different observations of a particular variable. A rank correlation coefficient measures the degree of similarity between two rankings, and can be used to assess the significance of the relation between them.

If, for example, one variable is the identity of a college basketball program and another variable is the identity of a college football program, one could test for a relationship between the poll rankings of the two types of program: do colleges with a higher-ranked basketball program tend to have a higher-ranked football program? A rank correlation coefficient can measure that relationship, and the measure of significance of the rank correlation coefficient can show whether the measured relationship is small enough to be likely to be a coincidence.

If there is only one variable, the identity of a college football program, but it is subject to two different poll rankings (say, one by coaches and one by sportswriters), then the similarity of the two different polls' rankings can be measured with a rank correlation coefficient.

Some of the more popular rank correlation statistics include Spearman's rho ($\rho$) and Kendall's tau ($\tau$).

Spearman's $\rho$

Spearman developed a method of measuring rank correlation known as Spearman's rank correlation coefficient. It is generally denoted by $r_s$. There are three cases when calculating Spearman's rank correlation coefficient:

When ranks are given
When ranks are not given
Repeated ranks

The formula for calculating Spearman's rank correlation coefficient is:

$\displaystyle{r_s = 1- \frac{6 \sum d^2}{n(n^2-1)}}$

where $n$ is the number of items or individuals being ranked and $d$ is $R_1 - R_2$ (where $R_1$ is the rank of items with respect to the first variable and $R_2$ is the rank of items with respect to the second variable).

Kendall's τ

The definition of the Kendall coefficient is as follows:

Let $(x_1, y_1), (x_2, y_2), \cdots, (x_n, y_n)$ be a set of observations of the joint random variables $X$ and $Y$, respectively, such that all the values of $x_i$ and $y_i$ are unique. Any pair of observations $(x_i,y_i)$ and $(x_j,y_j)$ follows these rules:

The observations are sadi to be concordant if the ranks for both elements agree—that is, if both $x_i > x_j$ and $y_i > y_j$, or if both $x_i < x_j$ and $y_i < y_j$.
The observations are said to be discordant if $x_i > x_j$ and $y_i < y_j$, or if $x_i < x_j$ and $y_i > y_j$.
The observations are neither concordant nor discordant if $x_i = x_j$ or $y_i = y_j$.

The Kendall $\tau$ coefficient is defined as follows:

$\displaystyle{\tau = \frac{(\text{number of concordant pairs}) - (\text{number of discordant pairs})}{\frac{1}{2} n (n-1)}}$

and has the following properties:

The denominator is the total number pair combinations, so the coefficient must be in the range $-1 \leq \tau \leq 1$.
If the agreement between the two rankings is perfect (i.e., the two rankings are the same) the coefficient has value $1$.
If the disagreement between the two rankings is perfect (i.e., one ranking is the reverse of the other) the coefficient has value $-1$.
If $X$ and $Y$ are independent, then we would expect the coefficient to be approximately zero.

Kendall's $\tau$ and Spearman's $\rho$ are particular cases of a general correlation coefficient.

[ edit ]

Prev Concept

Comparing Three or More Populations: Randomized Block Design

Descriptive or Inferential Statistics?

Next Concept