Nonparametric methods for testing the independence of samples include Spearman's rank correlation coefficient, the Kendall tau rank correlation coefficient, the Kruskal–Wallis one-way analysis of variance, and the Walk–Wolfowitz runs test.
Spearman's Rank Correlation Coefficient
Spearman's rank correlation coefficient, often denoted by the Greek letter
For a sample of size
The sign of the Spearman correlation indicates the direction of association between
The Kendall Tau Rank Correlation Coefficient
Kendall's tau (
Let
The denominator is the total number pair combinations, so the coefficient must be in the range
The Kruskal–Wallis One-Way Analysis of Variance
The Kruskal–Wallis one-way ANOVA by ranks is a nonparametric method for testing whether samples originate from the same distribution. It is used for comparing more than two samples that are independent, or not related. When the Kruskal–Wallis test leads to significant results, then at least one of the samples is different from the other samples. The test does not identify where the differences occur or how many differences actually occur.
Since it is a non-parametric method, the Kruskal–Wallis test does not assume a normal distribution, unlike the analogous one-way analysis of variance. However, the test does assume an identically shaped and scaled distribution for each group, except for any difference in medians.
The Walk–Wolfowitz Runs Test
The Walk–Wolfowitz runs test is a non-parametric statistical test that checks a randomness hypothesis for a two-valued data sequence. More precisely, it can be used to test the hypothesis that the elements of the sequence are mutually independent.
A "run" of a sequence is a maximal non-empty segment of the sequence consisting of adjacent equal elements. For example, the 22-element-long sequence
consists of 6 runs, 3 of which consist of
The mean and variance parameters of the run do not assume that the positive and negative elements have equal probabilities of occurring, but only assume that the elements are independent and identically distributed. If the number of runs is significantly higher or lower than expected, the hypothesis of statistical independence of the elements may be rejected.