The Mann–Whitney
Assumptions and Formal Statement of Hypotheses
Although Mann and Whitney developed the test under the assumption of continuous responses with the alternative hypothesis being that one distribution is stochastically greater than the other, there are many other ways to formulate the null and alternative hypotheses such that the test will give a valid test. A very general formulation is to assume that:
- All the observations from both groups are independent of each other.
- The responses are ordinal (i.e., one can at least say of any two observations which is the greater).
- The distributions of both groups are equal under the null hypothesis, so that the probability of an observation from one population (
$X$ ) exceeding an observation from the second population ($Y$ ) equals the probability of an observation from$Y$ exceeding an observation from$X$ . That is, there is a symmetry between populations with respect to probability of random drawing of a larger observation. - Under the alternative hypothesis, the probability of an observation from one population (
$X$ ) exceeding an observation from the second population ($Y$ ) (after exclusion of ties) is not equal to$0.5$ . The alternative may also be stated in terms of a one-sided test, for example:$P(X > Y) + 0.5 \cdot P(X = Y) > 0.5$ .
Calculations
The test involves the calculation of a statistic, usually called
There are two ways of calculating
Method One
For small samples a direct method is recommended. It is very quick, and gives an insight into the meaning of the
- Choose the sample for which the ranks seem to be smaller (the only reason to do this is to make computation easier). Call this "sample 1," and call the other sample "sample 2. "
- For each observation in sample 1, count the number of observations in sample 2 that have a smaller rank (count a half for any that are equal to it). The sum of these counts is
$U$ .
Method Two
For larger samples, a formula can be used.
First, add up the ranks for the observations that came from sample 1. The sum of ranks in sample 2 is now determinate, since the sum of all the ranks equals:
where
where
Example of Statement Results
In reporting the results of a Mann–Whitney test, it is important to state:
- a measure of the central tendencies of the two groups (means or medians; since the Mann–Whitney is an ordinal test, medians are usually recommended)
- the value of
$U$ - the sample sizes
- the significance level
In practice some of this information may already have been supplied and common sense should be used in deciding whether to repeat it. A typical report might run:
"Median latencies in groups
Comparison to Student's $t$ -Test
The
Ordinal Data
Robustness
As it compares the sums of ranks, the Mann–Whitney test is less likely than the
Efficiency
For distributions sufficiently far from normal and for sufficiently large sample sizes, the Mann-Whitney Test is considerably more efficient than the