StATS: What is a Mann-Whitney test?

The Mann-Whitney test (sometimes called the Wilcoxon-Mann-Whitney test) is a nonparametric test. It compares two independent groups on an outcome variable that is ordinal. Ordinal means that the values can be ranked from low to high. This is a less stringent type of data than continuous data, and can incorporate measurements like grades (A+, A, A-, B+, etc.) and Likert scale items (Strongly Disagree, Moderately Disagree, Slightly Disagree, Neutral, etc.) where you may be uncomfortable assigning a numeric code to the data.

Because the Mann-Whitney test is nonparametric, it does not require the data to follow a normal distribution. It performs reasonably well for a variety of distributions that are decidedly non-normal and is less sensitive to outliers than the traditional two sample t-test. The Mann-Whitney test uses a ranking of the data, and some statisticians feel this can distort the results, especially when there are a lot tied values in the data..

There is extensive debate over when you should use this test, but in my opinion, the choice of this test versus a t-test is not all that critical.

There are two equivalent approaches for computing the Mann-Whitney test. The first approach calculates P[X>Y], the probability that a randomly selected patient from the first group has a larger value than a randomly selected patient from the second group. The second approach computes the rank of all the data and looks at whether the sum of the ranks for the patients in the first group is either too big or too small.

Here's an example from the web. Nine elderly and eight young patients were asked to stand on a device that measures postural sway, the tendency for a person's center of gravity to shift over time. A large postural sway might indicate a tendency to lose balance easily. Here is the data

      age  fbsway sidesway
 1 Elderly     19       14
 2 Elderly     30       41
 3 Elderly     20       18
 4 Elderly     19       11
 5 Elderly     29       16
 6 Elderly     25       24
 7 Elderly     21       18
 8 Elderly     24       21
 9 Elderly     50       37

10 Young       25       17
11 Young       21       10
12 Young       17       16
13 Young       15       22
14 Young       14       12
15 Young       14       14
16 Young       22       12
17 Young       17       18

A boxplot of the front to back sway (fbsway) shows that the elderly patients have a tendency to have larger values.

Rank the data to get the following:

       age fbsway  rlsway
 1 Elderly     19     6/7
 2 Elderly     30      16
 3 Elderly     20       8
 4 Elderly     19     6/7
 5 Elderly     29      15
 6 Elderly     25   13/14
 7 Elderly     21    9/10
 8 Elderly     24      12
 9 Elderly     50      17

10 Young       25   13/14
11 Young       21    9/10
12 Young       17     4/5
13 Young       15       3
14 Young       14     1/2
15 Young       14     1/2
16 Young       22      11
17 Young       17     4/5

It's not too clear what to do with the ties, but the simplest thing is to average. If two values are tied for the smallest rank, rather than assigning the ranks of 1 and 2, compromise and assign 1.5 to both.

       age fbsway rfbsway
 1 Elderly     19     6.5
 2 Elderly     30      16
 3 Elderly     20       8
 4 Elderly     19     6.5
 5 Elderly     29      15
 6 Elderly     25    13.5
 7 Elderly     21     9.5
 8 Elderly     24      12
 9 Elderly     50      17

10 Young       25    13.5
11 Young       21     9.5
12 Young       17     4.5
13 Young       15       3
14 Young       14     1.5
15 Young       14     1.5
16 Young       22      11
17 Young       17     4.5

 The sum of ranks associated with the young patients is a little easier to calculate since there are only 8 of them. The sum is 49. The lowest possible sum (if all the values in the young group were smaller than all the values in the elderly group) would be 36 (1+2+...+8), and the largest possible value would be 100 (10+...+17). If the ranks of 1-17 were equally likely, then you'd expect to see a sum of 72. Clearly, the value in this data set is rather low, causing you to believe, perhaps, that young patients have less front-to-back sway than older patients.

Arrange the data to compute all pairwise differences. The first elderly value (19) minus the first young value (25) gives a difference of -6. The first elderly value (19) minus the second young value (25) gives a difference of -2. Keep on doing this until you subtract the last elderly value (50) from the last young value (17) to get a difference of 33.

     19 30 20 19 29 25 21 24 50
 

25   -6  5 -5 -6  4  0 -4 -1 25
21   -2  9 -1 -2  8  4  0  3 29
17    2 13  3  2 12  8  4  7 33
15    4 15  5  4 14 10  6  9 35
14    5 16  6  5 15 11  7 10 36
14    5 16  6  5 15 11  7 10 36
22   -3  8 -2 -3  7  3 -1  2 28
17    2 13  3  2 12  8  4  7 33

Notice that there is a mix of positive and negative differences, but mostly positive differences. There are exactly 58 positive differences, 12 negative differences, and 2 zero differences. It's not exactly clear what you should do with the zero differences, but treating each one as half positive and half negative seems to be reasonable. With 59 positive and 13 negative differences, you would estimate P[X>Y] at 82%. That's quite a bit larger than 50% and also seems to indicate that the elderly patients tend to have larger front-to-back sway values than young patients.

I won't show the details here, but you can easily compute a p-value for the Mann-Whitney test. A confidence interval takes a bit more work; it uses the pairwise differences described above. All the details can be found on pages 106-135 of Nonparametric Statistical Methods. Hollander M, Wolfe DA (1999) New York: John Wiley & Sons, Inc.

The data set described above is available at the OzDASL web site at http://www.statsci.org/data/general/balaconc.html. It was originally published in 

Teasdale, N., Bard, C., La Rue, J., and Fleury, M. (1993). On the cognitive penetrability of posture control. Experimental Aging Research 19, 1-13.

This page was written by Steve Simon while working at Children's Mercy Hospital. Although I do not hold the copyright for this material, I am reproducing it here as a service, as it is no longer available on the Children's Mercy Hospital website. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Category: Definitions.