Stats: What is a Mann-Whitney test?

StATS: What is a Mann-Whitney test?

The Mann-Whitney test (sometimes called the Wilcoxon-Mann-Whitney test) is a nonparametric test. It compares two independent groups on an outcome variable that is ordinal. Ordinal means that the values can be ranked from low to high. This is a less stringent type of data than continuous data, and can incorporate measurements like grades (A+, A, A-, B+, etc.) and Likert scale items (Strongly Disagree, Moderately Disagree, Slightly Disagree, Neutral, etc.) where you may be uncomfortable assigning a numeric code to the data.

Because the Mann-Whitney test is nonparametric, it does not require the data to follow a normal distribution. It performs reasonably well for a variety of distributions that are decidedly non-normal and is less sensitive to outliers than the traditional two sample t-test. The Mann-Whitney test uses a ranking of the data, and some statisticians feel this can distort the results, especially when there are a lot tied values in the data..

There is extensive debate over when you should use this test, but in my opinion, the choice of this test versus a t-test is not all that critical.

There are two equivalent approaches for computing the Mann-Whitney test. The first approach calculates P[X>Y], the probability that a randomly selected patient from the first group has a larger value than a randomly selected patient from the second group. The second approach computes the rank of all the data and looks at whether the sum of the ranks for the patients in the first group is either too big or too small.

Here's an example from the web. Nine elderly and eight young patients were asked to stand on a device that measures postural sway, the tendency for a person's center of gravity to shift over time. A large postural sway might indicate a tendency to lose balance easily. Here is the data

age fbsway sidesway 1 Elderly 19 14 2 Elderly 30 41 3 Elderly 20 18 4 Elderly 19 11 5 Elderly 29 16 6 Elderly 25 24 7 Elderly 21 18 8 Elderly 24 21 9 Elderly 50 37 10 Young 25 17 11 Young 21 10 12 Young 17 16 13 Young 15 22 14 Young 14 12 15 Young 14 14 16 Young 22 12 17 Young 17 18

A boxplot of the front to back sway (fbsway) shows that the elderly patients have a tendency to have larger values.

Rank the data to get the following:

       age fbsway rlsway 1 Elderly     19     6/7 2 Elderly     30      16 3 Elderly     20       8 4 Elderly     19     6/7 5 Elderly     29      15 6 Elderly     25   13/14 7 Elderly     21    9/10 8 Elderly     24      12 9 Elderly     50      17 10 Young       25   13/14 11 Young       21    9/10 12 Young       17     4/5 13 Young       15       3 14 Young       14     1/2 15 Young       14     1/2 16 Young       22      11 17 Young       17     4/5

It's not too clear what to do with the ties, but the simplest thing is to average. If two values are tied for the smallest rank, rather than assigning the ranks of 1 and 2, compromise and assign 1.5 to both.

       age fbsway rfbsway 1 Elderly     19     6.5 2 Elderly     30      16 3 Elderly     20       8 4 Elderly     19     6.5 5 Elderly     29      15 6 Elderly     25    13.5 7 Elderly     21     9.5 8 Elderly     24      12 9 Elderly     50      17 10 Young       25    13.5 11 Young       21     9.5 12 Young       17     4.5 13 Young       15       3 14 Young       14     1.5 15 Young       14     1.5 16 Young       22      11 17 Young       17     4.5

The sum of ranks associated with the young patients is a little easier to calculate since there are only 8 of them. The sum is 49. The lowest possible sum (if all the values in the young group were smaller than all the values in the elderly group) would be 36 (1+2+...+8), and the largest possible value would be 100 (10+...+17). If the ranks of 1-17 were equally likely, then you'd expect to see a sum of 72. Clearly, the value in this data set is rather low, causing you to believe, perhaps, that young patients have less front-to-back sway than older patients.

Arrange the data to compute all pairwise differences. The first elderly value (19) minus the first young value (25) gives a difference of -6. The first elderly value (19) minus the second young value (25) gives a difference of -2. Keep on doing this until you subtract the last elderly value (50) from the last young value (17) to get a difference of 33.

     19 30 20 19 29 25 21 24 50 25 -6 5 -5 -6 4 0 -4 -1 25 21 -2 9 -1 -2 8 4 0 3 29 17 2 13 3 2 12 8 4 7 33 15    4 15 5 4 14 10 6 9 35 14 5 16 6 5 15 11 7 10 36 14 5 16 6 5 15 11 7 10 36 22 -3 8 -2 -3 7 3 -1 2 28 17 2 13 3 2 12 8 4 7 33

Notice that there is a mix of positive and negative differences, but mostly positive differences. There are exactly 58 positive differences, 12 negative differences, and 2 zero differences. It's not exactly clear what you should do with the zero differences, but treating each one as half positive and half negative seems to be reasonable. With 59 positive and 13 negative differences, you would estimate P[X>Y] at 82%. That's quite a bit larger than 50% and also seems to indicate that the elderly patients tend to have larger front-to-back sway values than young patients.

I won't show the details here, but you can easily compute a p-value for the Mann-Whitney test. A confidence interval takes a bit more work; it uses the pairwise differences described above. All the details can be found on pages 106-135 of Nonparametric Statistical Methods. Hollander M, Wolfe DA (1999) New York: John Wiley & Sons, Inc.

The data set described above is available at the OzDASL web site at http://www.statsci.org/data/general/balaconc.html. It was originally published in

Teasdale, N., Bard, C., La Rue, J., and Fleury, M. (1993). On the cognitive penetrability of posture control. Experimental Aging Research 19, 1-13.

This page was written by Steve Simon while working at Children's Mercy Hospital. Although I do not hold the copyright for this material, I am reproducing it here as a service, as it is no longer available on the Children's Mercy Hospital website. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Category: Definitions.