P.Mean: Pearson correlation and ordinal data don't mix (created 2008-07-11)

P.Mean: Pearson correlation and ordinal data don't mix (created 2008-07-11).

This page is moving to a new website.

I feel uncomfortable using a Pearson correlation coefficient for two variables that are measured on an ordinal scale (for example, 1=unaware, 2=aware, 3=fairly aware, 4=moderately aware, 5=very aware). But I can't explain why I am uncomfortable with this. Can you help?

The Pearson correlation uses an average in its calculation and that alone will raise questions in some people's minds. An average assumes, among other things that a measurements of two values

3 (fairly aware) and 5 (very aware)

is equivalent to measurements of two values

4 (moderately aware) and 4 (moderately aware).

Unless you believe in equal spacing of your intervals (e.g., "moderately" is exactly halfway between "fairly" and "very") then an average and hence a Pearson correlation is unjustified.

I tend to not fuss about this too much. The equal spacing assumption is almost surely wrong, but it's probably not terribly wrong. I suspect that the Pearson correlation will give much the same results as a method appropriate for ordinal data.

This work is licensed under a Creative Commons Attribution 3.0 United States License. This page was written by Steve Simon and was last modified on 2010-04-01. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Category: Linear regression.