StATS: Composite scores (January 27, 2000).

Dear Professor Mean, I have developed a method to distinguish among several products that we need to buy so our company can make a good purchasing decision. I created a composite score which is a weighted average of several different indicators of quality. I want to use statistics to determine when two different products have significantly different composite scores.

It sounds like what you want is to select a product on the basis of the highest composite score, but if two composite scores are close, then you break the tie on the basis of price or convenience.

I would argue that statistics are a poor way to judge which scores are close. I've seen similar situations in medical research where a statistically significant change was of no practical importance. I helped with a medical study, for example, where we looked at various measures of male reproductive potential. One of the measures was semen pH. One group had a statistically significantly higher average level of semen pH, but it was 6.9 versus 7.2 or something like that. Sperm can function well in a much broader range of pH levels, so I am told. You would need to see a full unit change in pH or maybe even more before any doctor would worry.

So I would suggest that you talk to the same people who helped you develop you the weights for your composite score and ask them to tell you how much of a change in the composite score would be large enough to have a practical impact. This is an example where statistics is a poor substitute for human judgment.

This page was written by Steve Simon while working at Children's Mercy Hospital. Although I do not hold the copyright for this material, I am reproducing it here as a service, as it is no longer available on the Children's Mercy Hospital website. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Category: Unusual data.