P.Mean: The confidence interval for my odds ratio is too narrow! (created 2013-04-23).

News: Sign up for "The Monthly Mean," the newsletter that dares to call itself average, www.pmean.com/news.

Dear Professor Mean, I was reviewing a poster that presented an odds ratio of 1.00 and a 95% confidence interval of 1.00 to 1.00. The p-value was 0.01. When I asked to see the exact values, the odds ratio was 1.001024524 and the confidence interval was 1.000352 and 1.001698? It seems wrong somehow to have a confidence interval this narrow, but the presenter seemed to think that everything was just fine. What's going on here?

It sounds too good to be true, doesn't it? A confidence interval with zero width and with an extremely small p-value. Well, I'm afraid to tell you that this interval is deceptive. This happens a lot and it is effectively a units conversion problem. This problem occurs most often with logistic regression models. Here's an example of why it happens and what you need to do to fix things.

In a data set involving breast feeding patterns in pre-term infants, you can model the relationship with exclusive breast feeding at discharge and the birth weight of the baby. You'd expect a positive relationship for a variety of reasons, the most obvious being that the bigger babies are discharged from the hospital sooner and are home with the mother more quickly than the smaller babies. It's hard to breast feed when the mother is at home and the baby is stuck at the hospital. You have to do a lot of traveling, and a breast pump can help also.

In this data set all of the babies are preterm births. So the birth weights are rather low, ranging from 1,001 to 2,453 grams. If you do an analysis, you get an table that looks like this.

Logistic regression analysis (grams)

It's not quite as extreme as your results, but it is close. If you round to two decimal places, whicch seems reasonable enough, the odds ratio is 1.00 and the confidence interval is 1.00 to 1.00. The p-value is 0.005. So what is going on here?

You need to look at what the odds ratio represents. For a continuous variable like birth weight, it represents the relative change in the odds of breast feeding when you increase birth weight by one unit. One unit here means one gram.

Well, of course, a one gram change in birthweight would cause a microscopic change in the odds of breast feeding. No one would ever try to envision the difference between a baby that weighs 1,182 grams and 1,183 grams.

Let's convert the birth weight to kilograms instead. The results look quite different.

Logistic regression analysis (kilograms)

The odds ratio is 6.1 and the confidence interval is 1.7 to 21.8. Quite a difference!

The graph below tries to show what is going on.

Graph of birth weight in grams and breast feeding probability

This graph shows birth weight in grams on the horizontal axis, predicted probability of exclusive breast feeding on the left vertical axis and the equivalent predicted odds on the right vertical axis. You can see that the predicted probability goes from 0.33 to 0.88. Equivalently, the odds go from 0.5 to 7.6. So how much of a change is represented by the odds ratio in the first logistic model? It turns out that you can't even see it on this graph. You need to expand the graph markedly.

Zoom region

The square in the above graph represents where we are going to zoom in.

One gram change

This is the zoomed in plot showing the magnitude of a one gram change. I have arbitrarily set the one unit change at 1,182 grams, but the picture would be similar for any other choice. Even with a fifteen fold zoom, the one unit change is still barely visible. A baby with a birth weight of 1,182 grams has a predicted probability of 0.4109 (odds of 0.6974) and a baby with a birth weight of 1,183 grams has a predicted probability of 0.4113 (odds of 0.6987). When the changes appear in only the third or fourth decimal place, it's hardly surprising that the odds ratio is almost indistinguishable from 1. We have a result which is statistically significant, but because the researcher chose the wrong units of measurement, you have no way of understanding the practical implications of the odds ratio.

Kilogram graph

Here's a similar graqph, but with the units of birth weight changed from grams to kilograms. A one unit change on this graph is huge. The probability for a 1.182 kilogram baby is the same as for a 1,182 gram baby, of course (0.4109 or an odds of 0.6974), but for a 2.182 kilogram baby, the probability is 0.8101 (odds 4.2657). When you convert from grams to kilograms, you get a much better understanding of the odds ratio. Each additional kilogram of weight increases the odds of exclusive breast feeding by a factor of 6.1.

Now, I must admit that even though a gram change was way to small, a kilogram change does seem a bit too big. It might be logical to think about 100 gram changes (100 grams = 1 hectogram).

Hectogram graph

Here's a graph showing the birthweight in hectograms, and here the change is smaller, but still easy to interpret.The predicted probability for an 11.82 hectogram baby is still 0.4109 (odds 0.6974), and the probability for a 12.82 hectogram baby is 0.4553 (odds (0.8359). You can estimate the odds ratio on the hectogram scale by taking the odds ratio on the kilogram scale and raising it to the 1/10 power. This produces an odds ratio of 1.20. For every 100 gram increase in weight, the odds of exclusive breastfeeding increase by about 20%.

Logistic regression analysis (hectogram)

Here is the SPSS analysis using hectograms. The odds ratio is 1.2 and the confidence interval goes from 1.06 to 1.36.

Now the same problem could occur, in theory, for linear regression, but a poor choice of units is easily noticed and easily corrected in linear regression. Here's a different outcome variable, the duration of breast feeding in weeks.

Linear regression analysis (grams)

For this analysis, if you use birth weight in grams, you get a ridiculously small regression coefficient (0.003), but it is easier to spot, and if you wanted to re-run the analysis in kilograms, you could do this, effectively by multiplying the regression coefficient and the corresponding confidence interval by 1,000.

Linear regression analysis (kilograms)

If multiplying by 1,000 is too much work for you, here's the SPSS analysis in kilograms. Each kilogram of additional weight produces an increase of 3 weeks in duration of breast feeding, although this increase is not statistically significant.

In logistic regression, keep your eye open for a confidence interval that is outrageously narrow combined with an odds ratio that is extremely close to 1. It usually is a sign that you need to convert your independent variable to a different unit of measurement.

Creative Commons License This page was written by Steve Simon and is licensed under the Creative Commons Attribution 3.0 United States License. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Incomplete pages.