What is residual confounding (created 2010-01-06).

This page is moving to a new website.

Residual confounding is a frequent explanation for unusual research findings. Before I define the term and show an example, I need to address a more basic issue. The term "confounding" is used frequently but often without careful consideration of the true definition of the term. I tend to shy away from this term and typically use "covariate imbalance" instead.

A covariate is a variable that is not of direct interest in a research study but which needs to be accounted for as part of the research because it has the potential to influence the outcome variable. In many studies of cancer, the smoking status of the subject needs to be measured, not because we are trying to establish a link between smoking and cancer (that link was already well known for many types of cancer) but rather because smoking habits may differ in the patients exposed or not exposed to a toxic substance. This has the potential to mask a true relationship between exposure and cancer or to produce an artefactual relationship between exposure and cancer.

Covariate imbalance occurs when the mean of the covariate (for a continuous variable) or the probabilities for the covariate levels (for a categorical variable) differ between the exposed and unexposed groups.

Confounding requires a more rigorous definition than covariate imbalance. A confounding variable must be correlated with exposure (essentially the same condition as covariate imbalance). But a confounding variable must be a direct cause of the outcome being studied or a surrogate for a direct cause of the outcome. Contrast this with the potential influence that we use to define a covariate.

 In addition, a confounding variable must not be an intermediate step on the causal pathway between exposure and outcome. Recently, researchers have stated a more direct standard. A confounding variable cannot be a variable which is affected by exposure.

Residual confounding occurs when a confounding variable is measured imperfectly or with some error and the adjustment using this imperfect measure does not completely remove the effect of the confounding variable. An example appears in Chen et al (1999). It turns out that women who smoke during pregnancy have a decreased risk of having a Down syndrome birth. This is puzzling, as smoking is not often thought of as a good thing to do. Should we ask women to start smoking during pregnancy?

It turns out that there is a relationship between age and smoking during pregnancy, with younger women being more likely to indulge in this bad habit. Younger women are also less likely to give birth to a child with Down syndrome. When you adjust the model relating smoking and Down syndrome for the important covariate of age, then the effect of smoking disappears. But when you make the adjustment using a binary variable (age<35 years, age >=35 years), the protective effect of smoking appears to remain. This is an example of residual confounding.

Chi-Ling Chen, Tim J. Gilbert, Janet R. Daling. Maternal Smoking and Down Syndrome: The Confounding Effect of Maternal Age. Am. J. Epidemiol. 1999;149(5):442-446. Abstract: "Inconsistent results have been reported from studies evaluating the association of maternal smoking with birth of a Down syndrome child. Control of known risk factors, particularly maternal age, has also varied across studies. By using a population-based case-control design (775 Down syndrome cases and 7, 750 normal controls) and Washington State birth record data for 1984-1994, the authors examined this hypothesized association and found a crude odds ratio of 0.80 (95% confidence interval 0.65-0.98). Controlling for broad categories of maternal age ([≥]35 years, [≥]35 years), as described in prior studies, resulted in a negative association (odds ratio = 0.87, 95% confidence interval 0.71-1.07). However, controlling for exact year of maternal age in conjunction with race and parity resulted in no association (odds ratio = 1.00, 95% confidence interval 0.82-1.24). In this study, the prevalence of Down syndrome births increased with increasing maternal age, whereas among controls the reported prevalence of smoking during pregnancy decreased with increasing maternal age. There is a substantial potential for residual confounding by maternal age in studies of maternal smoking and Down syndrome. After adequately controlling for maternal age in this study, the authors found no clear relation between maternal smoking and the risk of Down syndrome. Am J Epidemiol 1999; 149:442-6." [Accessed January 6, 2010]. Available at: http://aje.oxfordjournals.org/cgi/content/abstract/149/5/442.

R McNamee. Confounding and confounders. Occupational and Environmental Medicine. 2003;60(3):227-234. Excerpt: "Confounding should always be addressed in studies concerned with causality. When present, it results in a biased estimate of the effect of exposure on disease. The bias can be negative´┐Żresulting in underestimation of the exposure effect´┐Żor positive, and can even reverse the apparent direction of effect. It is a concern no matter what the design of the study or what statistic is used to measure the effect of exposure." [Accessed January 6, 2010]. Available at: http://oem.bmj.com/content/60/3/227.short.