What is a Generalized Estimating Equations model? (created 2010-08-19).

This page is moving to a new website.

Generalized Estimating Equations (GEE) are a model for your data that can account for dependence among some of your measurements due to repeated measures, cluster sampling, or a longitudinal data set. It represents an extension of the Generalized Linear Model (GLM). Like the GLM, the GEE model allows you to specify a link function and a mean variance relationship. With the appropriate choice of these two items, you can specify a wide variety of models. Thus, GEE represents an extension of linear regression, logistic regression, and Poisson regression that can fit data from repeated measures and other designs that involve multiple measurements per sampling unit. The GEE model is very flexible with unbalanced data (more data for some subjects than other subjects) and missing data (some subjects missed a measurement at one time point). You should always be cautious about missing data, of course, and study the reasons why data are missing.

In GEE, you need to specify a pattern of correlation among your measurements. There are a variety of patterns that you can consider. The compound symmetry pattern makes the correlation equal for any two measurements on the same subject. The autoregressive pattern makes the correlation largest for values that are closest to one another in time and declining exponentially as the separation in time increases. It is important from an efficiency perspective to try to specify the correlation pattern correctly, but the GEE model uses an approach that is valid even if the pattern you select for the correlations is incorrect.

There are several models that you should also consider for longitudinal or repeated measures data: mixed models and hierarchical models. There are subtle but important differences between GEE and these other models. For example, the coefficients for a GEE model estimate an average effect across all patients while the coefficients in a mixed model estimate an effect for an individual patient. None of these models are easy to use, and you should consider getting advice from a professional statistician before fitting any of these models.