Epidemiologic data are often clustered within multiple levels that may not be nested within each other. Generalized estimating equations are commonly used to adjust for correlation among observations within clusters when fitting regression models; however, standard software does not currently accommodate nonnested clusters. This paper introduces a simple generalized estimating equation strategy that uses available commercial or public software for the regression analysis of nonnested multilevel data. The authors describe how to obtain empirical standard error estimates for constructing valid confidence intervals and conducting statistical hypothesis tests. The method is evaluated using simulations and illustrated with an analysis of data from the Breast Cancer Surveillance Consortium that estimates the influence of woman, radiologist, and facility characteristics on the positive predictive value of screening mammography. Performance with a small number of clusters is discussed. Both the simulations and the example demonstrate the importance of accounting for the correlation within all levels of clustering for proper inference.
- Clustered data
- Generalized estimating equation
- Generalized linear model
ASJC Scopus subject areas