Marginal modeling of nonnested multilevel data using standard software

Diana L Miglioretti, Patrick J. Heagerty

Research output: Contribution to journalArticlepeer-review

78 Scopus citations


Epidemiologic data are often clustered within multiple levels that may not be nested within each other. Generalized estimating equations are commonly used to adjust for correlation among observations within clusters when fitting regression models; however, standard software does not currently accommodate nonnested clusters. This paper introduces a simple generalized estimating equation strategy that uses available commercial or public software for the regression analysis of nonnested multilevel data. The authors describe how to obtain empirical standard error estimates for constructing valid confidence intervals and conducting statistical hypothesis tests. The method is evaluated using simulations and illustrated with an analysis of data from the Breast Cancer Surveillance Consortium that estimates the influence of woman, radiologist, and facility characteristics on the positive predictive value of screening mammography. Performance with a small number of clusters is discussed. Both the simulations and the example demonstrate the importance of accounting for the correlation within all levels of clustering for proper inference.

Original languageEnglish (US)
Pages (from-to)453-463
Number of pages11
JournalAmerican Journal of Epidemiology
Issue number4
StatePublished - Feb 2007
Externally publishedYes


  • Clustered data
  • Generalized estimating equation
  • Generalized linear model

ASJC Scopus subject areas

  • Epidemiology


Dive into the research topics of 'Marginal modeling of nonnested multilevel data using standard software'. Together they form a unique fingerprint.

Cite this