Aspects of the design and analysis of high-dimensional SNP studies for disease risk estimation

Ross L. Prentice, Lihong Qi

Research output: Contribution to journalArticlepeer-review

28 Scopus citations


The state of readiness for high-dimensional single nucleotide polymorphism (SNP) epidemiologic association studies is described, as background for a discussion of statistical aspects of case-control study design and analysis. Specifically, the important role that multistage designs can play in the elimination of false-positive associations and in the control of study costs will be noted. Also, the trade-offs associated with using pooled DNA at early design stages for additional important cost reductions will be discussed in some detail. An odds ratio approach to relating SNP alleles to disease risk using pooled DNA will be proposed, in conjunction with a simple empirical variance estimator, based on comparisons among log-odds ratio estimators from distinct pairs of case and control pools. Simulation studies will be presented to evaluate the moderate sample size properties of such multistage designs and estimation procedures. The design of an ongoing three-stage study in the Women's Health Initiative to relate 250 000 SNPs to the risk of coronary heart disease, stroke, and breast cancer will provide illustration, and will be used to motivate the choice of simulation configurations.

Original languageEnglish (US)
Pages (from-to)339-354
Number of pages16
Issue number3
StatePublished - Jul 2006
Externally publishedYes


  • Case-control
  • Cohort
  • Genetic association
  • High-dimensional data
  • Multistage design
  • Odds ratio
  • Pooled DNA
  • Single nucleotide polymorphism

ASJC Scopus subject areas

  • Medicine(all)
  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Aspects of the design and analysis of high-dimensional SNP studies for disease risk estimation'. Together they form a unique fingerprint.

Cite this