Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort

Yambazi Banda, Mark N. Kvale, Thomas J. Hoffmann, Stephanie E. Hesselson, Dilrini Ranatunga, Hua Tang, Chiara Sabatti, Lisa A. Croen, Brad P. Dispensa, Mary Henderson, Carlos Iribarren, Eric Jorgenson, Lawrence H. Kushi, Dana Ludwig, Diane Olberg, Charles P. Quesenberry, Sarah Rowell, Marianne Sadler, Lori C. Sakoda, Stanley SciortinoLing Shen, David Smethurst, Carol P. Somkin, Stephen K. Van Den Eeden, Lawrence Walter, Rachel Whitmer, Pui Yan Kwok, Catherine Schaefer, Neil Risch1

Research output: Contribution to journalArticlepeer-review

173 Scopus citations


Using genome-wide genotypes, we characterized the genetic structure of 103,006 participants in the Kaiser Permanente Northern California multi-ethnic Genetic Epidemiology Research on Adult Health and Aging Cohort and analyzed the relationship to selfreported race/ethnicity. Participants endorsed any of 23 race/ethnicity/nationality categories, which were collapsed into seven major race/ethnicity groups. By self-report the cohort is 80.8% white and 19.2% minority; 93.8% endorsed a single race/ethnicity group, while 6.2% endorsed two or more. Principal component (PC) and admixture analyses were generally consistent with prior studies. Approximately 17% of subjects had genetic ancestry from more than one continent, and 12% were genetically admixed, considering only nonadjacent geographical origins. Self-reported whites were spread on a continuum along the first two PCs, indicating extensive mixing among European nationalities. Self-identified East Asian nationalities correlated with genetic clustering, consistent with extensive endogamy. Individuals of mixed East Asian–European genetic ancestry were easily identified; we also observed a modest amount of European genetic ancestry in individuals selfidentified as Filipinos. Self-reported African Americans and Latinos showed extensive European and African genetic ancestry, and Native American genetic ancestry for the latter. Among 3741 genetically identified parent–child pairs, 93% were concordant for self-reported race/ ethnicity; among 2018 genetically identified full-sib pairs, 96% were concordant; the lower rate for parent–child pairs was largely due to intermarriage. The parent–child pairs revealed a trend toward increasing exogamy over time; the presence in the cohort of individuals endorsing multiple race/ethnicity categories creates interesting challenges and future opportunities for genetic epidemiologic studies.

Original languageEnglish (US)
Pages (from-to)1285-1295
Number of pages11
Issue number4
StatePublished - Aug 1 2015
Externally publishedYes


  • Admixture
  • Population structure
  • Principal components
  • Race/ethnicity

ASJC Scopus subject areas

  • Genetics


Dive into the research topics of 'Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort'. Together they form a unique fingerprint.

Cite this