Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort

Yambazi Banda, Mark N. Kvale, Thomas J. Hoffmann, Stephanie E. Hesselson, Dilrini Ranatunga, Hua Tang, Chiara Sabatti, Lisa A. Croen, Brad P. Dispensa, Mary Henderson, Carlos Iribarren, Eric Jorgenson, Lawrence H. Kushi, Dana Ludwig, Diane Olberg, Charles P. Quesenberry, Sarah Rowell, Marianne Sadler, Lori C. Sakoda, Stanley Sciortino & 9 others Ling Shen, David Smethurst, Carol P. Somkin, Stephen K. Van Den Eeden, Lawrence Walter, Rachel Whitmer, Pui Yan Kwok, Catherine Schaefer, Neil Risch1

Research output: Contribution to journalArticle

87 Citations (Scopus)

Abstract

Using genome-wide genotypes, we characterized the genetic structure of 103,006 participants in the Kaiser Permanente Northern California multi-ethnic Genetic Epidemiology Research on Adult Health and Aging Cohort and analyzed the relationship to selfreported race/ethnicity. Participants endorsed any of 23 race/ethnicity/nationality categories, which were collapsed into seven major race/ethnicity groups. By self-report the cohort is 80.8% white and 19.2% minority; 93.8% endorsed a single race/ethnicity group, while 6.2% endorsed two or more. Principal component (PC) and admixture analyses were generally consistent with prior studies. Approximately 17% of subjects had genetic ancestry from more than one continent, and 12% were genetically admixed, considering only nonadjacent geographical origins. Self-reported whites were spread on a continuum along the first two PCs, indicating extensive mixing among European nationalities. Self-identified East Asian nationalities correlated with genetic clustering, consistent with extensive endogamy. Individuals of mixed East Asian–European genetic ancestry were easily identified; we also observed a modest amount of European genetic ancestry in individuals selfidentified as Filipinos. Self-reported African Americans and Latinos showed extensive European and African genetic ancestry, and Native American genetic ancestry for the latter. Among 3741 genetically identified parent–child pairs, 93% were concordant for self-reported race/ ethnicity; among 2018 genetically identified full-sib pairs, 96% were concordant; the lower rate for parent–child pairs was largely due to intermarriage. The parent–child pairs revealed a trend toward increasing exogamy over time; the presence in the cohort of individuals endorsing multiple race/ethnicity categories creates interesting challenges and future opportunities for genetic epidemiologic studies.

Original languageEnglish (US)
Pages (from-to)1285-1295
Number of pages11
JournalGenetics
Volume200
Issue number4
DOIs
StatePublished - Aug 1 2015
Externally publishedYes

Fingerprint

Genetic Research
Molecular Epidemiology
Health
Ethnic Groups
North American Indians
Genetic Structures
Principal Component Analysis
Marriage
Hispanic Americans
African Americans
Self Report
Cluster Analysis
Epidemiologic Studies
Genotype
Genome

Keywords

  • Admixture
  • Population structure
  • Principal components
  • Race/ethnicity
  • RPGEH GERA

ASJC Scopus subject areas

  • Genetics

Cite this

Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort. / Banda, Yambazi; Kvale, Mark N.; Hoffmann, Thomas J.; Hesselson, Stephanie E.; Ranatunga, Dilrini; Tang, Hua; Sabatti, Chiara; Croen, Lisa A.; Dispensa, Brad P.; Henderson, Mary; Iribarren, Carlos; Jorgenson, Eric; Kushi, Lawrence H.; Ludwig, Dana; Olberg, Diane; Quesenberry, Charles P.; Rowell, Sarah; Sadler, Marianne; Sakoda, Lori C.; Sciortino, Stanley; Shen, Ling; Smethurst, David; Somkin, Carol P.; Van Den Eeden, Stephen K.; Walter, Lawrence; Whitmer, Rachel; Kwok, Pui Yan; Schaefer, Catherine; Risch1, Neil.

In: Genetics, Vol. 200, No. 4, 01.08.2015, p. 1285-1295.

Research output: Contribution to journalArticle

Banda, Y, Kvale, MN, Hoffmann, TJ, Hesselson, SE, Ranatunga, D, Tang, H, Sabatti, C, Croen, LA, Dispensa, BP, Henderson, M, Iribarren, C, Jorgenson, E, Kushi, LH, Ludwig, D, Olberg, D, Quesenberry, CP, Rowell, S, Sadler, M, Sakoda, LC, Sciortino, S, Shen, L, Smethurst, D, Somkin, CP, Van Den Eeden, SK, Walter, L, Whitmer, R, Kwok, PY, Schaefer, C & Risch1, N 2015, 'Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort', Genetics, vol. 200, no. 4, pp. 1285-1295. https://doi.org/10.1534/genetics.115.178616
Banda, Yambazi ; Kvale, Mark N. ; Hoffmann, Thomas J. ; Hesselson, Stephanie E. ; Ranatunga, Dilrini ; Tang, Hua ; Sabatti, Chiara ; Croen, Lisa A. ; Dispensa, Brad P. ; Henderson, Mary ; Iribarren, Carlos ; Jorgenson, Eric ; Kushi, Lawrence H. ; Ludwig, Dana ; Olberg, Diane ; Quesenberry, Charles P. ; Rowell, Sarah ; Sadler, Marianne ; Sakoda, Lori C. ; Sciortino, Stanley ; Shen, Ling ; Smethurst, David ; Somkin, Carol P. ; Van Den Eeden, Stephen K. ; Walter, Lawrence ; Whitmer, Rachel ; Kwok, Pui Yan ; Schaefer, Catherine ; Risch1, Neil. / Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort. In: Genetics. 2015 ; Vol. 200, No. 4. pp. 1285-1295.
@article{cba427af3f2f42a2ba997dec50e5356d,
title = "Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort",
abstract = "Using genome-wide genotypes, we characterized the genetic structure of 103,006 participants in the Kaiser Permanente Northern California multi-ethnic Genetic Epidemiology Research on Adult Health and Aging Cohort and analyzed the relationship to selfreported race/ethnicity. Participants endorsed any of 23 race/ethnicity/nationality categories, which were collapsed into seven major race/ethnicity groups. By self-report the cohort is 80.8{\%} white and 19.2{\%} minority; 93.8{\%} endorsed a single race/ethnicity group, while 6.2{\%} endorsed two or more. Principal component (PC) and admixture analyses were generally consistent with prior studies. Approximately 17{\%} of subjects had genetic ancestry from more than one continent, and 12{\%} were genetically admixed, considering only nonadjacent geographical origins. Self-reported whites were spread on a continuum along the first two PCs, indicating extensive mixing among European nationalities. Self-identified East Asian nationalities correlated with genetic clustering, consistent with extensive endogamy. Individuals of mixed East Asian–European genetic ancestry were easily identified; we also observed a modest amount of European genetic ancestry in individuals selfidentified as Filipinos. Self-reported African Americans and Latinos showed extensive European and African genetic ancestry, and Native American genetic ancestry for the latter. Among 3741 genetically identified parent–child pairs, 93{\%} were concordant for self-reported race/ ethnicity; among 2018 genetically identified full-sib pairs, 96{\%} were concordant; the lower rate for parent–child pairs was largely due to intermarriage. The parent–child pairs revealed a trend toward increasing exogamy over time; the presence in the cohort of individuals endorsing multiple race/ethnicity categories creates interesting challenges and future opportunities for genetic epidemiologic studies.",
keywords = "Admixture, Population structure, Principal components, Race/ethnicity, RPGEH GERA",
author = "Yambazi Banda and Kvale, {Mark N.} and Hoffmann, {Thomas J.} and Hesselson, {Stephanie E.} and Dilrini Ranatunga and Hua Tang and Chiara Sabatti and Croen, {Lisa A.} and Dispensa, {Brad P.} and Mary Henderson and Carlos Iribarren and Eric Jorgenson and Kushi, {Lawrence H.} and Dana Ludwig and Diane Olberg and Quesenberry, {Charles P.} and Sarah Rowell and Marianne Sadler and Sakoda, {Lori C.} and Stanley Sciortino and Ling Shen and David Smethurst and Somkin, {Carol P.} and {Van Den Eeden}, {Stephen K.} and Lawrence Walter and Rachel Whitmer and Kwok, {Pui Yan} and Catherine Schaefer and Neil Risch1",
year = "2015",
month = "8",
day = "1",
doi = "10.1534/genetics.115.178616",
language = "English (US)",
volume = "200",
pages = "1285--1295",
journal = "Genetics",
issn = "0016-6731",
publisher = "Genetics Society of America",
number = "4",

}

TY - JOUR

T1 - Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort

AU - Banda, Yambazi

AU - Kvale, Mark N.

AU - Hoffmann, Thomas J.

AU - Hesselson, Stephanie E.

AU - Ranatunga, Dilrini

AU - Tang, Hua

AU - Sabatti, Chiara

AU - Croen, Lisa A.

AU - Dispensa, Brad P.

AU - Henderson, Mary

AU - Iribarren, Carlos

AU - Jorgenson, Eric

AU - Kushi, Lawrence H.

AU - Ludwig, Dana

AU - Olberg, Diane

AU - Quesenberry, Charles P.

AU - Rowell, Sarah

AU - Sadler, Marianne

AU - Sakoda, Lori C.

AU - Sciortino, Stanley

AU - Shen, Ling

AU - Smethurst, David

AU - Somkin, Carol P.

AU - Van Den Eeden, Stephen K.

AU - Walter, Lawrence

AU - Whitmer, Rachel

AU - Kwok, Pui Yan

AU - Schaefer, Catherine

AU - Risch1, Neil

PY - 2015/8/1

Y1 - 2015/8/1

N2 - Using genome-wide genotypes, we characterized the genetic structure of 103,006 participants in the Kaiser Permanente Northern California multi-ethnic Genetic Epidemiology Research on Adult Health and Aging Cohort and analyzed the relationship to selfreported race/ethnicity. Participants endorsed any of 23 race/ethnicity/nationality categories, which were collapsed into seven major race/ethnicity groups. By self-report the cohort is 80.8% white and 19.2% minority; 93.8% endorsed a single race/ethnicity group, while 6.2% endorsed two or more. Principal component (PC) and admixture analyses were generally consistent with prior studies. Approximately 17% of subjects had genetic ancestry from more than one continent, and 12% were genetically admixed, considering only nonadjacent geographical origins. Self-reported whites were spread on a continuum along the first two PCs, indicating extensive mixing among European nationalities. Self-identified East Asian nationalities correlated with genetic clustering, consistent with extensive endogamy. Individuals of mixed East Asian–European genetic ancestry were easily identified; we also observed a modest amount of European genetic ancestry in individuals selfidentified as Filipinos. Self-reported African Americans and Latinos showed extensive European and African genetic ancestry, and Native American genetic ancestry for the latter. Among 3741 genetically identified parent–child pairs, 93% were concordant for self-reported race/ ethnicity; among 2018 genetically identified full-sib pairs, 96% were concordant; the lower rate for parent–child pairs was largely due to intermarriage. The parent–child pairs revealed a trend toward increasing exogamy over time; the presence in the cohort of individuals endorsing multiple race/ethnicity categories creates interesting challenges and future opportunities for genetic epidemiologic studies.

AB - Using genome-wide genotypes, we characterized the genetic structure of 103,006 participants in the Kaiser Permanente Northern California multi-ethnic Genetic Epidemiology Research on Adult Health and Aging Cohort and analyzed the relationship to selfreported race/ethnicity. Participants endorsed any of 23 race/ethnicity/nationality categories, which were collapsed into seven major race/ethnicity groups. By self-report the cohort is 80.8% white and 19.2% minority; 93.8% endorsed a single race/ethnicity group, while 6.2% endorsed two or more. Principal component (PC) and admixture analyses were generally consistent with prior studies. Approximately 17% of subjects had genetic ancestry from more than one continent, and 12% were genetically admixed, considering only nonadjacent geographical origins. Self-reported whites were spread on a continuum along the first two PCs, indicating extensive mixing among European nationalities. Self-identified East Asian nationalities correlated with genetic clustering, consistent with extensive endogamy. Individuals of mixed East Asian–European genetic ancestry were easily identified; we also observed a modest amount of European genetic ancestry in individuals selfidentified as Filipinos. Self-reported African Americans and Latinos showed extensive European and African genetic ancestry, and Native American genetic ancestry for the latter. Among 3741 genetically identified parent–child pairs, 93% were concordant for self-reported race/ ethnicity; among 2018 genetically identified full-sib pairs, 96% were concordant; the lower rate for parent–child pairs was largely due to intermarriage. The parent–child pairs revealed a trend toward increasing exogamy over time; the presence in the cohort of individuals endorsing multiple race/ethnicity categories creates interesting challenges and future opportunities for genetic epidemiologic studies.

KW - Admixture

KW - Population structure

KW - Principal components

KW - Race/ethnicity

KW - RPGEH GERA

UR - http://www.scopus.com/inward/record.url?scp=84939422197&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84939422197&partnerID=8YFLogxK

U2 - 10.1534/genetics.115.178616

DO - 10.1534/genetics.115.178616

M3 - Article

VL - 200

SP - 1285

EP - 1295

JO - Genetics

JF - Genetics

SN - 0016-6731

IS - 4

ER -