A hidden markov modeling approach for admixture mapping based on case-control data

Chun Zhang, Kun Chen, Michael F Seldin, Hongzhe Li

Research output: Contribution to journalArticle

24 Citations (Scopus)

Abstract

Admixture mapping is potentially a powerful method for mapping genes for complex human diseases, when the disease frequency due to a particular disease-susceptible gene is different between founding populations of different ethnicity. The method tests for association of the allele ancestry with the disease. Since the markers used to define ancestral populations are not fully informative for the ancestry status, direct test of such association is not possible. In this report, we develop a unified hidden Markov model (HMM) framework for estimating the unobserved ancestry haplotypes across a chromosomal region based on marker haplotype or genotype data. The HMM efficiently utilizes all the marker data to infer the latent ancestry states at the putative disease locus. In this HMM modelling framework, we develop a likelihood test for association of allele ancestry and the disease risk based on case-control data. Existence of such association may imply linkage between the candidate locus and the disease locus. We evaluate by simulations how several factors affect the power of admixture mapping, including sample size, ethnicity relative risk, marker density, and the different admixture dynamics. Our simulation results indicate correct type 1 error rates of the proposed likelihood ratio tests and great impact of marker density on the power. The simulation results also indicate that the methods work well for the admixed populations derived from both hybrid-isolation and continuous gene-flowing models. Finally, we observed that the genotype-based HMM performs very similarly in power as the haplotype-based HMM when the haplotypes are known and the set of markers is highly informative.

Original languageEnglish (US)
Pages (from-to)225-239
Number of pages15
JournalGenetic Epidemiology
Volume27
Issue number3
DOIs
StatePublished - Nov 2004

Fingerprint

Haplotypes
Alleles
Genotype
Population
Chromosome Mapping
Sample Size
Genes

Keywords

  • Admixed population
  • Ancestry association
  • Ancestry relative risk
  • Haplotype

ASJC Scopus subject areas

  • Genetics(clinical)
  • Epidemiology

Cite this

A hidden markov modeling approach for admixture mapping based on case-control data. / Zhang, Chun; Chen, Kun; Seldin, Michael F; Li, Hongzhe.

In: Genetic Epidemiology, Vol. 27, No. 3, 11.2004, p. 225-239.

Research output: Contribution to journalArticle

Zhang, Chun ; Chen, Kun ; Seldin, Michael F ; Li, Hongzhe. / A hidden markov modeling approach for admixture mapping based on case-control data. In: Genetic Epidemiology. 2004 ; Vol. 27, No. 3. pp. 225-239.
@article{d6c243c08fe24eeaa36eaba77810abe2,
title = "A hidden markov modeling approach for admixture mapping based on case-control data",
abstract = "Admixture mapping is potentially a powerful method for mapping genes for complex human diseases, when the disease frequency due to a particular disease-susceptible gene is different between founding populations of different ethnicity. The method tests for association of the allele ancestry with the disease. Since the markers used to define ancestral populations are not fully informative for the ancestry status, direct test of such association is not possible. In this report, we develop a unified hidden Markov model (HMM) framework for estimating the unobserved ancestry haplotypes across a chromosomal region based on marker haplotype or genotype data. The HMM efficiently utilizes all the marker data to infer the latent ancestry states at the putative disease locus. In this HMM modelling framework, we develop a likelihood test for association of allele ancestry and the disease risk based on case-control data. Existence of such association may imply linkage between the candidate locus and the disease locus. We evaluate by simulations how several factors affect the power of admixture mapping, including sample size, ethnicity relative risk, marker density, and the different admixture dynamics. Our simulation results indicate correct type 1 error rates of the proposed likelihood ratio tests and great impact of marker density on the power. The simulation results also indicate that the methods work well for the admixed populations derived from both hybrid-isolation and continuous gene-flowing models. Finally, we observed that the genotype-based HMM performs very similarly in power as the haplotype-based HMM when the haplotypes are known and the set of markers is highly informative.",
keywords = "Admixed population, Ancestry association, Ancestry relative risk, Haplotype",
author = "Chun Zhang and Kun Chen and Seldin, {Michael F} and Hongzhe Li",
year = "2004",
month = "11",
doi = "10.1002/gepi.20021",
language = "English (US)",
volume = "27",
pages = "225--239",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "3",

}

TY - JOUR

T1 - A hidden markov modeling approach for admixture mapping based on case-control data

AU - Zhang, Chun

AU - Chen, Kun

AU - Seldin, Michael F

AU - Li, Hongzhe

PY - 2004/11

Y1 - 2004/11

N2 - Admixture mapping is potentially a powerful method for mapping genes for complex human diseases, when the disease frequency due to a particular disease-susceptible gene is different between founding populations of different ethnicity. The method tests for association of the allele ancestry with the disease. Since the markers used to define ancestral populations are not fully informative for the ancestry status, direct test of such association is not possible. In this report, we develop a unified hidden Markov model (HMM) framework for estimating the unobserved ancestry haplotypes across a chromosomal region based on marker haplotype or genotype data. The HMM efficiently utilizes all the marker data to infer the latent ancestry states at the putative disease locus. In this HMM modelling framework, we develop a likelihood test for association of allele ancestry and the disease risk based on case-control data. Existence of such association may imply linkage between the candidate locus and the disease locus. We evaluate by simulations how several factors affect the power of admixture mapping, including sample size, ethnicity relative risk, marker density, and the different admixture dynamics. Our simulation results indicate correct type 1 error rates of the proposed likelihood ratio tests and great impact of marker density on the power. The simulation results also indicate that the methods work well for the admixed populations derived from both hybrid-isolation and continuous gene-flowing models. Finally, we observed that the genotype-based HMM performs very similarly in power as the haplotype-based HMM when the haplotypes are known and the set of markers is highly informative.

AB - Admixture mapping is potentially a powerful method for mapping genes for complex human diseases, when the disease frequency due to a particular disease-susceptible gene is different between founding populations of different ethnicity. The method tests for association of the allele ancestry with the disease. Since the markers used to define ancestral populations are not fully informative for the ancestry status, direct test of such association is not possible. In this report, we develop a unified hidden Markov model (HMM) framework for estimating the unobserved ancestry haplotypes across a chromosomal region based on marker haplotype or genotype data. The HMM efficiently utilizes all the marker data to infer the latent ancestry states at the putative disease locus. In this HMM modelling framework, we develop a likelihood test for association of allele ancestry and the disease risk based on case-control data. Existence of such association may imply linkage between the candidate locus and the disease locus. We evaluate by simulations how several factors affect the power of admixture mapping, including sample size, ethnicity relative risk, marker density, and the different admixture dynamics. Our simulation results indicate correct type 1 error rates of the proposed likelihood ratio tests and great impact of marker density on the power. The simulation results also indicate that the methods work well for the admixed populations derived from both hybrid-isolation and continuous gene-flowing models. Finally, we observed that the genotype-based HMM performs very similarly in power as the haplotype-based HMM when the haplotypes are known and the set of markers is highly informative.

KW - Admixed population

KW - Ancestry association

KW - Ancestry relative risk

KW - Haplotype

UR - http://www.scopus.com/inward/record.url?scp=7444241441&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=7444241441&partnerID=8YFLogxK

U2 - 10.1002/gepi.20021

DO - 10.1002/gepi.20021

M3 - Article

VL - 27

SP - 225

EP - 239

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 3

ER -