TY - JOUR
T1 - Personalized copy number and segmental duplication maps using next-generation sequencing
AU - Alkan, Can
AU - Kidd, Jeffrey M.
AU - Marques-Bonet, Tomas
AU - Aksay, Gozde
AU - Antonacci, Francesca
AU - Hormozdiari, Fereydoun
AU - Kitzman, Jacob O.
AU - Baker, Carl
AU - Malig, Maika
AU - Mutlu, Onur
AU - Sahinalp, S. Cenk
AU - Gibbs, Richard A.
AU - Eichler, Evan E.
PY - 2009/10/1
Y1 - 2009/10/1
N2 - Despite their importance in gene innovation and phenotypic variation, duplicated regions have remained largely intractable owing to difficulties in accurately resolving their structure, copy number and sequence content. We present an algorithm (mrFAST) to comprehensively map next-generation sequence reads, which allows for the prediction of absolute copy-number variation of duplicated segments and genes. We examine three human genomes and experimentally validate genome-wide copy number differences. We estimate that, on average, 73-87 genes vary in copy number between any two individuals and find that these genic differences overwhelmingly correspond to segmental duplications (odds ratio = 135; P 2.2 × 10 16). Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology.
AB - Despite their importance in gene innovation and phenotypic variation, duplicated regions have remained largely intractable owing to difficulties in accurately resolving their structure, copy number and sequence content. We present an algorithm (mrFAST) to comprehensively map next-generation sequence reads, which allows for the prediction of absolute copy-number variation of duplicated segments and genes. We examine three human genomes and experimentally validate genome-wide copy number differences. We estimate that, on average, 73-87 genes vary in copy number between any two individuals and find that these genic differences overwhelmingly correspond to segmental duplications (odds ratio = 135; P 2.2 × 10 16). Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology.
UR - http://www.scopus.com/inward/record.url?scp=70349556543&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349556543&partnerID=8YFLogxK
U2 - 10.1038/ng.437
DO - 10.1038/ng.437
M3 - Article
C2 - 19718026
AN - SCOPUS:70349556543
VL - 41
SP - 1061
EP - 1067
JO - Nature Genetics
JF - Nature Genetics
SN - 1061-4036
IS - 10
ER -