Multi-platform discovery of haplotype-resolved structural variation in human genomes

Mark J.P. Chaisson, Ashley D. Sanders, Xuefang Zhao, Ankit Malhotra, David Porubsky, Tobias Rausch, Eugene J. Gardner, Oscar L. Rodriguez, Li Guo, Ryan L. Collins, Xian Fan, Jia Wen, Robert E. Handsaker, Susan Fairley, Zev N. Kronenberg, Xiangmeng Kong, Fereydoun Hormozdiari, Dillon Lee, Aaron M. Wenger, Alex R. HastieDanny Antaki, Thomas Anantharaman, Peter A. Audano, Harrison Brand, Stuart Cantsilieris, Han Cao, Eliza Cerveira, Chong Chen, Xintong Chen, Chen Shan Chin, Zechen Chong, Nelson T. Chuang, Christine C. Lambert, Deanna M. Church, Laura Clarke, Andrew Farrell, Joey Flores, Timur Galeev, David U. Gorkin, Madhusudan Gujral, Victor Guryev, William Haynes Heaton, Jonas Korlach, Sushant Kumar, Jee Young Kwon, Ernest T. Lam, Jong Eun Lee, Joyce Lee, Wan Ping Lee, Sau Peng Lee, Shantao Li, Patrick Marks, Karine Viaud-Martinez, Sascha Meiers, Katherine M. Munson, Fabio C.P. Navarro, Bradley J. Nelson, Conor Nodzak, Amina Noor, Sofia Kyriazopoulou-Panagiotopoulou, Andy W.C. Pang, Yunjiang Qiu, Gabriel Rosanio, Mallory Ryan, Adrian Stütz, Diana C.J. Spierings, Alistair Ward, Anne Marie E. Welch, Ming Xiao, Wei Xu, Chengsheng Zhang, Qihui Zhu, Xiangqun Zheng-Bradley, Ernesto Lowy, Sergei Yakneen, Steven McCarroll, Goo Jun, Li Ding, Chong Lek Koh, Bing Ren, Paul Flicek, Ken Chen, Mark B. Gerstein, Pui Yan Kwok, Peter M. Lansdorp, Gabor T. Marth, Jonathan Sebat, Xinghua Shi, Ali Bashir, Kai Ye, Scott E. Devine, Michael E. Talkowski, Ryan E. Mills, Tobias Marschall, Jan O. Korbel, Evan E. Eichler, Charles Lee

Research output: Contribution to journalArticle

42 Citations (Scopus)

Abstract

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.

Original languageEnglish (US)
Article number1784
JournalNature communications
Volume10
Issue number1
DOIs
StatePublished - Dec 1 2019

Fingerprint

genome
Human Genome
sequencing
Haplotypes
platforms
Genes
Genome
Medical Genetics
Genomic Structural Variation
Inborn Genetic Diseases
inversions
biological diversity
recommendations
strands
Technology
Throughput
sensitivity

ASJC Scopus subject areas

  • Chemistry(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Physics and Astronomy(all)

Cite this

Chaisson, M. J. P., Sanders, A. D., Zhao, X., Malhotra, A., Porubsky, D., Rausch, T., ... Lee, C. (2019). Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nature communications, 10(1), [1784]. https://doi.org/10.1038/s41467-018-08148-z

Multi-platform discovery of haplotype-resolved structural variation in human genomes. / Chaisson, Mark J.P.; Sanders, Ashley D.; Zhao, Xuefang; Malhotra, Ankit; Porubsky, David; Rausch, Tobias; Gardner, Eugene J.; Rodriguez, Oscar L.; Guo, Li; Collins, Ryan L.; Fan, Xian; Wen, Jia; Handsaker, Robert E.; Fairley, Susan; Kronenberg, Zev N.; Kong, Xiangmeng; Hormozdiari, Fereydoun; Lee, Dillon; Wenger, Aaron M.; Hastie, Alex R.; Antaki, Danny; Anantharaman, Thomas; Audano, Peter A.; Brand, Harrison; Cantsilieris, Stuart; Cao, Han; Cerveira, Eliza; Chen, Chong; Chen, Xintong; Chin, Chen Shan; Chong, Zechen; Chuang, Nelson T.; Lambert, Christine C.; Church, Deanna M.; Clarke, Laura; Farrell, Andrew; Flores, Joey; Galeev, Timur; Gorkin, David U.; Gujral, Madhusudan; Guryev, Victor; Heaton, William Haynes; Korlach, Jonas; Kumar, Sushant; Kwon, Jee Young; Lam, Ernest T.; Lee, Jong Eun; Lee, Joyce; Lee, Wan Ping; Lee, Sau Peng; Li, Shantao; Marks, Patrick; Viaud-Martinez, Karine; Meiers, Sascha; Munson, Katherine M.; Navarro, Fabio C.P.; Nelson, Bradley J.; Nodzak, Conor; Noor, Amina; Kyriazopoulou-Panagiotopoulou, Sofia; Pang, Andy W.C.; Qiu, Yunjiang; Rosanio, Gabriel; Ryan, Mallory; Stütz, Adrian; Spierings, Diana C.J.; Ward, Alistair; Welch, Anne Marie E.; Xiao, Ming; Xu, Wei; Zhang, Chengsheng; Zhu, Qihui; Zheng-Bradley, Xiangqun; Lowy, Ernesto; Yakneen, Sergei; McCarroll, Steven; Jun, Goo; Ding, Li; Koh, Chong Lek; Ren, Bing; Flicek, Paul; Chen, Ken; Gerstein, Mark B.; Kwok, Pui Yan; Lansdorp, Peter M.; Marth, Gabor T.; Sebat, Jonathan; Shi, Xinghua; Bashir, Ali; Ye, Kai; Devine, Scott E.; Talkowski, Michael E.; Mills, Ryan E.; Marschall, Tobias; Korbel, Jan O.; Eichler, Evan E.; Lee, Charles.

In: Nature communications, Vol. 10, No. 1, 1784, 01.12.2019.

Research output: Contribution to journalArticle

Chaisson, MJP, Sanders, AD, Zhao, X, Malhotra, A, Porubsky, D, Rausch, T, Gardner, EJ, Rodriguez, OL, Guo, L, Collins, RL, Fan, X, Wen, J, Handsaker, RE, Fairley, S, Kronenberg, ZN, Kong, X, Hormozdiari, F, Lee, D, Wenger, AM, Hastie, AR, Antaki, D, Anantharaman, T, Audano, PA, Brand, H, Cantsilieris, S, Cao, H, Cerveira, E, Chen, C, Chen, X, Chin, CS, Chong, Z, Chuang, NT, Lambert, CC, Church, DM, Clarke, L, Farrell, A, Flores, J, Galeev, T, Gorkin, DU, Gujral, M, Guryev, V, Heaton, WH, Korlach, J, Kumar, S, Kwon, JY, Lam, ET, Lee, JE, Lee, J, Lee, WP, Lee, SP, Li, S, Marks, P, Viaud-Martinez, K, Meiers, S, Munson, KM, Navarro, FCP, Nelson, BJ, Nodzak, C, Noor, A, Kyriazopoulou-Panagiotopoulou, S, Pang, AWC, Qiu, Y, Rosanio, G, Ryan, M, Stütz, A, Spierings, DCJ, Ward, A, Welch, AME, Xiao, M, Xu, W, Zhang, C, Zhu, Q, Zheng-Bradley, X, Lowy, E, Yakneen, S, McCarroll, S, Jun, G, Ding, L, Koh, CL, Ren, B, Flicek, P, Chen, K, Gerstein, MB, Kwok, PY, Lansdorp, PM, Marth, GT, Sebat, J, Shi, X, Bashir, A, Ye, K, Devine, SE, Talkowski, ME, Mills, RE, Marschall, T, Korbel, JO, Eichler, EE & Lee, C 2019, 'Multi-platform discovery of haplotype-resolved structural variation in human genomes', Nature communications, vol. 10, no. 1, 1784. https://doi.org/10.1038/s41467-018-08148-z
Chaisson MJP, Sanders AD, Zhao X, Malhotra A, Porubsky D, Rausch T et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nature communications. 2019 Dec 1;10(1). 1784. https://doi.org/10.1038/s41467-018-08148-z
Chaisson, Mark J.P. ; Sanders, Ashley D. ; Zhao, Xuefang ; Malhotra, Ankit ; Porubsky, David ; Rausch, Tobias ; Gardner, Eugene J. ; Rodriguez, Oscar L. ; Guo, Li ; Collins, Ryan L. ; Fan, Xian ; Wen, Jia ; Handsaker, Robert E. ; Fairley, Susan ; Kronenberg, Zev N. ; Kong, Xiangmeng ; Hormozdiari, Fereydoun ; Lee, Dillon ; Wenger, Aaron M. ; Hastie, Alex R. ; Antaki, Danny ; Anantharaman, Thomas ; Audano, Peter A. ; Brand, Harrison ; Cantsilieris, Stuart ; Cao, Han ; Cerveira, Eliza ; Chen, Chong ; Chen, Xintong ; Chin, Chen Shan ; Chong, Zechen ; Chuang, Nelson T. ; Lambert, Christine C. ; Church, Deanna M. ; Clarke, Laura ; Farrell, Andrew ; Flores, Joey ; Galeev, Timur ; Gorkin, David U. ; Gujral, Madhusudan ; Guryev, Victor ; Heaton, William Haynes ; Korlach, Jonas ; Kumar, Sushant ; Kwon, Jee Young ; Lam, Ernest T. ; Lee, Jong Eun ; Lee, Joyce ; Lee, Wan Ping ; Lee, Sau Peng ; Li, Shantao ; Marks, Patrick ; Viaud-Martinez, Karine ; Meiers, Sascha ; Munson, Katherine M. ; Navarro, Fabio C.P. ; Nelson, Bradley J. ; Nodzak, Conor ; Noor, Amina ; Kyriazopoulou-Panagiotopoulou, Sofia ; Pang, Andy W.C. ; Qiu, Yunjiang ; Rosanio, Gabriel ; Ryan, Mallory ; Stütz, Adrian ; Spierings, Diana C.J. ; Ward, Alistair ; Welch, Anne Marie E. ; Xiao, Ming ; Xu, Wei ; Zhang, Chengsheng ; Zhu, Qihui ; Zheng-Bradley, Xiangqun ; Lowy, Ernesto ; Yakneen, Sergei ; McCarroll, Steven ; Jun, Goo ; Ding, Li ; Koh, Chong Lek ; Ren, Bing ; Flicek, Paul ; Chen, Ken ; Gerstein, Mark B. ; Kwok, Pui Yan ; Lansdorp, Peter M. ; Marth, Gabor T. ; Sebat, Jonathan ; Shi, Xinghua ; Bashir, Ali ; Ye, Kai ; Devine, Scott E. ; Talkowski, Michael E. ; Mills, Ryan E. ; Marschall, Tobias ; Korbel, Jan O. ; Eichler, Evan E. ; Lee, Charles. / Multi-platform discovery of haplotype-resolved structural variation in human genomes. In: Nature communications. 2019 ; Vol. 10, No. 1.
@article{5ad413a0178c4f8d9fb447fe75e41a54,
title = "Multi-platform discovery of haplotype-resolved structural variation in human genomes",
abstract = "The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.",
author = "Chaisson, {Mark J.P.} and Sanders, {Ashley D.} and Xuefang Zhao and Ankit Malhotra and David Porubsky and Tobias Rausch and Gardner, {Eugene J.} and Rodriguez, {Oscar L.} and Li Guo and Collins, {Ryan L.} and Xian Fan and Jia Wen and Handsaker, {Robert E.} and Susan Fairley and Kronenberg, {Zev N.} and Xiangmeng Kong and Fereydoun Hormozdiari and Dillon Lee and Wenger, {Aaron M.} and Hastie, {Alex R.} and Danny Antaki and Thomas Anantharaman and Audano, {Peter A.} and Harrison Brand and Stuart Cantsilieris and Han Cao and Eliza Cerveira and Chong Chen and Xintong Chen and Chin, {Chen Shan} and Zechen Chong and Chuang, {Nelson T.} and Lambert, {Christine C.} and Church, {Deanna M.} and Laura Clarke and Andrew Farrell and Joey Flores and Timur Galeev and Gorkin, {David U.} and Madhusudan Gujral and Victor Guryev and Heaton, {William Haynes} and Jonas Korlach and Sushant Kumar and Kwon, {Jee Young} and Lam, {Ernest T.} and Lee, {Jong Eun} and Joyce Lee and Lee, {Wan Ping} and Lee, {Sau Peng} and Shantao Li and Patrick Marks and Karine Viaud-Martinez and Sascha Meiers and Munson, {Katherine M.} and Navarro, {Fabio C.P.} and Nelson, {Bradley J.} and Conor Nodzak and Amina Noor and Sofia Kyriazopoulou-Panagiotopoulou and Pang, {Andy W.C.} and Yunjiang Qiu and Gabriel Rosanio and Mallory Ryan and Adrian St{\"u}tz and Spierings, {Diana C.J.} and Alistair Ward and Welch, {Anne Marie E.} and Ming Xiao and Wei Xu and Chengsheng Zhang and Qihui Zhu and Xiangqun Zheng-Bradley and Ernesto Lowy and Sergei Yakneen and Steven McCarroll and Goo Jun and Li Ding and Koh, {Chong Lek} and Bing Ren and Paul Flicek and Ken Chen and Gerstein, {Mark B.} and Kwok, {Pui Yan} and Lansdorp, {Peter M.} and Marth, {Gabor T.} and Jonathan Sebat and Xinghua Shi and Ali Bashir and Kai Ye and Devine, {Scott E.} and Talkowski, {Michael E.} and Mills, {Ryan E.} and Tobias Marschall and Korbel, {Jan O.} and Eichler, {Evan E.} and Charles Lee",
year = "2019",
month = "12",
day = "1",
doi = "10.1038/s41467-018-08148-z",
language = "English (US)",
volume = "10",
journal = "Nature Communications",
issn = "2041-1723",
publisher = "Nature Publishing Group",
number = "1",

}

TY - JOUR

T1 - Multi-platform discovery of haplotype-resolved structural variation in human genomes

AU - Chaisson, Mark J.P.

AU - Sanders, Ashley D.

AU - Zhao, Xuefang

AU - Malhotra, Ankit

AU - Porubsky, David

AU - Rausch, Tobias

AU - Gardner, Eugene J.

AU - Rodriguez, Oscar L.

AU - Guo, Li

AU - Collins, Ryan L.

AU - Fan, Xian

AU - Wen, Jia

AU - Handsaker, Robert E.

AU - Fairley, Susan

AU - Kronenberg, Zev N.

AU - Kong, Xiangmeng

AU - Hormozdiari, Fereydoun

AU - Lee, Dillon

AU - Wenger, Aaron M.

AU - Hastie, Alex R.

AU - Antaki, Danny

AU - Anantharaman, Thomas

AU - Audano, Peter A.

AU - Brand, Harrison

AU - Cantsilieris, Stuart

AU - Cao, Han

AU - Cerveira, Eliza

AU - Chen, Chong

AU - Chen, Xintong

AU - Chin, Chen Shan

AU - Chong, Zechen

AU - Chuang, Nelson T.

AU - Lambert, Christine C.

AU - Church, Deanna M.

AU - Clarke, Laura

AU - Farrell, Andrew

AU - Flores, Joey

AU - Galeev, Timur

AU - Gorkin, David U.

AU - Gujral, Madhusudan

AU - Guryev, Victor

AU - Heaton, William Haynes

AU - Korlach, Jonas

AU - Kumar, Sushant

AU - Kwon, Jee Young

AU - Lam, Ernest T.

AU - Lee, Jong Eun

AU - Lee, Joyce

AU - Lee, Wan Ping

AU - Lee, Sau Peng

AU - Li, Shantao

AU - Marks, Patrick

AU - Viaud-Martinez, Karine

AU - Meiers, Sascha

AU - Munson, Katherine M.

AU - Navarro, Fabio C.P.

AU - Nelson, Bradley J.

AU - Nodzak, Conor

AU - Noor, Amina

AU - Kyriazopoulou-Panagiotopoulou, Sofia

AU - Pang, Andy W.C.

AU - Qiu, Yunjiang

AU - Rosanio, Gabriel

AU - Ryan, Mallory

AU - Stütz, Adrian

AU - Spierings, Diana C.J.

AU - Ward, Alistair

AU - Welch, Anne Marie E.

AU - Xiao, Ming

AU - Xu, Wei

AU - Zhang, Chengsheng

AU - Zhu, Qihui

AU - Zheng-Bradley, Xiangqun

AU - Lowy, Ernesto

AU - Yakneen, Sergei

AU - McCarroll, Steven

AU - Jun, Goo

AU - Ding, Li

AU - Koh, Chong Lek

AU - Ren, Bing

AU - Flicek, Paul

AU - Chen, Ken

AU - Gerstein, Mark B.

AU - Kwok, Pui Yan

AU - Lansdorp, Peter M.

AU - Marth, Gabor T.

AU - Sebat, Jonathan

AU - Shi, Xinghua

AU - Bashir, Ali

AU - Ye, Kai

AU - Devine, Scott E.

AU - Talkowski, Michael E.

AU - Mills, Ryan E.

AU - Marschall, Tobias

AU - Korbel, Jan O.

AU - Eichler, Evan E.

AU - Lee, Charles

PY - 2019/12/1

Y1 - 2019/12/1

N2 - The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.

AB - The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.

UR - http://www.scopus.com/inward/record.url?scp=85060084825&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060084825&partnerID=8YFLogxK

U2 - 10.1038/s41467-018-08148-z

DO - 10.1038/s41467-018-08148-z

M3 - Article

C2 - 30992455

AN - SCOPUS:85060084825

VL - 10

JO - Nature Communications

JF - Nature Communications

SN - 2041-1723

IS - 1

M1 - 1784

ER -