Haplotype-phased synthetic long reads from short-read sequencing

James A. Stapleton, Jeongwoon Kim, John P. Hamilton, Ming Wu, Luiz C. Irber, Rohan Maddamsetti, Bryan Briney, Linsey Newton, Dennis R. Burton, Charles Brown, Christina Chan, C. Robin Buell, Timothy A. Whitehead

Research output: Contribution to journalArticle

11 Scopus citations

Abstract

Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic samples, full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.

Original languageEnglish (US)
Article numbere0147229
JournalPLoS One
Volume11
Issue number1
DOIs
StatePublished - Jan 1 2016

    Fingerprint

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Stapleton, J. A., Kim, J., Hamilton, J. P., Wu, M., Irber, L. C., Maddamsetti, R., Briney, B., Newton, L., Burton, D. R., Brown, C., Chan, C., Buell, C. R., & Whitehead, T. A. (2016). Haplotype-phased synthetic long reads from short-read sequencing. PLoS One, 11(1), [e0147229]. https://doi.org/10.1371/journal.pone.0147229