A new method for DNA sequencing error verification and correction via an on-disk index tree

Yarong Gu, Xianying Liu, Qiang Zhu, Youchao Dong, Charles Brown, Sakti Pramanik

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

Existing sequencing error correction techniques demand large expensive memory space. In this work, we introduce a new disk-based sequencing error correction method to solve the problem. The key idea is to utilize a special on-disk index structure, called the BoND-tree, to store and access a large set of k-mers and their associated metadata on disk. With the BoND-tree, a set of special box queries to retrieve the relevant k-mers and their counts are efficiently processed. A comprehensive voting mechanism is adopted to determine and correct an erroneous base in a genome sequence. Ex-periments demonstrate that the proposed method is quite promising in verifying and correcting sequencing errors in terms of accuracy and scalability. Copyright is held by the author/owner(s).

Original languageEnglish (US)
Title of host publicationBCB 2015 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PublisherAssociation for Computing Machinery, Inc
Pages503-504
Number of pages2
ISBN (Electronic)9781450338530
DOIs
StatePublished - Sep 9 2015
Externally publishedYes
Event6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2015 - Atlanta, United States
Duration: Sep 9 2015Sep 12 2015

Other

Other6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2015
CountryUnited States
CityAtlanta
Period9/9/159/12/15

Keywords

  • Bioinformatics
  • Disk index tree
  • Sequencing error correction

ASJC Scopus subject areas

  • Software
  • Health Informatics
  • Computer Science Applications
  • Biomedical Engineering

Fingerprint Dive into the research topics of 'A new method for DNA sequencing error verification and correction via an on-disk index tree'. Together they form a unique fingerprint.

  • Cite this

    Gu, Y., Liu, X., Zhu, Q., Dong, Y., Brown, C., & Pramanik, S. (2015). A new method for DNA sequencing error verification and correction via an on-disk index tree. In BCB 2015 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 503-504). Association for Computing Machinery, Inc. https://doi.org/10.1145/2808719.2811429