Next-generation VariationHunter: Combinatorial algorithms for transposon insertion discovery

Fereydoun Hormozdiari, Iman Hajirasouliha, Phuong Dao, Faraz Hach, Deniz Yorukoglu, Can Alkan, Evan E. Eichler, S. Cenk Sahinalp

Research output: Contribution to journalArticle

141 Citations (Scopus)

Abstract

Recent years have witnessed an increase in research activity for the detection of structural variants (SVs) and their association to human disease. The advent of next-generation sequencing technologies make it possible to extend the scope of structural variation studies to a point previously unimaginable as exemplified by the 1000 Genomes Project. Although various computational methods have been described for the detection of SVs, no such algorithm is yet fully capable of discovering transposon insertions, a very important class of SVs to the study of human evolution and disease. In this article, we provide a complete and novel formulation to discover both loci and classes of transposons inserted into genomes sequenced with highthroughput sequencing technologies. In addition, we also present 'conflict resolution' improvements to our earlier combinatorial SV detection algorithm (VariationHunter) by taking the diploid nature of the human genome into consideration. We test our algorithms with simulated data from the Venter genome (HuRef) and are able to discover >85% of transposon insertion events with precision of >90%. We also demonstrate that our conflict resolution algorithm (denoted as VariationHunter-CR) outperforms current state of the art (such as original VariationHunter, BreakDancer and MoDIL) algorithms when tested on the genome of the Yoruba African individual (NA18507). Availability: The implementation of algorithm is available at http://compbio.cs.sfu.ca/strvar.htm. Contact: eee@gs.washington.edu; cenk@cs.sfu.ca. Supplementary information: Supplementary data are available at Bioinformatics online.

Original languageEnglish (US)
Article numberbtq216
JournalBioinformatics
Volume26
Issue number12
DOIs
StatePublished - Jun 1 2010
Externally publishedYes

Fingerprint

Combinatorial Algorithms
Insertion
Genome
Genes
Conflict Resolution
Negotiating
Sequencing
Technology
Human Genome
Bioinformatics
Computational methods
Computational Biology
Diploidy
Computational Methods
High Throughput
Locus
Availability
Contact
Formulation
Research

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Hormozdiari, F., Hajirasouliha, I., Dao, P., Hach, F., Yorukoglu, D., Alkan, C., ... Sahinalp, S. C. (2010). Next-generation VariationHunter: Combinatorial algorithms for transposon insertion discovery. Bioinformatics, 26(12), [btq216]. https://doi.org/10.1093/bioinformatics/btq216

Next-generation VariationHunter : Combinatorial algorithms for transposon insertion discovery. / Hormozdiari, Fereydoun; Hajirasouliha, Iman; Dao, Phuong; Hach, Faraz; Yorukoglu, Deniz; Alkan, Can; Eichler, Evan E.; Sahinalp, S. Cenk.

In: Bioinformatics, Vol. 26, No. 12, btq216, 01.06.2010.

Research output: Contribution to journalArticle

Hormozdiari, F, Hajirasouliha, I, Dao, P, Hach, F, Yorukoglu, D, Alkan, C, Eichler, EE & Sahinalp, SC 2010, 'Next-generation VariationHunter: Combinatorial algorithms for transposon insertion discovery', Bioinformatics, vol. 26, no. 12, btq216. https://doi.org/10.1093/bioinformatics/btq216
Hormozdiari, Fereydoun ; Hajirasouliha, Iman ; Dao, Phuong ; Hach, Faraz ; Yorukoglu, Deniz ; Alkan, Can ; Eichler, Evan E. ; Sahinalp, S. Cenk. / Next-generation VariationHunter : Combinatorial algorithms for transposon insertion discovery. In: Bioinformatics. 2010 ; Vol. 26, No. 12.
@article{937d5c84a4364af08188ba9ed32048c0,
title = "Next-generation VariationHunter: Combinatorial algorithms for transposon insertion discovery",
abstract = "Recent years have witnessed an increase in research activity for the detection of structural variants (SVs) and their association to human disease. The advent of next-generation sequencing technologies make it possible to extend the scope of structural variation studies to a point previously unimaginable as exemplified by the 1000 Genomes Project. Although various computational methods have been described for the detection of SVs, no such algorithm is yet fully capable of discovering transposon insertions, a very important class of SVs to the study of human evolution and disease. In this article, we provide a complete and novel formulation to discover both loci and classes of transposons inserted into genomes sequenced with highthroughput sequencing technologies. In addition, we also present 'conflict resolution' improvements to our earlier combinatorial SV detection algorithm (VariationHunter) by taking the diploid nature of the human genome into consideration. We test our algorithms with simulated data from the Venter genome (HuRef) and are able to discover >85{\%} of transposon insertion events with precision of >90{\%}. We also demonstrate that our conflict resolution algorithm (denoted as VariationHunter-CR) outperforms current state of the art (such as original VariationHunter, BreakDancer and MoDIL) algorithms when tested on the genome of the Yoruba African individual (NA18507). Availability: The implementation of algorithm is available at http://compbio.cs.sfu.ca/strvar.htm. Contact: eee@gs.washington.edu; cenk@cs.sfu.ca. Supplementary information: Supplementary data are available at Bioinformatics online.",
author = "Fereydoun Hormozdiari and Iman Hajirasouliha and Phuong Dao and Faraz Hach and Deniz Yorukoglu and Can Alkan and Eichler, {Evan E.} and Sahinalp, {S. Cenk}",
year = "2010",
month = "6",
day = "1",
doi = "10.1093/bioinformatics/btq216",
language = "English (US)",
volume = "26",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "12",

}

TY - JOUR

T1 - Next-generation VariationHunter

T2 - Combinatorial algorithms for transposon insertion discovery

AU - Hormozdiari, Fereydoun

AU - Hajirasouliha, Iman

AU - Dao, Phuong

AU - Hach, Faraz

AU - Yorukoglu, Deniz

AU - Alkan, Can

AU - Eichler, Evan E.

AU - Sahinalp, S. Cenk

PY - 2010/6/1

Y1 - 2010/6/1

N2 - Recent years have witnessed an increase in research activity for the detection of structural variants (SVs) and their association to human disease. The advent of next-generation sequencing technologies make it possible to extend the scope of structural variation studies to a point previously unimaginable as exemplified by the 1000 Genomes Project. Although various computational methods have been described for the detection of SVs, no such algorithm is yet fully capable of discovering transposon insertions, a very important class of SVs to the study of human evolution and disease. In this article, we provide a complete and novel formulation to discover both loci and classes of transposons inserted into genomes sequenced with highthroughput sequencing technologies. In addition, we also present 'conflict resolution' improvements to our earlier combinatorial SV detection algorithm (VariationHunter) by taking the diploid nature of the human genome into consideration. We test our algorithms with simulated data from the Venter genome (HuRef) and are able to discover >85% of transposon insertion events with precision of >90%. We also demonstrate that our conflict resolution algorithm (denoted as VariationHunter-CR) outperforms current state of the art (such as original VariationHunter, BreakDancer and MoDIL) algorithms when tested on the genome of the Yoruba African individual (NA18507). Availability: The implementation of algorithm is available at http://compbio.cs.sfu.ca/strvar.htm. Contact: eee@gs.washington.edu; cenk@cs.sfu.ca. Supplementary information: Supplementary data are available at Bioinformatics online.

AB - Recent years have witnessed an increase in research activity for the detection of structural variants (SVs) and their association to human disease. The advent of next-generation sequencing technologies make it possible to extend the scope of structural variation studies to a point previously unimaginable as exemplified by the 1000 Genomes Project. Although various computational methods have been described for the detection of SVs, no such algorithm is yet fully capable of discovering transposon insertions, a very important class of SVs to the study of human evolution and disease. In this article, we provide a complete and novel formulation to discover both loci and classes of transposons inserted into genomes sequenced with highthroughput sequencing technologies. In addition, we also present 'conflict resolution' improvements to our earlier combinatorial SV detection algorithm (VariationHunter) by taking the diploid nature of the human genome into consideration. We test our algorithms with simulated data from the Venter genome (HuRef) and are able to discover >85% of transposon insertion events with precision of >90%. We also demonstrate that our conflict resolution algorithm (denoted as VariationHunter-CR) outperforms current state of the art (such as original VariationHunter, BreakDancer and MoDIL) algorithms when tested on the genome of the Yoruba African individual (NA18507). Availability: The implementation of algorithm is available at http://compbio.cs.sfu.ca/strvar.htm. Contact: eee@gs.washington.edu; cenk@cs.sfu.ca. Supplementary information: Supplementary data are available at Bioinformatics online.

UR - http://www.scopus.com/inward/record.url?scp=77954205450&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77954205450&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btq216

DO - 10.1093/bioinformatics/btq216

M3 - Article

C2 - 20529927

AN - SCOPUS:77954205450

VL - 26

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 12

M1 - btq216

ER -