Kevlar: A Mapping-Free Framework for Accurate Discovery of De Novo Variants

Research output: Contribution to journalArticle

Abstract

De novo genetic variants are an important source of causative variation in complex genetic disorders. Many methods for variant discovery rely on mapping reads to a reference genome, detecting numerous inherited variants irrelevant to the phenotype of interest. To distinguish between inherited and de novo variation, sequencing of families (parents and siblings) is commonly pursued. However, standard mapping-based approaches tend to have a high false-discovery rate for de novo variant prediction. Kevlar is a mapping-free method for de novo variant discovery, based on direct comparison of sequences between related individuals. Kevlar identifies high-abundance k-mers unique to the individual of interest. Reads containing these k-mers are partitioned into disjoint sets by shared k-mer content for variant calling, and preliminary variant predictions are sorted using a probabilistic score. We evaluated Kevlar on simulated and real datasets, demonstrating its ability to detect both de novo single-nucleotide variants and indels with high accuracy. Bioinformatics; Biological Sciences; Genetics

Original languageEnglish (US)
Pages (from-to)28-36
Number of pages9
JournaliScience
Volume18
DOIs
StatePublished - Aug 30 2019

Fingerprint

Inborn Genetic Diseases
Biological Science Disciplines
Computational Biology
Nucleotides
Genome
Phenotype
Datasets

Keywords

  • Bioinformatics
  • Biological Sciences
  • Genetics

ASJC Scopus subject areas

  • General

Cite this

Kevlar : A Mapping-Free Framework for Accurate Discovery of De Novo Variants. / Standage, Daniel S.; Brown, Charles; Hormozdiari, Fereydoun.

In: iScience, Vol. 18, 30.08.2019, p. 28-36.

Research output: Contribution to journalArticle

@article{3479574d386246509ffee675f2d73411,
title = "Kevlar: A Mapping-Free Framework for Accurate Discovery of De Novo Variants",
abstract = "De novo genetic variants are an important source of causative variation in complex genetic disorders. Many methods for variant discovery rely on mapping reads to a reference genome, detecting numerous inherited variants irrelevant to the phenotype of interest. To distinguish between inherited and de novo variation, sequencing of families (parents and siblings) is commonly pursued. However, standard mapping-based approaches tend to have a high false-discovery rate for de novo variant prediction. Kevlar is a mapping-free method for de novo variant discovery, based on direct comparison of sequences between related individuals. Kevlar identifies high-abundance k-mers unique to the individual of interest. Reads containing these k-mers are partitioned into disjoint sets by shared k-mer content for variant calling, and preliminary variant predictions are sorted using a probabilistic score. We evaluated Kevlar on simulated and real datasets, demonstrating its ability to detect both de novo single-nucleotide variants and indels with high accuracy. Bioinformatics; Biological Sciences; Genetics",
keywords = "Bioinformatics, Biological Sciences, Genetics",
author = "Standage, {Daniel S.} and Charles Brown and Fereydoun Hormozdiari",
year = "2019",
month = "8",
day = "30",
doi = "10.1016/j.isci.2019.07.032",
language = "English (US)",
volume = "18",
pages = "28--36",
journal = "iScience",
issn = "2589-0042",
publisher = "Elsevier Inc.",

}

TY - JOUR

T1 - Kevlar

T2 - A Mapping-Free Framework for Accurate Discovery of De Novo Variants

AU - Standage, Daniel S.

AU - Brown, Charles

AU - Hormozdiari, Fereydoun

PY - 2019/8/30

Y1 - 2019/8/30

N2 - De novo genetic variants are an important source of causative variation in complex genetic disorders. Many methods for variant discovery rely on mapping reads to a reference genome, detecting numerous inherited variants irrelevant to the phenotype of interest. To distinguish between inherited and de novo variation, sequencing of families (parents and siblings) is commonly pursued. However, standard mapping-based approaches tend to have a high false-discovery rate for de novo variant prediction. Kevlar is a mapping-free method for de novo variant discovery, based on direct comparison of sequences between related individuals. Kevlar identifies high-abundance k-mers unique to the individual of interest. Reads containing these k-mers are partitioned into disjoint sets by shared k-mer content for variant calling, and preliminary variant predictions are sorted using a probabilistic score. We evaluated Kevlar on simulated and real datasets, demonstrating its ability to detect both de novo single-nucleotide variants and indels with high accuracy. Bioinformatics; Biological Sciences; Genetics

AB - De novo genetic variants are an important source of causative variation in complex genetic disorders. Many methods for variant discovery rely on mapping reads to a reference genome, detecting numerous inherited variants irrelevant to the phenotype of interest. To distinguish between inherited and de novo variation, sequencing of families (parents and siblings) is commonly pursued. However, standard mapping-based approaches tend to have a high false-discovery rate for de novo variant prediction. Kevlar is a mapping-free method for de novo variant discovery, based on direct comparison of sequences between related individuals. Kevlar identifies high-abundance k-mers unique to the individual of interest. Reads containing these k-mers are partitioned into disjoint sets by shared k-mer content for variant calling, and preliminary variant predictions are sorted using a probabilistic score. We evaluated Kevlar on simulated and real datasets, demonstrating its ability to detect both de novo single-nucleotide variants and indels with high accuracy. Bioinformatics; Biological Sciences; Genetics

KW - Bioinformatics

KW - Biological Sciences

KW - Genetics

UR - http://www.scopus.com/inward/record.url?scp=85069966566&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85069966566&partnerID=8YFLogxK

U2 - 10.1016/j.isci.2019.07.032

DO - 10.1016/j.isci.2019.07.032

M3 - Article

AN - SCOPUS:85069966566

VL - 18

SP - 28

EP - 36

JO - iScience

JF - iScience

SN - 2589-0042

ER -