A cancer cell-line titration series for evaluating somatic classification

Robert E. Denroche, Laura Mullen, Lee Timms, Timothy Beck, Christina K. Yung, Lincoln Stein, John Douglas Mcpherson, Andrew M.K. Brown

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Background: Accurate detection of somatic single nucleotide variants and small insertions and deletions from DNA sequencing experiments of tumour-normal pairs is a challenging task. Tumour samples are often contaminated with normal cells confounding the available evidence for the somatic variants. Furthermore, tumours are heterogeneous so sub-clonal variants are observed at reduced allele frequencies. We present here a cell-line titration series dataset that can be used to evaluate somatic variant calling pipelines with the goal of reliably calling true somatic mutations at low allele frequencies. Results: Cell-line DNA was mixed with matched normal DNA at 8 different ratios to generate samples with known tumour cellularities, and exome sequenced on Illumina HiSeq to depths of >300×. The data was processed with several different variant calling pipelines and verification experiments were performed to assay >1500 somatic variant candidates using Ion Torrent PGM as an orthogonal technology. By examining the variants called at varying cellularities and depths of coverage, we show that the best performing pipelines are able to maintain a high level of precision at any cellularity. In addition, we estimate the number of true somatic variants undetected as cellularity and coverage decrease. Conclusions: Our cell-line titration series dataset, along with the associated verification results, was effective for this evaluation and will serve as a valuable dataset for future somatic calling algorithm development. The data is available for further analysis at the European Genome-phenome Archive under accession number EGAS00001001016. Data access requires registration through the International Cancer Genome Consortium's Data Access Compliance Office (ICGC DACO).

Original languageEnglish (US)
Article number1803
JournalBMC Research Notes
Volume8
Issue number1
DOIs
StatePublished - Dec 26 2015
Externally publishedYes

Fingerprint

Titration
Tumors
Cells
Cell Line
Pipelines
DNA
Neoplasms
Gene Frequency
Genes
Genome
Exome
Assays
DNA Sequence Analysis
Nucleotides
Experiments
Compliance
Ions
Technology
Mutation
Datasets

Keywords

  • Cancer bioinformatics
  • Normal contamination
  • Somatic mutation calling
  • Tumour cellularity
  • Whole exome sequencing dataset

ASJC Scopus subject areas

  • Medicine(all)
  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

Denroche, R. E., Mullen, L., Timms, L., Beck, T., Yung, C. K., Stein, L., ... Brown, A. M. K. (2015). A cancer cell-line titration series for evaluating somatic classification. BMC Research Notes, 8(1), [1803]. https://doi.org/10.1186/s13104-015-1803-7

A cancer cell-line titration series for evaluating somatic classification. / Denroche, Robert E.; Mullen, Laura; Timms, Lee; Beck, Timothy; Yung, Christina K.; Stein, Lincoln; Mcpherson, John Douglas; Brown, Andrew M.K.

In: BMC Research Notes, Vol. 8, No. 1, 1803, 26.12.2015.

Research output: Contribution to journalArticle

Denroche, RE, Mullen, L, Timms, L, Beck, T, Yung, CK, Stein, L, Mcpherson, JD & Brown, AMK 2015, 'A cancer cell-line titration series for evaluating somatic classification', BMC Research Notes, vol. 8, no. 1, 1803. https://doi.org/10.1186/s13104-015-1803-7
Denroche RE, Mullen L, Timms L, Beck T, Yung CK, Stein L et al. A cancer cell-line titration series for evaluating somatic classification. BMC Research Notes. 2015 Dec 26;8(1). 1803. https://doi.org/10.1186/s13104-015-1803-7
Denroche, Robert E. ; Mullen, Laura ; Timms, Lee ; Beck, Timothy ; Yung, Christina K. ; Stein, Lincoln ; Mcpherson, John Douglas ; Brown, Andrew M.K. / A cancer cell-line titration series for evaluating somatic classification. In: BMC Research Notes. 2015 ; Vol. 8, No. 1.
@article{6d1a227aecd34c15a01e1bf5754e92ea,
title = "A cancer cell-line titration series for evaluating somatic classification",
abstract = "Background: Accurate detection of somatic single nucleotide variants and small insertions and deletions from DNA sequencing experiments of tumour-normal pairs is a challenging task. Tumour samples are often contaminated with normal cells confounding the available evidence for the somatic variants. Furthermore, tumours are heterogeneous so sub-clonal variants are observed at reduced allele frequencies. We present here a cell-line titration series dataset that can be used to evaluate somatic variant calling pipelines with the goal of reliably calling true somatic mutations at low allele frequencies. Results: Cell-line DNA was mixed with matched normal DNA at 8 different ratios to generate samples with known tumour cellularities, and exome sequenced on Illumina HiSeq to depths of >300×. The data was processed with several different variant calling pipelines and verification experiments were performed to assay >1500 somatic variant candidates using Ion Torrent PGM as an orthogonal technology. By examining the variants called at varying cellularities and depths of coverage, we show that the best performing pipelines are able to maintain a high level of precision at any cellularity. In addition, we estimate the number of true somatic variants undetected as cellularity and coverage decrease. Conclusions: Our cell-line titration series dataset, along with the associated verification results, was effective for this evaluation and will serve as a valuable dataset for future somatic calling algorithm development. The data is available for further analysis at the European Genome-phenome Archive under accession number EGAS00001001016. Data access requires registration through the International Cancer Genome Consortium's Data Access Compliance Office (ICGC DACO).",
keywords = "Cancer bioinformatics, Normal contamination, Somatic mutation calling, Tumour cellularity, Whole exome sequencing dataset",
author = "Denroche, {Robert E.} and Laura Mullen and Lee Timms and Timothy Beck and Yung, {Christina K.} and Lincoln Stein and Mcpherson, {John Douglas} and Brown, {Andrew M.K.}",
year = "2015",
month = "12",
day = "26",
doi = "10.1186/s13104-015-1803-7",
language = "English (US)",
volume = "8",
journal = "BMC Research Notes",
issn = "1756-0500",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - A cancer cell-line titration series for evaluating somatic classification

AU - Denroche, Robert E.

AU - Mullen, Laura

AU - Timms, Lee

AU - Beck, Timothy

AU - Yung, Christina K.

AU - Stein, Lincoln

AU - Mcpherson, John Douglas

AU - Brown, Andrew M.K.

PY - 2015/12/26

Y1 - 2015/12/26

N2 - Background: Accurate detection of somatic single nucleotide variants and small insertions and deletions from DNA sequencing experiments of tumour-normal pairs is a challenging task. Tumour samples are often contaminated with normal cells confounding the available evidence for the somatic variants. Furthermore, tumours are heterogeneous so sub-clonal variants are observed at reduced allele frequencies. We present here a cell-line titration series dataset that can be used to evaluate somatic variant calling pipelines with the goal of reliably calling true somatic mutations at low allele frequencies. Results: Cell-line DNA was mixed with matched normal DNA at 8 different ratios to generate samples with known tumour cellularities, and exome sequenced on Illumina HiSeq to depths of >300×. The data was processed with several different variant calling pipelines and verification experiments were performed to assay >1500 somatic variant candidates using Ion Torrent PGM as an orthogonal technology. By examining the variants called at varying cellularities and depths of coverage, we show that the best performing pipelines are able to maintain a high level of precision at any cellularity. In addition, we estimate the number of true somatic variants undetected as cellularity and coverage decrease. Conclusions: Our cell-line titration series dataset, along with the associated verification results, was effective for this evaluation and will serve as a valuable dataset for future somatic calling algorithm development. The data is available for further analysis at the European Genome-phenome Archive under accession number EGAS00001001016. Data access requires registration through the International Cancer Genome Consortium's Data Access Compliance Office (ICGC DACO).

AB - Background: Accurate detection of somatic single nucleotide variants and small insertions and deletions from DNA sequencing experiments of tumour-normal pairs is a challenging task. Tumour samples are often contaminated with normal cells confounding the available evidence for the somatic variants. Furthermore, tumours are heterogeneous so sub-clonal variants are observed at reduced allele frequencies. We present here a cell-line titration series dataset that can be used to evaluate somatic variant calling pipelines with the goal of reliably calling true somatic mutations at low allele frequencies. Results: Cell-line DNA was mixed with matched normal DNA at 8 different ratios to generate samples with known tumour cellularities, and exome sequenced on Illumina HiSeq to depths of >300×. The data was processed with several different variant calling pipelines and verification experiments were performed to assay >1500 somatic variant candidates using Ion Torrent PGM as an orthogonal technology. By examining the variants called at varying cellularities and depths of coverage, we show that the best performing pipelines are able to maintain a high level of precision at any cellularity. In addition, we estimate the number of true somatic variants undetected as cellularity and coverage decrease. Conclusions: Our cell-line titration series dataset, along with the associated verification results, was effective for this evaluation and will serve as a valuable dataset for future somatic calling algorithm development. The data is available for further analysis at the European Genome-phenome Archive under accession number EGAS00001001016. Data access requires registration through the International Cancer Genome Consortium's Data Access Compliance Office (ICGC DACO).

KW - Cancer bioinformatics

KW - Normal contamination

KW - Somatic mutation calling

KW - Tumour cellularity

KW - Whole exome sequencing dataset

UR - http://www.scopus.com/inward/record.url?scp=84955668238&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84955668238&partnerID=8YFLogxK

U2 - 10.1186/s13104-015-1803-7

DO - 10.1186/s13104-015-1803-7

M3 - Article

C2 - 26708082

AN - SCOPUS:84955668238

VL - 8

JO - BMC Research Notes

JF - BMC Research Notes

SN - 1756-0500

IS - 1

M1 - 1803

ER -