ISOLATE

A computational strategy for identifying the primary origin of cancers using high-throughput sequencing

Gerald Quon, Quaid Morris

Research output: Contribution to journalArticle

31 Citations (Scopus)

Abstract

Motivation: One of the most deadly cancer diagnoses is the carcinoma of unknown primary origin. Without the knowledge of the site of origin, treatment regimens are limited in their specificity and result in high mortality rates. Though supervised classification methods have been developed to predict the site of origin based on gene expression data, they require large numbers of previously classified tumors for training, in part because they do not account for sample heterogeneity, which limits their application to well-studied cancers. Results: We present ISOLATE, a new statistical method that simultaneously predicts the primary site of origin of cancers and addresses sample heterogeneity, while taking advantage of new high-throughput sequencing technology that promises to bring higher accuracy and reproducibility to gene expression profiling experiments. ISOLATE makes predictions de novo, without having seen any training expression profiles of cancers with identified origin. Compared with previous methods, ISOLATE is able to predict the primary site of origin, de-convolve and remove the effect of sample heterogeneity and identify differentially expressed genes with higher accuracy, across both synthetic and clinical datasets. Methods such as ISOLATE are invaluable tools for clinicians faced with carcinomas of unknown primary origin.

Original languageEnglish (US)
Pages (from-to)2882-2889
Number of pages8
JournalBioinformatics
Volume25
Issue number21
DOIs
StatePublished - Nov 9 2009
Externally publishedYes

Fingerprint

Gene expression
Sequencing
High Throughput
Cancer
Throughput
Predict
Tumors
Neoplasms
Statistical methods
High Accuracy
Genes
Unknown
Supervised Classification
Mortality Rate
Reproducibility
Carcinoma
Gene Expression Data
Profiling
Statistical method
Gene Expression

ASJC Scopus subject areas

  • Statistics and Probability
  • Medicine(all)
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

ISOLATE : A computational strategy for identifying the primary origin of cancers using high-throughput sequencing. / Quon, Gerald; Morris, Quaid.

In: Bioinformatics, Vol. 25, No. 21, 09.11.2009, p. 2882-2889.

Research output: Contribution to journalArticle

@article{fef437f30ab34a25a338faea175d4da8,
title = "ISOLATE: A computational strategy for identifying the primary origin of cancers using high-throughput sequencing",
abstract = "Motivation: One of the most deadly cancer diagnoses is the carcinoma of unknown primary origin. Without the knowledge of the site of origin, treatment regimens are limited in their specificity and result in high mortality rates. Though supervised classification methods have been developed to predict the site of origin based on gene expression data, they require large numbers of previously classified tumors for training, in part because they do not account for sample heterogeneity, which limits their application to well-studied cancers. Results: We present ISOLATE, a new statistical method that simultaneously predicts the primary site of origin of cancers and addresses sample heterogeneity, while taking advantage of new high-throughput sequencing technology that promises to bring higher accuracy and reproducibility to gene expression profiling experiments. ISOLATE makes predictions de novo, without having seen any training expression profiles of cancers with identified origin. Compared with previous methods, ISOLATE is able to predict the primary site of origin, de-convolve and remove the effect of sample heterogeneity and identify differentially expressed genes with higher accuracy, across both synthetic and clinical datasets. Methods such as ISOLATE are invaluable tools for clinicians faced with carcinomas of unknown primary origin.",
author = "Gerald Quon and Quaid Morris",
year = "2009",
month = "11",
day = "9",
doi = "10.1093/bioinformatics/btp378",
language = "English (US)",
volume = "25",
pages = "2882--2889",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "21",

}

TY - JOUR

T1 - ISOLATE

T2 - A computational strategy for identifying the primary origin of cancers using high-throughput sequencing

AU - Quon, Gerald

AU - Morris, Quaid

PY - 2009/11/9

Y1 - 2009/11/9

N2 - Motivation: One of the most deadly cancer diagnoses is the carcinoma of unknown primary origin. Without the knowledge of the site of origin, treatment regimens are limited in their specificity and result in high mortality rates. Though supervised classification methods have been developed to predict the site of origin based on gene expression data, they require large numbers of previously classified tumors for training, in part because they do not account for sample heterogeneity, which limits their application to well-studied cancers. Results: We present ISOLATE, a new statistical method that simultaneously predicts the primary site of origin of cancers and addresses sample heterogeneity, while taking advantage of new high-throughput sequencing technology that promises to bring higher accuracy and reproducibility to gene expression profiling experiments. ISOLATE makes predictions de novo, without having seen any training expression profiles of cancers with identified origin. Compared with previous methods, ISOLATE is able to predict the primary site of origin, de-convolve and remove the effect of sample heterogeneity and identify differentially expressed genes with higher accuracy, across both synthetic and clinical datasets. Methods such as ISOLATE are invaluable tools for clinicians faced with carcinomas of unknown primary origin.

AB - Motivation: One of the most deadly cancer diagnoses is the carcinoma of unknown primary origin. Without the knowledge of the site of origin, treatment regimens are limited in their specificity and result in high mortality rates. Though supervised classification methods have been developed to predict the site of origin based on gene expression data, they require large numbers of previously classified tumors for training, in part because they do not account for sample heterogeneity, which limits their application to well-studied cancers. Results: We present ISOLATE, a new statistical method that simultaneously predicts the primary site of origin of cancers and addresses sample heterogeneity, while taking advantage of new high-throughput sequencing technology that promises to bring higher accuracy and reproducibility to gene expression profiling experiments. ISOLATE makes predictions de novo, without having seen any training expression profiles of cancers with identified origin. Compared with previous methods, ISOLATE is able to predict the primary site of origin, de-convolve and remove the effect of sample heterogeneity and identify differentially expressed genes with higher accuracy, across both synthetic and clinical datasets. Methods such as ISOLATE are invaluable tools for clinicians faced with carcinomas of unknown primary origin.

UR - http://www.scopus.com/inward/record.url?scp=70350697843&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70350697843&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btp378

DO - 10.1093/bioinformatics/btp378

M3 - Article

VL - 25

SP - 2882

EP - 2889

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 21

ER -