Unsupervised detection of genes of influence in lung cancer using biological networks

Anna Goldenberg, Sara Mostafavi, Gerald Quon, Paul C. Boutros, Quaid D. Morris

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Motivation: Lung cancer is often discovered long after its onset, making identifying genes important in its initiation and progression a challenge. By the time the tumors are discovered, we only observe the final sum of changes of the few genes that initiated cancer and thousands of genes that they have influenced. Gene interactions and heterogeneity of samples make it difficult to identify genes consistent between different cohorts. Using gene and gene-product interaction networks, we propose a principled approach to identify a small subset of genes whose network neighbors exhibit consistently high expression change (in cancerous tissue versus normal) regardless of their own expression. We hypothesize that these genes can shed light on the larger scale perturbations in the overall landscape of expression levels. Results: We benchmark our method on simulated data, and show that we can recover a true gene list in noisy measurement data. We then apply our method to four non-small cell lung cancer and two pancreatic cancer cohorts, finding several genes that are consistent within all cohorts of the same cancer type. Conclusion: Our model is flexible, robust and identifies gene sets that are more consistent across cohorts than several other approaches. Additionally, our method can be applied on a per-patient basis not requiring large cohorts of patients to find genes of influence. Our approach is generally applicable to gene expression studies where the goal is to identify a small set of influential genes that may in turn explain the much larger set of genome-wide expression changes.

Original languageEnglish (US)
Article numberbtr533
Pages (from-to)3166-3172
Number of pages7
JournalBioinformatics
Volume27
Issue number22
DOIs
StatePublished - Nov 1 2011
Externally publishedYes

Fingerprint

Lung Cancer
Biological Networks
Lung Neoplasms
Genes
Gene
Cancer
Influence
Benchmarking
Gene Regulatory Networks
Neoplasm Genes
Gene Networks
Pancreatic Neoplasms
Non-Small Cell Lung Carcinoma
Progression
Interaction
Large Set
Neoplasms
Gene Expression
Tumor
Gene expression

ASJC Scopus subject areas

  • Statistics and Probability
  • Medicine(all)
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Goldenberg, A., Mostafavi, S., Quon, G., Boutros, P. C., & Morris, Q. D. (2011). Unsupervised detection of genes of influence in lung cancer using biological networks. Bioinformatics, 27(22), 3166-3172. [btr533]. https://doi.org/10.1093/bioinformatics/btr533

Unsupervised detection of genes of influence in lung cancer using biological networks. / Goldenberg, Anna; Mostafavi, Sara; Quon, Gerald; Boutros, Paul C.; Morris, Quaid D.

In: Bioinformatics, Vol. 27, No. 22, btr533, 01.11.2011, p. 3166-3172.

Research output: Contribution to journalArticle

Goldenberg, A, Mostafavi, S, Quon, G, Boutros, PC & Morris, QD 2011, 'Unsupervised detection of genes of influence in lung cancer using biological networks', Bioinformatics, vol. 27, no. 22, btr533, pp. 3166-3172. https://doi.org/10.1093/bioinformatics/btr533
Goldenberg, Anna ; Mostafavi, Sara ; Quon, Gerald ; Boutros, Paul C. ; Morris, Quaid D. / Unsupervised detection of genes of influence in lung cancer using biological networks. In: Bioinformatics. 2011 ; Vol. 27, No. 22. pp. 3166-3172.
@article{157e27b4e8634fdaafb17abb03122c99,
title = "Unsupervised detection of genes of influence in lung cancer using biological networks",
abstract = "Motivation: Lung cancer is often discovered long after its onset, making identifying genes important in its initiation and progression a challenge. By the time the tumors are discovered, we only observe the final sum of changes of the few genes that initiated cancer and thousands of genes that they have influenced. Gene interactions and heterogeneity of samples make it difficult to identify genes consistent between different cohorts. Using gene and gene-product interaction networks, we propose a principled approach to identify a small subset of genes whose network neighbors exhibit consistently high expression change (in cancerous tissue versus normal) regardless of their own expression. We hypothesize that these genes can shed light on the larger scale perturbations in the overall landscape of expression levels. Results: We benchmark our method on simulated data, and show that we can recover a true gene list in noisy measurement data. We then apply our method to four non-small cell lung cancer and two pancreatic cancer cohorts, finding several genes that are consistent within all cohorts of the same cancer type. Conclusion: Our model is flexible, robust and identifies gene sets that are more consistent across cohorts than several other approaches. Additionally, our method can be applied on a per-patient basis not requiring large cohorts of patients to find genes of influence. Our approach is generally applicable to gene expression studies where the goal is to identify a small set of influential genes that may in turn explain the much larger set of genome-wide expression changes.",
author = "Anna Goldenberg and Sara Mostafavi and Gerald Quon and Boutros, {Paul C.} and Morris, {Quaid D.}",
year = "2011",
month = "11",
day = "1",
doi = "10.1093/bioinformatics/btr533",
language = "English (US)",
volume = "27",
pages = "3166--3172",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "22",

}

TY - JOUR

T1 - Unsupervised detection of genes of influence in lung cancer using biological networks

AU - Goldenberg, Anna

AU - Mostafavi, Sara

AU - Quon, Gerald

AU - Boutros, Paul C.

AU - Morris, Quaid D.

PY - 2011/11/1

Y1 - 2011/11/1

N2 - Motivation: Lung cancer is often discovered long after its onset, making identifying genes important in its initiation and progression a challenge. By the time the tumors are discovered, we only observe the final sum of changes of the few genes that initiated cancer and thousands of genes that they have influenced. Gene interactions and heterogeneity of samples make it difficult to identify genes consistent between different cohorts. Using gene and gene-product interaction networks, we propose a principled approach to identify a small subset of genes whose network neighbors exhibit consistently high expression change (in cancerous tissue versus normal) regardless of their own expression. We hypothesize that these genes can shed light on the larger scale perturbations in the overall landscape of expression levels. Results: We benchmark our method on simulated data, and show that we can recover a true gene list in noisy measurement data. We then apply our method to four non-small cell lung cancer and two pancreatic cancer cohorts, finding several genes that are consistent within all cohorts of the same cancer type. Conclusion: Our model is flexible, robust and identifies gene sets that are more consistent across cohorts than several other approaches. Additionally, our method can be applied on a per-patient basis not requiring large cohorts of patients to find genes of influence. Our approach is generally applicable to gene expression studies where the goal is to identify a small set of influential genes that may in turn explain the much larger set of genome-wide expression changes.

AB - Motivation: Lung cancer is often discovered long after its onset, making identifying genes important in its initiation and progression a challenge. By the time the tumors are discovered, we only observe the final sum of changes of the few genes that initiated cancer and thousands of genes that they have influenced. Gene interactions and heterogeneity of samples make it difficult to identify genes consistent between different cohorts. Using gene and gene-product interaction networks, we propose a principled approach to identify a small subset of genes whose network neighbors exhibit consistently high expression change (in cancerous tissue versus normal) regardless of their own expression. We hypothesize that these genes can shed light on the larger scale perturbations in the overall landscape of expression levels. Results: We benchmark our method on simulated data, and show that we can recover a true gene list in noisy measurement data. We then apply our method to four non-small cell lung cancer and two pancreatic cancer cohorts, finding several genes that are consistent within all cohorts of the same cancer type. Conclusion: Our model is flexible, robust and identifies gene sets that are more consistent across cohorts than several other approaches. Additionally, our method can be applied on a per-patient basis not requiring large cohorts of patients to find genes of influence. Our approach is generally applicable to gene expression studies where the goal is to identify a small set of influential genes that may in turn explain the much larger set of genome-wide expression changes.

UR - http://www.scopus.com/inward/record.url?scp=80755169537&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80755169537&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btr533

DO - 10.1093/bioinformatics/btr533

M3 - Article

VL - 27

SP - 3166

EP - 3172

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 22

M1 - btr533

ER -