Partial least squares dimension reduction for microarray gene expression data with a censored response

Danh V. Nguyen

Research output: Contribution to journalArticlepeer-review

15 Scopus citations


An important application of DNA microarray technologies involves monitoring the global state of transcriptional program in tumor cells. One goal in cancer microarray studies is to compare the clinical outcome, such as relapse-free or overall survival, for subgroups of patients defined by global gene expression patterns. A method of comparing patient survival, as a function of gene expression, was recently proposed in [Bioinformatics 18 (2002) 1625] by Nguyen and Rocke. Due to the (a) high-dimensionality of microarray gene expression data and (b) censored survival times, a two-stage procedure was proposed to relate survival times to gene expression profiles. The first stage involves dimensionality reduction of the gene expression data by partial least squares (PLS) and the second stage involves prediction of survival probability using proportional hazard regression. In this paper, we provide a systematic assessment of the performance of this two-stage procedure. PLS dimension reduction involves complex non-linear functions of both the predictors and the response data, rendering exact analytical study intractable. Thus, we assess the methodology under a simulation model for gene expression data with a censored response variable. In particular, we compare the performance of PLS dimension reduction relative to dimension reduction via principal components analysis (PCA) and to a modified PLS (MPLS) approach. PLS performed substantially better relative to dimension reduction via PCA when the total predictor variance explained is low to moderate (e.g. 40%-60%). It performed similar to MPLS and slightly better in some cases. Additionally, we examine the effect of censoring on dimension reduction stage. The performance of all methods deteriorates for a high censoring rate, although PLS-PH performed relatively best overall.

Original languageEnglish (US)
Pages (from-to)119-137
Number of pages19
JournalMathematical Biosciences
Issue number1
StatePublished - Jan 2005


  • Dimension reduction
  • DNA Microarray
  • Gene expression
  • Partial least squares
  • Principal components
  • Proportional hazard regression

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Ecology, Evolution, Behavior and Systematics


Dive into the research topics of 'Partial least squares dimension reduction for microarray gene expression data with a censored response'. Together they form a unique fingerprint.

Cite this