On partial least squares dimension reduction for microarray-based classification: A simulation study

Danh V. Nguyen, David M Rocke

Research output: Contribution to journalArticlepeer-review

55 Scopus citations

Abstract

In microarray tumor tissue classification studies, the expressions of thousands of genes (variables) are simultaneously measured across a few tissue samples. Standard statistical methodologies in classification do not work well when the dimension, p, is greater than the sample size, N. One approach to classification problems, when p≫N, is to first apply a dimension reduction method and then perform the classification in the reduced space. In this paper, we study dimension reduction for classification in high dimension based on partial least squares (PLS) and principal components analysis (PCA). In addition, we propose and explore two hybrid-PLS methods for dimension reduction. PLS components are linear combinations of the original predictors, but the weights are nonlinear functions of both the predictors and response variable. This makes it difficult to study the PLS classification methodologies analytically, so, in this paper, we turn to a numerical study using simulation.

Original languageEnglish (US)
Pages (from-to)407-425
Number of pages19
JournalComputational Statistics and Data Analysis
Volume46
Issue number3
DOIs
StatePublished - Jun 15 2004

Keywords

  • DNA microarray
  • Logistic discrimination
  • Partial least squares
  • Principal components analysis

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Statistics, Probability and Uncertainty
  • Electrical and Electronic Engineering
  • Computational Mathematics
  • Numerical Analysis
  • Statistics and Probability

Fingerprint Dive into the research topics of 'On partial least squares dimension reduction for microarray-based classification: A simulation study'. Together they form a unique fingerprint.

Cite this