Chemometric and statistical analyses of ToF-SIMS spectra of increasingly complex biological samples

Elena S F Berman, Ligang Wu, Susan L. Fortson, Kristen S. Kulp, David O. Nelson, Kuang Jen Wu

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

Characterizing and classifying molecular variations within biological samples are critical for determining the fundamental mechanisms of biological processes. Toward these ends, time-of-flight secondary ion mass spectrometry (ToF-SIMS) was used to examine increasingly complex samples of biological relevance. The large, multivariate datasets were analyzed using five common statistical and chemometric techniques: principal component analysis (PCA), linear discriminant analysis (LDA), partial least-squares discriminant analysis (PLSDA), soft independent modeling of class analogy (SIMCA), and decision-tree analysis by recursive partitioning. PCA was found to provide insight into both the relative groupings of samples and the molecular basis for those groupings. For monosaccharide, pure protein, and complex protein mixture samples, LDA, PLSDA, and SIMCA all produced excellent classification. For mouse embryo tissues, however, SIMCA did not classify samples as accurately. The decision-tree analysis was the least successful for all tested samples, providing neither as accurate a classification nor chemical insight. Based on these results we conclude that as the complexity of samples increases, so must the sophistication of the multivariate technique used to classify the samples. PCA is a preferred first step for understanding ToF-SIMS data that can be followed by either LDA or PLSDA for effective classification. This study demonstrates the strength of the combination of ToF-SIMS and multivariate analysis to classify increasingly complex biological samples. Applying these techniques to information-rich mass spectral data opens the possibilities for new applications including classification of subtly different biological samples that may provide insights into cellular processes, disease progress, and disease diagnosis.

Original languageEnglish (US)
Pages (from-to)97-104
Number of pages8
JournalSurface and Interface Analysis
Volume41
Issue number2
DOIs
StatePublished - Feb 2009
Externally publishedYes

Fingerprint

Discriminant analysis
Secondary ion mass spectrometry
secondary ion mass spectrometry
Principal component analysis
Decision trees
principal components analysis
Proteins
Monosaccharides
monosaccharides
proteins
Tissue
embryos
classifying
mice

Keywords

  • Biological
  • Decision tree
  • LDA
  • Multivariate analysis
  • PCA
  • PLSDA
  • SIMCA
  • ToF-SIMS

ASJC Scopus subject areas

  • Chemistry(all)
  • Condensed Matter Physics
  • Surfaces and Interfaces
  • Materials Chemistry
  • Surfaces, Coatings and Films

Cite this

Berman, E. S. F., Wu, L., Fortson, S. L., Kulp, K. S., Nelson, D. O., & Wu, K. J. (2009). Chemometric and statistical analyses of ToF-SIMS spectra of increasingly complex biological samples. Surface and Interface Analysis, 41(2), 97-104. https://doi.org/10.1002/sia.2953

Chemometric and statistical analyses of ToF-SIMS spectra of increasingly complex biological samples. / Berman, Elena S F; Wu, Ligang; Fortson, Susan L.; Kulp, Kristen S.; Nelson, David O.; Wu, Kuang Jen.

In: Surface and Interface Analysis, Vol. 41, No. 2, 02.2009, p. 97-104.

Research output: Contribution to journalArticle

Berman, Elena S F ; Wu, Ligang ; Fortson, Susan L. ; Kulp, Kristen S. ; Nelson, David O. ; Wu, Kuang Jen. / Chemometric and statistical analyses of ToF-SIMS spectra of increasingly complex biological samples. In: Surface and Interface Analysis. 2009 ; Vol. 41, No. 2. pp. 97-104.
@article{a376671c73e9401f97451f877464f7b2,
title = "Chemometric and statistical analyses of ToF-SIMS spectra of increasingly complex biological samples",
abstract = "Characterizing and classifying molecular variations within biological samples are critical for determining the fundamental mechanisms of biological processes. Toward these ends, time-of-flight secondary ion mass spectrometry (ToF-SIMS) was used to examine increasingly complex samples of biological relevance. The large, multivariate datasets were analyzed using five common statistical and chemometric techniques: principal component analysis (PCA), linear discriminant analysis (LDA), partial least-squares discriminant analysis (PLSDA), soft independent modeling of class analogy (SIMCA), and decision-tree analysis by recursive partitioning. PCA was found to provide insight into both the relative groupings of samples and the molecular basis for those groupings. For monosaccharide, pure protein, and complex protein mixture samples, LDA, PLSDA, and SIMCA all produced excellent classification. For mouse embryo tissues, however, SIMCA did not classify samples as accurately. The decision-tree analysis was the least successful for all tested samples, providing neither as accurate a classification nor chemical insight. Based on these results we conclude that as the complexity of samples increases, so must the sophistication of the multivariate technique used to classify the samples. PCA is a preferred first step for understanding ToF-SIMS data that can be followed by either LDA or PLSDA for effective classification. This study demonstrates the strength of the combination of ToF-SIMS and multivariate analysis to classify increasingly complex biological samples. Applying these techniques to information-rich mass spectral data opens the possibilities for new applications including classification of subtly different biological samples that may provide insights into cellular processes, disease progress, and disease diagnosis.",
keywords = "Biological, Decision tree, LDA, Multivariate analysis, PCA, PLSDA, SIMCA, ToF-SIMS",
author = "Berman, {Elena S F} and Ligang Wu and Fortson, {Susan L.} and Kulp, {Kristen S.} and Nelson, {David O.} and Wu, {Kuang Jen}",
year = "2009",
month = "2",
doi = "10.1002/sia.2953",
language = "English (US)",
volume = "41",
pages = "97--104",
journal = "Surface and Interface Analysis",
issn = "0142-2421",
publisher = "John Wiley and Sons Ltd",
number = "2",

}

TY - JOUR

T1 - Chemometric and statistical analyses of ToF-SIMS spectra of increasingly complex biological samples

AU - Berman, Elena S F

AU - Wu, Ligang

AU - Fortson, Susan L.

AU - Kulp, Kristen S.

AU - Nelson, David O.

AU - Wu, Kuang Jen

PY - 2009/2

Y1 - 2009/2

N2 - Characterizing and classifying molecular variations within biological samples are critical for determining the fundamental mechanisms of biological processes. Toward these ends, time-of-flight secondary ion mass spectrometry (ToF-SIMS) was used to examine increasingly complex samples of biological relevance. The large, multivariate datasets were analyzed using five common statistical and chemometric techniques: principal component analysis (PCA), linear discriminant analysis (LDA), partial least-squares discriminant analysis (PLSDA), soft independent modeling of class analogy (SIMCA), and decision-tree analysis by recursive partitioning. PCA was found to provide insight into both the relative groupings of samples and the molecular basis for those groupings. For monosaccharide, pure protein, and complex protein mixture samples, LDA, PLSDA, and SIMCA all produced excellent classification. For mouse embryo tissues, however, SIMCA did not classify samples as accurately. The decision-tree analysis was the least successful for all tested samples, providing neither as accurate a classification nor chemical insight. Based on these results we conclude that as the complexity of samples increases, so must the sophistication of the multivariate technique used to classify the samples. PCA is a preferred first step for understanding ToF-SIMS data that can be followed by either LDA or PLSDA for effective classification. This study demonstrates the strength of the combination of ToF-SIMS and multivariate analysis to classify increasingly complex biological samples. Applying these techniques to information-rich mass spectral data opens the possibilities for new applications including classification of subtly different biological samples that may provide insights into cellular processes, disease progress, and disease diagnosis.

AB - Characterizing and classifying molecular variations within biological samples are critical for determining the fundamental mechanisms of biological processes. Toward these ends, time-of-flight secondary ion mass spectrometry (ToF-SIMS) was used to examine increasingly complex samples of biological relevance. The large, multivariate datasets were analyzed using five common statistical and chemometric techniques: principal component analysis (PCA), linear discriminant analysis (LDA), partial least-squares discriminant analysis (PLSDA), soft independent modeling of class analogy (SIMCA), and decision-tree analysis by recursive partitioning. PCA was found to provide insight into both the relative groupings of samples and the molecular basis for those groupings. For monosaccharide, pure protein, and complex protein mixture samples, LDA, PLSDA, and SIMCA all produced excellent classification. For mouse embryo tissues, however, SIMCA did not classify samples as accurately. The decision-tree analysis was the least successful for all tested samples, providing neither as accurate a classification nor chemical insight. Based on these results we conclude that as the complexity of samples increases, so must the sophistication of the multivariate technique used to classify the samples. PCA is a preferred first step for understanding ToF-SIMS data that can be followed by either LDA or PLSDA for effective classification. This study demonstrates the strength of the combination of ToF-SIMS and multivariate analysis to classify increasingly complex biological samples. Applying these techniques to information-rich mass spectral data opens the possibilities for new applications including classification of subtly different biological samples that may provide insights into cellular processes, disease progress, and disease diagnosis.

KW - Biological

KW - Decision tree

KW - LDA

KW - Multivariate analysis

KW - PCA

KW - PLSDA

KW - SIMCA

KW - ToF-SIMS

UR - http://www.scopus.com/inward/record.url?scp=60349102451&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=60349102451&partnerID=8YFLogxK

U2 - 10.1002/sia.2953

DO - 10.1002/sia.2953

M3 - Article

VL - 41

SP - 97

EP - 104

JO - Surface and Interface Analysis

JF - Surface and Interface Analysis

SN - 0142-2421

IS - 2

ER -