Swarm intelligence based wavelet coefficient feature selection for mass spectral classification: An application to proteomics data

Weixiang Zhao, Cristina E Davis

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

This paper introduces the ant colony algorithm, a novel swarm intelligence based optimization method, to select appropriate wavelet coefficients from mass spectral data as a new feature selection method for ovarian cancer diagnostics. By determining the proper parameters for the ant colony algorithm (ACA) based searching algorithm, we perform the feature searching process for 100 times with the number of selected features fixed at 5. The results of this study show: (1) the classification accuracy based on the five selected wavelet coefficients can reach up to 100% for all the training, validating and independent testing sets; (2) the eight most popular selected wavelet coefficients of the 100 runs can provide 100% accuracy for the training set, 100% accuracy for the validating set, and 98.8% accuracy for the independent testing set, which suggests the robustness and accuracy of the proposed feature selection method; and (3) the mass spectral data corresponding to the eight popular wavelet coefficients can be located by reverse wavelet transformation and these located mass spectral data still maintain high classification accuracies (100% for the training set, 97.6% for the validating set, and 98.8% for the testing set) and also provide sufficient physical and medical meaning for future ovarian cancer mechanism studies. Furthermore, the corresponding mass spectral data (potential biomarkers) are in good agreement with other studies which have used the same sample set. Together these results suggest this feature extraction strategy will benefit the development of intelligent and real-time spectroscopy instrumentation based diagnosis and monitoring systems.

Original languageEnglish (US)
Pages (from-to)15-23
Number of pages9
JournalAnalytica Chimica Acta
Volume651
Issue number1
DOIs
StatePublished - Sep 28 2009

Fingerprint

proteomics
Intelligence
Proteomics
wavelet
Feature extraction
Ants
Ovarian Neoplasms
Testing
ant
Biomarkers
cancer
Spectrum Analysis
Spectroscopy
Monitoring
monitoring system
instrumentation
biomarker
spectroscopy
Swarm intelligence
method

Keywords

  • Ant colony algorithm
  • Feature selection
  • Mass spectrometry
  • Support vector machine
  • Swarm intelligence
  • Wavelet

ASJC Scopus subject areas

  • Biochemistry
  • Analytical Chemistry
  • Spectroscopy
  • Environmental Chemistry

Cite this

@article{618f7dc8e10f49fe904b130296e39245,
title = "Swarm intelligence based wavelet coefficient feature selection for mass spectral classification: An application to proteomics data",
abstract = "This paper introduces the ant colony algorithm, a novel swarm intelligence based optimization method, to select appropriate wavelet coefficients from mass spectral data as a new feature selection method for ovarian cancer diagnostics. By determining the proper parameters for the ant colony algorithm (ACA) based searching algorithm, we perform the feature searching process for 100 times with the number of selected features fixed at 5. The results of this study show: (1) the classification accuracy based on the five selected wavelet coefficients can reach up to 100{\%} for all the training, validating and independent testing sets; (2) the eight most popular selected wavelet coefficients of the 100 runs can provide 100{\%} accuracy for the training set, 100{\%} accuracy for the validating set, and 98.8{\%} accuracy for the independent testing set, which suggests the robustness and accuracy of the proposed feature selection method; and (3) the mass spectral data corresponding to the eight popular wavelet coefficients can be located by reverse wavelet transformation and these located mass spectral data still maintain high classification accuracies (100{\%} for the training set, 97.6{\%} for the validating set, and 98.8{\%} for the testing set) and also provide sufficient physical and medical meaning for future ovarian cancer mechanism studies. Furthermore, the corresponding mass spectral data (potential biomarkers) are in good agreement with other studies which have used the same sample set. Together these results suggest this feature extraction strategy will benefit the development of intelligent and real-time spectroscopy instrumentation based diagnosis and monitoring systems.",
keywords = "Ant colony algorithm, Feature selection, Mass spectrometry, Support vector machine, Swarm intelligence, Wavelet",
author = "Weixiang Zhao and Davis, {Cristina E}",
year = "2009",
month = "9",
day = "28",
doi = "10.1016/j.aca.2009.08.008",
language = "English (US)",
volume = "651",
pages = "15--23",
journal = "Analytica Chimica Acta",
issn = "0003-2670",
publisher = "Elsevier",
number = "1",

}

TY - JOUR

T1 - Swarm intelligence based wavelet coefficient feature selection for mass spectral classification

T2 - An application to proteomics data

AU - Zhao, Weixiang

AU - Davis, Cristina E

PY - 2009/9/28

Y1 - 2009/9/28

N2 - This paper introduces the ant colony algorithm, a novel swarm intelligence based optimization method, to select appropriate wavelet coefficients from mass spectral data as a new feature selection method for ovarian cancer diagnostics. By determining the proper parameters for the ant colony algorithm (ACA) based searching algorithm, we perform the feature searching process for 100 times with the number of selected features fixed at 5. The results of this study show: (1) the classification accuracy based on the five selected wavelet coefficients can reach up to 100% for all the training, validating and independent testing sets; (2) the eight most popular selected wavelet coefficients of the 100 runs can provide 100% accuracy for the training set, 100% accuracy for the validating set, and 98.8% accuracy for the independent testing set, which suggests the robustness and accuracy of the proposed feature selection method; and (3) the mass spectral data corresponding to the eight popular wavelet coefficients can be located by reverse wavelet transformation and these located mass spectral data still maintain high classification accuracies (100% for the training set, 97.6% for the validating set, and 98.8% for the testing set) and also provide sufficient physical and medical meaning for future ovarian cancer mechanism studies. Furthermore, the corresponding mass spectral data (potential biomarkers) are in good agreement with other studies which have used the same sample set. Together these results suggest this feature extraction strategy will benefit the development of intelligent and real-time spectroscopy instrumentation based diagnosis and monitoring systems.

AB - This paper introduces the ant colony algorithm, a novel swarm intelligence based optimization method, to select appropriate wavelet coefficients from mass spectral data as a new feature selection method for ovarian cancer diagnostics. By determining the proper parameters for the ant colony algorithm (ACA) based searching algorithm, we perform the feature searching process for 100 times with the number of selected features fixed at 5. The results of this study show: (1) the classification accuracy based on the five selected wavelet coefficients can reach up to 100% for all the training, validating and independent testing sets; (2) the eight most popular selected wavelet coefficients of the 100 runs can provide 100% accuracy for the training set, 100% accuracy for the validating set, and 98.8% accuracy for the independent testing set, which suggests the robustness and accuracy of the proposed feature selection method; and (3) the mass spectral data corresponding to the eight popular wavelet coefficients can be located by reverse wavelet transformation and these located mass spectral data still maintain high classification accuracies (100% for the training set, 97.6% for the validating set, and 98.8% for the testing set) and also provide sufficient physical and medical meaning for future ovarian cancer mechanism studies. Furthermore, the corresponding mass spectral data (potential biomarkers) are in good agreement with other studies which have used the same sample set. Together these results suggest this feature extraction strategy will benefit the development of intelligent and real-time spectroscopy instrumentation based diagnosis and monitoring systems.

KW - Ant colony algorithm

KW - Feature selection

KW - Mass spectrometry

KW - Support vector machine

KW - Swarm intelligence

KW - Wavelet

UR - http://www.scopus.com/inward/record.url?scp=69449088819&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=69449088819&partnerID=8YFLogxK

U2 - 10.1016/j.aca.2009.08.008

DO - 10.1016/j.aca.2009.08.008

M3 - Article

C2 - 19733729

AN - SCOPUS:69449088819

VL - 651

SP - 15

EP - 23

JO - Analytica Chimica Acta

JF - Analytica Chimica Acta

SN - 0003-2670

IS - 1

ER -