A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics

John T. Halloran, David M Rocke

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l2-SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l2-SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l2-SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator-upgrade.

Original languageEnglish (US)
Pages (from-to)1978-1982
Number of pages5
JournalJournal of Proteome Research
Volume17
Issue number5
DOIs
StatePublished - May 4 2018

Fingerprint

Proteomics
Support vector machines
Learning systems
Licensure
Support Vector Machine
Machine Learning
Learning algorithms
Software
Learning
Databases
Engines
Peptides

Keywords

  • machine learning
  • percolator
  • support vector machine
  • tandem mass spectrometry
  • TRON

ASJC Scopus subject areas

  • Biochemistry
  • Chemistry(all)

Cite this

A Matter of Time : Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics. / Halloran, John T.; Rocke, David M.

In: Journal of Proteome Research, Vol. 17, No. 5, 04.05.2018, p. 1978-1982.

Research output: Contribution to journalArticle

@article{5641e594019c4594a9c82f6dbccbb2bf,
title = "A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics",
abstract = "Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l2-SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l2-SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l2-SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator-upgrade.",
keywords = "machine learning, percolator, support vector machine, tandem mass spectrometry, TRON",
author = "Halloran, {John T.} and Rocke, {David M}",
year = "2018",
month = "5",
day = "4",
doi = "10.1021/acs.jproteome.7b00767",
language = "English (US)",
volume = "17",
pages = "1978--1982",
journal = "Journal of Proteome Research",
issn = "1535-3893",
publisher = "American Chemical Society",
number = "5",

}

TY - JOUR

T1 - A Matter of Time

T2 - Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics

AU - Halloran, John T.

AU - Rocke, David M

PY - 2018/5/4

Y1 - 2018/5/4

N2 - Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l2-SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l2-SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l2-SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator-upgrade.

AB - Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l2-SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l2-SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l2-SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator-upgrade.

KW - machine learning

KW - percolator

KW - support vector machine

KW - tandem mass spectrometry

KW - TRON

UR - http://www.scopus.com/inward/record.url?scp=85046655682&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046655682&partnerID=8YFLogxK

U2 - 10.1021/acs.jproteome.7b00767

DO - 10.1021/acs.jproteome.7b00767

M3 - Article

C2 - 29607643

AN - SCOPUS:85046655682

VL - 17

SP - 1978

EP - 1982

JO - Journal of Proteome Research

JF - Journal of Proteome Research

SN - 1535-3893

IS - 5

ER -