Scalable training of sparse linear SVMs

Guo Xun Yuan, Kwan-Liu Ma

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations


Sparse linear support vector machines have been widely applied to variable selection in many applications. For large data, managing the cost of training a sparse model with good predication performance is an essential topic. In this work, we propose a scalable training algorithm for large-scale data with millions of examples and features. We develop a dual alternating direction method for solving L1-regularized linear SVMs. The learning procedure simply involves quadratic programming in the same form as the standard SVM dual, followed by a soft-thresholding operation. The proposed training algorithm possesses two favorable properties. First, it is a decomposable algorithm by which a large problem can be reduced to small ones. Second, the sparsity of intermediate solutions is maintained throughout the training process. It naturally promotes the solution sparsity by soft-thresholding. We demonstrate that, by experiments, our method outperforms state-of-the-art approaches on large-scale benchmark data sets. We also show that it is well suited for training large sparse models on a distributed system.

Original languageEnglish (US)
Title of host publicationProceedings - 12th IEEE International Conference on Data Mining, ICDM 2012
Number of pages10
StatePublished - Dec 1 2012
Event12th IEEE International Conference on Data Mining, ICDM 2012 - Brussels, Belgium
Duration: Dec 10 2012Dec 13 2012


Other12th IEEE International Conference on Data Mining, ICDM 2012

ASJC Scopus subject areas

  • Engineering(all)


Dive into the research topics of 'Scalable training of sparse linear SVMs'. Together they form a unique fingerprint.

Cite this