Application of data mining tools for classification of protein structural class from residue based averaged NMR chemical shifts

Arun V. Kumar, Rehana F M Ali, Yu Cao, Viswanathan V Krishnan

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


The number of protein sequences deriving from genome sequencing projects is outpacing our knowledge about the function of these proteins. With the gap between experimentally characterized and uncharacterized proteins continuing to widen, it is necessary to develop new computational methods and tools for protein structural information that is directly related to function. Nuclear magnetic resonance (NMR) provides powerful means to determine three-dimensional structures of proteins in the solution state. However, translation of the NMR spectral parameters to even low-resolution structural information such as protein class requires multiple time consuming steps. In this paper, we present an unorthodox method to predict the protein structural class directly by using the residue's averaged chemical shifts (ACS) based on machine learning algorithms. Experimental chemical shift information from 1491 proteins obtained from Biological Magnetic Resonance Bank (BMRB) and their respective protein structural classes derived from structural classification of proteins (SCOP) were used to construct a data set with 119 attributes and 5 different classes. Twenty four different classification schemes were evaluated using several performance measures. Overall the residue based ACS values can predict the protein structural classes with 80% accuracy measured by Matthew correlation coefficient. Specifically protein classes defined by mixed αβ or small proteins are classified with > 90% correlation. Our results indicate that this NMR-based method can be utilized as a low-resolution tool for protein structural class identification without any prior chemical shift assignments.

Original languageEnglish (US)
Pages (from-to)1545-1552
Number of pages8
JournalBiochimica et Biophysica Acta - Proteins and Proteomics
Issue number10
StatePublished - Oct 1 2015


  • Chemical shift
  • Data mining
  • NMR
  • Protein structural class

ASJC Scopus subject areas

  • Analytical Chemistry
  • Biophysics
  • Biochemistry
  • Molecular Biology


Dive into the research topics of 'Application of data mining tools for classification of protein structural class from residue based averaged NMR chemical shifts'. Together they form a unique fingerprint.

Cite this