Evaluating Protein Engineering Thermostability Prediction Tools Using an Independently Generated Dataset

Peishan Huang, Simon K.S. Chu, Henrique N. Frizzo, Morgan P. Connolly, Ryan W. Caster, Justin B. Siegel

Research output: Contribution to journalArticlepeer-review

11 Scopus citations


Engineering proteins to enhance thermal stability is a widely utilized approach for creating industrially relevant biocatalysts. The development of new experimental datasets and computational tools to guide these engineering efforts remains an active area of research. Thus, to complement the previously reported measures of T50 and kinetic constants, we are reporting an expansion of our previously published dataset of mutants for β-glucosidase to include both measures of TM and ΔΔG. For a set of 51 mutants, we found that T50 and TM are moderately correlated, with a Pearson correlation coefficient and Spearman's rank coefficient of 0.58 and 0.47, respectively, indicating that the two methods capture different physical features. The performance of predicted stability using nine computational tools was also evaluated on the dataset of 51 mutants, none of which are found to be strong predictors of the observed changes in T50, TM, or ΔΔG. Furthermore, the ability of the nine algorithms to predict the production of isolatable soluble protein was examined, which revealed that Rosetta ΔΔG, FoldX, DeepDDG, PoPMuSiC, and SDM were capable of predicting if a mutant could be produced and isolated as a soluble protein. These results further highlight the need for new algorithms for predicting modest, yet important, changes in thermal stability as well as a new utility for current algorithms for prescreening designs for the production of mutants that maintain fold and soluble production properties.

Original languageEnglish (US)
JournalACS Omega
StateAccepted/In press - Jan 1 2020

ASJC Scopus subject areas

  • Chemistry(all)
  • Chemical Engineering(all)


Dive into the research topics of 'Evaluating Protein Engineering Thermostability Prediction Tools Using an Independently Generated Dataset'. Together they form a unique fingerprint.

Cite this