Toward fully automated high performance computing drug discovery: A massively parallel virtual screening pipeline for docking and molecular mechanics/generalized born surface area rescoring to improve enrichment

Xiaohua Zhang, Sergio E. Wong, Felice C Lightstone

Research output: Contribution to journalArticle

47 Scopus citations

Abstract

In this work we announce and evaluate a high throughput virtual screening pipeline for in-silico screening of virtual compound databases using high performance computing (HPC). Notable features of this pipeline are an automated receptor preparation scheme with unsupervised binding site identification. The pipeline includes receptor/target preparation, ligand preparation, VinaLC docking calculation, and molecular mechanics/generalized Born surface area (MM/GBSA) rescoring using the GB model by Onufriev and co-workers [J. Chem. Theory Comput. 2007, 3, 156-169]. Furthermore, we leverage HPC resources to perform an unprecedented, comprehensive evaluation of MM/GBSA rescoring when applied to the DUD-E data set (Directory of Useful Decoys: Enhanced), in which we selected 38 protein targets and a total of ∼0.7 million actives and decoys. The computer wall time for virtual screening has been reduced drastically on HPC machines, which increases the feasibility of extremely large ligand database screening with more accurate methods. HPC resources allowed us to rescore 20 poses per compound and evaluate the optimal number of poses to rescore. We find that keeping 5-10 poses is a good compromise between accuracy and computational expense. Overall the results demonstrate that MM/GBSA rescoring has higher average receiver operating characteristic (ROC) area under curve (AUC) values and consistently better early recovery of actives than Vina docking alone. Specifically, the enrichment performance is target-dependent. MM/GBSA rescoring significantly out performs Vina docking for the folate enzymes, kinases, and several other enzymes. The more accurate energy function and solvation terms of the MM/GBSA method allow MM/GBSA to achieve better enrichment, but the rescoring is still limited by the docking method to generate the poses with the correct binding modes.

Original languageEnglish (US)
Pages (from-to)324-337
Number of pages14
JournalJournal of Chemical Information and Modeling
Volume54
Issue number1
DOIs
StatePublished - Jan 27 2014
Externally publishedYes

ASJC Scopus subject areas

  • Chemistry(all)
  • Chemical Engineering(all)
  • Computer Science Applications
  • Library and Information Sciences
  • Medicine(all)

Fingerprint Dive into the research topics of 'Toward fully automated high performance computing drug discovery: A massively parallel virtual screening pipeline for docking and molecular mechanics/generalized born surface area rescoring to improve enrichment'. Together they form a unique fingerprint.

  • Cite this