Message passing interface and multithreading hybrid for parallel molecular docking of large databases on petascale high performance computing machines

Xiaohua Zhang, Sergio E. Wong, Felice C Lightstone

Research output: Contribution to journalArticle

42 Citations (Scopus)

Abstract

A mixed parallel scheme that combines message passing interface (MPI) and multithreading was implemented in the AutoDock Vina molecular docking program. The resulting program, named VinaLC, was tested on the petascale high performance computing (HPC) machines at Lawrence Livermore National Laboratory. To exploit the typical cluster-type supercomputers, thousands of docking calculations were dispatched by the master process to run simultaneously on thousands of slave processes, where each docking calculation takes one slave process on one node, and within the node each docking calculation runs via multithreading on multiple CPU cores and shared memory. Input and output of the program and the data handling within the program were carefully designed to deal with large databases and ultimately achieve HPC on a large number of CPU cores. Parallel performance analysis of the VinaLC program shows that the code scales up to more than 15K CPUs with a very low overhead cost of 3.94%. One million flexible compound docking calculations took only 1.4 h to finish on about 15K CPUs. The docking accuracy of VinaLC has been validated against the DUD data set by the re-docking of X-ray ligands and an enrichment study, 64.4% of the top scoring poses have RMSD values under 2.0 Å. The program has been demonstrated to have good enrichment performance on 70% of the targets in the DUD data set. An analysis of the enrichment factors calculated at various percentages of the screening database indicates VinaLC has very good early recovery of actives.

Original languageEnglish (US)
Pages (from-to)915-927
Number of pages13
JournalJournal of Computational Chemistry
Volume34
Issue number11
DOIs
StatePublished - Apr 30 2013
Externally publishedYes

Fingerprint

Molecular Docking
Message Passing Interface
Multithreading
Docking
Message passing
High Performance
Program processors
Computing
Data Handling
Data handling
Scale-up
Supercomputers
Supercomputer
Vertex of a graph
Shared Memory
Scoring
Performance Analysis
Screening
Percentage
Recovery

Keywords

  • AutoDock · Virtual Screening
  • HPC
  • Molecular Docking
  • MPI
  • Vina

ASJC Scopus subject areas

  • Chemistry(all)
  • Computational Mathematics

Cite this

@article{b460f1ebf8644279a9ead29deae98ee6,
title = "Message passing interface and multithreading hybrid for parallel molecular docking of large databases on petascale high performance computing machines",
abstract = "A mixed parallel scheme that combines message passing interface (MPI) and multithreading was implemented in the AutoDock Vina molecular docking program. The resulting program, named VinaLC, was tested on the petascale high performance computing (HPC) machines at Lawrence Livermore National Laboratory. To exploit the typical cluster-type supercomputers, thousands of docking calculations were dispatched by the master process to run simultaneously on thousands of slave processes, where each docking calculation takes one slave process on one node, and within the node each docking calculation runs via multithreading on multiple CPU cores and shared memory. Input and output of the program and the data handling within the program were carefully designed to deal with large databases and ultimately achieve HPC on a large number of CPU cores. Parallel performance analysis of the VinaLC program shows that the code scales up to more than 15K CPUs with a very low overhead cost of 3.94{\%}. One million flexible compound docking calculations took only 1.4 h to finish on about 15K CPUs. The docking accuracy of VinaLC has been validated against the DUD data set by the re-docking of X-ray ligands and an enrichment study, 64.4{\%} of the top scoring poses have RMSD values under 2.0 {\AA}. The program has been demonstrated to have good enrichment performance on 70{\%} of the targets in the DUD data set. An analysis of the enrichment factors calculated at various percentages of the screening database indicates VinaLC has very good early recovery of actives.",
keywords = "AutoDock · Virtual Screening, HPC, Molecular Docking, MPI, Vina",
author = "Xiaohua Zhang and Wong, {Sergio E.} and Lightstone, {Felice C}",
year = "2013",
month = "4",
day = "30",
doi = "10.1002/jcc.23214",
language = "English (US)",
volume = "34",
pages = "915--927",
journal = "Journal of Computational Chemistry",
issn = "0192-8651",
publisher = "John Wiley and Sons Inc.",
number = "11",

}

TY - JOUR

T1 - Message passing interface and multithreading hybrid for parallel molecular docking of large databases on petascale high performance computing machines

AU - Zhang, Xiaohua

AU - Wong, Sergio E.

AU - Lightstone, Felice C

PY - 2013/4/30

Y1 - 2013/4/30

N2 - A mixed parallel scheme that combines message passing interface (MPI) and multithreading was implemented in the AutoDock Vina molecular docking program. The resulting program, named VinaLC, was tested on the petascale high performance computing (HPC) machines at Lawrence Livermore National Laboratory. To exploit the typical cluster-type supercomputers, thousands of docking calculations were dispatched by the master process to run simultaneously on thousands of slave processes, where each docking calculation takes one slave process on one node, and within the node each docking calculation runs via multithreading on multiple CPU cores and shared memory. Input and output of the program and the data handling within the program were carefully designed to deal with large databases and ultimately achieve HPC on a large number of CPU cores. Parallel performance analysis of the VinaLC program shows that the code scales up to more than 15K CPUs with a very low overhead cost of 3.94%. One million flexible compound docking calculations took only 1.4 h to finish on about 15K CPUs. The docking accuracy of VinaLC has been validated against the DUD data set by the re-docking of X-ray ligands and an enrichment study, 64.4% of the top scoring poses have RMSD values under 2.0 Å. The program has been demonstrated to have good enrichment performance on 70% of the targets in the DUD data set. An analysis of the enrichment factors calculated at various percentages of the screening database indicates VinaLC has very good early recovery of actives.

AB - A mixed parallel scheme that combines message passing interface (MPI) and multithreading was implemented in the AutoDock Vina molecular docking program. The resulting program, named VinaLC, was tested on the petascale high performance computing (HPC) machines at Lawrence Livermore National Laboratory. To exploit the typical cluster-type supercomputers, thousands of docking calculations were dispatched by the master process to run simultaneously on thousands of slave processes, where each docking calculation takes one slave process on one node, and within the node each docking calculation runs via multithreading on multiple CPU cores and shared memory. Input and output of the program and the data handling within the program were carefully designed to deal with large databases and ultimately achieve HPC on a large number of CPU cores. Parallel performance analysis of the VinaLC program shows that the code scales up to more than 15K CPUs with a very low overhead cost of 3.94%. One million flexible compound docking calculations took only 1.4 h to finish on about 15K CPUs. The docking accuracy of VinaLC has been validated against the DUD data set by the re-docking of X-ray ligands and an enrichment study, 64.4% of the top scoring poses have RMSD values under 2.0 Å. The program has been demonstrated to have good enrichment performance on 70% of the targets in the DUD data set. An analysis of the enrichment factors calculated at various percentages of the screening database indicates VinaLC has very good early recovery of actives.

KW - AutoDock · Virtual Screening

KW - HPC

KW - Molecular Docking

KW - MPI

KW - Vina

UR - http://www.scopus.com/inward/record.url?scp=84875366733&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84875366733&partnerID=8YFLogxK

U2 - 10.1002/jcc.23214

DO - 10.1002/jcc.23214

M3 - Article

C2 - 23345155

AN - SCOPUS:84875366733

VL - 34

SP - 915

EP - 927

JO - Journal of Computational Chemistry

JF - Journal of Computational Chemistry

SN - 0192-8651

IS - 11

ER -