Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines

Montiago X. Labute, Xiaohua Zhang, Jason Lenderman, Brian J. Bennion, Sergio E. Wong, Felice C Lightstone

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

Late-stage or post-market identification of adverse drug reactions (ADRs) is a significant public health issue and a source of major economic liability for drug development. Thus, reliable in silico screening of drug candidates for possible ADRs would be advantageous. In this work, we introduce a computational approach that predicts ADRs by combining the results of molecular docking and leverages known ADR information from DrugBank and SIDER. We employed a recently parallelized version of AutoDock Vina (VinaLC) to dock 906 small molecule drugs to a virtual panel of 409 DrugBank protein targets. L1- regularized logistic regression models were trained on the resulting docking scores of a 560 compound subset from the initial 906 compounds to predict 85 side effects, grouped into 10 ADR phenotype groups. Only 21% (87 out of 409) of the drug-protein binding features involve known targets of the drug subset, providing a significant probe of off-target effects. As a control, associations of this drug subset with the 555 annotated targets of these compounds, as reported in DrugBank, were used as features to train a separate group of models. The Vina off-target models and the DrugBank on-target models yielded comparable median area-under-the-receiver-operating-characteristic-curves (AUCs) during 10-fold cross-validation (0.60-0.69 and 0.61-0.74, respectively). Evidence was found in the PubMed literature to support several putative ADRprotein associations identified by our analysis. Among them, several associations between neoplasm-related ADRs and known tumor suppressor and tumor invasiveness marker proteins were found. A dual role for interstitial collagenase in both neoplasms and aneurysm formation was also identified. These associations all involve off-target proteins and could not have been found using available drug/on-target interaction data. This study illustrates a path forward to comprehensive ADR virtual screening that can potentially scale with increasing number of CPUs to tens of thousands of protein targets and millions of potential drug candidates.

Original languageEnglish (US)
Article numbere106298
JournalPLoS One
Volume9
Issue number9
DOIs
StatePublished - Sep 5 2014
Externally publishedYes

Fingerprint

Computing Methodologies
Drug-Related Side Effects and Adverse Reactions
drugs
prediction
Pharmaceutical Preparations
Proteins
proteins
Logistic Models
Neoplasms
Matrix Metalloproteinase 1
Preclinical Drug Evaluations
Drug and Narcotic Control
neoplasms
Tumor Biomarkers
PubMed
Protein Binding
ROC Curve
Computer Simulation
Area Under Curve
Aneurysm

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines. / Labute, Montiago X.; Zhang, Xiaohua; Lenderman, Jason; Bennion, Brian J.; Wong, Sergio E.; Lightstone, Felice C.

In: PLoS One, Vol. 9, No. 9, e106298, 05.09.2014.

Research output: Contribution to journalArticle

Labute, Montiago X. ; Zhang, Xiaohua ; Lenderman, Jason ; Bennion, Brian J. ; Wong, Sergio E. ; Lightstone, Felice C. / Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines. In: PLoS One. 2014 ; Vol. 9, No. 9.
@article{de5a29c8478c438884258be16f475387,
title = "Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines",
abstract = "Late-stage or post-market identification of adverse drug reactions (ADRs) is a significant public health issue and a source of major economic liability for drug development. Thus, reliable in silico screening of drug candidates for possible ADRs would be advantageous. In this work, we introduce a computational approach that predicts ADRs by combining the results of molecular docking and leverages known ADR information from DrugBank and SIDER. We employed a recently parallelized version of AutoDock Vina (VinaLC) to dock 906 small molecule drugs to a virtual panel of 409 DrugBank protein targets. L1- regularized logistic regression models were trained on the resulting docking scores of a 560 compound subset from the initial 906 compounds to predict 85 side effects, grouped into 10 ADR phenotype groups. Only 21{\%} (87 out of 409) of the drug-protein binding features involve known targets of the drug subset, providing a significant probe of off-target effects. As a control, associations of this drug subset with the 555 annotated targets of these compounds, as reported in DrugBank, were used as features to train a separate group of models. The Vina off-target models and the DrugBank on-target models yielded comparable median area-under-the-receiver-operating-characteristic-curves (AUCs) during 10-fold cross-validation (0.60-0.69 and 0.61-0.74, respectively). Evidence was found in the PubMed literature to support several putative ADRprotein associations identified by our analysis. Among them, several associations between neoplasm-related ADRs and known tumor suppressor and tumor invasiveness marker proteins were found. A dual role for interstitial collagenase in both neoplasms and aneurysm formation was also identified. These associations all involve off-target proteins and could not have been found using available drug/on-target interaction data. This study illustrates a path forward to comprehensive ADR virtual screening that can potentially scale with increasing number of CPUs to tens of thousands of protein targets and millions of potential drug candidates.",
author = "Labute, {Montiago X.} and Xiaohua Zhang and Jason Lenderman and Bennion, {Brian J.} and Wong, {Sergio E.} and Lightstone, {Felice C}",
year = "2014",
month = "9",
day = "5",
doi = "10.1371/journal.pone.0106298",
language = "English (US)",
volume = "9",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "9",

}

TY - JOUR

T1 - Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines

AU - Labute, Montiago X.

AU - Zhang, Xiaohua

AU - Lenderman, Jason

AU - Bennion, Brian J.

AU - Wong, Sergio E.

AU - Lightstone, Felice C

PY - 2014/9/5

Y1 - 2014/9/5

N2 - Late-stage or post-market identification of adverse drug reactions (ADRs) is a significant public health issue and a source of major economic liability for drug development. Thus, reliable in silico screening of drug candidates for possible ADRs would be advantageous. In this work, we introduce a computational approach that predicts ADRs by combining the results of molecular docking and leverages known ADR information from DrugBank and SIDER. We employed a recently parallelized version of AutoDock Vina (VinaLC) to dock 906 small molecule drugs to a virtual panel of 409 DrugBank protein targets. L1- regularized logistic regression models were trained on the resulting docking scores of a 560 compound subset from the initial 906 compounds to predict 85 side effects, grouped into 10 ADR phenotype groups. Only 21% (87 out of 409) of the drug-protein binding features involve known targets of the drug subset, providing a significant probe of off-target effects. As a control, associations of this drug subset with the 555 annotated targets of these compounds, as reported in DrugBank, were used as features to train a separate group of models. The Vina off-target models and the DrugBank on-target models yielded comparable median area-under-the-receiver-operating-characteristic-curves (AUCs) during 10-fold cross-validation (0.60-0.69 and 0.61-0.74, respectively). Evidence was found in the PubMed literature to support several putative ADRprotein associations identified by our analysis. Among them, several associations between neoplasm-related ADRs and known tumor suppressor and tumor invasiveness marker proteins were found. A dual role for interstitial collagenase in both neoplasms and aneurysm formation was also identified. These associations all involve off-target proteins and could not have been found using available drug/on-target interaction data. This study illustrates a path forward to comprehensive ADR virtual screening that can potentially scale with increasing number of CPUs to tens of thousands of protein targets and millions of potential drug candidates.

AB - Late-stage or post-market identification of adverse drug reactions (ADRs) is a significant public health issue and a source of major economic liability for drug development. Thus, reliable in silico screening of drug candidates for possible ADRs would be advantageous. In this work, we introduce a computational approach that predicts ADRs by combining the results of molecular docking and leverages known ADR information from DrugBank and SIDER. We employed a recently parallelized version of AutoDock Vina (VinaLC) to dock 906 small molecule drugs to a virtual panel of 409 DrugBank protein targets. L1- regularized logistic regression models were trained on the resulting docking scores of a 560 compound subset from the initial 906 compounds to predict 85 side effects, grouped into 10 ADR phenotype groups. Only 21% (87 out of 409) of the drug-protein binding features involve known targets of the drug subset, providing a significant probe of off-target effects. As a control, associations of this drug subset with the 555 annotated targets of these compounds, as reported in DrugBank, were used as features to train a separate group of models. The Vina off-target models and the DrugBank on-target models yielded comparable median area-under-the-receiver-operating-characteristic-curves (AUCs) during 10-fold cross-validation (0.60-0.69 and 0.61-0.74, respectively). Evidence was found in the PubMed literature to support several putative ADRprotein associations identified by our analysis. Among them, several associations between neoplasm-related ADRs and known tumor suppressor and tumor invasiveness marker proteins were found. A dual role for interstitial collagenase in both neoplasms and aneurysm formation was also identified. These associations all involve off-target proteins and could not have been found using available drug/on-target interaction data. This study illustrates a path forward to comprehensive ADR virtual screening that can potentially scale with increasing number of CPUs to tens of thousands of protein targets and millions of potential drug candidates.

UR - http://www.scopus.com/inward/record.url?scp=84906997312&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84906997312&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0106298

DO - 10.1371/journal.pone.0106298

M3 - Article

VL - 9

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 9

M1 - e106298

ER -