TY - JOUR
T1 - Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines
AU - Labute, Montiago X.
AU - Zhang, Xiaohua
AU - Lenderman, Jason
AU - Bennion, Brian J.
AU - Wong, Sergio E.
AU - Lightstone, Felice C
PY - 2014/9/5
Y1 - 2014/9/5
N2 - Late-stage or post-market identification of adverse drug reactions (ADRs) is a significant public health issue and a source of major economic liability for drug development. Thus, reliable in silico screening of drug candidates for possible ADRs would be advantageous. In this work, we introduce a computational approach that predicts ADRs by combining the results of molecular docking and leverages known ADR information from DrugBank and SIDER. We employed a recently parallelized version of AutoDock Vina (VinaLC) to dock 906 small molecule drugs to a virtual panel of 409 DrugBank protein targets. L1- regularized logistic regression models were trained on the resulting docking scores of a 560 compound subset from the initial 906 compounds to predict 85 side effects, grouped into 10 ADR phenotype groups. Only 21% (87 out of 409) of the drug-protein binding features involve known targets of the drug subset, providing a significant probe of off-target effects. As a control, associations of this drug subset with the 555 annotated targets of these compounds, as reported in DrugBank, were used as features to train a separate group of models. The Vina off-target models and the DrugBank on-target models yielded comparable median area-under-the-receiver-operating-characteristic-curves (AUCs) during 10-fold cross-validation (0.60-0.69 and 0.61-0.74, respectively). Evidence was found in the PubMed literature to support several putative ADRprotein associations identified by our analysis. Among them, several associations between neoplasm-related ADRs and known tumor suppressor and tumor invasiveness marker proteins were found. A dual role for interstitial collagenase in both neoplasms and aneurysm formation was also identified. These associations all involve off-target proteins and could not have been found using available drug/on-target interaction data. This study illustrates a path forward to comprehensive ADR virtual screening that can potentially scale with increasing number of CPUs to tens of thousands of protein targets and millions of potential drug candidates.
AB - Late-stage or post-market identification of adverse drug reactions (ADRs) is a significant public health issue and a source of major economic liability for drug development. Thus, reliable in silico screening of drug candidates for possible ADRs would be advantageous. In this work, we introduce a computational approach that predicts ADRs by combining the results of molecular docking and leverages known ADR information from DrugBank and SIDER. We employed a recently parallelized version of AutoDock Vina (VinaLC) to dock 906 small molecule drugs to a virtual panel of 409 DrugBank protein targets. L1- regularized logistic regression models were trained on the resulting docking scores of a 560 compound subset from the initial 906 compounds to predict 85 side effects, grouped into 10 ADR phenotype groups. Only 21% (87 out of 409) of the drug-protein binding features involve known targets of the drug subset, providing a significant probe of off-target effects. As a control, associations of this drug subset with the 555 annotated targets of these compounds, as reported in DrugBank, were used as features to train a separate group of models. The Vina off-target models and the DrugBank on-target models yielded comparable median area-under-the-receiver-operating-characteristic-curves (AUCs) during 10-fold cross-validation (0.60-0.69 and 0.61-0.74, respectively). Evidence was found in the PubMed literature to support several putative ADRprotein associations identified by our analysis. Among them, several associations between neoplasm-related ADRs and known tumor suppressor and tumor invasiveness marker proteins were found. A dual role for interstitial collagenase in both neoplasms and aneurysm formation was also identified. These associations all involve off-target proteins and could not have been found using available drug/on-target interaction data. This study illustrates a path forward to comprehensive ADR virtual screening that can potentially scale with increasing number of CPUs to tens of thousands of protein targets and millions of potential drug candidates.
UR - http://www.scopus.com/inward/record.url?scp=84906997312&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84906997312&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0106298
DO - 10.1371/journal.pone.0106298
M3 - Article
C2 - 25191698
AN - SCOPUS:84906997312
VL - 9
JO - PLoS One
JF - PLoS One
SN - 1932-6203
IS - 9
M1 - e106298
ER -