A data set of 348 urea-like compounds that inhibit the soluble epoxide hydrolase enzyme in mice and humans is examined. Compounds having IC50 values ranging from 0.06 to >500 μM (murine) and 0.10 to >500 μM (human) are categorized as active or inactive for classification, while quantitation is performed on smaller compound subsets ranging from 0.07 to 431 μM (murine) and 0.11 to 490 μM (human). Each compound is represented by calculated structural descriptors that encode topological, geometrical, electronic, and polar surface features. Multiple linear regression (MLR) and computational neural networks (CNNs) are employed for quantitative models. Three classification algorithms, k-nearest neighbor (kNN), linear discriminant analysis (LDA), and radial basis function neural networks (RBFNN), are used to categorize compounds as active or inactive based on selected data split points. Quantitative modeling of human enzyme inhibition results in a nonlinear, five-descriptor model with root- mean-square errors (log units of IC50 [μM]) of 0.616 (r2 = 0.66), 0.674 (r2 = 0.61), and 0.914 (r2 = 0.33) for training, cross-validation, and prediction sets, respectively. The best classification results for human and murine enzyme inhibition are found using kNN. Human classification rates using a seven-descriptor model for training and prediction sets are 89.1% and 91.4%, respectively. Murine classification rates using a five-descriptor model for training and prediction sets are 91.5% and 88.6%, respectively.
ASJC Scopus subject areas
- Organic Chemistry