Journal of Taibah University for Science 10 (2016) 534542 [631779]

Journal of Taibah University for Science 10 (2016) 534–542
Available online at www.sciencedirect.com
ScienceDirect
The inhibitory activity of aldose reductase of flavonoid compounds:
Combining
DFT and QSAR calculations
Mounir Ghamalia,∗, Samir Chtitaa, Rachid Hmamouchia, Azeddine Adada,
Mohammed Bouachrineb, Tahar Lakhlifia
aMolecular Chemistry and Natural Substances Laboratory, Faculty of Science, University Moulay Ismail, Meknes, Morocco
bESTM, University Moulay Ismail, Meknes, Morocco
Received 29 April 2015; received in revised form 12 September 2015; accepted 14 September 2015
Available online 6 November 2015
Abstract
The
DFT-B3LYP method, with the base set 6-31G (d), was used to calculate several quantum chemical descriptors of 44 sub-
stituted flavonoids. The best descriptors were selected to establish the quantitative structure activity relationship (QSAR) of the
inhibitory activity against aldose reductase using principal components analysis (PCA), multiple regression analysis (MLR), non-
linear regression (RNLM) and an artificial neural network (ANN). We propose a quantitative model according to these analyses,
and we interpreted the activity of the compounds based on the multivariate statistical analysis.
This study shows that the MLR and MNLR predict activity, but compared to the results of the ANN model, we conclude that
the predictions achieved by the latter are more effective and better than the other models. The results indicate that the ANN model
is statistically significant and shows very good stability toward data variation for the validation method. The contribution of each
descriptor to the structure–activity relationship was also evaluated.
© 2015 The Authors. Production and hosting by Elsevier B.V . on behalf of Taibah University. This is an open access article under
the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/ ).
Keywords: QSAR model; DFT study; Flavonoids; Aldose reductase; Artificial neural network
1. Introduction
Aldose reductase (AR) is an essential enzyme of the
polyol pathway and plays a vital role in the development
of diabetic complications. Aldose reductase normally
∗Corresponding author. Tel.: +212 670301669.
E-mail address: [anonimizat] (M. Ghamali).
Peer review under responsibility of Taibah University
http://dx.doi.org/10.1016/j.jtusci.2015.09.006
1658-3655 © 2015 The Authors. Production and hosting by Elsevier B.V . on behalf of Taibah University. This is an open access article under the
CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/ ).functions to reduce toxic aldehydes in the cell to inactive
alcohols; however, when the glucose concentration in
the cell becomes too high, aldose reductase also reduces
the glucose to sorbitol, which is later oxidized to fruc-
tose. During the process of reducing high intracellular
glucose to sorbitol, aldose reductase consumes NADPH.
The accumulation of sorbitol can lead to an osmotic
imbalance and may contribute to the progression of
diabetic complications, such as cataracts, neuropathy,
and nephropathy [1–3] . The inhibition of AR is a
possible prevention or treatment for these effects [4].
The inhibitory effect of aldose reductase was found
in several structurally diverse classes of the compounds,

M. Ghamali et al. / Journal of Taibah University for Science 10 (2016) 534–542 535
including flavone coumarin, xanthine, naphthalene,
flavone and quinazoline derivatives [5–8] , but the
flavonoid derivatives were more potent. A recent QSAR
study on a data set of the inhibitory activities against the
AR enzyme for the substituted flavonoids was reported
using topology indices (JCIM 46, 90, 2006; JBMC 16,
7473, 2008) [9,10] and three-dimensional (3D)-QSAR
methodologies (JMM 15, 841, 2009; JMC 6, 33, 2010)
[11,12] .
Flavonoids are a group of naturally occurring
polyphenolic compounds that are ubiquitously found in
fruits and vegetables [13–15] . Chemically, flavonoids
are benzo- /H9253-pyrone derivatives, and they have poten-
tial application for a variety of pharmacological targets.
The structural diversity of these compounds pro-
vides anti-hepatitis, antibacterial, anti-inflammatory,
anti-mutagenic, anti-allergic, and anti-viral activities
[16–18] .
The increase in the speed and efficiency of drug
discovery has been aided by large investments from
major pharmaceutical companies, with the primary aim
of reducing the cost per synthesized compound or assay.
Computational models that predict the biological activ-
ity of compounds based on their structural properties
are powerful tools to design highly active molecules.
In this sense, quantitative structure–activity relationship
(QSAR) studies have been successfully applied for mod-
eling the biological activities of natural and synthetic
chemicals [19].
The objective of this study was to develop predic-
tive QSAR models of the inhibitory activity of flavonoid
derivatives against the aldose reductase enzyme using
several statistical tools, such as principal components
analysis (PCA), multiple linear regression (MLR), non-
linear regression (RNLM) and artificial neural network
(ANN) calculations. To test the performance and the
stability of this model, we opted for a validation method.
2. Material and methods
2.1. Data sources
In the present study, we selected 44 substituted
flavonoids with activities reported in the literature by
Stefanic-Petek et al. [20]. The activities are reported as
log (1/IC50) or pIC50 values, where IC50 refers to the
molar concentration of the compound required for 50%
inhibition of AR activity. Stefanic-Petek et al. initially
proposed 75 compounds, and the remaining compounds
had structures that differed from the structures required
for this study. Fig. 1 represents the basic structure of the
flavonoids, and Table 1 shows the substituted compounds
Fig. 1. Chemical structure of the studied compounds.
studied and experimental activities corresponding to the
pIC50. For the proper validation of our data set with a
QSAR model, the 44 substituted flavonoids were divided
into training and test sets. A total of 35 molecules were
placed in the training set to build the QSAR models,
whereas the remaining 9 molecules composed the test
set. The division was performed by random selection.
2.2. Molecular descriptors
Currently, there are a large number of molecular
descriptors used in QSAR studies. After validation, the
findings can be used to predict the activity of untested
compounds.
The computation of electronic descriptors was per-
formed using the Gaussian03W package [21]. The
geometries of the 44 substituted flavonoids against
aldose
reductase were optimized with the DFT method
with the B3LYP functional and 6-31G (d) base set. Then,
several related structural parameters were selected from
the results of the quantum computation as follows: high-
est occupied molecular orbital energy (EHOMO ), lowest
unoccupied molecular orbital energy (ELUMO ), dipole
moment (DM), total energy (ET), absolute hardness (η),
absolute electronegativity (χ) and reactivity index (ω)
[22]. The η, χ and ω were determined using the following
equations:
η =ELUMO –EHOMO
2, χ =ELUMO + EHOMO
2,
ω =μ2

ACD/ChemSketch program [23] was used to calcu-
late the topological descriptors, as follows: molar volume
(MV), molecular weight (MW), molar refractivity (MR),
parachor (Pc), density (D), refractive index (n) and sur-
face tension (γ).

536 M. Ghamali et al. / Journal of Taibah University for Science 10 (2016) 534–542
Table 1
Observed activity of studied flavonoids derivatives.
No. Substituent pIC50 No. Substituent pIC50
1 5,7,3/prime,4/prime-OH; 3,6-OCH 3 7.590 23 7,3/prime,4/prime-OH; 3,5,8-OCH 3 6.550
2 3/prime,4/prime-OH; 5,6,7,8-OCH 3 7.490 24 5,6,3/prime,4/prime-OH; 7-OCH 3 6.520
3 6,3/prime,4/prime-OH; 5,7,8-OCH 3 7.440 25 6,3/prime,4/prime-OH; 3,5,7-OCH 3 6.520
4 5,7,3/prime,4/prime-OH; 6-OCH 3; 8-CH 2Ph 7.470 26 5,3/prime,4/prime-OH; 3,6,7-OCH 3 6.460
5 5,3/prime,4/prime-OH; 6,7,8-OCH 3 7.410 27 5,7,4/prime-OH; 6,8-OCH 3 6.390
6 3/prime,4/prime-OH; 5,7,8-OCH 3 7.350 28 5,4/prime-OH; 6,7,8-OCH 3 6.270
7 5,6,7,3/prime,4/prime-OH; 3-OCH 3 7.240 29 5,6,3/prime,4/prime-OH; 3,7-OCH 3 6.090
8 5,6,3/prime,4/prime-OH; 7,8-OCH 3 7.190 30 3,5,7,3/prime,4/prime-OH 6.020
9a7,3/prime,4/prime-OH; 5,8-OCH 3 7.130 31a5,6,7,4/prime-OH; 8-OCH 3 5.920
10 5,3/prime,4/prime-OH; 7,8-OCH 3 7.110 32 5,4/prime-OH; 6,7-OCH 3 5.850
11 3/prime,4/prime-OH; 5,6,7-OCH 3 7.040 33a5,7,4/prime-OH; 6,8,3/prime-OCH 3 5.350
12 5,6,7,3/prime,4/prime-OH; 8-OCH 3 6.920 34 6,4/prime-OH; 5,7,8,3/prime-OCH 3 5.200
13 6,3/prime,4/prime-OH; 5,7-OCH 3 6.850 35 5,4/prime-OH; 6,7,3/prime-OCH 3 5.170
14 4/prime-OH; 5,6,7,8-OCH 3 6.790 36a5,7-OH; 6,8,4/prime-OCH 3 5.140
15a8,3/prime,4/prime-OH; 5,7-OCH 3 6.790 37a5,6,7-OH; 8-OCH 3 5.090
16 3/prime,4/prime-OH; 3,5,7,8-OCH 3 6.770 38 5,6-OH; 7,8-OCH 3 5.080
17a5,6,7,3/prime,4/prime-OH 6.690 39 3/prime,4/prime-OH; 5,6,7-OCH 3; 3-COCH 3 5.050
18 5,3/prime,4/prime-OH; 6,7-OCH 3 6.920 40 4/prime-OH; 5,6,7,8,3/prime-OCH 3 4.740
19 5,8,3/prime,4/prime-OH; 7-OCH 3 6.640 41 5,7-OH; 6,8,3/prime,4/prime-OCH 3 4.670
20 5,7,3/prime,4/prime-OH; 3,8-OCH 3 6.620 42 5,4/prime-OH; 6,7,8,3/prime-OCH 3 4.420
21a6,4/prime-OH; 5,7,8-OCH 3 6.600 43 5,6,4/prime-OH; 7,8,3/prime-OCH 3 3.960
22a5,7,3/prime,4/prime-OH; 8-OCH 3 6.550 44 6-OH; 5,7,8-OCH 3 4.440
aTest set.
2.3. Statistical analysis
The objective of quantitative structure–activity rela-
tionship (QSAR) analysis is to derive empirical models
that relate the biological activity of compounds to their
chemical structures. In this QSAR analysis, quantitative
descriptors are used to describe the chemical structure
and analysis results in a mathematical model describ-
ing the relationship between the chemical structure and
biological activity. To explain the structure–activity rela-
tionship, these 14 descriptors are calculated for the 44
molecules using the Gaussian03W and ChemSketch pro-
grams.
The quantitative descriptors of the substituted
flavonoids are studied using statistical methods based
on principal component analysis (PCA) [24] with the
software XLSTAT version 2013 [25]. PCA is a useful
statistical technique for summarizing all of the informa-
tion encoded in the structures of the compounds. It is a
lso helpful for understanding the distribution of the com-
pounds [26]. This is an essentially descriptive statistical
method that aims to present, in graphic form, the max-
imum information contained in the data, as shown in
Tables 1 and 2.
Multiple linear regression (MLR) analysis with
descendent selection and elimination of variables is
used to model the structure–activity relationship. It is amathematical technique that minimizes the difference
between the actual and predicted values. Additionally,
it selects the descriptors used as the input parameters in
the multiple nonlinear regression (MNLR) and artificial
neural network (ANN).
MLR and MNLR are generated using the software
XLSTAT version 2013. To predict the pIC50, equations
are justified by the correlation coefficient (R), the mean
squared error (MSE), Fisher’s criterion (F) and the sig-
nificance level (P).
The ANN analysis is performed using the Mat-
lab software version 2009a Neural Fitting tool (nftool)
toolbox on a data set of the compounds [27]. A num-
ber of individual models of ANN were designed, built
and trained. Three components constitute a neural net-
work, the processing elements or nodes, the topology
of the connections between the nodes, and the learn-
ing rule by which new information is encoded in the
network. Although there many different ANN mod-
els, the most frequently used type of ANN in QSAR
is the three-layered feed forward network [28]. In this
type of network, the neurons are arranged in layers
as an input layer, one hidden layer and an output
layer. Each neuron in any layer is fully connected
with the neurons of a succeeding layer and no con-
nections are between neurons belonging to the same
layer.

M. Ghamali et al. / Journal of Taibah University for Science 10 (2016) 534–542 537
Table 2
Values of the obtained parameters of the studied substituted flavonoids.
No. pIC50 ET EHOMO ELUMO DM ω η χ MW MR MV Pc n γ D
1b,c7.590 −34,254.400 −5.484 −1.350 5.885 2.825 2.067 −3.417 346.288 83.460 208.700 651.900 1.731 95.100 1.650
2
7.490 −36,394.548 −5.742 −1.494 4.169 3.082 2.124 −3.618 374.341 94.680 272.100 740.600 1.612 54.800 1.375
3b7.440 −35,324.509 −5.629 −1.500 4.494 3.077 2.065 −3.564 360.000 89.880 246.500 697.200 1.649 63.900 1.461
4
7.470 −38,498.254 −5.694 −1.362 3.173 2.873 2.166 −3.528 406.385 107.720 276.300 807.300 1.707 72.800 1.470
5 7.410 −35,324.908 −5.550 −1.695 6.296 3.405 1.927 −3.623 360.315 89.880 246.500 697.200 1.649 63.900 1.461
6 7.350 −33,276.487 −5.503 −1.264 2.697 2.701 2.119 −3.384 344.315 88.000 248.100 682.000 1.627 57.000 1.387
7
7.240 −33,184.078 −5.489 −1.364 5.096 2.846 2.063 −3.426 322.262 78.630 183.800 608.500 1.800 120.000 1.800
8
7.190 −34,254.815 −5.460 −1.715 5.974 3.437 1.872 −3.587 346.288 85.090 220.900 653.800 1.696 76.600 1.566
9a7.130 −32,206.181 −5.572 −1.293 2.342 2.754 2.139 −3.433 330.289 83.200 222.500 638.600 1.670 67.700 1.483
10
7.110 −32,206.882 −5.571 −1.664 6.662 3.350 1.953 −3.618 330.289 83.200 222.500 638.600 1.670 67.700 1.483
11
7.040 −33,276.448 −5.830 −1.535 3.691 3.158 2.147 −3.683 344.315 88.000 248.100 682.000 1.627 57.000 1.387
12c6.920 −33,184.761 −5.545 −1.708 5.445 3.427 1.919 −3.626 332.262 80.290 195.400 610.400 1.758 95.200 1.700
13b,c6.850 −32,206.399 −5.701 −1.538 3.602 3.146 2.081 −3.619 330.289 83.200 222.500 638.600 1.670 67.700 1.483
14b6.790 −34,346.416 −5.845 −1.506 5.014 3.114 2.169 −3.676 358.342 92.800 273.700 725.400 1.593 49.300 1.309
15a6.790 −32,206.333 −5.247 −1.235 2.863 2.618 2.006 −3.241 330.289 83.200 222.500 638.600 1.670 67.700 1.483
16
6.770 −36,394.693 −5.327 −1.251 5.678 2.654 2.038 −3.289 374.341 93.130 258.500 738.700 1.639 66.600 1.440
17a6.690 −30,066.647 −5.657 −1.725 4.870 3.465 1.966 −3.691 302.236 73.610 171.400 551.800 1.804 107.400 1.763
18 6.920 −32,206.811 −5.718 −1.708 4.994 3.437 2.005 −3.713 330.289 83.200 222.500 638.600 1.670 67.700 1.483
19
6.640 −31,136.717 −5.240 −1.630 3.980 3.268 1.805 −3.435 316.262 78.410 196.900 595.200 1.727 83.300 1.605
20
6.620 −34,254.747 −5.543 −1.675 5.184 3.368 1.934 −3.609 346.288 83.460 208.700 651.900 1.731 95.100 1.650
21a,c6.600 −33,276.378 −5.673 −1.513 4.725 3.104 2.080 −3.593 344.315 88.000 248.100 682.000 1.627 57.000 1.387
22a6.550 −31,136.575 −5.664 −1.700 6.423 3.421 1.982 −3.682 316.262 78.410 196.900 595.200 1.727 83.300 1.605
23
6.550 −35,324.389 −5.375 −1.275 5.037 2.696 2.050 −3.325 360.315 88.300 233.600 695.300 1.680 78.400 1.540
24
6.520 −31,136.716 −5.553 −1.732 5.098 3.472 1.911 −3.643 316.262 78.410 196.900 595.200 1.727 83.300 1.605
25 6.520 −35,324.585 −5.572 −1.539 3.903 3.135 2.016 −3.555 360.315 88.300 233.600 695.300 1.680 78.400 1.540
26b6.460 −35,324.539 −5.485 −1.381 6.360 2.871 2.052 −3.433 360.315 88.300 233.600 695.300 1.680 78.400 1.540
27c6.390 −32,206.813 −5.628 −1.704 4.190 3.425 1.962 −3.666 330.289 83.200 222.500 638.600 1.670 67.700 1.483
28c6.270 −33,276.782 −5.532 −1.705 4.901 3.421 1.914 −3.619 344.315 88.000 248.100 682.000 1.627 57.000 1.387
29
6.090 −34,254.436 −5.383 −1.277 8.794 2.701 2.053 −3.330 346.288 83.460 208.700 651.900 1.731 95.100 1.650
30 6.020 −30,066.680 −5.476 −1.802 2.582 3.605 1.837 −3.639 302.236 73.310 167.900 549.900 1.823 114.800 1.799
31a5.920 −31,136.624 −5.565 −1.719 4.111 3.448 1.923 −3.642 316.262 78.410 196.900 595.200 1.727 83.300 1.605
32 5.850 −30,158.665 −5.723 −1.708 3.938 3.439 2.007 −3.716 314.289 81.320 224.100 623.400 1.645 59.800 1.402
33a5.350 −35,325.052 −5.616 −1.711 4.831 3.437 1.953 −3.663 360.315 89.880 246.500 697.200 1.649 63.900 1.461
34b5.200 −36,394.605 −5.650 −1.512 5.014 3.100 2.069 −3.581 374.341 94.680 272.100 740.600 1.612 54.800 1.375
35b5.170 −33,276.904 −5.701 −1.708 4.021 3.436 1.997 −3.704 344.315 88.000 248.100 682.000 1.627 57.000 1.387
36a5.140 −33,277.134 −5.597 −1.673 4.645 3.367 1.962 −3.635 344.315 88.000 248.100 682.000 1.627 57.000 1.387
37a5.090 −29,088.443 −5.620 −1.865 2.969 3.731 1.878 −3.743 300.263 76.530 198.500 580.000 1.697 72.800 1.512
38b,c5.080 −30,158.533 −5.542 −1.913 3.522 3.828 1.814 −3.727 314.289 81.320 224.100 623.400 1.645 59.800 1.402
39
5.050 −37,432.725 −5.930 −1.616 3.534 3.301 2.157 −3.773 386.352 97.200 277.300 763.600 1.618 57.400 1.392
40 4.740 −37,464.656 −5.829 −1.512 5.022 3.120 2.159 −3.670 388.368 99.480 297.700 784.000 1.582 48.100 1.304
41c4.670 −36,395.369 −5.590 −1.679 5.249 3.378 1.955 −3.634 374.341 94.680 272.100 740.600 1.612 54.800 1.375
42b4.420 −36,395.021 −5.521 −1.708 5.609 3.427 1.906 −3.615 374.341 94.680 272.100 740.600 1.612 54.800 1.375
43
3.960 −35,324.943 −5.461 −1.763 4.739 3.528 1.849 −3.612 360.315 89.880 246.500 697.200 1.649 63.900 1.461
44c4.440 −31,228.250 −5.738 −1.694 2.854 3.414 2.022 −3.716 328.316 86.120 249.700 666.800 1.606 50.800 1.314
aTest set.
bTest set 2.
cTest set 3.
According to the supervised learning adopted here,
the networks are taught by providing examples of input
patterns and the corresponding target outputs. Through
an iterative process, the connection weights are modified
until the network gives the desired results for the training
set of data. A backpropagation algorithm is used to min-
imize the error function. This algorithm was described
previously with a simple example of an application [29],and the details of this algorithm are provided elsewhere
[30].
Testing the stability, predictive power and general-
ization ability of the models are very important steps
in a QSAR study. For the validation of the predictive
power of a QSAR model, two basic principles, internal
validation and external validation, are available. Cross-
validation is one of the most popular methods for internal

538 M. Ghamali et al. / Journal of Taibah University for Science 10 (2016) 534–542
validation. In this study, the internal predictive capa-
bility of the model was evaluated using leave-one-out
cross-validation (R2
cv). A good R2cvoften indicates good
robustness and the high internal predictive power of
a QSAR model. However, recent studies [31] indicate
that there is no evident correlation between the value of
R2
cvand the actual predictive power of a QSAR model,
suggesting that the R2
cvremains inadequate as a reli-
able estimate of the model’s predictive power for all
new chemicals. To determine both the generalizability
of QSAR models for new chemicals and the true predic-
tive power of the models, statistical external validation is
applied
during the model development step by properly
using a prediction set for validation.
3. Results and discussion
3.1. Data set for analysis
A QSAR study was performed for a series of 44 sub-
stituted flavonoids against aldose reductase, as reported
previously [20], to determine a quantitative relationship
between the structure and biological activity. The values
of the 14 descriptors are shown in Table 2.
3.2. Principal component analysis
The total of the 14 descriptors coding the 44
molecules was submitted to principal components analy-
sis (PCA) [32]. The first three principal axes are sufficient
to describe the information provided by the data matrix.
Indeed,
the percentages of the variance are 47.32%,26.41% and 10.08% for the axes F1, F2 and F3, respec-
tively. The total information is estimated as 83.81%.
The principal component analysis (PCA) [33] was
conducted to identify the link between the different vari-
ables. The correlations between the fourteen descriptors
are shown in Table 3 as a correlation matrix.
The obtained matrix provides information on the high
or low interrelationship between the variables. In gen-
eral, good co-linearity (r > 0.5) was observed between
most of the variables. A high interrelationship was
observed between ELUMO and ω (r = −0.999), and a
low interrelationship was observed between ω and D
(r
= 0.001).
3.3. Multiple linear regressions MLR
Many attempts have been made to develop a relation-
ship with the indicator variable of biological activity,
pIC50, but the best relationship obtained using this
method is only one corresponding to the linear com-
bination
of several descriptors selected, the total energy
ET, the energy EHOMO , the energy ELUMO , the molar
volume (MV), the refractive index (n) and the surface
tension (γ).
The resulting equation is as follows:
pIC50 = −36.345 − 4.311 × 10−4× ET− 2.471
× EHOMO + 4.540 × ELUMO − 0.073
× MV + 29.583 × n − 0.153 × γ (1)
N = 35; R = 0.758; R2
cv= 0.616; MSE = 0.552;
F
= 6.311; P < 0.0001.
Table 3
Correlation matrix between different obtained descriptors.
pIC50 ET EHOMO ELUMO DM ω η χ MW MR MV Pc n γ D
pIC50 1
ET 0.034 1
EHOMO 0.162 0.126 1
ELUMO 0.471 −0.425 0.268 1
DM 0.095 −0.274 0.209 0.073 1
ω −0.475 0.426 −0.292 −0.999 −0.093 1
η 0.303 −0.475 −0.500 0.700 −0.089 −0.682 1
χ 0.415 −0.221 0.750 0.838 0.168 −0.851 0.198 1
MW −0.088 −0.982 −0.207 0.397 0.187 −0.394 0.510 0.155 1
MR −0.152 −0.896 −0.331 0.328 0.030 −0.319 0.541 0.038 0.960 1
MV −0.289 −0.760 −0.421 0.218 −0.039 −0.207 0.508 −0.088 0.853 0.937 1
Pc −0.171 −0.915 −0.307 0.359 0.079 −0.351 0.551 0.073 0.970 0.990 0.947 1
n 0.363 0.438 0.408 −0.060 0.112 0.050 −0.357 0.190 −0.558 −0.685 −0.889 −0.717 1
γ 0.357 0.327 0.434 0.039 0.205 −0.051 −0.286 0.272 −0.467 −0.632 −0.839 −0.638 0.975 1
D 0.366 0.361 0.456 −0.015 0.213 0.001 −0.351 0.247 −0.505 −0.675 −0.872 −0.684 0.982 0.989 1
The bold values mean the interrelationship between descriptors, i.e. (great value, in absolute value, means a highest correlation between two
descriptors) and (small value, in absolute value, means a lowest correlation between two descriptors).

M. Ghamali et al. / Journal of Taibah University for Science 10 (2016) 534–542 539
In the equation, N is the number of compounds, R
is the correlation coefficient, MSE is the mean squared
error, F is the Fisher’s criterion and P is the significance
level. A higher correlation coefficient and lower mean
squared errors indicate that the model is more reliable.
A P that is smaller than 0.05 indicates that the regression
equation is statistically significant. The QSAR model
expressed by Eq. (1) is cross validated by its appreciable
R2
cvvalues (R2
cv= 0.616) obtained by the leave-one-out
(LOO) method. A value of R2
cvgreater than 0.5 is the
basic requirement for qualifying a QSAR model as valid
[31].
The elaborated QSAR model reveals that the
inhibitory activity against aldose reductase may be
explained by a number of electronic and topologic
factors. The negative correlation of the electronic
descriptors (ET, EHOMO ) and the topologic descriptors
(MV , γ) with the activity pIC50 shows that an increase
in the values of these factors indicates a decrease in the
value
of the pIC50, whereas a positive correlation of the
energy ELUMO and refractive index (n) with the activity
pIC50 reveals an increase in the value of the pIC50.
Based on Eq. (1), we explain the mechanisms of
inhibitory activity of the substituted flavonoids against
aldose reductase as follows:
(1) The low total energy ETis generally associated with
the high chemical stability of the compound. There-
fore, the inhibitory activity varies inversely with the
total energy ETof the substituted flavonoids. Con-
sequently, there is chemical stability against aldose
reductase.
(2) EHOMO is a quantum chemical parameter that is
often associated with the electron donating abil-
ity of the molecule. The lower the EHOMO , the
weaker the donating electron ability, showing that
the electrophilic reaction occurs more easily and the
inhibitory activity of the substituted flavonoids ishigher. Accordingly, molecules with large values of
EHOMO result in small values of the pIC50.
(3) ELUMO is a global molecular property that describes
the electrophilicity of a compound in general terms
and is also a measurement of the ability of the
molecule to act as an electron acceptor. Good cor-
relation with pIC50 and an obvious correlation with
ELUMO demonstrate that the inhibitory activity of the
substituted flavonoids is affected by electric proper-
ties.
The correlations of the predicted and observed activ-
ities and the residual graph of absolute numbers are
illustrated in Fig. 2. The descriptors proposed in Eq. (1)
by MLR are, therefore, used as the input parameters in
the multiple nonlinear regressions (MNLR) and artificial
neural
network (ANN).
3.4. Multiple nonlinear regressions (MNLR)
We also used the nonlinear regression model to
improve the structure activity in a quantitative manner,
taking into account several parameters. This is the most
common tool for the study of multidimensional data.
We applied this to the data matrix constituted from the
descriptors proposed by the MLR corresponding to the
35 compound training set.
The resulting equation is as follows:
pIC50 = −193.772 − 3.425 × 10−3× ET− 19.900
× EHOMO − 3.771 × ELUMO − 0.017
× MV + 72.400 × n − 0.195 × γ − 4.695
× 10−8× E2
T− 1.617 × E2
HOMO − 2.606
×
E2
LUMO − 4.663 × 10−5× MV2− 8.435
×
n2+ 2.243 × 10−4× γ2(2)
3.544.555.566.577.58
3.54 4.55 5.5 6 6.57 7.5 8observed activity
Pred ic tif activity00.511.522.53
13 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35Résidu
Observa/g415ons
Fig. 2. Graphical representation of calculated and observed activity and the residues values calculated by MLR.

540 M. Ghamali et al. / Journal of Taibah University for Science 10 (2016) 534–542
3.544.555.566.577.58
3.54 4.5 5 5.5 6 6. 57 7.5 8observed activity
Pred ictif activity00.511.522.53
13 5 7 10 12 14 18 20 24 26 28 30 34 38 40 42 44Résidu
Observa/g415ons
Fig. 3. Graphical representation of calculated and observed activity and the residues values calculated by MNLR.
3.544.555.566.577.58
3.5 4 4.5 5 5.56 6.5 7 7.5 8observed activity
Pred ic tif activity00.511.522.533.544.5
1 35 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35Résidu
Observa/g415ons
Fig. 4. Graphical representation of calculated and observed activity and the residues values calculated by ANN.
The obtained parameters describing the topological
and the electronic aspects of the studied molecules are
as follows:
N = 35; R = 0.787; R2
cv= 0.632; MSE = 0.628.
The QSAR model expressed by Eq. (2) is cross-
validated by its appreciable R2
cvvalues (R2
cv= 0.632)
obtained using the leave-one-out (LOO) method. A value
of R2
cvgreater than 0.5 is the basic requirement for qual-
ifying a QSAR model as valid [31].
The correlations of the predicted and observed activ-
ities and the residual graph of the absolute numbers are
illustrated in Fig. 3.
3.5. Artificial neural networks (ANN)
Neural networks (ANN) can generate predictive mod-
els of the quantitative structure–activity relationship
(QSAR) between a set of molecular descriptors obtained
from the MLR and observed activity. The ANN calcu-
lated activity model was developed using the properties
of several studied compounds. The correlation of the pre-
dicted and observed activities and the residual graph of
the absolute numbers are illustrated in Fig. 4.
N = 35; R = 0.900; R2
cv= 0.749; MSE = 0.458.
The obtained squared correlation coefficient (R) value
is 0.900 for this data set of substituted flavonoids. This
confirms that the artificial neural network (ANN) results
are the best to build the quantitative structure activityrelationship model. Furthermore, the high R2
cvvalue
(R2
cv= 0.749) shows that the obtained QSAR model can
predict the inhibitory activity against aldose reductase.
3.6. External validation
To estimate the predictive power of the MLR, MNLR
and ANN models, we must use a set of compounds that
have not been used as the training set to establish the
QSAR model. The models established in the computa-
tion procedure using the 35 substituted flavonoids are
used to predict the activity of the remaining 9 com-
pounds. The main performance parameters of the three
models are shown in Table 4. As seen from this table, the
statistical parameters of the ANN model are better than
the other models.
We assessed the best linear QSAR regression equa-
tions established in this study. Based on this result, a
Table 4
Performance comparison between models obtained by MLR, RNLM
and ANN.
Model Training set Test set
R R2
cv MSE R R2
ext MSE
MLR 0.758 0.616 0.552 0.575 0.775 0.552
MNLR 0.787 0.632 0.628 0.620 0.847 0.628
ANN 0.900 0.749 0.458 0.756 0.858 1.614

M. Ghamali et al. / Journal of Taibah University for Science 10 (2016) 534–542 541
comparison of the quality of the MLR and MNLR mod-
els shows that the ANN model has a significantly better
predictive capability because the ANN approach yields
better results than those of MLR and MNLR. ANN estab-
lishes a satisfactory relationship between the molecular
descriptors and the activity of the studied compounds.
Table 5
Observed values and calculated values of pIC50 according to different
methods.
No. pIC50 (obs.) pIC50 (calc.)
MLR NMLR ANN
1 7.590 7.286 7.263 7.054
2 7.490 6.239 6.182 6.692
3 7.440 7.034 7.104 7.360
4 7.470 7.374 7.449 7.197
5 7.410 5.952 6.070 5.441
6 7.350 7.200 7.365 6.809
7 7.240 6.810 7.108 6.748
8 7.190 6.483 6.598 5.833
9a7.130 8.271 8.166 7.010
10 7.110 6.584 6.705 7.080
11 7.040 6.778 6.927 6.917
12 6.920 7.103 7.232 6.968
13 6.850 7.479 7.501 6.693
14 6.790 5.719 6.056 4.305
15a6.790 7.733 7.548 7.013
16 6.770 6.295 5.969 6.415
17a6.690 7.194 6.861 5.791
18 6.920 6.751 6.730 5.900
19 6.640 6.618 6.495 6.553
20 6.620 5.957 6.049 6.405
21a6.600 6.491 6.815 6.825
22a6.550 7.347 7.114 7.144
23 6.550 7.059 6.857 5.183
24 6.520 6.930 6.755 6.538
25 6.520 6.345 6.350 5.814
26 6.460 6.851 6.783 5.704
27 6.390 6.544 6.607 5.981
28 6.270 5.272 5.679 6.053
29 6.090 7.367 7.250 5.407
30 6.020 6.077 5.930 5.358
31a5.920 7.017 6.845 4.547
32 5.850 6.232 5.860 4.379
33a5.350 6.044 6.112 4.563
34 5.200 5.928 5.971 4.054
35 5.170 5.677 5.959 5.395
36a5.140 5.578 5.963 6.824
37a5.090 6.212 5.227 7.163
38 5.050 5.997 4.397 6.584
39 5.080 4.857 5.411 6.711
40 4.740 5.114 4.846 6.702
41 4.670 5.023 5.109 6.579
42 4.420 4.719 4.827 5.958
43 3.960 5.425 5.528 5.729
44 4.440 5.161 5.318 5.719
aTest set.Table 6
Results of test sets generated.
Test set 2 Test set 3
R R2
ext MSE R R2ext MSE
MLR 0.502 0.616 0.606 0.445 0.711 0.600
MNLR 0.776 0.660 1.133 0.757 0.541 1.271
ANN 0.917 0.865 0.414 0.812 0.753 0.331
4. Conclusion
In this study, we investigated the QSAR regression
to predict the inhibitory activity of flavonoid derivatives
against aldose reductase.
The studies regarding the quality of the three models
constructed in the study have good stabilities and great
predictive powers. Moreover, compared to the MLR
and MNLR models, the ANN model is better and is
an effective tool to predict the inhibitory activity of
the flavonoid derivatives. Furthermore, using the ANN
approach, we established a relationship between sev-
eral
descriptors and inhibition values, pIC50, of several
organic compounds based on the substituted flavonoids
in a satisfactory manner.
The accuracy and predictability of the proposed
models were illustrated by comparing key statistical indi-
cators, such as the R or R2of different models obtained
using different statistical tools and different descriptors,
as shown in Table 5. To validate these results, we gener-
ated two test sets, as shown in Table 6.
Finally, we conclude that the studied descriptors,
which are sufficiently rich in chemical, electronic and
topological information to encode the structural features,
may be used with other descriptors for the development
of predictive QSAR models.
Acknowledgments
We are grateful to the “Association Marocaine des
Chimistes Théoriciens” (AMCT) for its pertinent help
concerning the programs.
References
[1] Z. Kyselova, M. Stefek, V. Bauer, Pharmacological prevention of
diabetic cataract, J. Diabetes Complicat. 18 (2004) 129–140.
[2] Y. Hamada, J. Nakamura, Clinical potential of aldose reductase
inhibitors in diabetic neuropathy, Treat. Endocrinol. 3 (2004)
245–255.
[3] J.M. Forbes, M.T. Coughlan, M.E. Cooper, Oxidative stress as a
major culprit in kidney disease in diabetes, Diabetes 57 (2008)
1446–1454.

542 M. Ghamali et al. / Journal of Taibah University for Science 10 (2016) 534–542
[4] S. Miyamoto, Recent advances in aldose reductase inhibitors:
potential agents for the treatment of diabetic complications,
Expert Opin. Ther. Patents 12 (2002) 621–631.
[5] P.E. Kador, N.E. Sharpless, Structure–activity studies of aldose
reductase inhibitors containing the 4-oxo-4H-chromen ring sys-
tem, Biophys. Chem. 8 (1978) 81–85.
[6] P.E. Kador, J.H. Kinoshita, N.E. Sharpless, Aldose reductase
inhibitors:
a potential new class of agents for the pharmacolog-
ical control of certain diabetic complications, J. Med. Chem. 28
(1985) 841–849.
[7] J.R. Pfister, W.E. Wymann, J.M. Mahoney, L.D. Water-
bury, Synthesis and aldose reductase inhibitory activity of
7-sulfamoylxanthone-2-carboxylic acids, J. Med. Chem. 23
(1980) 1264–1267.
[8] M.S. Malamas, J. Millen, Quinazoline acetic acids and related
analogues as aldose reductase inhibitors, J. Med. Chem. 34 (1991)
1492–1503.
[9] Y .S. Prabhakar, M.K. Gupta, N. Roy, Y. Venkateswarlu, A high
dimensional QSAR study on the aldose reductase inhibitory activ-
ity of some flavones: topological descriptors in modeling the
activity, J. Chem. Inf. Model. 46 (2006) 86–92.
[10] A.G. Mercader, P.R. Duchowicz, F.M. Fernandez, E.A. Castro,
D.O. Bennardi, J.C. Autino, et al., QSAR prediction of inhibition
of aldose reductase for flavonoids, J. Bioorg. Med. Chem. 16
(2008) 7470–7476.
[11] H. Liu, S. Liu, L. Qin, L. Mo, CoMFA, CoMSIA analysis of 2,4-
thiazolidinediones derivatives as aldose reductase inhibitors, J.
Mol. Model. 15 (2009) 837–845.
[12] S. Thareja, S. Aggarwal, T.R. Bhardwaj, M. Kumar, 3D-QSAR
studies on a series of 5-arylidine-2,4-thiazolidinediones as aldose
reductase inhibitors: a self-organizing molecular field analysis
approach, Med. Chem. 6 (2010) 30–36.
[13] G.D. Carlo, N. Mascolo, A.A. Izzo, F. Capasso, Flavonoid: old
and new aspects of a class of natural therapeutic drugs, Life Sci.
65 (1999) 337–353.
[14] P.C.H. Haollman, I.C.W. Arts, Flavonols, flavones and flavanols
– nature, occurrence and dietary burden, J. Sci. Food Agric. 80
(2000) 1081–1093.
[15] S.A. Aherne, N.M. O’Brien, Dietary flavonols: chemistry, food
content, and metabolism, Nutrition 18 (2002) 75–81.
[16] O.P. Agarwal, The anti-inflammatory action of nepitrin, a
flavonoid, Agents Actions 12 (1982) 298–302.
[17] M.K. Church, Cromoglycate-like anti-allergic drugs: a review,
Drugs Today 14 (1978) 281–341.[18] B.F. Rasulev, N.D. Abdullaev, V .N. Syrov, J. Leszczynski, A
quantitative structure–activity relationship (QSAR) study of the
antioxidant activity of flavonoids, QSAR Comb. Sci. 24 (2005)
1056–1065.
[19] H. Kubinyi, QSAR: Hansch analysis and related approaches, in:
R. Mannhold, P. Krokgsgaard-Larsen, H. Timmerman (Eds.),
Methods and Principles in Medicinal Chemistry, VCH, Wein-
heim, 1993.
[20] A. Stefanic-Petek, A. Krbavcic, T. Solmajer, QSAR of flavonoids.
4. Differential inhibition of aldose reductase and p56lck protein
tyrosine kinase, Croat. Chem. Acta 75 (2002) 517–529.
[21] M.J. Frisch, Gaussian 03 Revision B.01, Gaussian, Inc., Pitts-
burgh, PA, 2003.
[22] U. Sakar, R. Parthasarathi, V. Subramanian, P.K. Chattaraji, Tox-
icity analysis of polychlorinated dibenzofurans through global, J.
Mol. Des. IECMD (2004) 1–24.
[23] Advanced Chemistry Development, Inc., Toronto, Canada, 2009
(www.acdlabs.com/resources/freeware/chemsketch/ ).
[24] M. Larif, A. Adad, R. Hmammouchi, A.I. Taghki, A. Soulaymani,
A. Elmidaoui, M. Bouachrine, T. Lakhlifi, Biological activities
of triazine derivatives combining DFT and QSAR results, Arab.
J. Chem. (2016), http://dx.doi.org/10.1016/j.arabjc.2012.12.033
(in press).
[25] XLSTAT. software, XLSTAT Company, 2013 http://www.
xlstat.com .
[26] S. Chtita, M. Larif, M. Ghamali, M. Bouachrine, T. Lakhlifi,
DFT-based QSAR studies of MK801 derivatives for non com-
petitive antagonists of NMDA using electronic and topological
descriptors, J. Taibah Univ. Chem. 9 (2) (2014) 143–154.
[27] A. Adad, M. Larif, R. Hmamouchi, M. Bouachrine, T. Lakhlifi,
J. Comp. Methods Mol. Des. 4 (3) (2014) 72–83.
[28] R. Hmamouchi, M. Larif, A. Adad, M. Bouachrine, T. Lakhlifi,
J. Comp. Methods Mol. Des. 4 (3) (2014) 61–71.
[29] D. Cherqaoui, D. Villemin, J. Chem. Soc. Faraday Trans. 90
(1994) 97–102.
[30] J.A. Freeman, D.M. Skapura, Neural Networks, Algorithms,
Applications and Programming Techniques, Addison-Wesley
Publishing Company, Reading, MA, 1991.
[31] A. Golbraikh, A. Tropsha, Beware of q2!, J. Mol. Graph. Model.
20 (2002) 269–276.
[32] Software STATITCF, Technical Institute of Cereals and Fodder,
Paris, France, 1987.
[33] A. Ousaa, B. Elidrissi, M. Ghamali, S. Chtita, M. Bouachrine,
T. Lakhlifi, J. Comp. Methods Mol. Des. 4 (3) (2014) 10–18.

Similar Posts