qsar modeling of carbonic anhydrase-i - Revue Roumaine de Chimie

0 downloads 0 Views 264KB Size Report
d Laboratorio di Chimica Bioinorganica, Departmento di Chimica, University of Florence, Via della Lastruccia, 3, Polo. Scientifico, 50019 Sesto Fioventino, ...
Revue Roumaine de Chimie, 2006, 51(7-8), 703–717

Dedicated to the memory of Professor Mircea D. Banciu (1941–2005)

QSAR MODELING OF CARBONIC ANHYDRASE-I, -II AND -IV INHIBITORY ACTIVITIES: RELATIVE CORRELATION POTENTIAL OF SIX TOPOLOGICAL INDICES Padmakar V. KHADIKAR,a* Brian W. CLARE,b Alexandru T. BALABAN,c* Claudiu T. SUPURAN,d* Vijay K. AGARWAL,e Jyoti SINGH,e Ashok K. JOSHIf and Meenakshi LAKHWANI g a

b

Research Division, Laxmi Fumigation and Pest Control, Pvt. Ltd., 3, Khatipura, Indore 452007, India, e-mail: [email protected] School of Biomedical and Chemical Sciences, University of Western Australia,35, Stirling Highway Crawltywa 6009, Australia, e-mail: [email protected] c Texas A&M University at Galveston, 5007 Avenue U, Galveston, TX, 77551, USA, e-mail: [email protected] d Laboratorio di Chimica Bioinorganica, Departmento di Chimica, University of Florence, Via della Lastruccia, 3, Polo Scientifico, 50019 Sesto Fioventino, Firenze, Italy. e Department of Chemistry, A.P.S. University, Rewa, 486 003, India, e-mail: vijay- [email protected] f Upzon Drugs Pvt. Ld, MPLUN Plot No,32, Sector ‘F’ Sanwer Road, Indore 452015, India. g Department of Chemistry, Holkar Model & Autonomous College, Indore, India.

Received December 9, 2005

QSAR studies on modeling the biological activities of 26 benzenesulfonamide derivatives as inhibitors of carbonic anhydrases CA-I, CA-II and CA-IV were performed using six distance-based topological indices, including Balaban-type indices JhetV and JhetE. Satisfactory multiparametric correlations were obtained.

INTRODUCTION1 Carbonic anhydrases (CAs) are important enzymes found in red blood cells, gastric mucosa, pancreatic cells, and renal tubules. The physiological and physio-pathological processes in which carbonic anhydrases are involved were thoroughly investigated due to pharmacological applications of their inhibitors, chiefly sulfonamides.1-3 It has been shown that such inhibitors are important clinical agents.4 As a result a large number of aromatic and heterocyclic sulfonamides were synthesized and tested for their CA inhibitory potential.5-20 Among these carbonic anhydrases, benzenesulfonamides have attracted much attention.5-20 One of the authors [CTS] has published extensive work on such inhibitors,2-4,14-25 and has reported CA inhibition data K1 (nmol) for hCAI, hCAII and bCAIV. However, till now, few QSAR (quantitative structure activity relationship) studies employing distance- based topological indices have been reported. This is, one of the objectives of present study. It is worth mentioning that QSAR methodology is very useful in screening a large library of possible drug candidates for selectivity and potency, arriving at models that correlate molecular structure to bioactivity activity.4-20 A current interest in predictive QSAR is the estimation of biological activities of organic compounds acting as drugs from their calculated structural parameters. Molecular structure is encoded through numerical descriptors, which correspond to topological, geometrical, chemical, or electronic structural features. During the last decades, QSAR modeling based on topological (graph-theoretical) indices has undergone an explosive growth due to rapid progress in chemical graph theory and to advances of computer technology. Looking to the potential of QSAR methodology, we have recently investigated the relative activity of carbonic anhydrase inhibitors.21-36 The parameters used in these studies were chiefly distance-based topological indices. In a few cases, information-theoretic indices were also used.21-41 Having noted that in the *Authors for correspondence: Phone + 91-731-531906 (PVK): +1-409-741-4313 (ATB); +39-055-4573005 (CTS)

704

Padmakar V. Khadikar et al.

literature there are only a few QSAR studies based on the Balaban index (J),42 and its extensions for taking into account the presence of heteroatoms and/or multiple bonds,43-45 we decided to undertake the present study for investigating the relative potential of Balaban-type indices for modeling the CA inhibitory activities of the set of compounds indicated in Table 1. Table 1 Structural details of carbonic anhydrase inhibitors used in present investigation SO 2-NH2

SO2-NH2

SO 2-NH2 NH2

1

NH2

NH2

SO 2-NH2

SO 2-NH2

NH-NH2

2

CH2-NH 2

4

SO 2-NH2

CH2-CH2-NH2

5

Br

Cl

9

NH2

NH2

NH2

7

8

SO 2-NH2

SO2-NH2

SO2-NH2 Cl

H2N

I

N

10

N

N

H3C

SO2-NH2

S

N

O H3C

NH

S

13

HN

N

H2N-CH2-CH2-OC-HN

SO 2-NH2

S

N SO 2-NH2

S

S

H2N

16

14

N

N

S

H2N

S

18

O

17 N

O

NH

SO 2-NH2

SO 2-NH2

NH

H2N

N

O

19

20 SO2-NH2

N

N

SO 2-NH2

SO 2-NH2 S

HO-H2C-H2C-O

22

S

21

N NH2

HO

SO 2-NH2

NH

O

SO2-NH2

SO 2-NH 2

S

S

23 CH2-OH

SO2-NH2

CH2-CH2-OH

25

SO 2-NH2

S

O

N

O

N

15

H3C

O

H2N

12

NH2

11

Cl

N

H2N

SO2-NH2

SO 2-NH2

Cl

NH2

6

SO 2-NH2

SO2-NH2

F

3

SO 2-NH2

24

QSAR modeling of carbonic anhydrase

705

STRUCTURES AND MOLECULAR DESCRIPTORS Structural details of 25 carbonic anhydrase sulfonamide inhibitors used in the present study are presented in Table 1. The CA inhibitory data (Ki, nmol) reported by one of us (CTS) against isozymes I, II and IV were used by converting them into their log units.13 Distance-based molecular descriptors selected for the present study are topostructural indices, namely Balaban (J),42 Wiener (W),46 Szeged (Sz),47-49 first order Randić connectivity (1χ),59,51 and topochemical indices. Several such distance-based Balaban-type topochemical indices have been described by accounting for heteroatoms via their atomic number (JhetZ),52 atomic weight (JhetM), atomic radius (JhetV), electronegativity (JhetE), and polarizability (JhetP). Table 2 contains the values of molecular descriptors for the 25 sulfonamides from Table 1. Table 2 Various topological descriptors used in the present study and their values* Comp.

W

J

JhetZ

JhetM

JhetV

JhetE

JhetP

1

144

2.545

4.788

4.789

3.092

3.614

2

148

2.461

4.585

4.586

3.015

3.504

3

152

2.394

4.425

4.426

2.953

4

201

2.359

4.121

4.122

5

201

2.359

4.008

4.009

6

262

2.305

3.632

7

189

2.512

4.526

9

189

2.512

10

189

11

189

12 13

1

χ

Sz

3.531

5.016

220

3.430

4.999

228

3.415

3.348

4.999

236

2.695

3.331

2.884

5.537

306

2.895

3.257

3.216

5.537

306

3.633

2.791

3.072

3.046

6.037

388

4.540

2.878

3.592

3.083

5.410

291

4.645

4.652

3.101

3.568

3.509

5.410

291

2.512

4.720

4.730

3.150

3.553

3.562

5.410

291

2.512

4.745

4.754

3.178

3.523

3.623

5.410

291

458

2.991

5.883

5.894

3.629

4.203

4.260

7.459

668

399

2.853

5.633

5.640

3.456

4.014

4.073

7.032

582

14

113

2.449

6.035

6.039

2.416

3.271

2.913

4.499

132

15

146

2.538

5.769

5.772

2.360

3.437

2.654

4.910

171

16

403

2.304

3.826

3.827

2.064

2.805

2.189

6.931

452

17

853

1.861

3.779

3.781

1.848

2.401

2.159

9.182

1124

18

948

1.937

3.741

3.743

1.836

2.452

2.093

9.593

1237

19

1004

1.816

3.225

3.226

1.984

2.473

2.205

9.682

1502

20

960

1.900

3.409

3.410

2.049

2.575

2.287

9.682

1414

21

669

1.731

2.651

2.652

1.818

2.419

1.788

8.449

1057

22

287

1.987

3.927

3.929

2.304

2.736

2.648

6.465

430

23

287

1.987

3.953

3.955

2.261

2.749

2.583

6.465

430

24

543

1.856

3.228

3.228

1.816

2.500

1.909

8.003

776

25

201

2.359

4.036

4.036

2.827

3.276

3.120

5.537

306

26

262

2.305

3.651

3.652

2.737

3.086

2.973

6.037

388

* W – Wiener index; J- Balaban distance connectivity index; JhetZ – Balaban-type index from Z-weighted distance matrix (Barysz et al. matrix); JhetM – Balaban-type index from mass weighted distance matrix; JhetV – Balaban-type index from van der Waals weighted distance matrix; JhetE – Balaban-type index from electro negativity weighted distance matrix; JhetP – Balabantype index from polarizability weighted distance matrix; 1χ – First order Randić connectivity index; and Sz – Szeged index.

We have carried out regression analysis 53-55 for modeling the inhibitory activity of sulfonamides against CA-I, CA-II, and CA-IV using the maximum R2 method,53 and the results are discussed below.

706

Padmakar V. Khadikar et al.

RELATIVE POWER OF BALABAN-TYPE INDICES FOR MODELING LOG KI(HCA-I), LOG KI(HCA-II, AND LOG KI(BCA-IV) ACTIVITIES The first step is obviously to investigate uniparametric modeling separately for each descriptor, although the Balaban indices were designed to account for “topological shape” (degree of branching, centricity of branches, and cyclicity), and not for molecular size. Consequently, it was emphasized that for physical-chemical properties or biological activities that depend on molecular size, J and J-type indices should always be used in multiparametric correlations.42-45 As expected, Table 3 shows that in uniparametric correlations the size-independent J and J-type indices lead to poorer correlations (lower values of the correlation coefficient R) than indices W, 1χ and Sz, which increase with graph size. Among the Jhet indices, the best results (comparable to J) are observed for JhetV and JhetE. Table 3 Regression parameters and quality of correlations for uniparametric modeling Parameter W J JhetZ JhetM JhetV JhetE JhetP 1 χ Sz

R -0.812 0.757 0.493 0.493 0.775 0.788 0.712 -0.830 -0.780

CA-I St.err. 0.763 0.853 1.136 1.137 0.826 0.805 0.917 0.729 0.818

F 44.36 30.91 7.38 7.37 34.53 37.60 23.63 50.81 35.68

R -0.721 0.615 0.359 0.359 0.748 0.689 0.690 -0.750 -0.678

CA-II St.err. 0.482 0.548 0.648 0.649 0.461 0.505 0.499 0.460 0.511

F 24.83 13.99 3.40 3.39 29.18 20.63 21.60 29.61 19.58

R -0.653 0.526 0.280 0.279 0.567 0.561 0.502 -0.694 -0.620

CA-IV St.err. 0.641 0.720 0.818 0.819 0.702 0.702 0.737 0.613 0.669

F 17.14 9.25 1.96 1.94 10.91 7.55 7.78 21.25 14.39

The next step was to proceed to biparametric correlations associating one of the W, 1χ and Sz indices to each of the J and J-type indices. In particular, we will concentrate of six topological indices, namely the foremost three Balaban indices (J, JhetV and JhetE) and three distance-based indices (W, 1χ and Sz). In Tables 4 and 5 we present the results for biparametric correlations involving these pairwise associations. It is evident that models 5, 14, and 23 involving JhetV and 1χ (highlighted by boldface characters) have the highest R values and the lowest standard errors. Table 4 Regression parameters for biparametric modeling of log Ki(hCA-I), log Ki(hCA-II), and log Ki(bCA-IV) Model Parameters Regression Coefficients No. used (i) For modeling of log Ki(hCAI) 1. W -0.0025(±0.0006) J 1.5451(±0.5497) 1 2. χ -0.4518(±0.1033) J 1.4770(±0.5207) 3. Sz -0.0016(±0.0004) J 1.6762(±0.5924) 4. W -0.0024(±0.0005) Jhetv 1.0114(±0.3141) 1 5. χ -0.4383(±0.0954) Jhetv 0.9852(±0.2892) 6. Sz -0.0015(±0.0004) Jhetv 1.1144(±0.3167) 7. W -0.0023(±0.0006) Jhete 1.0967(±0.3476) 1 8. χ -0.4237(±0.1023) Jhete 1.0474(±0.3290) 9. Sz -0.0015(±0.0004) Jhete 1.2032(±0.3585)

Constant

St.error

R2A

R

F-ratio

0.6902

0.6692

0.7259

0.8653

32.785

2.8764

0.6378

0.7511

0.8785

37.207

0.3072

0.7158

0.6865

0.8442

27.277

1.5676

0.6432

0.7468

0.8763

36.396

3.6084

0.6031

0.7774

0.8922

42.914

1.2345

0.6687

0.7264

0.8656

32.858

0.7117

0.6473

0.7436

0.8746

35.796

2.7775

0.6167

0.7673

0.8869

40.566

0.2981

0.6798

0.7172

0.8607

31.430

Table 4 (continues)

QSAR modeling of carbonic anhydrase

707 Table 4 (continues)

(ii) For modeling of log Ki(hCAII) 10. W -0.0013(±0.0004) J 0.5229(±0.3891) 1 11. χ -0.2494(±0.0735) J 0.4619(±0.3708) 12. Sz -0.0008(±-0.0003) J 0.6159(±0.4118) 13. W -0.0009(±0.0003) Jhetv 0.6054(±0.2031) 1 14. χ -0.1909(±0.0625) Jhetv 0.5711(±0.1896) 15. Sz -0.0005(±0.0002) Jhetv 0.6638(±0.2029) 16. W -0.0011(±0.0004) Jhete 0.4898(±0.2431) 1 17. χ -0.2161(±0.0722) Jhete 0.4452(±0.2320) 18. Sz -0.0006(±0.0003) Jhete 0.5629(±0.2479) (iii) For modeling of log Ki(bCAIV) 19. W -0.0015(±0.0006) J 0.4832(±0.5320) 1 20. χ -0.3016(±0.1004) J 0.3783(±0.5060) 21. Sz -0.0009(±0.0004) J 0.5694(±0.5525) 22. W -0.0014(±0.0005) Jhetv 0.3842(±0.3116) 1 23. χ -0.2810(±0.0964) Jhetv 0.3352(±0.2924) 24. Sz -0.0008(±0.0004) Jhetv 0.4584(±0.3086) 25. W -0.0015(±0.0006) Jhete 0.3582(±0.3459) 1 26. χ -0.2910(±0.1024) Jhete 0.2846(±0.3292) 27. Sz -0.0009(±0.0004) Jhete 0.4360(±0.3483)

0.8625

0.4737

0.5152

0.7454

13.753

2.1383

0.4541

0.5545

0.7691

15.934

0.5898

0.4975

0.4652

0.7140

11.437

0.3527

0.4159

0.6263

0.8108

21.111

1.3267

0.3954

0.6623

0.8309

24.532

0.1545

0.4283

0.6037

0.7979

19.277

0.4417

0.4527

0.5571

0.7707

16.096

1.5762

0.4349

0.5914

0.7909

18.371

0.1511

0.4700

0.5227

0.7500

14.139

1.7555

0.6476

0.3975

0.6691

8.916

3.3901

0.6197

0.4483

0.7030

10.750

1.5034

0.6676

0.3598

0.6428

7.744

1.8217

0.6380

0.4153

0.6812

9.523

3.2498

0.6096

0.4662

0.7146

11.479

1.5748

0.6516

0.3900

0.6640

8.674

1.7074

0.6442

0.4039

0.6735

9.132

3.2911

0.6172

0.4529

0.7060

10.932

1.4019

0.6604

0.3735

0.6525

8.155

The next steps consist in looking at triparametric correlations involving JhetV and two from the distancebased indices (W, 1χ and Sz). Results are displayed in Table 6 as models 28–33. A slight increase in R and a slight decrease in the standard error may be seen by comparing Tables 4 and 6. A final possible refinement may be obtained by introducing an indicator variable I1 for taking into account the large residuals observed in Table 3 for the calculated versus observed inhibitory activity of CA-I for compounds 21, 22, and 23: the indicator parameter I1 signifies the presence (=1) or absence (=0) of an electron-donating group (NH2 or OR) attached to the benzene ring of a thiazole group. The resulting tetraparametric correlations are presented in the lower part of Table 6 as models 34 –39. Then Tables 7 and 8 show the observed and calculated data, with the corresponding residuals, for tri- and tetraparametric regressions, respectively. PREDICTIVE POWER OF THE PROPOSED MODELS We now discuss the predictive power of the best models for logKi(hCA-I), logKi(hCA-II), and logKi(bCA-IV). The correlation coefficients R2pred are 0.8942 (model 35), 0.7709 (model 36), and 0.6663 (model 38), for logKi(hCAI), logKi(hCAII), and logKi(bCAIV), respectively. In order to investigate the intercorrelations between descriptors in the proposed models, in Table 9 we present the correlation matrix from the data in Table 2. This can be useful in determining if certain variables are redundant and therefore not needed in the model. Because JhetZ and JhetM are so highly intercorrelated with one another, and because JhetP correlates with JhetV we decided to keep only the topostructural index J, and the topochemical indices JhetV and JhetE, along with indices W, 1χ, and Sz.

Obs.

4.657 4.398 4.447 4.895 4.398 4.322 3.919 3.991 3.813 3.778 3.785 3.924 3.934 3.968 2.658 0.778 0.954 1.623 1.643 2.839 1.845 1.740 1.699 4.380 4.255

Comp.

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.

Model Estim. 4.267 4.127 4.014 3.839 3.839 3.604 4.105 4.105 4.105 4.105 4.180 4.113 4.195 4.251 3.255 1.458 1.341 1.016 1.254 1.712 3.051 3.051 2.217 3.839 3.604

1 Residual 0.39 0.271 0.433 1.056 0.559 0.718 -0.186 -0.114 -0.292 -0.327 -0.395 -0.189 -0.261 -0.283 -0.597 -0.680 -0.387 0.607 0.389 1.127 -1.206 -1.311 -0.518 0.541 0.651

Model Estim. 4.369 4.253 4.154 3.859 3.859 3.553 4.142 4.142 4.142 4.142 3.924 3.913 4.461 4.406 3.148 1.476 1.403 1.184 1.308 1.615 2.890 2.890 2.002 3.859 3.553

2 Residual 0.288 0.145 0.293 1.036 0.539 0.769 -0.223 -0.151 -0.329 -0.364 -0.139 0.011 -0.527 -0.438 -0.490 -0.698 -0.449 0.439 0.335 1.224 -1.045 -1.150 -0.303 0.521 0.702

A. Estimated log Ki (hCA-I) using biparametric models Model Estim. 4.229 4.076 3.951 3.782 3.782 3.564 4.062 4.062 4.062 4.062 4.275 4.179 4.206 4.294 3.462 1.667 1.618 1.000 1.279 1.554 2.965 2.965 2.204 3.782 3.564

3 Residual 0.428 0.322 0.496 1.113 0.616 0.758 -0.143 -0.071 -0.249 -0.284 -0.490 -0.255 -0.272 -0.326 -0.804 -0.889 -0.664 0.623 0.364 1.285 -1.120 -1.225 -0.505 0.598 0.691

Model Estim. 4.351 4.264 4.191 3.813 4.016 3.765 4.027 4.253 4.302 4.331 4.144 4.110 3.741 3.606 2.693 1.400 1.161 1.177 1.347 1.809 3.213 3.169 2.108 3.947 3.710

4 Resid. 0.306 0.134 0.256 1.082 0.382 0.557 -0.108 -0.262 -0.489 -0.553 -0.359 -0.186 0.193 0.362 -0.035 -0.622 -0.207 0.446 0.296 1.030 -1.368 -1.429 -0.409 0.433 0.545 Model Estim. 4.456 4.388 4.326 3.836 4.033 3.712 4.072 4.292 4.340 4.368 3.914 3.931 4.017 3.781 2.604 1.404 1.212 1.319 1.383 1.696 3.044 3.002 1.889 3.966 3.659

5 Resid. 0.201 0.010 0.121 1.059 0.365 0.610 -0.153 -0.301 -0.527 -0.590 -0.129 -0.007 -0.083 0.187 0.054 -0.626 -0.258 0.304 0.260 1.143 -1.199 -1.262 -0.190 0.414 0.596

Model Estim. 4.341 4.243 4.162 3.766 3.989 3.747 3.994 4.242 4.297 4.328 4.250 4.189 3.724 3.601 2.838 1.562 1.375 1.132 1.340 1.632 3.140 3.092 2.063 3.914 3.687

Observed and calculated values and their residuals

Table 5

6 Resid. 0.316 0.155 0.285 1.129 0.409 0.575 -0.075 -0.251 -0.484 -0.550 -0.465 -0.265 0.210 0.367 -0.180 -0.784 -0.421 0.491 0.303 1.207 -1.295 -1.352 -0.364 0.466 0.568

Model Estim. 4.344 4.214 4.108 3.903 3.822 3.479 4.217 4.190 4.174 4.141 4.269 4.197 4.039 4.146 2.862 1.385 1.222 1.116 1.329 1.827 3.053 3.067 2.206 3.843 3.494

7 Resid. 0.313 0.184 0.339 0.992 0.576 0.843 -0.298 -0.199 -0.361 -0.363 -0.484 -0.273 -0.105 -0.178 -0.204 -0.607 -0.268 0.507 0.314 1.012 -1.208 -1.327 -0.507 0.537 0.761

Model Estim. 4.438 4.330 4.236 3.920 3.843 3.437 4.248 4.223 4.207 4.175 4.019 4.002 4.297 4.297 2.779 1.402 1.281 1.265 1.372 1.731 2.904 2.918 2.005 3.863 3.452

8 Resid. 0.219 0.068 0.211 0.975 0.555 0.885 -0.329 -0.232 -0.394 -0.397 -0.234 -0.078 -0.363 -0.329 -0.121 -0.624 -0.327 0.358 0.271 1.108 -1.059 -1.178 -0.306 0.517 0.803

Model Estim. 4.327 4.183 4.064 3.861 3.772 3.430 4.197 4.168 4.150 4.114 4.384 4.282 4.042 4.185 3.016 1.553 1.450 1.090 1.341 1.672 2.965 2.981 2.178 3.795 3.447

9 Resid. 0.330 0.215 0.383 1.034 0.626 0.892 -0.278 -0.177 -0.337 -0.336 -0.599 -0.358 -0.108 -0.217 -0.358 -0.775 -0.496 0.533 0.302 1.167 -1.120 -1.241 -0.479 0.585 0.808

Obs.

2.470 2.380 2.477 2.505 2.230 2.204 1.778 2.041 1.602 1.845 1.447 1.875 1.778 1.279 0.477 0.301 0.778 0.778 0.954 1.079 0.954 0.903 0.845 2.097 2.041

Comp.

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.

Model Estim. 2.003 1.954 1.913 1.830 1.830 1.721 1.926 1.926 1.926 1.926 1.821 1.827 1.994 1.996 1.534 0.708 0.622 0.484 0.586 0.883 1.522 1.522 1.115 1.830 1.721

10 Residual 0.467 0.426 0.564 0.675 0.400 0.483 -0.148 0.115 -0.324 -0.081 -0.374 0.048 -0.216 -0.717 -1.057 -0.407 0.156 0.294 0.368 0.196 -0.568 -0.619 -0.27 0.267 0.320

Model Estim. 2.063 2.028 1.997 1.847 1.847 1.697 1.949 1.949 1.949 1.949 1.660 1.702 2.147 2.086 1.474 0.708 0.641 0.562 0.601 0.831 1.444 1.444 1.000 1.847 1.697

11 Residual 0.407 0.352 0.480 0.658 0.383 0.507 -0.171 0.092 -0.347 -0.104 -0.213 0.173 -0.369 -0.807 -0.997 -0.407 0.137 0.216 0.353 0.248 -0.490 -0.541 -0.155 0.250 0.344

B. Estimated log Ki (hCA-II) using biparametric models Model Estim. 1.979 1.921 1.873 1.795 1.795 1.695 1.901 1.901 1.901 1.901 1.891 1.876 1.991 2.015 1.643 0.826 0.781 0.492 0.615 0.800 1.465 1.465 1.105 1.795 1.695

12 Residual 0.491 0.459 0.604 0.710 0.435 0.509 -0.123 0.140 -0.299 -0.056 -0.444 -0.001 -0.213 -0.736 -1.166 -0.525 -0.003 0.286 0.339 0.279 -0.511 -0.562 -0.260 0.302 0.346

Model Estim. 2.084 2.034 1.993 1.789 1.910 1.787 1.911 2.046 2.076 2.093 2.104 2.057 1.705 1.639 1.210 0.642 0.542 0.577 0.660 0.803 1.468 1.442 0.924 1.869 1.755

13 Resid. 0.386 0.346 0.484 0.716 0.320 0.417 -0.133 -0.005 -0.474 -0.248 -0.657 -0.182 0.073 -0.360 -0.733 -0.341 0.236 0.201 0.294 0.276 -0.514 -0.539 -0.079 0.228 0.286 Model Estim. 2.135 2.094 2.059 1.809 1.923 1.768 1.937 2.065 2.093 2.109 1.975 1.958 1.847 1.737 1.182 0.629 0.544 0.611 0.648 0.752 1.408 1.384 0.836 1.884 1.737

14 Resid. 0.335 0.286 0.418 0.696 0.307 0.436 -0.159 -0.024 -0.491 -0.264 -0.528 -0.083 -0.069 -0.458 -0.705 -0.328 0.234 0.167 0.306 0.327 -0.454 -0.481 0.009 0.213 0.304

Model Estim. 2.077 2.021 1.975 1.763 1.895 1.778 1.893 2.041 2.073 2.092 2.169 2.105 1.680 1.620 1.257 0.717 0.642 0.584 0.679 0.736 1.430 1.401 0.901 1.850 1.742

15 Resid. 0.393 0.359 0.502 0.742 0.335 0.426 -0.115 0.000 -0.471 -0.247 -0.722 -0.230 0.098 -0.341 -0.780 -0.416 0.136 0.194 0.275 0.343 -0.476 -0.498 -0.056 0.247 0.299

Model Estim. 2.050 1.992 1.944 1.848 1.812 1.653 1.989 1.977 1.970 1.955 1.987 1.961 1.917 1.961 1.364 0.662 0.580 0.528 0.627 0.877 1.460 1.467 1.058 1.821 1.660

16 Resid. 0.42 0.388 0.533 0.657 0.418 0.551 -0.211 0.064 -0.368 -0.110 -0.540 -0.086 -0.139 -0.682 -0.887 -0.361 0.198 0.250 0.327 0.202 -0.506 -0.564 -0.213 0.276 0.381

Model Estim. 2.101 2.056 2.016 1.862 1.830 1.639 2.006 1.995 1.989 1.975 1.835 1.843 2.060 2.045 1.327 0.661 0.595 0.585 0.630 0.827 1.397 1.403 0.960 1.838 1.645

17 Resid. 0.369 0.324 0.461 0.643 0.400 0.565 -0.228 0.046 -0.387 -0.130 -0.388 0.032 -0.282 -0.766 -0.850 -0.360 0.183 0.193 0.324 0.252 -0.443 -0.500 -0.115 0.259 0.396

Model Estim. 2.038 1.971 1.916 1.822 1.780 1.621 1.979 1.965 1.957 1.940 2.071 2.022 1.904 1.972 1.428 0.752 0.705 0.540 0.656 0.807 1.404 1.411 1.040 1.791 1.629

18 Resid. 0.432 0.409 0.561 0.683 0.450 0.583 -0.201 0.076 -0.355 -0.095 -0.624 -0.147 -0.126 -0.693 -0.951 -0.451 0.073 0.238 0.298 0.272 -0.450 -0.508 -0.195 0.306 0.412

Obs.

3.117 3.342 3.477 3.507 3.447 3.389 2.255 2.505 1.820 2.097 2.243 2.204 2.732 2.550 2.097 0.699 0.903 1.699 1.724 2.188 1.279 1.230 1.176 2.748 2.653

Comp.

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.

Model Estim. 2.763 2.716 2.677 2.585 2.585 2.465 2.677 2.677 2.677 2.677 2.493 2.518 2.764 2.756 2.246 1.337 1.227 1.082 1.191 1.559 2.272 2.272 1.814 2.585 2.465

19 Residual 0.354 0.626 0.800 0.922 0.862 0.924 -0.422 -0.172 -0.857 -0.580 -0.250 -0.314 -0.032 -0.206 -0.149 -0.638 -0.324 0.617 0.533 0.629 -0.993 -1.042 -0.638 0.163 0.188

Model Estim. 2.840 2.814 2.788 2.613 2.613 2.442 2.709 2.709 2.709 2.709 2.272 2.349 2.960 2.870 2.172 1.325 1.230 1.157 1.189 1.497 2.192 2.192 1.679 2.613 2.442

20 Residual 0.277 0.528 0.689 0.894 0.834 0.947 -0.454 -0.204 -0.889 -0.612 -0.029 -0.145 -0.228 -0.320 -0.075 -0.626 -0.327 0.542 0.535 0.691 -0.913 -0.962 -0.503 0.135 0.211

Model Estim. 2.738 2.683 2.637 2.549 2.549 2.438 2.651 2.651 2.651 2.651 2.556 2.562 2.769 2.782 2.375 1.469 1.402 1.076 1.209 1.460 2.216 2.216 1.805 2.549 2.438

C. Estimated log Ki (bCA-IV) using biparametric models 21 Residual 0.379 0.659 0.840 0.958 0.898 0.951 -0.396 -0.146 -0.831 -0.554 -0.313 -0.358 -0.037 -0.232 -0.278 -0.770 -0.499 0.623 0.515 0.728 -0.937 -0.986 -0.629 0.199 0.215

Model Estim. 2.803 2.768 2.738 2.569 2.645 2.518 2.656 2.742 2.761 2.771 2.558 2.577 2.588 2.519 2.036 1.307 1.166 1.143 1.231 1.560 2.295 2.278 1.740 2.619 2.497

22 Resid. 0.314 0.574 0.739 0.938 0.802 0.871 -0.401 -0.237 -0.941 -0.674 -0.315 -0.373 0.144 0.031 0.061 -0.608 -0.263 0.556 0.493 0.628 -1.016 -1.048 -0.564 0.129 0.156 Model Estim. 2.877 2.856 2.835 2.597 2.664 2.489 2.694 2.769 2.785 2.795 2.370 2.432 2.795 2.661 1.994 1.289 1.169 1.194 1.216 1.485 2.205 2.191 1.609 2.641 2.471

23 Resid. 0.240 0.486 0.642 0.910 0.783 0.900 -0.439 -0.264 -0.965 -0.698 -0.127 -0.228 -0.063 -0.111 0.103 -0.590 -0.266 0.505 0.508 0.703 -0.926 -0.961 -0.433 0.107 0.182

Model Estim. 2.794 2.752 2.716 2.535 2.627 2.505 2.632 2.735 2.757 2.770 2.638 2.636 2.564 2.503 2.115 1.411 1.304 1.134 1.243 1.458 2.244 2.225 1.710 2.596 2.481

24 Resid. 0.323 0.590 0.761 0.972 0.82 0.884 -0.377 -0.230 -0.937 -0.673 -0.395 -0.432 0.168 0.047 -0.018 -0.712 -0.401 0.565 0.481 0.730 -0.965 -0.995 -0.534 0.152 0.172

Model Estim. 2.790 2.744 2.707 2.605 2.578 2.422 2.716 2.707 2.702 2.691 2.538 2.558 2.713 2.723 2.119 1.312 1.190 1.115 1.216 1.589 2.265 2.269 1.803 2.585 2.427

25 Resid. 0.327 0.598 0.770 0.902 0.869 0.967 -0.461 -0.202 -0.882 -0.594 -0.295 -0.354 0.019 -0.173 -0.022 -0.613 -0.287 0.584 0.508 0.599 -0.986 -1.039 -0.627 0.163 0.226

Model Estim. 2.860 2.834 2.808 2.628 2.607 2.409 2.739 2.732 2.728 2.720 2.317 2.387 2.913 2.841 2.073 1.303 1.198 1.178 1.207 1.521 2.189 2.192 1.674 2.612 2.413

26 Resid. 0.257 0.508 0.669 0.879 0.840 0.980 -0.484 -0.227 -0.908 -0.623 -0.074 -0.183 -0.181 -0.291 0.024 -0.604 -0.295 0.521 0.517 0.667 -0.910 -0.962 -0.498 0.136 0.240

Model Estim. 2.777 2.722 2.676 2.575 2.543 2.387 2.703 2.692 2.686 2.672 2.625 2.621 2.708 2.744 2.212 1.423 1.342 1.110 1.234 1.492 2.202 2.208 1.784 2.551 2.393

27 Resid. 0.340 0.620 0.801 0.932 0.904 1.002 -0.448 -0.187 -0.866 -0.575 -0.382 -0.417 0.024 -0.194 -0.115 -0.724 -0.439 0.589 0.490 0.696 -0.923 -0.978 -0.608 0.197 0.260

QSAR modeling of carbonic anhydrase

711

Table 6 Regression parameters and quality of correlations for modeling logKi(hCA-I), logKi(hCA-II), and logKi(bCA-IV) using tri- and tetra-parametric regressions (A) Tri-parametric regressions Model Parameters Regression Coefficients No. used (i) For modeling of log Ki(hCA-I) 28 Jhetv 0.9873(±0.2766) 1 χ -1.0371(±0.3540) Sz 0.0025(±0.0014) 29 Jhetv 1.0445(±0.2947) 1 χ 0.0026(±0.0025) W -0.8661(±0.4288) (ii) For modeling of log Ki(hCA-II) 30 Jhetv 0.5730(±0.1681) 1 χ -0.7409(±0.2152) Sz 0.0023(±0.0008) 0.6285(±0.1874) 31 Jhetv 1 χ -0.6049(±0.2727) W 0.0025(±0.0016) (iii) For modeling of log Ki(bCA-IV) 32 Jhetv 0.3376(±0.2723) 1 χ -0.9850(±0.3485) Sz 0.0029(±0.0014) 33 Jhetv 0.4176(±0.2912) 1 χ -0.8756(±0.4237) W 0.0036(±0.0025)

Constant

Se

R2A

R

F-ratio

6.1738

0.5766

0.7965

0.9066

32.315

5.2838

0.6025

0.7779

0.8976

29.019

3.6827

0.3505

0.7346

0.8762

23.144

2.9484

0.3832

0.6828

0.8500

18.221

6.2659

0.5677

0.5371

0.7713

10.281

5.5788

0.5953

0.4909

0.7447

8.715

Constant

Se

R2A

R

F-ratio

5.2972

0.4954

0.8498

0.9353

34.949

2.0072

0.4554

0.8731

0.9456

42.281

3.5704

0.3568

0.7250

0.8780

16.820

2.4835

0.3881

0.6745

0.8537

13.433

5.6009

0.5279

0.5997

0.8163

9.987

3.2758

0.5350

0.5888

0.8107

9.590

(B) Tetra-parametric regressions Model Parameters Regression Coefficients No. used (i) For modeling of log Ki(hCAI) 0.6932(±0.2582) 34 Jhetv 1 -0.6037(±0.3387) χ 0.0005(±0.0014) Sz -1.0854(±0.3734) I1 35 Jhetv 0.3921(±0.2739) 1 χ 0.3795(±0.4446) W -0.0053(±0.0027) I1 -1.7484(±0.4272) (ii) For modeling of log Ki(hCAII) 36 Jhetv 0.5353(±0.1860) 1 χ -0.6853(±0.2439) Sz 0.0020(±0.0010) I1 -0.1391(±0.2689) 37 Jhetv 0.5359(±0.2335) 1 χ -0.4282(±0.3790) W 0.0014(±0.0023) I1 -0.2481(±0.3641) (iii) For modeling of log Ki(bCAIV) 38 Jhetv 0.1146(±0.2752) 1 χ -0.6562(±0.3610) Sz 0.0014(±0.0015) I1 -0.8234(±0.3979) -0.0409(±0.3218) 39 Jhetv 1 -0.0001(±0.5224) χ -0.0020(±0.0032) W -1.2289(±0.5019) I1

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.

Comp.

4.657 4.398 4.447 4.895 4.398 4.322 3.919 3.991 3.813 3.778 3.785 3.924 3.934 3.968 2.658 0.778 0.954 1.623 1.643 2.839 1.845 1.740 1.699 4.380 4.255

Obs.

CA-I Model Calc. 4.573 4.535 4.494 3.856 4.053 3.637 4.131 4.351 4.399 4.427 3.688 3.746 4.223 3.838 2.152 1.282 1.126 1.841 1.685 1.845 2.817 2.774 1.604 3.986 3.583

logKi 28 Residual 0.084 -0.137 -0.047 1.039 0.345 0.685 -0.212 -0.360 -0.586 -0.649 0.097 0.178 -0.289 0.130 0.506 -0.504 -0.172 -0.218 -0.042 0.994 -0.972 -1.034 0.095 0.394 0.672 Model Calc. 4.541 4.486 4.431 3.822 4.031 3.647 4.093 4.325 4.377 4.406 3.797 3.834 4.203 3.873 2.478 1.465 1.341 1.563 1.518 1.593 2.832 2.787 1.652 3.960 3.591

29 Residual 0.116 -0.088 0.016 1.073 0.367 0.675 -0.174 -0.334 -0.564 -0.628 -0.012 0.090 -0.269 0.095 0.180 -0.687 -0.387 0.060 0.125 1.246 -0.987 -1.047 0.047 0.420 0.664 2.470 2.380 2.477 2.505 2.230 2.204 1.778 2.041 1.602 1.845 1.447 1.875 1.778 1.279 0.477 0.301 0.778 0.778 0.954 1.079 0.954 0.903 0.845 2.097 2.041

Obs.

CA-II Model Calc. 2.243 2.229 2.212 1.826 1.941 1.699 1.991 2.119 2.147 2.163 1.768 1.788 2.037 1.789 0.767 0.516 0.464 1.090 0.926 0.888 1.199 1.175 0.753 1.902 1.668

logKi 30 Residual 0.227 0.151 0.265 0.679 0.289 0.505 -0.213 -0.078 -0.545 -0.318 -0.321 0.087 -0.259 -0.510 -0.290 -0.215 0.314 -0.312 0.028 0.191 -0.245 -0.272 0.092 0.195 0.373 Model Calc. 2.217 2.189 2.160 1.795 1.921 1.705 1.957 2.097 2.128 2.145 1.862 1.864 2.028 1.826 1.060 0.687 0.669 0.848 0.779 0.652 1.203 1.176 0.606 1.878 1.671

31 Residual 0.253 0.191 0.317 0.710 0.309 0.499 -0.179 -0.056 -0.526 -0.300 -0.415 0.011 -0.250 -0.547 -0.583 -0.386 0.109 -0.070 0.175 0.427 -0.249 -0.273 0.239 0.219 0.370 3.117 3.342 3.477 3.507 3.447 3.389 2.255 2.505 1.820 2.097 2.243 2.204 2.732 2.550 2.097 0.699 0.903 1.699 1.724 2.188 1.279 1.230 1.176 2.748 2.653

Obs.

CA-IV Model Calc. 3.015 3.029 3.032 2.620 2.687 2.401 2.763 2.838 2.855 2.864 2.105 2.214 3.037 2.728 1.462 1.145 1.067 1.808 1.571 1.660 1.938 1.923 1.274 2.665 2.382

logKi 32 Residual 0.102 0.313 0.445 0.887 0.760 0.988 -0.508 -0.333 -1.035 -0.767 0.138 -0.010 -0.305 -0.178 0.635 -0.446 -0.164 -0.109 0.153 0.528 -0.659 -0.693 -0.098 0.083 0.271

Observed and calculated log Ki(hCA-I), log Ki(hCA-II), and log Ki(bCA-IV) values using tri-parametric models 28-33.

Table 7

Model Calc. 2.995 2.992 2.981 2.578 2.661 2.399 2.722 2.815 2.836 2.847 2.207 2.297 3.054 2.789 1.819 1.373 1.349 1.534 1.403 1.342 1.910 1.892 1.279 2.633 2.376

33 Residual 0.122 0.350 0.496 0.929 0.786 0.990 -0.467 -0.310 -1.016 -0.750 0.036 -0.093 -0.322 -0.239 0.278 -0.674 -0.446 0.165 0.321 0.846 -0.631 -0.662 -0.103 0.115 0.277

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.

Comp.

4.657 4.398 4.447 4.895 4.398 4.322 3.919 3.991 3.813 3.778 3.785 3.924 3.934 3.968 2.658 0.778 0.954 1.623 1.643 2.839 1.845 1.740 1.699 4.380 4.255

Obs.

Model Calc. 4.529 4.490 4.451 3.984 4.123 3.792 4.180 4.334 4.368 4.388 3.662 3.755 4.326 4.059 2.782 1.628 1.431 1.619 1.618 2.014 2.133 2.103 1.048 4.076 3.755

CA-I logKi 34 Residual 0.128 -0.092 -0.004 0.911 0.275 0.530 -0.261 -0.343 -0.555 -0.610 0.123 0.169 -0.392 -0.091 -0.124 -0.850 -0.477 0.004 0.025 0.825 -0.288 -0.363 0.651 0.304 0.500 Model Calc. 4.358 4.300 4.254 4.097 4.175 4.000 4.184 4.271 4.291 4.302 3.826 3.910 4.061 4.020 3.304 1.682 1.328 1.122 1.381 2.370 2.090 2.073 1.121 4.148 3.979

35 Residual 0.299 0.098 0.193 0.798 0.223 0.322 -0.265 -0.280 -0.478 -0.524 -0.041 0.014 -0.127 -0.052 -0.646 -0.904 -0.374 0.501 0.262 0.469 -0.245 -0.333 0.578 0.232 0.276 2.470 2.380 2.477 2.505 2.230 2.204 1.778 2.041 1.602 1.845 1.447 1.875 1.778 1.279 0.477 0.301 0.778 0.778 0.954 1.079 0.954 0.903 0.845 2.097 2.041

Obs.

Model Calc. 2.237 2.224 2.207 1.843 1.950 1.719 1.997 2.117 2.143 2.158 1.764 1.789 2.050 1.818 0.848 0.561 0.503 1.062 0.917 0.910 1.111 1.088 0.502 1.913 1.690

CA-II logKi 36 Residual 0.233 0.156 0.270 0.662 0.280 0.485 -0.219 -0.076 -0.541 -0.313 -0.317 0.086 -0.272 -0.539 -0.371 -0.260 0.275 -0.284 0.037 0.169 -0.157 -0.185 0.343 0.184 0.351 Model Calc. 2.191 2.163 2.135 1.834 1.941 1.755 1.970 2.089 2.116 2.131 1.866 1.874 2.007 1.847 1.177 0.718 0.667 0.785 0.759 0.762 1.097 1.074 0.530 1.905 1.726

37 Residual 0.279 0.217 0.342 0.671 0.289 0.449 -0.192 -0.048 -0.514 -0.286 -0.419 0.001 -0.229 -0.568 -0.700 -0.417 0.111 -0.007 0.195 0.317 -0.143 -0.171 0.315 0.192 0.315 3.117 3.342 3.477 3.507 3.447 3.389 2.255 2.505 1.820 2.097 2.243 2.204 2.732 2.550 2.097 0.699 0.903 1.699 1.724 2.188 1.279 1.230 1.176 2.748 2.653

Obs.

CAIV Model Calc. 2.981 2.995 2.999 2.717 2.740 2.518 2.800 2.826 2.831 2.834 2.085 2.221 3.116 2.896 1.941 1.407 1.299 1.639 1.520 1.788 1.419 1.414 0.852 2.732 2.512 logKi 38 Residual 0.136 0.347 0.478 0.790 0.707 0.871 -0.545 -0.321 -1.011 -0.737 0.158 -0.017 -0.384 -0.346 0.156 -0.708 -0.396 0.060 0.204 0.400 -0.140 -0.184 0.324 0.016 0.141

Observed and calculated log Ki(hCA-I), log Ki(hCA-II),and log Ki(bCA-IV) values using tetra-parametric models 34-39.

Table 8

Model Calc. 2.866 2.861 2.856 2.770 2.762 2.647 2.786 2.777 2.775 2.774 2.228 2.351 2.955 2.892 2.400 1.525 1.339 1.223 1.307 1.888 1.389 1.390 0.906 2.765 2.649

39 Residual 0.251 0.481 0.621 0.737 0.685 0.742 -0.531 -0.272 -0.955 -0.677 0.015 -0.147 -0.223 -0.342 -0.303 -0.826 -0.436 0.476 0.417 0.300 -0.110 -0.160 0.270 -0.017 0.004

714

Padmakar V. Khadikar et al. Table 9 Intercorrelation matrix for topological indices according to data in Table 2. W J JhetZ JhetM JhetV JhetE JhetP 1 χ Sz

W

J

JhetZ

JhetM

JhetV

JhetE

JhetP

1 -0.6527 -0.5352 -0.5348 -0.6423 -0.6736 -0.6057 0.4284 0.9848

1 0.8467 0.8472 0.9037 0.9771 0.9079 0.3537 -0.6473

1 0.9999 0.6511 0.8143 0.7198 0.3419 -0.5642

1 0.652 0.815 0.7205 0.343 -0.5638

1 0.9475 0.9883 0.2477 -0.6252

1 0.9445 0.2955 -0.6683

1 0.3055 -0.5943

1

χ

Sz

1 0.4642

1

Eq. (35) and (36) are the best models used for modeling logKi(hCA-I). This eq. (35) contains JhetV, 1χ, I1 and W as parameters whose correlation matrix is shown in Table 10. These results show that topological indices 1 χ and W are highly correlated. We have proposed eq. (36) for modeling log Ki(hCA-II) and eq. (38) for modeling log Ki(bCA-IV). The correlation matrices (Table 10) show that indices 1χ and Sz appearing in equations (36) and (38) also suffer from the same collinearity defect. Table 10 Correlation matrices for the parameters of eq. (35), (36), and (38) Equation (35) log Ki(hCA-I). W 1 χ JhetV I1

logKi(hCA-I). 1.000 -0.812 -0.830 0.775 -0.455

Equation (36) logKi(hCA-II) Sz 1 χ JhetV I1

logKi(hCA-II) 1.000 -0.678 -0.750 0.748 -0.368

Equation (38) logKi (bCA-IV) Sz 1 χ JhetV I1

logKi(bCA-IV) 1.000 -0.620 -0.694 0.567 -0.477

1

W 1.000 0.985 -0.642 -0.005

χ

JhetV

I1

1.000 -0.625 0.098

1.000 -0.331

1.000

1

Sz 1.000 0.980 -0.613 0.004 Sz 1.000 0.979 -0.613 0.004

χ

JhetV

I1

1.000 -0.625 0.098

1.000 -0.331

1.000

1

χ

JhetV

I1

1.000 -0.625 0.098

1.000 -0.331

1.000

A thorough investigation of collinearity involves examining the values of R2 that result from regressions for each of the predictor variables against all others. The relationship between predictor variables can be judged by examining the variance inflation factor (VIF) which is defined as VIF = 1/(1- Ri2), where Ri is the multiple correlation coefficient of the i’th independent variable versus all other independent variables. A VIF is defined for each variable in the equation and not for the equation (model) as a whole. Therefore, there should be as many VIFs as there are independent variables in the equation (model), and all should be less than 10. Any independent variable having VIF > 10 is indicative of the occurrence of collinearity. We observed that: 1χ and W, as well as 1χ and Sz involved in eqs. (35), (36), and (38) have VIF values higher than 10 and therefore, there is a collinearity problem in these equations (models). The above results show that collinearity exists in all the three proposed models, and thus statistically they are disputable. However, Randić has investigated 56,57 such a problem and he recommended that under certain situations even highly correlated descriptors could be retained in the model. We will, therefore, use Randić’s recommendations in the present case. Randić stated that if a descriptor strongly correlates with another descriptor already used in a regression, such a descriptor in most studies should be discarded. For

QSAR modeling of carbonic anhydrase

715

example 1χ and 2χ often strongly correlate and in many structure-property-activity studies 2χ has been discarded. This is not theoretically justified and despite the widespread practice should be stopped. Although two highly correlated descriptors overall depict the same features of molecular structure, it is important to recognize that even highly interrelated descriptors differ in some other structural traits. The difference between them may be relatively small but nevertheless very important for structure-property regression. The criteria for inclusion or exclusion of descriptors should not be based on parallelism between descriptors even if overwhelming, but should be based on whether the part in which two descriptors disagree is or is not relevant for the characterization of the property considered. If the part in which the second descriptor differ from the first, regardless of how small it is, is relevant for the property under consideration, then the descriptor should be included. Randić further stated that the selection of descriptors to be used in structure-property-activity studies should not be delegated solely to computers,56 although statistical criteria will continue to be useful for preliminary screening of descriptors taken from a large pool. Often in an automated selection of descriptors, a descriptor will be discarded because it is highly correlated with another descriptor already selected. But what is important is not whether two descriptors parallel one another, i.e. duplicate much of the same structural information, but whether they are complementary in those parts that are important for structure-property-activity correlations. Hence, the residual of the correlation between two descriptors should be examined and kept or discarded depending on how well it can improve the correlation based on already selected descriptors. In view of Randić’s recommendations and the fact that 1χ, W, and Sz indices have different information contents, these highly correlated descriptors can be retained in the proposed models. At this stage, it is worth mentioning that problems caused by colinearity; and how to deal with them, continue to be of prime concern to theoretical statistician. From a decision maker’s viewpoint, one should be aware of that collinearity can (and usually does) exist and recognize the basic problems it can cause. Some of the most obvious problems and indications of severe multi-collinearity are: Incorrect signs on the coefficient, A change in the values of the previous coefficient when a new variable is added to the model. Change to insignificant of a previously significant variable when a new variable is added to the model An increase in the standard error of the estimate when a variable is added to the model. In the present case most the correlating variables have their coefficients smaller than the respective standard error. We now comment on adjustable-R2 (R2A). These values take into account of adjustment of R2. Therefore, if a variable is added that does not contribute its fair share, R2A will actually decline. It also takes into account the relationship between sample size and number of variables. The correlation coefficient R2 may appear artificially high if the number of variables is high compared with sample size. That is, R2 will always increase when a new independent variable is added, but R2A will decrease if the added variables do not reduce the unexplained variation enough to offset the loss of degrees of freedom. From Tables 4 and 6 we observe that in each case R2A increases with the increase in the number of variables (an exception is provided by data on CA-II from tri- to tetraparametric regressions). Thus, the added variable has a significant contribution to the developed model. All these points indicate that multi-collinearity is not that serious in the proposed models. EXPERIMENTAL 1. Inhibitory activities. All three values of inhibitory activities logKi(hCA-I), logKi(hCA-II), and logKi(bCA-IV) were taken from our earlier publication after converting into their log unit.13 2. Topological indices. All topological indices used in this paper were calculated from the hydrogen suppressed molecular graph of the benzenesulfonamides presented in Table 1. Their calculations are well documented in the literature.58-63 We have used the Luko-1 program of Lukovits, Hungarian Academy of Sciences, Budapest for the calculation of Szeged index (Sz), while other indices are calculated using Todeschini’s Dragon software.64 3. Regression Analysis. The maximum-R2 method 53-55 was adopted for implementing regression analysis. The Regress-1 program of Lukovits as well as Origin-6 and NCSS programs were used. ACKNOWLEDGEMENTS. Authors are thankful to Professor Istvan Lukovits, Hungarian Academy of Sciences, Budapest, Hungary for providing software to carry out regression analysis. Authors are also thankful to CSIR New Delhi, India for providing financial support through project No 01(1785)/02/EMR-II.

716

Padmakar V. Khadikar et al.

REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56.

A. T. Balaban, S. C. Basak, A. Beteringhe, D. Mills and C. T. Supuran, Molecular Diversity, 2004, 8, 401. C. T. Supuran, A. Scozzafava and J. Conway (Eds.), “Carbonic Anhydrase Inhibitors and Activators”, CRC Press, Boca Raton, 2004. C. T. Supuran and A. T. Balaban, Rev. Roum. Chem., 1994, 39, 107. C. T. Supuran, A. Casini, A. Mastrolorenzo and A. Scozzafava, Mini-Rev. Med. Chem., 2004, 4, 625. P. M. Bell and R. O. Roblin, J. Am. Chem. Soc., 1942, 64, 2905. C. Silipo and A. Vittoria, Farmaco, Ediz. Sci., 1979, 34, 858. G. H. Miller, P. M. Doukas and J. K. Seydel, J. Med. Chem., 1972, 15, 700. J. K. Seydel, J. Med. Chem., 1971, 14, 714. W. Walter and R. F. Becker, Liebigs Ann. Chem. 1969, 727, 71. N. Kakey, M. Aoki, A. Kamada and N. Yata, Chem. Pharm. Bull., 1969, 17, 1010. G. Dauphin and A. Kergomard, Bull. Soc. Chim. Fr., 1961, 486. M. Yoshika, K. Hamamoto and T. Kubota, Bull. Chem. Soc. Jpn., 1962, 35, 1723. V. K. Agrawal, J. Singh, M. Gupta, P. V. Khadikar and C. T. Supuran, Eur. J. Med. Chem., 2005, 40, 1002. C. T. Supuran, A. Scozzafava and A. Casini, “Carbonic Anhydrase, Its Inhibitors and Activators”, C.T. Supuran, A. Scozzafava, J. Conway (Eds.) CRC Press, Boca Raton, 2004, p. 67. F. Mincione, L. Menabuoni and C. T. Supuran, Ibid., p. 243-254. C. T. Supuran, F. Briganti, L. Menabuoni, G. Mincione, F. Mincione and A. Scozzafava, Eur. J. Med. Chem., 2000, 35, 309. C. T. Supuran and B. W. Clare, Eur. J. Med. Chem., 1995, 30, 687. C. T. Supuran and B. W. Clare, Eur. J. Med. Chem., 1999, 34, 41. B. W. Clare and C. T. Supuran, Eur. J. Med. Chem., 1997, 32, 311. C. T. Supuran and B. W. Clare, Eur. J. Med. Chem., 1998, 33, 489. C. T. Supuran and A. Scozzafava, J. Enz. Inhib., 2000, 15, 597. A. Casini, J. Antel, F. Abbate, A. Scozzafava, S. David, H. Waldeck, S. Schafer and C. T. Supuran, Bioorg. Med. Chem. Lett., 2003, 13, 841. B. W. Clare and C. T. Supuran, J. Pharm. Sci., 1994, 83, 768. C. T. Supuran and A. Scozzafava, SAR QSAR Environ. Res., 2001, 12, 17. C. T. Supuran and A. Scozzafava, Eur. J. Med. Chem., 2000, 35, 867. A. Thakur, M. Thakur, P. V. Khadikar and C. T. Supuran, Bioorg. Med. Chem. Lett., 2005, 15, 203. D. Mandoli, S. Joshi, P. V. Khadikar and N. Khosla, Bioorg. Med. Chem. Lett., 2005, 15, 405. P. V. Khadikar, V. Sharma, S. Karmarkar and C. T. Supuran, Bioorg. Med. Chem. Lett., 2005, 15, 931. P. V. Khadikar, V. Sharma, S. Karmarkar and C. T. Supuran, Bioorg. Med. Chem. Lett., 2005, 15, 923. A. T. Balaban, P. V. Khadikar, C. T. Supuran, A. Thakur and M. Thakur, Bioorg. Med. Chem. Lett., 2005, 15, 3966. V. K. Agrawal, M. Banerji, M.Gupta, J. Singh, P. V. Khadikar and C. T. Supuran, Eur. J. Med. Chem., 2005, 40, 1002. M. Jaiswal, P. V. Khadikar and C. T. Supuran, Bioorg. Med. Chem. Lett., 2004, 14, 5661. M. Jaiswal, P. V. Khadikar, A. Scozzafava and C. T. Supuran, Bioorg. Med. Chem. Lett., 2004, 14, 3283. V. K. Agrawal, S. Bano, C. T. Supuran and P. V. Khadikar, Eur. J. Med. Chem., 2004, 39, 593. M. Jaiswal, P. V. Khadikar and C. T. Supuran, Bioorg. Med. Chem., 2004, 12, 2477. A. Thakur, M. Thakur, P. V. Khadikar, C. T. Supuran and P. Sudele, Bioorg. Med. Chem., 2004, 12, 789. A. Saxena, V. K. Agrawal and P. V. Khadikar, Oxid. Commun., 2003, 26, 9. V. K. Agrawal, S. Shrivastava, P. V. Khadikar and C. T. Supuran, Bioorg. Med. Chem., 2003, 11, 5353. V. K. Agrawal and P. V. Khadikar, Bioorg. Med. Chem. Lett., 2003, 13, 447. V. K. Agrawal, R. Sharma and P. V. Khadikar, Bioorg. Med. Chem., 2002, 10, 2993. A. Saxena and P. V. Khadikar, Acta Pharm., 1999, 49, 171. A. T. Balaban, Chem. Phys. Lett., 1982, 89, 399. A. T. Balaban, MATCH, Commun. Math. Computer Chem., 1986, 21, 115. A. T. Balaban and O. Ivanciuc, “MATH/CHEM/COMP 1988 Studies in Physical and Theoretical Chemistry Series”, A. Graovac (Ed.), No. 63, Elsevier, Amsterdam, 1989, p. 193-211. O. Ivanciuc, T. Ivanciuc and A. T. Balaban, J. Chem. Inf. Comput. Sci., 1998, 38, 395-401. H. Wiener, J. Am. Chem. Soc., 1947, 69, 17. I. Gutman, Graph Theory Notes New York, 1994, 27, 9. P. V. Khadikar, N. V. Deshpande, P. P. Kale, A. Dobrynin, I. Gutman and G. Domotor, J. Chem. Inf. Comput. Sci., 1995, 35, 547. P. V. Khadikar, P. P. Kale, N. V. Deshpande, S. Karmarkar and V. K. Agrawal, MATCH, Commun. Math. Comput. Chem., 2001, 43, 7. M. Randić, J. Am. Chem. Soc., 1975, 97, 6609. J. Devillers and A. T. Balaban, (eds.) “Topological indices and related descriptors in QSAR and QSPR”, Gordon and Breach, Williston VT, 2000. M. Barysz, G. Jashari, R. S. Lall, V. K. Srivastava and N. Trinajstić, “Chemical applications of topology and graph theory”, R. B. King (Ed.), Elsevier, Amsterdam, 1983, p. 222. S. Chaterjee, A. S. Hadi and B. Price, “Regression analysis by examples”, 3rd ed., Wiley, New York, 2000. H. Van de Waterbeemd, “Chemometric methods in molecular mesign”, VCH, Weinheim, 1995. J. Devillers, W. Karcher (Eds.) “Applied multiparametric analysis in SAR and environmental studies”, Kluwer Academic, Dordrecht, 1991. M. Randić, Acta Chem. Slov., 1998, 45, 239.

QSAR modeling of carbonic anhydrase 57. 58. 59. 60. 61. 62.

717

M. Randić, J. Chem. Inf. Comput. Sci. 1997, 37, 672. M. V. Diudea and P. V. Khadikar, “Molecular topology and its applications”, Galgotia Publ., New Delhi, India, (in press). N. Trinajstić, “Chemical Graph Theory”, 2nd ed., CRC Press, Boca Raton, Florida, 1992, chapter 10, p. 225. M. V. Diudea (Ed.), “QSPR/QSAR studies by molecular descriptors”, Nova Science, Huntington, New York, 2000. R. Todeschini and V. Consonni, “Handbook of molecular descriptors”, Wiley-VCH, Weinheim, 2000. A. T. Balaban, A. Chiriac, I. Motoc and Z. Simon, “Steric fit in quantitative structure activity relations”, Lecture Notes in Chemistry No. 15, Springer Verlag, Berlin, 1980. 63. A. T. Balaban, I. Motoc, D. Bonchev and O. Mekenyan, Topological indices for structure-activity correlations. In Steric Effects in Drug Design, (M. Charton, I. Motoc, eds.), Topics Curr. Chem., 1983, 114, 21-55, Springer, Berlin. 64. Dragon software for calculation of Balaban-type and other indices, www.disat.unimib.it