Understanding the toxic potencies of xenobiotics

0 downloads 0 Views 2MB Size Report
Jan 8, 2018 - software packages, and the descriptors were selected using QSARINS. (v.2.2.1) software. The QSTR models generated for AhR binding.
SAR and QSAR in Environmental Research

ISSN: 1062-936X (Print) 1029-046X (Online) Journal homepage: http://www.tandfonline.com/loi/gsar20

Understanding the toxic potencies of xenobiotics inducing TCDD/TCDF-like effects A. D. Şahin & M. T. Saçan To cite this article: A. D. Şahin & M. T. Saçan (2018) Understanding the toxic potencies of xenobiotics inducing TCDD/TCDF-like effects, SAR and QSAR in Environmental Research, 29:2, 117-131, DOI: 10.1080/1062936X.2017.1414075 To link to this article: https://doi.org/10.1080/1062936X.2017.1414075

View supplementary material

Published online: 08 Jan 2018.

Submit your article to this journal

Article views: 70

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=gsar20

SAR and QSAR in Environmental Research, 2018 VOL. 29, NO. 2, 117–131 https://doi.org/10.1080/1062936X.2017.1414075

Understanding the toxic potencies of xenobiotics inducing TCDD/TCDF-like effects$ A. D. Şahin and M. T. Saçan  Ecotoxicology and Chemometrics Laboratory, Institute of Environmental Sciences, Bogazici University, Besiktas/Istanbul, Turkey

ABSTRACT

Toxic potencies of xenobiotics such as halogenated aromatic hydrocarbons inducing 2,3,7,8-tetrachlorodibenzo-p-dioxin/2,3,7,8tetrachlorodibenzofuran (TCDD/TCDF)-like effects were investigated by quantitative structure–toxicity relationships (QSTR) using their aryl hydrocarbon receptor (AhR) binding affinity data. A descriptor pool was created using the SPARTAN 10, DRAGON 6.0 and ADMET 8.0 software packages, and the descriptors were selected using QSARINS (v.2.2.1) software. The QSTR models generated for AhR binding affinities of chemicals with TCDD/TCDF-like effects were internally and externally validated in line with the Organization of Economic Co–operation and Development (OECD) principles. The TCDDbased model had six descriptors from DRAGON 6.0 and ADMET 8.0, whereas the TCDF-based model had seven descriptors from DRAGON 6.0. The predictive ability of the generated models was tested on a diverse group of chemicals including polychlorinated/brominated biphenyls, dioxins/furans, ethers, polyaromatic hydrocarbons with fused heterocyclic rings (i.e. phenoxathiins, thianthrenes and dibenzothiophenes) and polyaromatic hydrocarbons (i.e. halogenated naphthalenes and phenanthrenes) with no AhR binding data. For the external set chemicals, the structural coverage of the generated models was 90% and 89% for TCDD and TCDF-like effects, respectively.

ARTICLE HISTORY

Received 27 October 2017 Accepted 4 December 2017 KEYWORDS

QSTR; TCDD; TCDF; AhR

Introduction Xenobiotics can belong to the group of persistent organic pollutants (POPs). Most of the production of these chemicals (i.e. polychlorinated biphenyls) stopped many years ago, yet their remains can be found in different parts of the environment (i.e. water, soil and air). Moreover, they have adverse health effects [1]. Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) states the need for the evaluation of chemicals for their environmental and human hazards. In this respect, persistence, bioaccumulation and ecotoxicological (PBT) properties are of major

CONTACT  M. T. Saçan  [email protected] $ Presented at the 9th International Symposium on Computational Methods in Toxicology and Pharmacology Integrating Internet Resources, CMTPI-2017, 27–30 October 2017, Goa, India.   Supplemental data for this article can be accessed here at https://doi.org/10.1080/1062936X.2017.1414075 © 2018 Informa UK Limited, trading as Taylor & Francis Group

118 

 A. D. ŞAHIN AND M. T. SAÇAN

concern, together with carcinogenicity, mutagenicity and reproductive toxicity, the so-called CMR properties [2]. This information is not, however, available for the majority of existing chemicals. Some xenobiotics are known to exert toxic effects via binding the aryl hydrocarbon receptor (AhR). The aryl hydrocarbon receptor is a ligand-activated transcription factor that mediates transcription of many downstream target genes, including cytochrome P450 metabolizing enzymes. Activation of AhR by ligands like TCDD leads to a wide variety of responses, including acute toxicity, teratogenicity and cancer. Therefore, the binding affinity of chemicals to AhR can be accepted as a toxicological endpoint. Aryl nuclear receptors are crucial in cellular processes like metabolic processes and cell growth [3]. The aryl hydrocarbon receptor is one of these receptors. It is a member of the basic helix-loop-helix transcription family and is located in the cytoplasm. Ligand binding to AhR is thought to lead a conformational change. These ligands can be either synthetic or natural. Polyhalogenated aromatic hydrocarbons such as biphenyls, dibenzofurans and dioxin-like chemicals are classified among synthetic ligands [4]. The mechanism suggests that activated AhR translocates into the nucleus, then forms a heterodimer by binding to the AhR nuclear translocator protein (Arnt). The heterodimer, later, binds to xenobiotic-responsive elements (XRE) [5]. AhR ligands such as polychlorinated biphenyls (PCBs), polychlorinated dibenzofurans (PCDFs), polyhalogenated dibenzo-p-dioxins (PHDDs) and naphthalenes are classified among the persistent environmental pollutants and can be found widespread in the environment [6,7]. Halogenated aromatic hydrocarbons (HAHs) have great adverse effects on health due to their carcinogenic, teratogenic and mutagenic effects [8]. Therefore, it is significant to study their binding affinities to AhR. Assessing binding affinities is generally a time- and money-consuming procedure, however. In order to avoid these problems, in vitro and/or in silico methods (i.e. quantitative structure–activity/ toxicity relationships (QSA/TRs)) can be employed for assessing the binding affinities. Various QSTRs have been developed to explain the HAH binding affinity towards AhR [9–15]. Data sets in previous studies include various groups of chemicals, yet on many occasions researchers [12] have chosen to model each chemical group separately. This surely limits the developed model’s applicability domain (AD). In order to overcome this, we aimed to achieve broad ADs by developing models using diverse chemical groups. Furthermore, up-to-date internal and external validation parameters were not reported for some of the above-mentioned works [14,15]. The aims of the present study were: (i) to investigate the toxic potencies of xenobiotics (i.e. PCBs, PCDDs, PCDFs, naphthalenes and indolocarbazoles) inducing TCDD/TCDF-like effects using their AhR binding affinities; (ii) to develop robust QSTR models that comply with the Organization of Economic Co-operation and Development (OECD) principles for both end points [16], to indicate the reliability of the predicted AhR binding affinities of test set chemicals in each of the data sets comprising TCDD- and TCDF-like chemicals regarding the AD of the developed models; (iii) to predict the AhR binding affinities of more than 900 chemicals from a diverse group such as polychlorinated/brominated biphenyls, dioxins, ethers, furans, phenoxathiins, thianthrenes and dibenzothiophenes, and polyaromatic hydrocarbons (PAHs) (i.e. naphthalene, phenanthrene, anthracene, acridine) with no AhR binding data; and (iv) to compare the predicted toxic potencies of HAHs inducing TCDF-like effects with those inducing TCDD-like effects. The present study seeks to reach the final aim by predicting the AhR binding affinity of chemicals that do not have experimental values, yet through the developed QSTR models contribute to the REACH data needed.

SAR AND QSAR IN ENVIRONMENTAL RESEARCH 

 119

Materials and methods Data set Two different data sets were used in the present study. First, the AhR binding affinities for rat hepatocytes were compiled from the literature [17–19]. Chemicals were tested on the cytochrome P450 isoenzymes purified from rat liver cytosol. This data set has 107 AhR ligands including 25 dibenzo-p-dioxins, 35 dibenzofurans, 18 diphenylethers, 14 biphenyls and 15 different biphenyl derivatives. AhR binding affinity data were utilized as the negative of the log of the molar concentration of the chemical necessary to displace 50% of the radiolabelled TCDD from the Ah receptor. These values are presented as pIC50,TCDD in the present study. The second data set was obtained from Waller and McKinney’s work in 1995 [18]. This data set was previously used by Lo Piparo et al. [10]. Initially, they removed five chemicals from the original data set, which had 99 chemicals, since the exact binding data was not readily available. In addition, one more chemical was eliminated, since it was the duplicate of another chemical in the data set. In the present study, three more chemicals were eliminated since they had exactly the same structures and binding affinities. The remaining 90 chemicals comprise 25 dibenzo-p-dioxins, 35 dibenzofurans, 14 biphenyls, 5 naphthalenes, 7 indolocarbazoles and 4 indolocarbazole derivatives. In the data set of chemicals with TCDF-like effects, AhR binding affinity values were utilized as the negative of the log of the molar concentration of the chemical necessary to displace 50% of the radio-labelled TCDD from the Ah receptor. Since the data were compiled from different laboratories that used 2,3,7,8-TCDF as an internal standard to eliminate laboratory variations, all binding affinities were normalized to a value of 8.444 for TCDF [10]. AhR values for chemicals with TCDF-like effects are presented as pIC50,TCDF in the present study.

Molecular descriptors Molecular descriptors were calculated using SPARTAN 10 [20], DRAGON 6.0 [21] and ADMET 8.0 [22] software packages. Structures were drawn in SPARTAN 10 software, and conformer search and geometry optimization were carried out using the semi-empirical PM6 method. Molecular descriptors were calculated using the conformer with the lowest Eaq value. Molecular weight (MW), dipole moment (μ), the energy of the lowest unoccupied molecular orbital (ELUMO), the energy of the highest unoccupied molecular orbital (EHOMO), gas-phase energy (E), aqueous-phase energy (Eaq), the logarithm of the octanol/water partitioning coefficient (log P), space-filling (CPK) volume and area values were obtained from SPARTAN 10. In total approximately 3000 descriptors were calculated from the above-mentioned software packages.

Model development and validation The data set was divided into training (~80% of chemicals) and test sets (~20% of chemicals). Two different division methods were applied: (i) ordered response; and (ii) ordering the molecules based on the first axis of principal component analysis (PCA) of the molecular descriptors first axis (PC1 score) using QSARINS software [23,24]. For splitting by response, the test chemicals were selected in such a manner that all representative points of the test

120 

 A. D. ŞAHIN AND M. T. SAÇAN

set were close to those of the training set after the responses of chemicals were ordered by increasing pIC50. For splitting by structure, the chemicals were ranked based on the molecular descriptors’ first axis principal component (PC1) score as implemented in QSARINS software. Of the generated models from each division, the model fulfilling the OECD validation requirements [16] and with the best multi-criteria decision-making (MCDM) score as implemented in QSARINS software [23,24] was selected. The Topliss ratio was employed to avoid over-fitting due to the high number of descriptors [24]. Chance correlation was eliminated by applying the QUIK rule [25]. Y-scrambling was done repeatedly several times (2000 scrambling iterations), to ensure that models were not obtained by chance, by scrambling the response values and keeping the X matrix as is [26]. OECD principles states that a QSAR model should have appropriate measures of goodness of fit, robustness and predictivity. To ensure this, the developed models were validated internally and externally. For internal validation the coefficient of determination (r2), adjusted coefficient of determination (r2adj) and Fisher statistics (F) were stated. Applying the Y-scrambling technique we further eliminated the chance correlation that may happen among independent variables. To test the models’ predictive reliability an external validation was performed. The predictive ability and external validation of the models were tested with the most widely used metrics [27] and the mean absolute error (MAE)-based criteria [28]. The AD is established by using the ranges of descriptors and endpoints. If a chemical’s standardized residual value was higher than 3σ, it was identified as a response outlier. The leverage value was set at 3 p′/n, where p′ is the number of descriptors plus one and n is the number of chemicals in the data set [29].

Results and discussion Selected models’ predictions of the pIC50 for chemicals with TCDD- and TCDF-like effects were given together with standard errors on the coefficient of determination in Equations (1) and (2), respectively.

pIC50,TCDD = − 3.605 (±0.744) − 3.930 (±0.816) MATS5m + 4.812 (±0.844) MATS5v − 1.237 (±0.160) F09[C − Br] + 2.018 (±0.197) M_RNG

(1)

+ 2.692 (±0.247) RdGrav__3D + 0.863 (±0.223) Mor03v where nTr = 87, Q2LOO = 0.810, r2 = 0.840, r2adj = 0.830, RMSETr = 0.671, F = 70.023, CCCTr = 0.913, nTest= 20, r2Test = 0.910 and RMSETest = 0.41.

pIC50,TCDF = −1.468 (±0.595) − 3.392 (±1.038) RFD − 1.450 (±0.489) MATS5s + 0.635 (±0.039) Tm − 0.609 (±0.149) nHAcc + 1.408 (±0.254) B04[O − Cl] − 0.535(±0.110)F04[Cl − Cl] − 1.360(±0.510)LOC

(2)

where nTr =73, Q2LOO = 0.815, r2 = 0.850, r2adj = 0.834, RMSETr = 0.638, F = 52.493, CCCTr = 0.919, nTest= 17, r2Test = 0.913 and RMSETest = 0.476. Both models were tested for external validation parameters (Table 1). Compliance with all these parameters points to a satisfactory external prediction ability. TCDD-like effects were explained with descriptors from DRAGON 6.0 and ADMET 8.0 software. Selected descriptors mainly explain the 3D properties as well as different features like the van der Waals volume, atomic mass and ring structure types, and each are significant

SAR AND QSAR IN ENVIRONMENTAL RESEARCH 

 121

Table 1. External validation parametersa and literature thresholdsb of Equations (1) and (2). Parameter r2Test Q2F1 Q2F2 Q2F3 CCCTest Δr2m k k′ r02 r0′2 r2m MAE (95% of the data) MAE + 3d Training set range MAE-based prediction quality

Model for TCDD-like effects

Model for TCDF-like effects

[Equation (1)] 0.910 0.908 0.898 0.941 0.946 0.080 1.028 0.968 0.910 0.902 0.869 0.292 0.988 7.630 Good

[Equation 2] 0.913 0.893 0.893 0.917 0.940 0.090 1.027 0.970 0.910 0.890 0.798 0.326 1.160 7.258 Good

r

: coefficient of determination for the test set; Q2F1, Q2F2, Q2F3: predictive squared correlation coefficients; CCCTest: conTest cordence correlation coefficient for the test set. b Literature thresholds and corresponding references: r2Test > 0.6, r2 − r02/r2 < 0.1 and 0.85 ≤ k ≤ 1.15 or r2 − r02/r2 < 0.1 and 0.85 ≤ k′ ≤ 1.15 or |r02 − r′02| where r02 is the predicted versus observed, r0′2is the observed versus predicted squared correlation coefficients, and k and k′ are the slopes of the regression lines through the origin [29]; mean absolute error (MAE) based criteria: For a general notation, an error of 10% of the training set range should be acceptable while an error value more than 20% of the training set should be a very high error. Thus, the criteria for good predictions should be the following: MAE ≤ 0.1 × training set range and MAE + 3δ ≤ 0.2 × training set range, where the δ value refers to the standard deviation of the absolute error values for the test set data. Considering a normal distribution pattern, mean ± 3δ covers 99.7% of the data points [28]. r2m > 0.5, Δr2m < 0.2 [30], Q2F1, Q2F2, Q2F3 > 0.6, CCCTest > 0.85 [31]. a 2

for the predictivity of the model. Among these, MATS5m and MATS5v are from the 2D autocorrelations block weighed by the atomic mass and van der Waals volume, respectively. Mor03v, from the 3D-MoRSE block, reflects the importance of substituents in the molecule together with their atomic van der Waals volume. F09[C–Br] calculates the frequency of C–Br at a topological distance of 9 in the structure [32]. As the frequency of C–Br at a topological distance of 9 in the skeleton of a xenobiotic increases, its binding affinity to AhR decreases. M_RNG indicates the presence of ring structures that are not benzene and its condensed rings. Xenobiotics with this kind of ring structure had a higher binding affinity to AhR. Finally, RgGrav__3D, calculates the gravitational radius of gyration, which is a measure of molecular compactness. A descriptor would get a small value if the majority of atoms in the chemical are close to the centre of mass and, as the degree that the structure spreads out from its centre increases, the binding affinity increases. TCDF-like effects were explained with descriptors from DRAGON 6.0 software. Among these, ring fusion density (RDF) has made the most significant contribution in terms of explaining the AhR binding affinity. The MATS5s descriptor’s contribution to AhR binding affinity seem to be chemical-specific. Tm indicates the importance of the holistic structure of xenobiotics for their binding to AhR. The increase in the size of the molecule increases its binding affinity. nHAcc explains the hydrogen-bonding capacity of a molecule expressed as the number of possible hydrogen-bond donors. B04[O–Cl] and F04[Cl–Cl] explain the presence or absence of an O–Cl bond at a topological distance of 4 and the frequency of a Cl–Cl bond at a topological distance of 4, respectively. The final descriptor is the lopping centric index (LOC), which is an index defined as the mean information content derived from the pruning partition of a graph [33–35].

122 

 A. D. ŞAHIN AND M. T. SAÇAN

Applicability domain of the proposed models The plots of the predicted pIC50 values from Equations (1) and (2) vs. the observed pIC50 values are given in Figure 1(a) and (b), respectively. Training and test set chemicals for TCDD and TCDF data sets are given in the Supplementary Material (Tables S1 and S2, respectively) together with the descriptor values. Figure 1(c) and (d) visually defines the AD based on the leverage approaches for Equations (1) and (2), respectively. Remarkably, all of the chemicals in the training and test sets were included in the response and descriptor range of the two models, suggesting that the AhR predictions were reliably interpolated by these two models. Both models were further employed to test more than 900 external chemicals to predict their pIC50 values, consisting of PCBs and derivatives, PBBs, PBDEs and derivatives, PCDEs, PBDD/PCDDs, PCDF/PBDFs, polychlorinated phenoxanthiins (PCPTs), polychlorinated thianthrenes (PCTAs), polychlorinated dibenzothiophenes (PCDTSs), polychlorinated diphenyl sulphides (PCDPSs), indoles, carbazoles, naphthalenes and PAHs that are environmentally significant. A complete list of external set chemicals is given in the Supplementary Material (Tables S3 and S4) together with descriptor values and predicted pIC50 values from Equations (1) and (2), respectively. External sets tested on each model had different numbers of chemicals from each group, yet they are very similar. Among the 944 chemicals tested for Equation (1), 852 chemicals were in the AD that yielded 90% structural coverage. Of the 938 external set chemicals tested for Equation (2), 831 were within the AD that resulted in 89% structural coverage. Structural or response outliers were mainly highly substituted chemicals or structurally different from the chemicals in the training set, so the descriptors in the models do not represent them. It is of our interest to examine the predictive performance of both models for chemicals in the external set in more detail. Many of the external chemicals of the TCDD-based model that fell outside the AD belonged to polyaromatic hydrocarbon, polychlorinated thianthrene and dibenzothiophene groups (Figure 2(a) and (b)). Insubria graphs obtained from QSARINS [24] prove that Equation (1) can, with few exceptions, make reliable predictions for this group of chemicals, although there are no structurally similar chemicals in the TCDD-normalized data set. Of the 400 different congeners and derivatives of PBBs, PCBs and PCB derivatives, 374 were predicted within the structural AD (Figure 2(c)). The pIC50,TCDD values of 255 chemicals from the PCDE and PBDE groups and hydroxylated and methoxylated derivatives of PBDEs were predicted from Equation (1) (Figure 2(d)). This model was reliable for predicting the pIC50,TCDD values of the ether groups, with a few exceptions. Two hundred and twenty-five congeners were predicted within the AD. Even though the methoxy group is not represented in the training set, the TCDD-model is satisfactory in predicting methoxy substituted chemicals. Finally, among 107 different congeners of halogenated dioxins (Figure 2(e)) and furans, octabromo-dibenzo-p-dioxin (OBDD) has a hat value that is slightly higher than the critical hat value (h* = 0.241), which is highly substituted with bromine atoms. Equation (2) was better in terms of predicting polyaromatic hydrocarbons and indolocarbazoles (Figure 3(a)). The descriptors in Equation (2) well represent these groups in the TCDF-normalized data. The pIC50,TCDF values for PCTAs, PCPTs and PCDTs in the external set were satisfactory (Figure 3(b)). The pIC50,TCDF predictions of PBB and PCB congeners are not as reliable as in Equation (1) (Figure 3(c)). It is likely that the descriptors in this model are not representing PCB, PBBs and their derivatives very well.

Figure 1.  (a) Predicted pIC50,TCDD from Equation (1) vs. observed pIC50,TCDD; (b) predicted pIC50,TCDF from Equation (2) vs. observed pIC50,TCDF. Solid line represents the line of unity; (c) Williams plot for the QSTR model generated using TCDD-normalized data set; and (d) Williams plot for the QSTR model generated using TCDF-normalized data set. Vertical line stands for the critical hat values of the two models (h* = 0.241 and h* = 0.329, respectively), horizontal dashed lines are response outlier limits.

SAR AND QSAR IN ENVIRONMENTAL RESEARCH   123

Figure 2.  Insubria graphs for Equation (1) including: (a) PAHs, carbazoles, indoles and naphthalenes; (b) PCPTs, PCTAs, PCDTs and PCDPSs; (c) PCBs, PCB derivatives and PBBs; (d) PCDE, PBDE and derivatives of PBDE; and (e) PCBDDs and PCBDFs as an external set. Predicted pIC50,TCDD values of Training, Test and external set chemicals from Equation (1) and their hat values, where the critical hat value (h*) is 0.241.

124   A. D. ŞAHIN AND M. T. SAÇAN

Figure 3.  Insubria graphs for Equation (2) including: (a) PAHs, carbazoles, indoles and naphthalenes; (b) PCPTs, PCTAs, PCDTs and PCDPSs; (c) PCBs, PCB derivatives and PBBs; (d) PCDE, PBDE and derivatives of PBDE; and (e) PCBDDs and PCBDFs as an external set. Predicted pIC50,TCDF values of Training, Test and external set chemicals from Equation (2) and their hat values, where the critical hat value (h*) is 0.329.

SAR AND QSAR IN ENVIRONMENTAL RESEARCH   125

126 

 A. D. ŞAHIN AND M. T. SAÇAN

From the PCDE and PBDE groups and hydroxylated and methoxylated derivatives of the PBDEs’ pIC50,TCDF values, 271 chemicals have been well predicted from Equation (2) (Figure 3(d)). It has been suggested that the para and meta positions of bromine substituted PBDEs result in a higher AhR binding affinity. On the other hand, ortho substitution of bromines are thought to be disfavoured by AhR, and the binding affinity decreases as stated by Papa et al. [11] and Gu et al. [14]. Lastly, among 175 different congeners of halogenated dioxins and furans, only dibenzo-p-dioxin had a hat value that is (0.374) slightly higher than the critical hat value of 0.329 (Figure 3(e)). Unlike Equation (1), in which some of the highly brominated chemicals were response or structural outliers, Equation (2) was good at predicting the highly halogenated congeners of dioxins and furans. Two descriptors in Equation (2) are directly related to the relationship of Cl–Cl and O–Cl bonds in the chemical, which might explain this trend. Regarding the TCDD-normalized data, the pIC50 values of seven chemicals were higher than the TCDD pIC50 value of 8.00. Among these seven chemicals, 2,3,7,8-tetrabromo dibenzofuran had the highest pIC50,TCDD value of 9.807. It was followed by 2,3,4,7,8-pentabromo dibenzofuran (pIC50,TCDD = 9.499), 2,2′,3,6-tetrabromodiphenyl ether (pIC50,TCDD = 9.288), 1,2,3,4,7,8-hexabromo dibenzo-p-dioxin (pIC50,TCDD = 8.784), 1,2,3,4,7,8-hexabromo dibenzofuran (pIC50,TCDD = 8.560), 1,2,3,7,8-pentabromo dibenzo-p-dioxin (pIC50,TCDD = 8.543) and 1,2,3,7,8-pentabromo dibenzofuran (pIC50,TCDD = 8.088). All of these chemicals were predicted within the model’s AD. Furthermore, six of the abovementioned seven chemicals, with the exception of 2,2′,3,6-tetrabromodiphenyl ether, showed both TCDD- and TCDF-like effects. In general, dioxins and furans that had bromine atoms in the 2,3,7 and 8 positions showed a high binding affinity towards AhR. The predictive ability of Equation (1) for AhR’s binding affinities for PBDD congeners is high. On the contrary, PBDEs, their methoxylated and hydroxylated derivates, PCBs and PAHs showed less TCDD-like behaviour. Regarding the TCDF-normalized data, the pIC50,TCDF values for 63 chemicals had higher pIC50,TCDF values than those of TCDF. The above-mentioned chemicals and their predicted pIC50,TCDF values obtained from Equation (2) are given in the Supplementary Material (Table S5). The total numbers of external chemicals and structural applicability of Equations (1) and (2) for each subgroup are given in detail (Table 2). Table 2. Total numbers and groups of external chemicals used in equations (1) and (2) and their structural coverage. No of chemicals in AD Chemicals Training set chemicals Test set chemicals External set chemicals (total) PAH PBBs, PCBs PCB derivatives PCDD/PCDF PCDE/PBDE Hydroxylated, methoxylated derivatives of PBDE Indoles, carbazoles, naphthalenes PCTAs, PCPTs, PCDTs

Structural coverage (%)

Equation (1) 87

Equation (2) 90

Equation (1) All

Equation (2) All

Equation (1) 100

Equation (2) 100

20 944

17 938

All 852

All 831

100 90

100 89

28 351 50 175 224 31

23 340 61 175 240 31

20 326 48 170 198 27

23 292 47 174 210 25

71 93 96 97 88 87

100 86 77 99 88 81

42

25

30

22

71

88

43

43

33

39

77

91

Dioxins, furans, biphenyls, napthalenes, carbazole derivatives Dioxins, furans, biphenyls, napthalenes Dioxins, furans, biphenyls, napthalenes, carbazole derivatives BFR Dioxins and furans Dioxins, furans and biphenyls Diphenyl ethers Dioxins, furans and biphenyls Dioxins, furans, biphenyls, diphenyl ethers Dioxins, furans, biphenyls, diphenyl ethers, napthalenes and carbazole derivatives Dioxins, furans, biphenyls, napthalenes, carbazole derivatives Dioxins, furans, biphenyls, napthalenes Dioxins, furans, biphenyls, napthalenes, carbazole derivatives BFR Dioxins and furans Dioxins, furans and biphenyls Diphenyl ethers Dioxins, furans and biphenyls Dioxins, furans, biphenyls, diphenyl ethers Dioxins, furans, biphenyls, diphenyl ethers, napthalenes and carbazole derivatives a n = number of compounds.

Chemical group

CoMFA CoMFA/CoMSIA CoMFA MLR PLS PLS PLS CoMFA MLR MLR

Method CoMFA CoMFA/CoMSIA CoMFA MLR PLS PLS PLS CoMFA MLR MLR

60 65 18 78 107 90

99 95 91

60 65 18 78 107 90

na 99 95 91

0.824 0.9/0.873 0.910 0.900 0.549 0.992 0.932 0.858 0.850 0.840

r2 0.824 0.9/0.873 0.910 0.900 0.549 0.992 0.932 0.858 0.850 0.840

0.603 0.907 0.894 0.684 N/A N/A

0.453 0.631/0.711 0.620

0.603 0.907 0.894 0.684 N/A N/A

Q2 0.453 0.631/0.711 0.620

N/A N/A 0.620 0.790 N/A N/A N/A N/A 0.810 0.815

Q2LOO N/A N/A 0.620 0.790 N/A N/A N/A N/A 0.810 0.815

Table 3. Comparison of the statistical parameters of generated models to those of the previously published models.

N/A N/A N/A 0.730 N/A N/A N/A N/A 0.910 0.913

r2Test N/A N/A N/A 0.730 N/A N/A N/A N/A 0.910 0.913 N/A N/A N/A 0.420 N/A 0.446 N/A N/A 0.407 0.476

RMSETest N/A N/A N/A 0.420 N/A 0.446 N/A N/A 0.407 0.476

[18] [9] [10] [11] [12] [13] [14] [15] Present work (TCDD-based) Present work (TCDF-based)

Reference [18] [9] [10] [11] [12] [13] [14] [15] Present work (TCDD-based) Present work (TCDF-based)

SAR AND QSAR IN ENVIRONMENTAL RESEARCH   127

128 

 A. D. ŞAHIN AND M. T. SAÇAN

A comparison of predicted pIC50 values for hydroxylated and methoxylated derivatives prove that HO-PBDEs show greater AhR binding affinity in comparison to MeO-PBDEs, and both of these BDE derivatives show greater potencies for inducing AhR. Furthermore, para substitution of hydrophobic groups like methoxy and chlorine on the benzene enhance the binding affinity towards AhR. Our findings are coherent with the findings in the literature [36,37]. PCB congeners that lack ortho substitution are known to be the most potent towards AhR. Due to their planar structure they can easily fit into the binding site of the receptor, and PCB 126 (3,3′,4,4′,5-penta-CB) shows the most dioxin-like effect among the PCB congeners. Many planar PCB congeners in the external set, including PCB 169 (3,3′,4,4′,5,5′-hexa-CB) and PCB 123 (2,3′,4,4′,5′-penta-CB), had high predicted pIC50 values. Chemicals that have ortho substitution tend to be bulkier and, therefore, do not have a planar configuration. In addition, hydroxy and methoxy substitution of the meta position of the benzene ring in PCB derivatives resulted in increased activity. These results are coherent with the findings from previous work [36,38]. For polyaromatic hydrocarbons, adding halogens like chlorine and bromine or groups like methyl were observed to enhance PAH’s binding affinity towards AhR. This trend was supported by the literature [39]. To compare our models with the studies that have been previously published is important, since the comparison points out the strengths and weaknesses of the present study. We are aware that an exact comparison is impossible as each author uses different software and a different number of chemicals for their modelling data sets. Additionally, internal and external validation parameters for all of the published models are not available for comparison. Yet, it is vital to compare to contrast with the models in the literature using some features and parameters of the proposed models in this study (Table 3). Our models offer superior performance in terms of providing rigorous validation metrics [26–31]. Moreover, having proven their wide applicability upon a heterogeneous external set, they can be used as a potential tool for predicting the AhR binding affinity of chemicals with either TCDD- or TCDF-like effects, or both.

Conclusions In the present study, two QSTR models, one for TCDD-normalized data and one for TCDFnormalized data, were developed and validated both internally and externally. Both of the models fully comply with the OECD criteria. TCDD- and TCDF-like behaviour of the complex data set, comprising halogenated dioxins, dibenzofurans, biphenyls, biphenyl derivatives, diphenyl ethers, naphthalene and polyaromatic hydrocarbons, were explained with six and seven descriptors, respectively. Descriptors that represented the entire data were quite complex, which proves that AhR binding affinity is too complicated to be explained with a simple pathway. Nevertheless, it is likely to interpret the mechanism of AhR binding affinity for chemicals with TCDD/TCDF-like effects using the information gathered from the definition and sign of the descriptors appearing in the generated models. Substitution patterns significantly affected the AhR binding affinity. Models were externally tested with approximately 900 chemicals that are structurally close to the chemicals in the data sets. The TCDD- and TCDF-based models had 90% and 89% structural coverage, respectively. Both models were reliable in terms of predicting

SAR AND QSAR IN ENVIRONMENTAL RESEARCH 

 129

brominated biphenyls, halogenated dibenzo-p-dioxins and halogenated dibenzofurans. The TCDF-based model was more reliable in terms of predicting the pIC50,TCDF values of PAHs, whereas the TCDD-based model had a better predictive ability for the pIC50,TCDD values of substituted diphenyl ethers and substituted biphenyls, especially when the substituents are bulky groups. Seven chemicals showed higher binding towards AhR compared to the binding affinity of TCDD and TCDF. These seven chemicals included polybrominated dibenzofurans. They are very persistent in the environment and they can be found in different media. In addition, they have adverse effects on human health. These chemicals did not have any experimental or predicted AhR binding affinity data; the present study provides reliable predicted pIC50,TCDD and pIC50,TCDF values for these chemicals.

Acknowledgements The support of this study by the Scientific and Technological Research Council of Turkey and Slovenian Research Agency (TUBITAK-ARRS) Grant Number 214Z225 is acknowledged. The authors would like to thank Prof. Gramatica for providing the QSARINS 2.2.1 software.

Disclosure statement The authors report no potential conflict of interest.

ORCID M. T. Saçan 

 http://orcid.org/0000-0003-2902-4965

References  [1] U.S. Epa, PCBS: Cancer Dose-Response Assessment and Application to Environmental Mixtures, EPA/600/P-96/001F, Office of Research and Development, National Center for Environmental Assessment, U.S. Environmental Protection Agency, Washington, DC, 1996.   [2] Regulation (EC) No 1907/2006 of the European Parliament and of the Council of 18 December 2006 concerning the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH), establishing a European Chemicals Agency, amending Directive 1999/45/EC and repealing Council Regulation (EEC) No 793/93 and Commission Regulation (EC) No 1488/94 as well as Council Directive 76/769/EEC and Commission Directives 91/155/EEC, 93/67/EEC, 93/105/EC and 2000/21/EC, 2006 available at http://eur-lex.europa.eu/legal-content/EN/TXT/ PDF/?uri=CELEX:02006R1907-20140410&from=EN.   [3] H. Gronemeyer, J.A. Gustafsson, and V. Laudet, Principles for modulation of the nuclear receptor superfamily, Nat. Rev. Drug Discov. 3 (2004), pp. 950–964.   [4] M.S. Denison, A. Pandini, S.R. Nagy, E.P. Baldwin, and L. Bonati, Ligand binding and activation of the Ah receptor, Chem. Biol. Interact. 141 (2002), pp. 3–24.   [5] A. Poland and J.C. Knutson, 2,3,7,8-Tetrachlorodibenzo-p-dioxin and related halogenated aromatic hydrocarbons: Examination of the mechanism of toxicity, Annu. Rev. Pharmacol. Toxicol. 22 (1982), pp. 517–554.   [6] J. Chovancova, A. Kocan, and S. Jursa, PCDDs, PCDFs and dioxin-like PCBs in food of animal origin (Slovakia), Chemosphere 61 (2005), pp. 1305–1311.   [7] J.L. Domingo and A. Bocio, Levels of PCDD/PCDFs and PCBs in edible marine species and human intake: A literature review, Environ. Int. 33 (2007), pp. 397–405.

130 

 A. D. ŞAHIN AND M. T. SAÇAN

  [8] K. Hilscherova, M. Machala, K. Kannan, A.L. Blankenship, and J.P. Giesy, Cell bioassays for detection of aryl hydrocarbon (AhR) and estrogen receptor (ER) mediated activity in environmental samples, Environ. Sci. Pollut. Res. 7 (2000), pp. 159–171   [9] A. Ashek, L. Cheolju, P. Hyunsung, and J.C. Seung, 3D QSAR studies of dioxins and dioxin-like compounds using CoMFA and CoMSIA, Chemosphere 65 (2006), pp. 521–529. [10] E. Lo Piparo, K. Koehler, A. Chana, and E. Benfanti, Virtual screening for aryl hydrocarbon receptor binding prediction, J. Med. Chem. 49 (2006), pp. 5702–5709. [11] E. Papa, S. Kovarich, and P. Gramatica, QSAR modeling and prediction of the endocrine-disrupting potencies of brominated flame retardants, Chem. Res. Toxicol. 23 (2010), pp. 946–954. [12] J. Diao, Y. Li, S. Shi, and Y. Sun, QSAR models for predicting toxicity of polychlorinated dibenzo-pdioxins and dibenzofurans using quantum chemical descriptors, Bull. Environ. Contam. Toxicol. 85 (2010), pp. 109–115. [13] F. Li, X. Li, L. Zhang, L. You, J. Zhao, and H. Wu, Docking and 3D-QSAR studies on the Ah receptor binding affinities of polychlorinated biphenyls (PCBs), dibenzo-p-dioxins (PCDDs) and dibenzofurans (PCDFs), Environ. Toxicol. Pharmacol. 32 (2011), pp. 478–485. [14] C. Gu, M. Goodarzi, X. Yang, Y. Bian, C. Sun, and X. Jiang, Predictive insight into the relationship between AhR binding property and toxicity of polybrominated diphenyl ethers by PLS-derived QSAR, Toxicol. Lett. 208 (2012), pp. 269–274. [15] J. Yuan, Y. Pu, and P. Yin, Docking-based three-dimensional quantitative structure–activity relationship (3D-QSAR) predicts binding affinities to aryl hydrocarbon receptor for polychlorinated dibenzodioxins, dibenzofurans, and biphenyls, Environ. Toxicol. Chem. 32 (2013), pp. 1453–1458. [16] OECD, Guidance document on the validation of (quantitative) structure–activity relationship [(Q) SAR] models, ENV/JM/MONO (2007)2, OECD Environment Health and Safety Publications Series of Testing and Assessment No. 69, Organisation for Economic Co-operation and Development, Paris, France, 2007. [17] S. Safe, Polychlorinated biphenyls (PCBs), dibenzo-p-dioxins (PCDDs), dibenzofurans (PCDFs), and related compounds: Environmental and mechanistic considerations which support the development of toxic equivalency factors (TEFs), Crit. Rev. Toxicol. 21 (1990), pp. 51–88. [18] C.L. Waller and J.D. McKinney, Three-dimensional quantitative structure–activity relationships of dioxins and dioxin-like compounds: Model validation and ah receptor characterization, Chem. Res. Toxicol. 8 (1995), pp. 847–858. [19] S. Safe, S. Bandiera, T. Sawyer, B. Zmudzka, G. Mason, M. Romkes, M.A. Denomme, J. Sparling, A.B. Okey, and T. Fujita, Effects of structure on binding to the 2,3,7,8-TCDD receptor protein and AHH induction—Halogenated biphenyls, Environ. Health Perspect. 61 (1985), pp. 21–33. [20] SPARTAN 10, Wavefunction Inc., Irvine, USA, 2010; software available at https://www.wavefun. com/products/windows/Spartan10/win_spartan.html. [21] DRAGON for Windows 6.0, Talete srl, Mialn, Italy, 2014; software available at http://www.talete.mi.it/. [22] ADMET 8.0, Simulations Plus; Lacnaster, CA, 2015 software available at http://www.simulationsplus.com/software/admet-property-prediction-qsar/. [23] P. Gramatica, N. Chirico, E. Papa, S. Kovarich, and S. Cassani, QSARINS: A new software for the development, analysis, and validation of QSAR MLR models, J. Comput. Chem. 34 (2013), pp. 2121– 2132. [24] P. Gramatica, S. Cassani, and N. Chirico. QSARINS-Chem: Insubria datasets and new QSAR/QSPR models for environmental pollutants in QSARINS, J. Comput. Chem., Software news and updates 35 (2014), pp. 1036–1044. [25] J.G. Topliss and R.P. Edwards, Chance factors in studies of quantitative structure–activity relationships, J. Med. Chem. 22 (1979), pp. 1238–1244. [26] S. Wold and L. Eriksson, Statistical validation of QSAR results, in Chemometric Methods in Molecular Design, H. van de Waterbeemd, ed.; Wiley-VCH, Weinheim, 1995, pp. 309–318. [27] P. Gramatica and A. Sangion, A historical excursus on the statistical validation parameters for QSAR models: A clarification concerning metrics and terminology, J. Chem. Inf. Model. 56 (2016), pp. 1127–1131. [28] K. Roy, R.D. Das, P. Ambure, and R.B. Aher, Be aware of error measures. Further studies on validation of predictive QSAR models, Chemom. Intell. Lab. Syst. 152 (2016), pp. 18–33.

SAR AND QSAR IN ENVIRONMENTAL RESEARCH 

 131

[29] A. Golbraikh and A. Tropsha, Beware of q2!, J. Mol. Graph. Model. 20 (2002), pp. 269–276. [30] N. Chirico and P. Gramatica, Real external predictivity of QSAR models: How to evaluate it? Comparison of different validation criteria and proposal of using the concordance correlation coefficient, J. Chem. Inf. Model. 51 (2011), pp. 2320–2335. [31] P.K. Ojha, I. Mitra, R.N. Das, and K. Roy, Further exploring r2m metrics for validation of QSPR models, Chemom. Int. Lab. Syst. 107 (2011), pp. 194–205. [32] P. Gramatica, Principles of QSAR models validation: Internal and external, QSAR Comb. Sci. 26 (2007), pp. 694–701. [33] R. Todeschini and V. Consonni, Molecular Descriptors for Chemoinformatics, Vol. 1, Wiley-VCH, Weinheim, 2009. [34] A.T. Balaban, Chemical graphs. XXXII. Five new topological indices for the branching of tree-like graphs, Theor. Chim. Acta 53 (1979), pp. 355–375. [35] R. Todeschini, M. Lasagni, and E. Marengo, New molecular descriptors for 2D and 3D structures, J. Chemom. 8 (1994), pp. 236–272. [36] F. Cao, X. Li, L. Xie, Y. Wang, W. Shi, X. Qian, Y. Zhu, and H. Yu, Molecular docking, molecular dynamics simulation, and structure-based 3D-QSAR studies on the aryl hydrocarbon receptor agonistic activity of hydroxylated polychlorinated biphenyls, Environ. Toxicol. Pharmacol. 36 (2013), pp. 626–635. [37] G. Su, J. Xia, H. Liu, M.H.W. Lam, H. Yu, J.P. Giesy, and X. Zhang. Dioxin-like potency of HO- and MeOanalogues of PBDEs’ the potential risk through consumption of fish from Eastern China, Environ. Sci. Technol. 46 (2012), pp. 10781–10788. [38] J. Lindén, S. Lensu, J. Tuomisto, and R. Pohjanvirta, The Aryl hydrocarbon receptor and the central regulation of energy balance, Front. Neuroendocrinol. 31 (2010), pp. 452–478. [39] S. Lee, W. Shin, S. Hong, H. Kang, D. Jung, U.H. Yim, W.J. Shim, J.S. Khim, C. Seok, J.P. Giesy, and K. Choi, Measured and predicted affinities of binding and relative potencies to activate the AhR of PAHs and their alkylated analogues, Chemosphere 139 (2015), pp. 23–29.