Supporting Information for

0 downloads 0 Views 280KB Size Report
KPLS_desc_7. KPLS_radial_25. KPLS_Dendritic_13. KPLS_molprint2D_1. KPLS_molprint2D_44. PLS_50. 0.83. 0.80. 0.85. 0.85. 0.86. 0.83. 0.82. 0.82. 0.82.

Supporting Information for:

On the Virtues of Automated QSAR ‒ The New Kid on the Block

Best models generated with AutoQSAR for each disease

S1-S4

RESULTS. Table S1. Alzheimer’s disease (hBACE1) best models generated with AutoQSAR for each training set. Overall best model is highlighted Training set, %

Model

Score





SD

RMSE

t, min

70

KPLS_desc_7

0.83

0.82

0.83

0.45

0.41

55

71

KPLS_radial_25

0.80

0.82

0.83

0.44

0.44

55

72

KPLS_Dendritic_13

0.85

0.82

0.85

0.42

0.40

55

73

KPLS_molprint2D_1

0.85

0.83

0.87

0.39

0.38

55

74

KPLS_molprint2D_44

0.86

0.84

0.88

0.38

0.37

55

75

PLS_50

0.83

0.85

0.83

0.44

0.42

55

76

KPLS_molprint2D_42

0.88

0.89

0.88

0.38

0.31

55

77

KPLS_molprint2D_11

0.83

0.80

0.83

0.44

0.42

55

78

KPLS_molprint2D_11

0.83

0.80

0.83

0.44

0.42

55

79

KPLS_dendritic_21

0.86

0.85

0.86

0.41

0.35

55

80

KPLS_radial_36

0.88

0.87

0.88

0.37

0.36

55

Table S2. Asthma (hBACE1) best models generated with AutoQSAR for each training set. Overall best model is highlighted. Training set, %

Model

Score





SD

RMSE

t, min

70

MLR_14

0.73

0.75

0.76

0.52

0.51

39

71

KPLS_desc_28

0.76

0.79

0.75

0.52

0.48

39

72

KPLS_desc_28

0.76

0.79

0.75

0.52

0.48

39

73

KPLS_radial_16

0.75

0.79

0.74

0.52

0.47

39

74

KPLS_dendritic_26

0.76

0.82

0.89

0.34

0.43

39

75

KPLS_dendritic_26

0.76

0.82

0.89

0.34

0.43

39

76

KPLS_desc_37

0.77

0.78

0.77

0.51

0.49

39

77

KPLS_radial_4

0.82

0.85

0.82

0.45

0.42

39

78

KPLS_radial_4

0.82

0.85

0.82

0.45

0.42

39

79

KPLS_radial_18

0.87

0.88

0.87

0.39

0.34

39

80

KPLS_radial_18

0.87

0.88

0.87

0.39

0.34

39

S1

Table S3. Cardiovascular (ABHD6) best models generated with AutoQSAR for each training set. Overall best model is highlighted. Training set, %

Model

Score





SD

RMSE

t, min

70

MLR_42

0.54

0.67

0.50

0.68

0.59

22

71

KPLS_dendritic_39

0.75

0.69

0.75

0.50

0.47

22

72

KPLS_dendritic_39

0.75

0.69

0.75

0.50

0.47

22

73

KPLS_dendritic_39

0.75

0.69

0.75

0.50

0.47

22

74

KPLS_dendritic_30

0.73

0.71

0.73

0.52

0.47

22

75

KPLS_dendritic_30

0.73

0.71

0.73

0.52

0.47

22

76

KPLS_dendritic_30

0.73

0.71

0.73

0.52

0.47

22

77

KPLS_linear_13

0.73

0.71

0.72

0.52

0.46

22

78

KPLS_linear_13

0.73

0.71

0.72

0.52

0.46

22

79

KPLS_desc_19

0.72

0.70

0.74

0.50

0.49

22

80

KPLS_desc_19

0.72

0.70

0.74

0.50

0.49

22

Table S4. Chagas disease (cruzain) best models generated with AutoQSAR for each training set. Overall best model is highlighted. Training set, %

Model

Score





SD

RMSE

t, min

70

KPLS_molprint2D_29

0.87

0.87

0.88

0.36

0.35

32

71

KPLS_molprint2D_32

0.87

0.87

0.87

0.37

0.35

32

72

KPLS_molprint2D_32

0.87

0.87

0.87

0.37

0.35

32

73

KPLS_desc_35

0.86

0.87

0.90

0.35

0.35

32

74

KPLS_molprint2D_30

0.84

0.84

0.87

0.37

0.38

32

75

KPLS_molprint2D_30

0.84

0.84

0.87

0.37

0.38

32

76

KPLS_desc_45

0.89

0.89

0.89

0.35

0.32

32

77

KPLS_desc_45

0.89

0.89

0.89

0.35

0.32

32

78

KPLS_desc_23

0.85

0.88

0.90

0.33

0.36

32

79

KPLS_desc_23

0.84

0.87

0.90

0.34

0.36

32

80

KPLS_desc_23

0.84

0.87

0.90

0.34

0.36

32

S2

Table S5. Diabetes (PTP1B) best models generated with AutoQSAR for each training set. Overall best model is highlighted. Training set, %

Model

Score





SD

RMSE

t, min

70

KPLS_molprint2D_20

0.81

0.90

0.80

0.34

0.25

12

71

KPLS_molprint2D_20

0.81

0.90

0.80

0.34

0.25

12

72

KPLS_molprint2D_46

0.76

0.74

0.75

0.40

0.32

12

73

KPLS_molprint2D_46

0.76

0.74

0.75

0.40

0.32

12

74

KPLS_molprint2D_46

0.76

0.74

0.75

0.40

0.32

12

75

KPLS_linear_27

0.86

0.83

0.86

0.31

0.26

12

76

KPLS_linear_27

0.86

0.83

0.86

0.31

0.26

12

77

KPLS_molprint2D_30

0.86

0.91

0.86

0.29

0.24

12

78

KPLS_molprint2D_30

0.86

0.91

0.86

0.29

0.24

12

79

KPLS_molprint2D_30

0.86

0.91

0.86

0.29

0.24

12

80

KPLS_dendritic_22

0.89

0.86

0.89

0.27

0.24

12

Table S6. Malaria (PfDHODH) best models generated with AutoQSAR for each training set. Overall best model is highlighted. Training set, %

Model

Score





SD

RMSE

t, min

70

PLS_12

0.76

0.77

0.78

0.52

0.52

97

71

KPLS_desc_35

0.82

0.82

0.82

0.47

0.46

97

72

KPLS_desc_7

0.81

0.81

0.81

0.48

0.47

97

73

KPLS_desc_50

0.82

0.83

0.84

0.45

0.45

97

74

KPLS_desc_7

0.81

0.81

0.82

0.47

0.47

97

75

KPLS_desc_22

0.81

0.83

0.81

0.48

0.46

97

76

KPLS_desc_37

0.77

0.79

0.81

0.49

0.50

140

77

KPLS_desc_48

0.81

0.81

0.80

0.49

0.47

97

78

KPLS_desc_26

0.83

0.84

0.82

0.47

0.44

97

79

KPLS_desc_20

0.83

0.84

0.84

0.44

0.44

97

80

KPLS_desc_11

0.85

0.85

0.85

0.44

0.43

97

S3

Table S7. Schistosomiasis (SmTGR) best models generated with AutoQSAR for each training set. Overall best model is highlighted. Training set, %

Model

Score





SD

RMSE

t, min

70

KPLS_desc_34

0.73

0.75

0.73

0.45

0.39

18

71

KPLS_desc_34

0.73

0.75

0.73

0.45

0.39

18

72

KPLS_linear_39

0.88

0.89

0.87

0.31

0.25

18

73

KPLS_linear_39

0.88

0.89

0.87

0.31

0.25

18

74

KPLS_linear_39

0.88

0.89

0.87

0.31

0.25

18

75

MLR_19

0.78

0.62

0.77

0.43

0.33

18

76

MLR_19

0.78

0.62

0.77

0.43

0.33

18

77

MLR_19

0.78

0.62

0.77

0.43

0.33

18

78

KPLS_linear_4

0.88

0.90

0.88

0.29

0.25

18

79

KPLS_linear_4

0.88

0.90

0.88

0.29

0.25

18

80

KPLS_linear_4

0.88

0.90

0.88

0.29

0.25

18

S4