Machine Learning Methods

0 downloads 0 Views 95KB Size Report
The Use of Kernels in Building SVM Models ... One-Class Classification Method 1-SVM ... Basic Principles of Building Models Using Backpropagation Neural.
Machine Learning Methods Preface 1. Fundamentals of Machine Learning 1.1. A Brief History of Machine Learning 1.2. Key Concepts of Machine Learning 1.3. Fundamentals of the Theory of Machine Learning 1.4. Chemoinformatics and Machine Learning 2. Machine Learning Methods 2.1. Multiple Linear Regression (MLR) 2.1.1. Fundamentals of the Method 2.1.2. Stepwise Multiple Linear Regression 2.1.3. Descriptor Selection Using Stochastic Optimization Algorithms 2.1.3.1. Genetic Algorithm 2.1.3.2. The Method of “Simulated Annealing” 2.2. Linear Models with Regularization 2.2.1. The Concept of Regularization 2.2.2. L2-Regularization and Ridge Regression 2.2.3. L1-Regularization and LASSO 2.3. Multivariate Analysis 2.3.1. The Concept of the Linear Multivariate Analysis 2.3.2. Principal Component Analysis (PCA) 2.3.3. Partial Least Squares (PLS) 2.3.4. Linear Multivariate Analysis in Chemoinformatics 2.4. Similarity-Based Methods 2.4.1. k-Nearest Neighbors Algorithm (k-NN) and Its Generalizations 2.4.2. The Problem of Fast Detection of Nearest Neighnors 2.4.3. The Problem of “Curse of Dimensionality” and Methods of Its Solution 2.5. Support Vector Machines (SVM) 2.5.1. Fundamentals of the Method 2.5.2. Search for a Separating Hyperplane 2.5.3. The Use of Kernels in Building SVM Models 2.5.4. SVM Regression 2.5.5. One-Class Classification Method 1-SVM 2.5.6. SVM in Chemoinformatics 2.6. Bayesian Approach to Machine Learning 2.6.1. Fundamentals of the Bayesian Approach to Machine Learning 2.6.2. The Naïve Bayes Classifier 2.6.3. Gaussian Processes (GP) Regression and Kernel Ridge Regression (KRR) 2.7. Ensemble Learning 2.8. Decision Trees 2.8.1. Fundamentals of the Approach 2.8.2. Random Forest (RF) 2.9. Active Learning 2.10. Graph Mining

2.11. Artificial Neural Networks 2.11.1. Fundamentals of the Approach 2.11.2. Backpropagation Neural Networks 2.11.2.1. Error Function 2.11.2.2. Backpropagation Algorithm for Computing the Gradient of the Error Function 2.11.2.3. Gradient Methods for Training Neural Networks 2.11.2.4. Basic Principles of Building Models Using Backpropagation Neural Networks 2.11.3. Modifications of Backpropagation Neural Networks Important for Chemoinformatics 2.11.3.1. Associative Neural networks (ASNN) 2.11.3.2. Bayesian Regularized Neural Networks (BRNN) 2.11.3.3. Autoencoders 2.11.4. Self-Organizing Kohonen Maps (SOM) and Other Networks with Competition Layers 2.11.5. Counterpropagation Networks 2.11.6. Neural Networks with Radial Basic Functions (RBF-networks) 2.11.7. Recurrent Neural Networks 2.11.7.1. Hopfield Neural Networks 2.11.7.2. Boltzmann Machines 2.11.7.3. Restricted Boltzmann Machines (RBM) 2.11.8. Convolutional Neural Networks 2.11.8.1. General Principles of Building Convolutional Neural Networks 2.11.8.2. Neural Device for Searching Direct Correlations between Structures and Properties of Chemical Compounds 2.11.9. Neural Networks for Graphs 2.11.9.1. General Principles for Building Neural networks for Working on Graphs 2.11.9.2. Neural Network of Kvasnicka 2.11.9.3. Neural Networks ChemNet and MolNet 2.11.9.4. Recursive Cascade Correlation Neural Network 2.11.9.5. Dreyfus’ Graph Machines 2.11.10. Neural Networks with Deep Learning – a Way to Artificial Intelligence 2.11.11. The History of the Use of Neural Networks in Chemoinformatics 2.12. Inductive Knowledge Transfer and Transfer Learning 2.13. Generative Topographic Mapping (GTM) 2.13.1. Standard Method of Generative Topographic Mapping 2.13.2. Activity landscapes and Regression “Structure-Property” Models Based on GTM 2.13.3. GTM-Based Classification Models 2.13.3.1. Classification in Initial Data Space 2.13.3.2. Classification in Latent Space 2.13.4. Extensions of the GTM Approach 2.13.4.1. Latent Trait Model (LTM) 2.13.4.2. Incremental Algorithm iGTM 2.13.4.3. meta-GTM 2.13.4.4. Stargate GTM

2.14. Unsupervised Machine Learning 2.14.1. Cluster Analysis 2.14.1.1. Methods of Hierarchical Clustering 2.14.1.1.1. Agglomerative Hierarchical Clustering 2.14.1.1.2. Divisive Hierarchical Clustering 2.14.1.2. Methods of Nonhierarchical Clustering 2.14.1.2.1. Single-Path Methods. Leader Algorithm 2.14.1.2.2. Nearest Neighbor Method. Jarvis-Patrick Algorithm 2.14.1.2.3. Relocation Methods. The k-Means Algorithm 2.14.2. Dimensionality Reduction 2.14.2.1. Linear Dimensionality Reduction Methods 2.14.2.1.1. Multidimensional Scaling 2.14.2.1.2. Independent Component Analysis (ICA) 2.14.2.1.3. Canonical Correlation Analysis (CCA) 2.14.2.2. Nonlinear Dimensionality Reduction Methods 2.14.3. Density Estimation 2.14.3.1. General Concepts 2.14.3.2. Nonparametric Density Estimation Methods. Parzen Window 2.14.3.3. Parametric Density Estimation Methods. Gaussian Mixture Model (GMM) 2.14.3.4. Density Estimation in Chemoinformatics 2.14.4. One-Class Classifications 2.15. Semi-Supervised and Transductive Machine Learning 2.16. Multi-Instance Learning 3. Machine Learning Methods in Chemoinformatics 3.1. Specificity of Machine Learning Methods in Chemoinformatics 3.1.1. The Nature of Chemical Objects 3.1.2. Representativity Problem 3.1.3. Data Heterogenicity and Heteroscedasticity 3.1.4. Unbalanced Data Set Problem 3.1.5. Uncertainty of Labelling for Inactives 3.1.6. Interpretability of Models 3.2. Recommendations on the Application of Machine Learning Methods in Chemoinformatics 3.2.1. Amount of Data 3.2.2. Data Distribution over Chemical Space 3.2.3. Data Types and Complexity