Scalability of a Hybrid Extended Compact Genetic ... - Semantic Scholar

1 downloads 88 Views 420KB Size Report
review of GA-based cluster optimizers is presented in the next section, followed by a ..... Goldberg, D.E. Genetic Algorithms in Search Optimization and. Machine ...
Materials and Manufacturing Processes, 22: 570–576, 2007 Copyright © Taylor & Francis Group, LLC ISSN: 1042-6914 print/1532-2475 online DOI: 10.1080/10426910701319654

Scalability of a Hybrid Extended Compact Genetic Algorithm for Ground State Optimization of Clusters Kumara Sastry1 , David. E. Goldberg1 , and D. D. Johnson2 1

Department of Industrial and Enterprise Systems Engineering, Illinois Genetic Algorithms Laboratory (IlliGAL), University of Illinois at Urbana-Champaign, Urbana, Illinois, USA 2 Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, Illlnois, USA We analyze the utility and scalability of extended compact genetic algorithm (eCGA)—a genetic algorithm (GA) that automatically and adaptively mines the regularities of the fitness landscape using machine learning methods and information theoretic measures—for ground state optimization of clusters. In order to reduce the computational time requirements while retaining the high reliability of predicting near-optimal structures, we employ two efficiency-enhancement techniques: (1) hybridizing eCGA with a local search method, and (2) seeding the initial population with lowest energy structures of a smaller cluster. The proposed method is exemplified by optimizing silicon clusters with 4–20 atoms. The results indicate that the population size required to obtain near-optimal solutions with 98% probability scales sub linearly (as n083 ) with the cluster size. The total number of function evaluations (cluster energy calculations) scales sub-cubically (as n245 , which is a significant improvement over exponential scaling of poorly designed evolutionary algorithms. Keywords Cluster optimization; Competent and efficient optimization; Competent genetic algorithms; Efficiency enhancement techniques; Efficient genetic algorithms; Extended compact genetic algorithms; Hybridization; Scalability analysis; Seeding; Silicon clusters.

[16] and (2) seeding the initial population with lowest energy structures of a smaller cluster [1]. GA practitioners are primarily interested in the quality of the solution obtained, and the scalability of the evolutionary algorithm takes a back seat. However, scalability becomes critical, especially as we endeavor to solve problems of increasing size and complexity. Therefore, the purpose of this paper is to investigate the scalability of the hybrideCGA in predicting near-optimal geometries of clusters with high reliability. As a test case, we use silicon clusters with 4–20 atoms. This paper is structured as follows. A brief literature review of GA-based cluster optimizers is presented in the next section, followed by a description of extended compact GA in Section 3. In Section 4, we briefly describe the silicon potential used in the current study. The cluster optimization algorithm used in the scalability analysis is described in Section 5. The results are discussed in Section 6 followed by summary and conclusions.

1. Introduction One of the challenging problems in computational materials science and chemistry is the prediction of lowest-energy structures of atomic and molecular clusters. Geometry optimization can be daunting, even for moderatesize clusters, mainly due to the exponential increase in the number of local optima [1–4]. While classical gradientbased optimization methods fail to successfully predict globally-optimal geometries, genetic and evolutionary algorithms [5–7] have been particularly successful [8–14]. Most of the above genetic and evolutionary algorithms used for geometry optimization utilize custom evolutionary operators that incorporate problem knowledge to varying degrees. In this study we investigate the utility of extended compact genetic algorithm (eCGA) [15]—a competent genetic algorithm (GA) that automatically and adaptively discovers regularities in fitness landscapes in terms of variable interactions by data mining a population of promising solutions—in predicting groundstate structures of clusters. By discovering the regularities of the search problems, we aim to circumvent (at the least alleviate) the need for problem-specific representations and/or evolutionary operators. While competence takes us from intractability to tractability, we often need a variety of efficiency-enhancement techniques to take us from tractability to practicality. Therefore, to enable eCGA to predict lowest-energy structures in practical time, we employ two efficiency-enhancement techniques: (1) hybridizing eCGA with Nelder-Mead simplex method

2. Ga-based cluster optimizers One of key areas of interest in computational materials science and chemistry is the design of functional materials, and, given that the chemical and physical properties of nanomaterials can be tuned by varying the size of the clusters, the prediction of lowest-energy structures is critical. That is, for a given potential energy function and cluster size, we are interested in searching for the optimal arrangement of atoms, ions, or molecules that has the lowest potential energy. Recently, there has been considerable interest in employing GAs for predicting lowest-energy structures of atomic and molecular clusters [2, 8–12, 17–26]. Judson [27] has shown that GAs outperform the traditional Monte Carlo based simulated

Received August 21, 2006; Accepted December 14, 2006 Address correspondence to D. D. Johnson, Department of Materials Science and Engineering, 312 MSE Building, C-246, 1304 W. Green St., Urbana, IL 61801, USA; E-mail: [email protected]

570

571

SCALABILITY OF A GENETIC ALGORITHM

annealing method. Zeiri [13] reports that GAs are superior to simulated annealing in predicting the geometry of Arn H2 . GAs were successful in predicting the optimal atomic structure for C60 cluster and simulated annealing failed for the same cause [9]. In the above study the authors used a geometric crossover to create new individuals. Hartke [12] used a simple GA to predict the structure of Si4 and Si10 with a semi-empirical potential. Gregurick and Alexander [11] proposed a hybrid GA, in which they combined a binary coded GA and a conjugate gradient local search. They reported significant improvement in performance when compared to a GA without local search. Niesse and Mayne [2] modified this hybrid GA by encoding real values instead of binary. Iwamatsu [14] used a NelderMead simplex [16] instead of a conjugate gradient technique to predict the cluster structure of Sin (n = 3–15). Erkoç et al. [10] compared the performance of two different evolutionary algorithms in predicting the lowest-energy structures of silicon clusters for a number of empirical potential energy functions. Chakraborti et al. [8, 17–21] have extensively used evolutionary algorithms to predict lowest-energy structures of hydrogenated silicon and copper clusters. A number of these studies have used GAs with either fixed recombination and mutation operators or have resorted to search operators that incorporate problem knowledge. Additionally many studies employ one or more efficiency-enhancement techniques [6, 28, 29] including hybridization—where GAs are coupled with local search methods—and seeding [1, 2]—where the initial population is seeded with clusters by perturbing optimal lower-order clusters. 3. Ecga The eCGA [15] is an estimation of distribution algorithm [30–32] that replaces the variation operators of GAs with building and sampling probabilistic models of promising candidate solutions. The eCGA is based on a key idea that the choice of a good probability distribution is equivalent to linkage learning. The measure of a good distribution is quantified based on minimum description length (MDL) models. The key concept behind MDL models is that all things being equal, simpler distributions are better than more complex ones. The MDL restriction penalizes both inaccurate and complex models, thereby leading to an optimal probability distribution. Thus, MDL restriction reformulates the problem of finding a good distribution as an optimization problem that minimizes both the probability model as well as population representation. The probability distribution used in eCGA is a class of probability models known as marginal product models (MPMs). MPMs are formed as a product of marginal distributions on a partition of the genes and are similar to those of the compact GA (CGA) [33] and PBIL [34]. Unlike the models used in CGA and PBIL, MPMs can represent probability distributions for more than one gene at a time. MPMs also facilitate a direct linkage map with each partition separating tightly linked genes. For example, the following MPM [1–4], for a four-bit problem represents that the 1st and 3rd genes are linked and 2nd and 4th genes are independent. Furthermore, the MPM consists of the following marginal probabilities:

px1 = 0 x3 = 0, px1 = 0 x3 = 1, px1 = 1 x3 = 0, px1 = 1 x3 = 1, px2 = 0, px2 = 1, px4 = 0, px4 = 1}, where xi is the value of the ith gene. The eCGA can be algorithmically outlined as follows 1. Initialization: The population is usually initialized with random individuals. However, other initialization procedures can also be used. 2. Evaluate the fitness value of the individuals. 3. Selection: The eCGA uses s-wise tournament selection [35]. However, other selection procedures can be used instead of tournament selection. 4. Build the probabilistic model: In eCGA, both the structure and the parameters of the model are searched. A greedy search heuristic is used to find an optimal model of the selected individuals in the population. 5. Create new individuals: In eCGA, new individuals are created by sampling the probabilistic model. 6. Replace the parental population with the offspring population. 7. Repeat steps 2–6 until some convergence criteria are met. Two things need further explanation: one is the identification of MPM using MDL, and the other is the creation of a new population based on MPM. The identification of MPM in every generation is formulated as a constrained optimization problem, Minimize Cm + Cp  Subject to 2ki ≤ Np ∀i ∈ 1 m

(1)

where Cm is the model complexity which represents the cost of a complex model. In essence, the model complexity, Cm , quantifies the model representation size in terms of number of bits required to store all the marginal probabilities. Let a given problem of size with binary alphabets, have m partitions with ki genes in the ith partition, m such that i=1 ki = . Then, each partition i requires 2ki − 1 independent frequencies to completely define its marginal distribution. Furthermore, each frequency is of size log2 Np , where Np is the population size. Therefore, the model complexity (or the model representation size), Cm , is given by Cm = log2 Np 

m  2ki − 1

(2)

i=1

The compressed population complexity, Cp , represents the cost of using a simple model as against a complex one. In essence, the compressed population complexity, Cp , quantifies the data compression in terms of the entropy of the marginal distribution over all partitions. Therefore, Cp is evaluated as m  2i  k

Cp = −Np

pij log2 pij 

(3)

i=1 j=1

where pij is the frequency of the jth gene sequence of the genes belonging to the ith partition. In other words,

572 pij = Nij /Np , where Nij is the number of chromosomes in the population (after selection) possessing bit-sequence j ∈ 1 2ki 1 for ith partition. The constraint (Eq. 1) arises due to finite population size. The following greedy search heuristic is used to find an optimal or near-optimal probabilistic model. 1. Assume each variable is independent of each other. The model is a vector of probabilities. 2. Compute the model complexity and population complexity values of the current model. 3. Consider all possible  − 1/2 merges of two variables. 4. Evaluate the model and compressed population complexity values for each model structure. 5. Select the merged model with lowest combined complexity. 6. If the combined complexity of the best merged model is better than the combined complexity of the model evaluated in step 2, replace it with the best merged model and go to step 2. 7. If the combined complexity of the best merged model is less than or equal to the combined complexity, the model cannot be improved and the model of step 2 is the probabilistic model of the current generation. The offspring population is generated by randomly choosing subsets from the current individuals according to the probabilities of the subsets as calculated in the probabilistic model. The scalability of eCGA has been analyzed, and bounds for population sizing and run duration have been derived elsewhere [15, 36]. 4. Silicon potential There have been many potential energy functions developed for silicon due to its technological importance including the Sillinger–Webber potential [37], Tersoff potential [38], and the Gong potential [39]. A comparative study of geometry optimization for these and other potentials are given elsewhere [10, 14] and the comparison of several empirical potentials in predicting properties of both bulk silicon and silicon clusters is given elsewhere [40]. Although Tersoff and Stillinger–Weber potentials have satisfactorily predicted important materials properties [41, 42] they are not quite as accurate in predicting the structural properties [39]. For example, the three-body term in the Stillinger–Webber potential becomes zero only for the perfect tetrahedron angle (∼120 . On the other hand, ab initio molecular dynamics calculations indicate a large peak at 60 and a smaller peak at 100 [39]. The Gong potential, based on the Stillinger–Webber potential, contains a correction in the three-body term to incorporate not only

1 Note that a BB of length k has 2k possible sequences where the first sequence denotes be 00    0 and the last sequence 11    1.

K. SASTRY ET AL.

the tetrahedral angle but also the preferred angle. The Gong potential is of the following form. Utot =

n n   i j + i j k i