Decision making in a hybrid genetic algorithm

1 downloads 0 Views 249KB Size Report
Sep 4, 1996 - These are fundamental questions that come to mind when we think about ... a combination of the two usually performs better than either one alone. ..... the model assumes that parents and o spring are statistically independent.
Decision making in a hybrid genetic algorithm Fernando G. Lobo Grupo de Analise de Sistemas Ambientais Universidade Nova de Lisboa

David E. Goldberg Dept. of General Engineering University of Illinois at Urbana-Champaign

IlliGAL Report No. 96009 September 1996

Department of General Engineering University of Illinois at Urbana-Champaign 117 Transportation Building 104 South Mathews Avenue Urbana, IL 61801

Decision making in a hybrid genetic algorithm Fernando G. Lobo

Grupo de Analise de Sistemas Ambientais Faculdade de Ci^encias e Tecnologia Universidade Nova de Lisboa [email protected]

David E. Goldberg

Department of General Engineering University of Illinois at Urbana-Champaign [email protected]

September 4, 1996

Abstract

There are several issues that need to be taken in consideration when designing a hybrid problem solver. This paper focus on one of them | decision making. More speci cally, we address the following questions: given two di erent methods, how to get the most out of both of them? When should we use one and when should we use the other in order to get maximum eciency? These are fundamental questions that come to mind when we think about hybridization and here they are investigated in detail. We present a model for hybridizing genetic algorithms (GAs) based on a concept that decision theorists call probability matching. Essentially, it can be viewed as a stochastic learning automata and we use it to combine an elitist selecto-recombinative GA with a simple hillclimber (HC). Tests on an easy problem with a small population size match our intuition that both GA and HC are needed to solve the problem eciently.

1 Introduction Oftentimes, the combination of a genetic algorithm (GA) with a more speci c technique tailored to a problem, performs better than either technique when applied alone. This suggests that there is an opportunity in genetic optimization to capture the best of both schemes. However, little theory has been developed on how to make this combination as ecient as possible. It is interesting to note that even a pure simple GA can be viewed as a hybrid algorithm. Selection alone is an expensive algorithm to compute the maximum of a nite set of numbers, and crossover alone is little more than a random shuer. The real power of the GA comes from the combined e ect of selection and crossover, the combined e ect of exploitation and exploration. When hybridizing a GA with other method there is the same problem: when to use the GA and when to use the other method? We are faced again with a dilemma and it needs to be solved in a rational way. In this paper we limit our investigation to this critical question and ignore all the other issues of hybridization. The next section reviews past work on this topic. Section 3 brie y outlines several issues that need to be addressed on a complete theory of hybridization. Section 4 describes a model for combining two di erent methods, and section 5 specializes it to create an hybrid of an elitist selectorecombinative GA with a simple hillclimber. Experimental results on the bit counting (onemax) problem are presented in section 6. A simple mathematical analysis and a comparison of the model with experiment is given in section 7. Finally we suggest some extensions to this work. 

Visiting student in the Department of General Engineering at the University of Illinois at Urbana-Champaign.

1

2 Related work This section reviews some techniques that are often used in hybridization, followed by an example that gives evidence of its importance in real world applications. Finally, some of the theoretical research is discussed.

2.1 GAs as hybrids

GAs are optimization techniques that work well on a broad range of problems. They search the most promising areas of the space irrespective of local optima and without any speci c knowledge about the problem other than the objective function. This is a very good characteristic of GAs and it is one of the reasons why they are so elegant, so robust and so easy to apply. However, when we have a real problem to solve it is advantageous to incorporate speci c knowledge about the problem itself. This can be done in a variety of ways (Goldberg, 1989; Davis, 1991) and there is empirical evidence on a range of applications that indicates that hybridization can give a boost in performance (Powell, Tong, & Skolnick, 1989; Davis, Orvosh, Cox, & Qiu, 1993; Karr, 1995; Kado, Ross, & Corne, 1995). Because GAs do not require speci c knowledge about the problem other than the objective function, it becomes very easy to hybridize them with other techniques. Typically the GA would nd the most promising areas of the search space and the more specialized scheme would polish o the solutions to obtain an extra improvement. Other technique that has been used is to seed the initial population with a certain percentage of good solutions obtained by a speci c method. This way the GA will have good building blocks right from the beginning. Another way is to incorporate knowledge about the problem in the genetic operators themselves (Grefenstette, 1987).

2.2 Applications as mainly hybrids

It is very unlikely that a GA will outperform a specialized scheme tailored to a problem. However, a combination of the two usually performs better than either one alone. This happens because on a hybrid there is the possibility of incorporating domain knowledge, which gives it an advantage over a pure blind searcher such as a GA. That's the reason why most of the successful applications of GAs have been hybrids. An example was the EnGENEous project at General Electric (Powell, Tong, & Skolnick, 1989). They used a combination of GAs, expert systems and numerical optimization techniques to create a general purpose tool. The combination performed better than each technique when applied alone, and with it, they could improve the eciency of the gas turbine that is part of the Boeing 777 jet engine. In (Davis, 1991), a number of other successful hybrid applications are described.

2.3 Theory of hybrids

Little theory has been developed on how to design hybrid GAs and most of the work in this area is done empirically to tune the speci c problem that people are trying to solve. This is not a criticism because after all, when designing some product, there is always the need to tune it to get that little extra improvement. However, we think that this topic deserves more attention from the GA researchers because of the obvious practical implications that hybrid problem solvers can have in industry. The theoretical work in this area has focused mainly on combining the global search power of the GA with the local search power of a HC. Two interesting lines of work have been studied with quite a bit of extension: the Lamarckian Evolution and the Baldwin E ect. Both use the metaphor 2

that an individual learns (hillclimbs) during its lifetime (generation). In the Lamarckian case, the resulting individual (after hillclimbing) is put back into the population. In the Baldwinian case only the tness is changed and the genotype remains unchanged. The drawback of the Lamarckian approach is lost of diversity. On the contrary, the Baldwinian strategy maintains the diversity in the population and can be very useful in a search space that doesn't have \nice hills" (Hinton & Nowlan, 1987). Orvosh and Davis (1993) did an interesting combination of these two strategies for the Survivable Network Design Problem and veri ed that 95% of Baldwinian steps and 5% of Lamarckian steps performed best for their particular problem. This seems to indicate that a high proportion of Baldwinian steps is bene cial, but again it is unknown if these results scale up to arbitrary problems. More recently and on a more theoretical ground, Whitley (1995) used markov chains to try to model hybrid GAs, but no practical consequences seem to result from that work.

3 Outline of a theory of hybridization Although there are some guidelines and techniques for doing hybrids, no theory of hybridization seems to exist. Such a theory should address at least the following issues  Costs 1. of solving problem with a given method. 2. of seeking better knowledge. 3. of deciding among di erent methods.  Knowledge 1. of problem domain. 2. of strengths and weaknesses of various methods.  Reliability 1. of knowledge of the problem. 2. of knowledge of strengths and weaknesses of various methods.  Decision making 1. decide among di erent methods. 2. decide among seeking or not for better knowledge. Throughout the rest of the paper, our attention is focused on the decision-making problem of deciding among di erent methods, and all the other issues mentioned above are ignored. The next section goes directly into this issue and presents a simple model to combine two methods. To keep things simple each of them is isolated into a black box so that they do not interfere directly with each other. We will be using one or the other based on which of them is more likely to help us solve the problem.

3

4 Deciding between two methods Assume there are two di erent methods to solve a particular problem. Let's call them Method A and Method B. Also, assume that the problem is solvable in a certain number of time steps by repeatedly applying either one of the methods. At each time step a choice must be made between one or the other. What is the best strategy to use? If at each time step we knew which of the methods was going to be most useful to solve the problem, we would always choose the better one. But since it is unknown beforehand which of the methods is the better one, we are left with a decision-making problem similar to the two-armed bandit problem. Moreover, we are facing a bandit that can change its mind because which method is the better one is likely to change through time. The next paragraph illustrates the concept of probability matching and its connection to the decision-making process involving two mutually exclusive alternatives. Suppose a two-armed bandit like the one described in (Holland, 1975; Goldberg, 1989). On average, one of the arms is better than the other but we don't know which of the two is the better one. In that case, some experimentation is needed to gure out which of the arms is likely to be the better one. After that, the arm that was better during the experimental phase would always be chosen. Now let's imagine that we don't know anything about the bandit at all. It is unknown if one of the arms is better than the other one. Maybe the bandit switches its behavior once in a while. We could imagine that for a certain period of time the left arm was the better one and suddenly after some time, the bandit's right arm started paying more. If we do the experimental phase while the left arm is paying more, we will be mislead and will not be able to detect the switch to the right arm. If instead, we keep experimenting all the time and allocate our trials in proportion to the payo given by each of the arms, we could learn the bandit's behavior and be able to adjust if suddenly the right arm started giving more payo . This later behavior is called probability matching (Goldberg, 1990) and it can be shown that it is a useful one in a non-stationary environment. An individual using such a strategy is, according to Simon (1956), behaving rationally and attempting to minimize his regret. It is interesting to observe that humans and non-humans tend to adopt a probability-matching-like strategy when confronted with similar situations. In fact, such experiments were performed with real people and real animals and they tended to allocate their trials in proportion to the payo of each of the two mutually exclusive alternatives. Those experiments were concerned with the study of animal behavior, a eld that is a bridge between mathematics and psychology. Many of those experiments were rst described in (Bush & Mosteller, 1958). Going back to the model, we can think of the bandit's arms as corresponding to our two methods. At a given time step, one of them should be picked and we will do so using a probability matching approach. The whole idea here is to do the experimentation continuously and make the best use of the information that we have at a given time. What is needed now is a measure for the payo . When method A is picked we will get some reward, a measure of how well method A helped us to solve the problem. Similarly, when method B is chosen we will be rewarded by a measure of how well method B performed. In addition, we would like the later rewards to have more weight than the rewards that were obtained a long time ago. Doing so will allow us to detect a switch must faster. A simple way to achieve this is with some sort of weighted geometric average. The simple equation Wt+1 = Wt (1 ? c) + c Rt :

where Wt denotes the weighted reward at time t, Rt is the reward at time t, and c is a constant in [0::1], produces such an e ect. The constant c is a control parameter that tells us how far back 4

the past rewards are remembered. c = 0 means full memory (remember everything and the present reward has no e ect), c = 1 means no memory (just use the last reward and forget about the past). Having a measure for the payo , we are now able to use the probability matching strategy. This mechanism has similarities with work done on adapting operator probabilities (Davis, 1989; Julstrom, 1995). All these schemes adapt the operator probabilities based on their recent contributions to the algorithm's performance. The main di erence is in the updating scheme itself. Overall, these systems can be viewed as stochastic learning automata (Narenda & Thathachar, 1974). Many other kinds of learning automata exist and some of them are described in (Lakshmivarahan, 1981). After presenting our simple model, we proceed to the next section and specialize the bandit by putting a hillclimber on the left arm and a selecto-recombinative GA on the right arm.

5 A simple hybrid GA Now method A is a hillclimber and method B is a selecto-recombinative GA. Let's look at the mechanics of each of them alone. The hillclimber works as follows: pick one individual from the population, ip a randomly chosen bit, and insert the better of the two individuals into the next generation. This is like an elitist mutation because we never get an individual worse than what we had before. Now let's look at the GA. We could have used a standard GA but we decide to use an elitist selecto-recombinative GA (Thierens & Goldberg, 1994b) in order to make a fair comparison with the HC. The elitist GA picks two individuals randomly from the population, cross them using uniform crossover, and select the two best among parents and o spring to the next generation. This algorithm also never gets an individual worse than what we had before. Every time a GA step is performed, it generates two new individuals. To make things fair, whenever the HC is chosen, it is applied to two distinct individuals. After looking at each of them separately, let's see how the ideas presented in the previous section can be applied. First of all, a measure of reward for each of the methods is needed. For both of them, the reward is measured by the ratio between the tness improvement and the number of function evaluations that were spent. In both methods, only one tness function evaluation per individual is performed, so the measure of reward is just the tness improvement. By tness improvement we mean the di erence between the best individuals, after and before the method is applied. In other words, there is a reward when there is an innovation, an individual that is better than what we had before. The updating of the weighted rewards is incremental. Every time a GA step is done on a pair of individuals, we update the weighted reward of the GA and consequently the probabilities of picking the GA and the HC on the next time step. Similarly, the same thing happen when the HC is chosen. The updating of the rewards could have been done a generational basis but we opted for the incremental one so that the system is able to adapt faster.

6 Experimental results The hybrid algorithm was tested on the bit counting function (onemax). In this problem, the tness value of an individual is the number of 1's in the given chromosome. The rst experiments use population sizes according to Goldberg, Deb, and Clark (1992) and the hybrid picks the GA almost all the times ( gure 1). This happen because on average the expected reward of the GA is greater than the expected reward of the HC. In order to give the HC a chance, we run again our simulation with a small population size. This gives the GA a hard time and we expect that at some point the population will be almost converged to a non-optimal solution and the GA will not give 5

1

0.8

0.8

0.8

0.6 0.4 0.2

prob. of using GA

1

prob. of using GA

prob. of using GA

1

0.6 0.4 0.2

0 10

20

30

40

50

60

0.4 0.2

0 0

0.6

0 0

10

generations

20

30

40

50

60

0

10

generations

(a) c=0.001

20

30

40

50

60

generations

(b) c=0.01

(c) c=0.1

Figure 1: Typical runs of the hybrid GA on a 200-bit problem using a population size of 400. The runs nish when all the individuals reach the optimum. The graphs show the probability of using the GA vs generation using di erent values for the control parameter c. Note that for small values of c the system takes longer to adapt.

1

1

0.8

0.8

0.8

0.6 0.4 0.2

prob. of using GA

1

prob. of using GA

prob. of using GA

much improvement. At this time we expect to observe a switch to the HC. That is exactly what happened ( gure 2).

0.6 0.4 0.2

0 20

40

60

80

generations

(a) c=0.001

100

120

0.4 0.2

0 0

0.6

0 0

20

40

60

80

generations

(b) c=0.01

100

120

0

20

40

60

80

100

120

generations

(c) c=0.1

Figure 2:

Typical runs of the hybrid GA on a 200-bit problem using a population size of 30. The runs nish when all the individuals reach the optimum. The graphs show the probability of using the GA vs generation using di erent values for the control parameter c. Note that for small values of c the system takes longer to adapt.

Let's comment on the results and concentrate on the case where the control parameter c = 0:1 (graph c of gure 2). The hybrid starts o using the GA roughly around 85% of the time and then it reaches a point where the population has almost converged and because of that, crossover is not giving much improvement. The system detects this fact and switches to the HC. Note that at the end of the run it switches back to the GA, probably it is distributing the last correct alleles to the other individuals in the population. On another simulation, the same settings were used but this time with a 100% of GA steps (a pure GA) and then with a 100% of HC steps (a pure HC). 20 runs were performed with each method and on all of them the GA alone could never nd the optimum due to the small population size. The HC alone took on average 663 generations to get an optimum individual | the combination (hybrid) performed better than the pure GA or the pure HC. The hybrid was also run using xed probabilities of using the GA and HC with increments of 5%. It was interesting to observe that the algorithm performed about the same when the probability of using the GA was between 45% and 75% (Figure 3). On a last set of experiments, the HC was turned o and the elitist GA was run with mutation 6

350

generations

300 250 200 150 100 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 prob. of using GA

Figure 3:

Number of generations to get an optimum for various xed probabilities of using the GA. popsize = 30, l = 200. Results are averaged over 20 runs.

applied after crossover. After that, the o spring are evaluated and the two best are chosen to the next generation. Again, the algorithm only spends one function evaluation per individual. Muhlenbein (1992) and Back (1993) showed that when optimizing a unimodal binary function, a mutation rate of 1=l performs best (assuming a xed mutation rate throughout the run). Their result was obtained using an asexual evolutionary algorithm. It was not extended for evolutionary algorithms with crossover. Nevertheless, if the population size is small, a mutation rate of 1=l (here, and through the rest of the paper, l denotes always the chromosome length) is still a very well tuned parameter when optimizing an easy function such as the onemax. Experiments with mutation rates of 1=10l, 1=l, 3=l and 4=l, were performed and the results are summarized in table 1, along with the di erent experiments performed with the hybrid. algorithm generations hybrid with c=0.001 93 hybrid with c=0.01 83 hybrid with c=0.05 87 hybrid with c=0.1 93 hybrid with c=0.2 88 hybrid with c=0.5 89 elitist GA with mut=1=10l 155 elitist GA with mut=1=l 69 elitist GA with mut=3=l 109 elitist GA with mut=4=l 162 pure HC 663 pure GA never

Table 1:

The table above shows the number of generations needed to get an optimum. popsize = 30, l = 200. Results are averaged over 20 runs.

As expected, the GA with a mutation rate of 1=l performed better than the hybrid. The extra number of function evaluations performed by the hybrid is a cost of exploration, a price that had to be paid in order to learn the \optimal" mutation rate. As mentioned in section 3, the decisionmaking facet was isolated and we ignored all the other issues of hybridization, including the use of knowledge. The fact that a mutation rate of 1=l performs well on this problem is valuable knowledge that the hybrid didn't use. The other runs illustrate the case where the mutation rates are poorly chosen (1=10l, 3=l, 4=l), a situation that might occur when there is no knowledge about what a good mutation rate might be. In those cases the hybrid performed better. Summarizing, 7

the hybrid works quite well under various settings of the control parameter c, but of course, it never beats a ne tune parameter setting. This was also observed by Davis (1989). In his paper, he concluded: " . . . using an adaptive mechanism leads to performance slightly inferior to that of carefully-derived interpolated parameter settings . . . "

7 Mathematical analysis This section presents a simple mathematical analysis to try to predict what is the expected reward of each of the methods when applied alone on the onemax problem. From the empirical results we suspect that the GA gives better reward at the beginning and that there will be a point where the HC starts giving more reward.

7.1 The hillclimber alone

Let ft be the average tness of the population at generation t. At the next generation the tness increase will be given by the probability of ipping a 0 to a 1 ft+1 = ft +

l ? ft ; l

Solving this equation and setting f0 to l=2, the expected average tness at the initial generation, we obtain an equation that gives us the average tness of the population at generation t 1 l ft = l ? (1 ? )t : (1) 2 l If we only do HC steps on the onemax problem, the tness is a random variable with a binomial distribution and it can be approximated to the normal distribution N (; 2 ), where  and 2 are the mean and variance, respectively. In our case  = l pt and 2 = l pt (1 ? pt ), where pt is the proportion of correct alleles at generation t. Let X and X 0 be random variables that give the tness distribution at generation t and t + 1 respectively. Then the expected reward from generation t to 0 ? t + 1 is given by  2:2 2:2 , where  2:2 is the mean of the second order statistic from a sample of size 2 of the random variable X (Miller & Goldberg, 1995). Using a result from that paper we have 1 (2)  = + p : X

X

X

2:2

X

X

X



From equation 1, we can obtain pt = ft =l. Combining this with equation 2 we obtain the expected reward R(t) when going from generation t to generation t + 1 0 ? R(t) =  2:2 2:2 = ( 0 +  0 p1 ) ? ( +  X

X

X

X

q

X

X

p1 )

p1 ) ? (l p(t) + q p ? (2 ? k) k l + k (l ? 1) (k l+ 2 l ? k l) + kp p ; = 2  = (l p(t+1) +

l p(t+1) (1 ? p(t+1) )

where k = (1 ? 1=l)t . 8

q

l p(t) (1 ? p(t) )

p1 ) (3)

Experimental results match the theoretical ones (Figure 4). 0.6 0.5

reward

0.4 0.3 0.2 0.1 0 0

200

400

600

800

1000

generations

Figure 4:

Comparison of theory (solid line) and experiment (dots) using the hillclimber alone. Results are averaged for 20 runs. popsize = 30, l = 200.

7.2 The GA alone

Thierens and Goldberg (1994a) showed that on the onemax problem, the elitist GA behaves very much like the simple GA using binary tournament selection. More recently, Miller and Goldberg (1995) used order statistics and extended the previous model for tournament selection to account for di erent tournament sizes. Their model showed that for a tournament size s, the tness improvement between generations is given by 

F;t

=  +1 ?  =  s:s: F;t

F;t

F;t

where  and  are the mean and standard deviation of the population tness at generation t, and s:s is the maximal order statistic of size s of the standard normal distribution N (0; 1). Miller and Goldberg (1995) give a brief review of order statistics and that should be enough to follow the analyses presented here. However, for a more detailed description of order statistics, the reader should see David (1981). Based on their results, we are now able to compute the expected reward when using the GA alone. The reward computation is given by the di erence between the best of parents and o spring and the best of the parents. The expected tness of the best of parents and o spring is given by the mean of the fourth order statistic of a sample of size 4 and the expected tness of the best of the parents is given by the mean of the second order statistic of a sample of size 2. Using this, we can derive the expected reward R(t) when going from generation t to t + 1 F;t

F;t

q (4:4 ? 2:2) l p (1 ? p(t) ) (4:4 ? 2:2 ) s (t)

R(t) = 

=

=

F;t

t t l 0:5 (1 + sin( p )) (1 ? 0:5 (1 + sin( p )) (4:4 ? 2:2 ) l l

(4)

The last step on the derivation of equation 4 was obtained by using the convergence equation for 9

the proportion of correct alleles obtained by Thierens and Goldberg (1994a) t p(t) = 0:5 (1 + sin( p )): l 4:4 and 2:2 are the expected values of the maximal order statistics of sizes 4 and 2 for the standard normal distribution (4:4 = 1:0294; 2:2 = 0:5642)

Figure 5 shows a comparison of theory and experiment. The small discrepancy between the 3.5 exper. 1 exper. 2 theory

3

reward

2.5 2 1.5 1 0.5 0 0

5 10 15 20 25 30 35 40 45 50 generations

Figure 5:

Comparison of theory and experiment using the GA alone. The di erence between experiments 1 and 2 is that on the second one, the individuals are shued after crossover. Results are averaged for 20 runs. popsize = 400, l = 200.

two is mainly because the model assumes that parents and o spring are statistically independent. This is not exactly true because after crossover, the resulting o spring are highly correlated with the parents. On another experiment, we shue the individuals after performing crossover and only after that, the reward is computed for every pair of individuals. This way, the four individuals that are needed to compute the reward value are more statistically independent of each other. By doing so, we get a much better t between the model and experiment. The small di erence that still remains is now due to the build-up of covariances between the alleles as explained in (Thierens & Goldberg, 1994a). Although the model is not exact, it is qualitatively good, and in a practical sense, the small di erence does not really alter what we can extract from it | the expected reward of the GA will be greater at the beginning of the run and at some point when the population has almost converged, the expected reward of the HC will be greater (compare gure 4 with gure 5).

8 Extensions We started with the onemax problem because it is easy to analyze and for a rst choice it is wise to keep things as simple as possible. Also, the main reason why we chose the one-bit mutation hillclimber is because it is easy to analyze. The mechanics of most evolutionary algorithms are very simple and easy to implement but their mathematical analyses is surprisingly complex. However, the model presented here is very general and many possibilities exist for further exploration. We just need to be careful in order to make a fair comparison between the two methods. Now the system needs to be tested on a range of test functions to see how it behaves on more dicult problems. Looking back, the hybrid presented here is still a blind searcher, it is a combination of two blind searchers. On a real application, we would probably want to incorporate previous knowledge about 10

the problem, and by doing so, we will certainly see the real advantages of using a hybrid. Further research in this and in other facets of hybridization is needed and is going under way at the Illinois Genetic Algorithms Laboratory.

9 Summary and conclusions In this paper we recognize the importance of hybridization in genetic optimization and try to understand how to combine the best of two di erent methods. A simple model was introduced and based on it a hybrid genetic algorithm was developed. We combined an elitist selecto-recombinative GA with a hillclimber and veri ed that both methods had an important role in solving the problem. A simple mathematical analyses con rmed our intuition and predicted that a switch from the GA to the HC would occur. The goal of this work was to get some insight into the decision-making problem on a hybrid GA. The objective was accomplished but we are aware that this is only a small part of the whole hybridization issue. We think of this work as a rst step towards an economic theory of hybridization of genetic algorithms (Goldberg, 1995). There is no doubt that hybridization is an important issue and that any ecient real world application of genetic algorithms is most likely to be some sort of hybrid. We need to have some idea o how to do this hybridization in an ecient way. This paper started to scratch a little bit this issue using a combination of careful experimentation and a simple model based on the sound theoretical grounds of probability matching.

Acknowledgments This work was supported by \Junta Nacional de Investigaca~o Ci^ent ca e Tecnologica (JNICT, Portugal)" through scholarship PRAXIS BD4020/94 and by \EDP - Electricidade de Portugal". This e ort was also sponsored by the Air Force Oce of Scienti c Research, Air Force Material Command, USAF, under grant numbers F4960-94-1-0103 and F49620-95-1-0338. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the ocial policies or endorsements, either expressed or implied, of the Air Force Oce of Scienti c Research or the U.S. Government.

11

References Back, T. (1993). Optimal mutation rates in genetic search. In Forrest, S. (Ed.), Proceedings of the Fifth International Conference on Genetic Algorithms (pp. 2{8). San Mateo, CA: Morgan Kaufmann. Bush, R. R., & Mosteller, F. (1958). Stochastic models for learning. New York: Wiley. David, H. A. (1981). Order statistics (2nd ed.). John Wiley and Sons, Inc. Davis, L. (1989). Adapting operator probabilities in genetic algorithms. In Scha er, J. D. (Ed.), Proceedings of the Third International Conference on Genetic Algorithms (pp. 61{69). San Mateo, CA: Morgan Kaufmann. Davis, L. (Ed.) (1991). Handbook of genetic algorithms. New York: Van Nostrand Reinhold. Davis, L., Orvosh, D., Cox, A., & Qiu, Y. (1993). A genetic algorithm for survivable network design. In Forrest, S. (Ed.), Proceedings of the Fifth International Conference on Genetic Algorithms (pp. 408{415). San Mateo, CA: Morgan Kaufmann. Goldberg, D. E. (1989). Genetic algorithms in search, optimization, and machine learning. New York: Addison-Wesley. Goldberg, D. E. (1990). Probability matching, the magnitude of reinforcement, and classi er system bidding. Machine Learning , 5 (4), 407{426. (Also TCGA Report No. 88002). Goldberg, D. E. (1995). Toward a mechanics of conceptual machines (IlliGAL Report No. 95011). Urbana, IL: University of Illinois at Urbana-Champaign. Goldberg, D. E., Deb, K., & Clark, J. H. (1992). Genetic algorithms, noise, and the sizing of populations. Complex Systems , 6 , 333{362. Grefenstette, J. J. (1987). Incorporating problem speci c knowledge into genetic algorithms. Genetic Algorithms and Simulated Annealing , 420{460. Hinton, G. E., & Nowlan, S. J. (1987). How learning can guide evolution. Complex Systems , 1 , 495{502. Holland, J. H. (1975). Adaptation in natural and arti cial systems. Ann Arbor: University of Michigan Press. Julstrom, B. A. (1995). What have you done for me lately? Adapting operator probabilities in a steady-state genetic algorithm. In Eschelman, L. (Ed.), Proceedings of the Sixth International Conference on Genetic Algorithms (pp. 81{87). San Francisco, CA: Morgan Kaufmann. Kado, K., Ross, P., & Corne, D. (1995). A study of genetic algorithm hybrids for facility layout problems. In Eschelman, L. (Ed.), Proceedings of the Sixth International Conference on Genetic Algorithms (pp. 498{505). San Francisco, CA: Morgan Kaufmann. Karr, C. L. (1995). Genetic algorithms and fuzzy logic for adaptive process control. In Goonatilake, S., & Khebbal, S. (Eds.), Intelligent Hybrid Systems (pp. 63{83). Chichester: John Wiley and Sons. Lakshmivarahan, S. (1981). Learning algorithms theory and applications. New York: SpringerVerlag. Miller, B. L., & Goldberg, D. E. (1995). Genetic algorithms, tournament selection, and the e ects of noise. Complex Systems , 9 (3), 193{212. 12

Muhlenbein, H. (1992). How genetic algorithms really work: I.Mutation and Hillclimbing. In Manner, R., & Manderick, B. (Eds.), Parallel Problem Solving from Nature, 2 (pp. 15{25). Amsterdam, The Netherlands: Elsevier Science. Narenda, K. S., & Thathachar, M. A. L. (1974). Learning automata - a survey. IEEE Transactions Systems Man and Cybernetics , 4 , 323{334. Orvosh, D., & Davis, L. (1993). Shall we repair? Genetic algorithms, combinatorial optimizat ion, and feasibility constraints. Proceedings of the Fifth International Conference on Genetic Algorithms , 650. Powell, D. J., Tong, S. S., & Skolnick, M. M. (1989). EnGENEous domain independent, machine learning for design optimization. In Scha er, J. D. (Ed.), Proceedings of the Third International Conference on Genetic Algorithms (pp. 151{159). San Mateo, CA: Morgan Kaufmann. Simon, H. A. (1956). A comparision of game theory and learning theory. Psychometrika , 21 (3), 267{272. Thierens, D., & Goldberg, D. (1994a). Convergence models of genetic algorithm selection schemes. In Davidor, Y., Schwefel, H.-P., & Manner, R. (Eds.), Parallel Problem Solving fron Nature, 3 (pp. 119{129). Berlin: Springer-Verlag. Thierens, D., & Goldberg, D. (1994b). Elitist recombination: An integrated selection recombination GA. Proceedings of the First IEEE Conference on Evolutionary Computation , I , 508{512. Whitley, D. (1995). Modeling hybrid genetic algorithms. In Winter, G., Periaux, J., Galan, M., & Cuesta, P. (Eds.), Genetic Algorithms in Engineering and Computer Science (Chapter 10, pp. 191{201). Chichester: John Wiley and Sons.

13

Appendix The pseudocode of the hybrid algorithm is shown below. create initial generation randomly := hc := 0 5 f set initial weighted rewards g f probability of using the GA g ga := ga ( ga + hc ) f probability of using the HC g hc := hc ( ga + hc ) while do begin for := 1 to 2 do f generation loop g begin if ip( ga ) then f do a GA step g := pick two individuals without replacement from the population := cross parents insert the 2 best of ( + ) into the next generation := best of best of f reward of GA g ga := (1 ? ) + f update weighted reward of GA g ga ga ga else f do a HC step g 1 := pick an individual without replacement 1 := ip a randomly chosen bit from 1 1 := best of 1 1 insert 1 into the next generation 2 := pick an individual without replacement 2 := ip a randomly chosen bit from 2 2 := best of 2 2 insert 2 into the next generation f reward of HC g hc = best of 1 2 - best of 1 2 := (1 ? ) + f update weighted reward of HC g hc hc hc endif f update probabilities using a g ga := ga ( ga + hc ) := ( + ) f probability matching strategy g hc hc ga hc endfor endwhile

Wga

W

:

P

W

=

W

W

P

W

=

W

W

stop criteria is not reached

i

popsize=

P

parents kids

parents

R

kids

insert

W

W

c

parents

c R

x

0

x

x

00

x

x

0

x

x

x

x

00

x

0

x

x

00

x

x

0

00

R

x

W

W

00

x

c

P

W

=

W

W

P

W

=

W

W

00

x

x

c R

14