Why Parameter Control Mechanisms Should Be ... - CiteSeerX

1 downloads 0 Views 1MB Size Report
Abstract—Parameter control mechanisms in evolutionary al- gorithms (EAs) dynamically change the values of the EA pa- rameters during a run. Research over ...
Why Parameter Control Mechanisms Should Be Benchmarked Against Random Variation Giorgos Karafotias

Mark Hoogendoorn

A.E. Eiben

Computer Science Department VU University Amsterdam Email: [email protected]

Computer Science Department VU University Amsterdam Email: [email protected]

Computer Science Department VU University Amsterdam Email: [email protected]

Abstract—Parameter control mechanisms in evolutionary algorithms (EAs) dynamically change the values of the EA parameters during a run. Research over the last two decades has delivered ample examples where an EA using a parameter control mechanism outperforms its static version with fixed parameter values. However, very few have investigated why such parameter control approaches perform better. In principle, it could be the case that using different parameter values alone is already sufficient and EA performance can be improved without sophisticated control strategies raising an issue in the methodology of parameter control mechanisms’ evaluation. This paper investigates whether very simple random variation in parameter values during an evolutionary run can already provide improvements over static values. Results suggest that random variation of parameters should be included in the benchmarks when evaluating a new parameter control mechanism.

I.

I NTRODUCTION

When setting up an evolutionary algorithm (EA) one aspect that needs to be addressed is defining appropriate values for the various parameters of the algorithm. In case inappropriate values are chosen the performance of the EA can be severely degraded. The question whether a certain value of a parameter is appropriate is far from trivial as different phases in an evolutionary run could require different values. In fact, there are two principal options to set such values [4]: (1) trying to find fixed parameter values that seem to work well across the entire evolutionary run (parameter tuning), and (2) finding a suitable control strategy to adjust the parameter values during a run (parameter control). Furthermore, we can distinguish three forms of parameter control: (a) deterministic parameter control, which uses a fixed control scheme without using any input from the state of the process; (b) adaptive parameter control, utilizing information from the state of the process to determine good parameter values, and (c) self-adaptive whereby the parameter values are part of the evolutionary process itself. In the literature a variety of evolutionary algorithms equipped with sophisticated parameter control strategies have been shown to outperform their static counterparts (see e.g. [12], [3] and [18]), and many have acknowledged that dynamically adjusting parameter values is a very good idea (see e.g. [14]). In the majority of work presenting parameter control mechanisms, the value of the controller is assessed only by comparing its performance to the static version of the EA that keeps parameter values fixed. The motivation of this paper is based on the idea that such performance benefits

observed when using parameter control mechanisms over using static parameter values might be in fact a result of simply the variation of the parameter and not the intelligent strategy itself. Some authors have made some weak hints in this direction, see the next Section, however, none have performed a rigorous analysis. If it is possible that variation on its own (without some intelligent strategy) might improve performance, a methodological issue is raised: when evaluating a parameter control mechanism the actual contribution of the intelligent strategy to the performance gain should not be taken for granted but should be explicitly assessed by also including ’naive variation’ in the set of benchmarks used. The goal of this paper is to investigate whether (nonintelligent) ‘variation’ alone might indeed improve EA performance as compared to keeping parameter values fixed. To this end, we implement a few simple random methods to vary parameter values during the run of an EA and investigate their impact on a set of standard test problems. In particular, we use a uniform distribution and a Gaussian distribution and compare the resulting EAs with an EA whose parameter values are fixed (by a powerful tuning algorithm) and with an EA whose parameters change by a sine wave based schedule (enabling increasing and decreasing the values). This paper is organized as follows. Section II explains the motivation in more detail and provides some related work. Thereafter the experimental setup is dicussed in Section III whereas the results are presented in Section IV. Finally, Section V concludes the paper and presents avenues for future work. II.

M OTIVATION AND R ELATED W ORK

It is widely accepted in EC that parameter control is preferable over static parameter values because different parameter values are needed at different stages of the evolutionary process (e.g. switching from global to local search). Additionally, information about the fitness landscape that is accumulated during the search can be used to improve parameter values in the later phases of the process [2]. Several parameter control methods for evolutionary algorithms have been suggested, some literature reviews can be found in [4], [16], [7] and [13]. In many of the studies introducing these parameter control strategies, performance comparisons between the EA using the control mechanism and the equivalent EA with static parameter values are presented as a proof of the controller’s value.

However, it is the usual case that no further investigation is carried out as to how exactly the parameters are varied and to what extent the performance gain is a result of the specific control strategy or the mere fact that parameters simply change during the run, i.e. just adding some variation in the parameter values already brings added value. The idea that simply changing the values of a parameter, regardless of how that change is done, can result in better performance has been hinted in some previous work. In [17], Spears experiments with self-adaptive operator selection for genetic algorithms. Results show that the GA with random operator selection has similar performance with the selfadaptive GA meaning that it is just the availability of multiple operators that improves performance and not self-adaptation. Randomized values are purposefully used in [5] to set the parameters of different islands for a distributed EA with the rationale that, at each moment during the search process, there will be at least one parameter configuration that will be favorable for further advance. In this paper we attempt to answer the question whether a random variation of parameter values by itself (with no intelligence, purpose or strategy) can have a positive effect on the performance of an evolutionary algorithm as compared to keeping parameter values fixed. Though theoretical studies on optimal parameter values or ideal control strategies do exist (see e.g. [6], [9], [8] and [11]), we believe that such a theoretical approach here would be impossible or greatly oversimplifying. For this reason we prefer an experimental approach as will be described in the following section. III.

E XPERIMENTAL S ETUP

As was explained in the previous sections, the purpose of the experiments presented here is to determine if the mere variation of parameter values (with no particular method or strategy) can result in performance gain for the evolutionary algorithm when compared to keeping parameters fixed. In order to assess the effect of parameter variation isolated from the effect of an ”intelligent” control method or strategy we use the most naive parameter variation approach possible, i.e. random variation of parameter values. Keeping all other factors identical, we compare the performance of an evolutionary algorithm when its parameter values are kept fixed during the whole search and when its parameter values vary according to some random distribution. To show the difference between the random variation and a non-random (but certainly not sophisticated) variation approach, an additional approach to vary the parameter values, i.e. a sine-based function, is used which facilitates sequences of increase and decrease of such values. Before describing the experimental setup, two important points must be emphasized here. First, we are not trying to establish as a general truth that parameter variation will by itself lead to better performance but rather to determine if it can be possible to observe better performance as a result of only the availability or application of multiple parameter values regardless of any control strategy. Second, we do not propose random variation as a parameter control method. The performance comparison between search with static parameter values and with randomly varying parameters aims only at ex-

ploring the effect of parameter variation and not in designating a winner. A. The evolutionary algorithm and the test functions As an evolutionary algorithm we use a (µ + λ) Evolution Strategy with n-point crossover, Gaussian mutation and tournament selection for both parent and survivor selection. The parameters used in these experiments are the following six: •

population size µ



generation gap g (g is the ratio of the number of offspring to the population size)



number of crossover points n



mutation step size σ



parent selection tournament size kp



survivor selection tournament size ks

For test problems we use a set of seven standard continuous optimization test functions (see Table I). All functions are to be minimized in 10-dimensions. All are multimodal except one (f3 Rosenbrock). A single EA run is 10000 function evaluations. B. Comparison approach We use the following workflow to facilitate the desired comparison, in step (2)-(4) experimental results are generated and comparisons are made: 1) 2) 3)

4)

Tune the parameters values of the ES using a dedicated parameter tuner, resulting in a set of basic parameter values. Add variation to the basic parameter values found under (1) using a Gaussian and uniform distribution with fixed variation values. Instead of using fixed variation values as in (2), try to find the best variation values using a parameter tuning approach given the basic values found under (1). Tune all parameter values (both basic and variation values) at the same time, also include a nonrandomized parameter value generator which can express basic sequences of increasing and/or decreasing values.

Each of these steps is discussed more elaborately below. 1) Tuning the ES: As a first step, the ES is tuned for every test function separately (all six parameters are tuned concurrently with one tuning process per problem). For tuning we use Bonesa [15], which is a state-of-the-art parameter tuning method for tuning real valued parameters. This step results in seven parameter vectors p~i , one for each problem fi , i = 1...7, with good static values for each parameter. The ranges and the results of the tuning process are shown in Table II. A single Bonesa tuning run was given a budget of 10000 algorithm tests.

TABLE I. s f1 (Ackley) f2 (Rastrigin)

f (~ x) = −20 · exp(−0.2 · f (~ x) = A · D +

D P i=1

f3 (Rosenbrock) f4 (Schaffer) f5 (Bohachevsky) f6 (Griewangk) f7 (Shekel)

f (~ x) =

D P

f (~ x) =

i=1 D P

f (~ x) =

i=1 D P

f (~ x) =

i=1 D P

f (~ x) =

i=1 m P

1 D

·

D P i=1

1 x2i − exp( D ·

D P

T EST FUNCTIONS cos(2π · xi )) + 20 + e

i=1

(x2i − A · cos (2 · πxi ))

(100 · (x2i − xi+1 )2 + (1 − xi )2 ) ((x2i + x2i+1 )0.25 · ((sin 50(x2i + x2i+1 )0.1 )2 + 1)) (x2i + 2x2i+1 − 0.3 · cos 3πxi − 0.4 · cos 4πxi+1 + 0.7) x2

i )− ( 4000

D Q i=1

cos

x √i i

+1

1

D i=1 c + P (x −α )2 i j ij j=1

TABLE II.

PARAMETERS AND CORRESPONDING TUNING RANGES AND TUNED VALUES FOR EACH PROBLEM .

µ g σ N kp ks

Range [1, 200] [0, 15] [0, 2] [1, 9] [1, 200] [1, 200]

f1 199 0.137 1.053 6 21 163

f2 147 4.062 0.008 6 2 16

2) Experiment 1: Adding variation around tuned values: In order to determine the effect of variation we use the tuned vectors p~i as a starting point and add some random variation. Specifically, at each generation, parameter values are drawn from a distribution (Gaussian and uniform distributions are tested). For each parameter, a separate distribution is used and the “centers” of these distributions are set to the tuned values: •

Gaussian: for problem i, values for parameter j are drawn from a normal distribution N (~ pi (j), d · p~i (j))



uniform: for problem i, values for parameter j are drawn uniformly from the interval [~ pi (j) − w2 , p~i (j) + w ~i (j) 2 ], w = d · p

Several width coefficients d are tried, d = 0.01, 0.02, 0.03, 0.05, 0.1, 0.15, 0.2, 0.3, 0.5, 0.8. Separate runs are made with each parameter varied alone and all parameters varied together. For every setting (i.e. combination of parameter, distribution and d), the ES is run 30 times to derive statistically reliable results. 3) Experiment 2: “Tuning” the range of the variation: The above process attempts to determine whether adding some variance around the parameter values can result in improved performance, however, only a small number of hand-picked ranges (defined by the values of d) are tested. For a more thorough and rigorous test, the rationale of experiment 1 is maintained but we use Bonesa as a search algorithm to find good values for the standard deviation of the Gaussian distribution. Thus now, for problem i, values for parameter j are drawn from a normal distribution N (~ pi (j), σij ) with every j σi being derived through a search process by Bonesa (one tuning process per problem was executed that concurrently tuned the deviations of all six parameters). If the tuning process of Bonesa for a σij converges to a non-zero value, that would indicate that some random variation is indeed beneficial. A much longer (25000 algorithm tests) tuning process was used for this experiment to increase the reliability of the results.

f3 190 0 0.09 6 32 3

f4 150 1.826 1.736 5 3 9

f5 5 0.001 1.991 6 4 196

f6 186 0.327 1.808 7 2 6

f7 195 0.114 0.037 6 121 127

Due to time limitations, this experiment was performed only for function f1 . 4) Experiment 3: ”Tuning” all the settings of the variation: As a final test we make a fair comparison between the performance of the ES using static parameter values and its performance using varying values. Since the static values were derived through a tuning process, in order to make the comparison fair, the settings that determine the varying values must also be calibrated in equal terms. Thus, an identical tuning process (using Bonesa with the same budget of 10000 algorithm tests) is used. Here, except for the normal and uniform random distributions employed previously, we also use an approach based on a sine wave which is able to generate sequences of increasing and/or decreasing parameter values. For each variation mechanism, the following settings are tuned: •

Gaussian: for each problem i, a tuning process calibrates for each parameter j the mean mji and standard deviation σij of the normal distribution. Thus, each tuning process tunes 12 values.



uniform: for each problem i, a tuning process calibrates for each parameter j the minimum lij and the width wij of the range from which values are drawn. Thus, each tuning process tunes 12 values.



sine: for each problem i, a tuning process calibrates for each parameter j the amplitude Aji , frequency fij , angular frequency ωij and phase φji that define a sine wave used as a deterministic schedule. Thus, each tuning process tunes 24 values.

After the tuning is complete, for each problem and variation setting combination, the ES is run 30 times to derive statistically reliable results.

IV.

R ESULTS AND A NALYSIS

The results of experiment 1 are presented in Table III. The table shows the performance of the three ES variants that have been run in this first experiment: with tuned static parameter values, with values drawn from a Gaussian distribution (for various values of d defining the standard deviation) and with values drawn from a uniform distribution (for various values of d defining the width). Emphasized numbers indicate a significant improvement over fixed parameters. These results suggest that only varying the value of the parameter around the static value, without any strategy or purpose, may possibly lead to better performance. For 4 out of 7 problems and for 9 out of 49 combinations of problem and parameter, there exists some kind of variation that may significantly improve performance. The Gaussian distribution appears more often, perhaps indicating that mild noise is preferable, however, there are also cases where drawing values from a uniform distribution is beneficial when compared to keeping parameters fixed. An important observation is that, in most cases where changing parameter values can be beneficial, performance improves as the range of the change becomes wider, with the best results achieved when the range is 80% of the center value. Figure 2 shows some examples where performance improves with variation and how this performance gain is influenced by the variation width d. The parameter that has most often a positive response is the mutation step size σ but there are also cases where varying the population size and number of crossover points may result in improvement. Finally, for function f2 varying all parameters significantly increases performance while varying each parameter independently does not. The results of experiment 2 are shown in Table IV. The best three vectors resulting from the tuning process are presented, each vector defining the standard deviations of the Gaussian distributions from which parameter values are drawn (the results only concern f1 ). For all parameters, except g, the tuning process converged to deviation values far from zero, indicating that the existence of variation (non-zero deviation) was preferred by the tuning process. Using the best vector of deviations (and tuned vector p~i for mean values), the ES was run 30 times with Gaussian variation of all parameters. A comparison with keeping the parameters static (to tuned values p~i ) is shown in Figure 1. Also, the two best (for this problem) cases of experiment 1 are considered, namely the case with variation of all parameters with a width of d = 0.3 and varying only σ with a width of d = 0.8. The tuned deviations of this experiment produce better results than static and than varying all parameter values with d = 0.3 (experiment 1). However, they are not better than varying only σ. This may be due to the fact that varying one of the parameters except σ has a very detrimental effect (for this problem and EA) but the tuning process was not able to set the parameter’s deviation to zero. The results of experiment 3 are presented in Table V, showing the performance of the ES using static parameter values, and completely tuned variations (Gaussian and uniform distributions as well as a completely tuned sine function). Underlined values denote the best average performance and bold values indicate performance not significantly worse than the best. We can again see that varying the parameter values can

Fig. 1. Experiment 2: A comparison of the performance when keeping all parameters fixed to the tuned values (ST), when varying all parameters with a Gaussian distribution with a tuned standard deviation (TD), when varying all parameter with a Gaussian distribution with d = 0.3 (VA) and when varying only σ with a Gaussian distribution with d = 0.8. The function used is f1 (Ackley), which is a minimization problem. TABLE IV. R ESULTS OF EXPERIMENT 2 SHOWING THE BEST THREE PARAMETER VECTORS FOUND BY B ONESA . F OR EACH PARAMETER THE TUNED DEVIATIONS ARE PRESENTED . Parameter µ g σ N kp ks

Vector 1 17.923 0.025 0.862 2.692 25.464 39.978

Vector 2 19.616 0.06 1.049 1.064 21.966 22.644

Vector 3 25.785 0.03 0.897 2.771 25.662 38.916

result in better performance in some cases. However, tuning the settings of the random variations (Gaussian and uniform) did not produce any improvement compared to the results acquired by experiment 1 with handpicked d values (see Table III). For function f5 the performance of the tuned Gaussian is much worse than the performance acquired simply by setting the deviation of all parameters to 0.8 · pi (j). Furthermore, for functions f2 and f4 , while experiment 1 showed improvement when varying all parameters, here we see worse performance. It might be that the task of the tuning process is too tough when tuning the settings of the variation mechanisms due to the number of values tuned: for the Gaussian and uniform distributions there are double the settings compared to tuning static values (two settings per parameter) while for the sine wave the factor is four. Consequently, though the same tuning effort was spent for static values and variation mechanisms, the outcomes are unbalanced. The tuned sine wave variation performs the best with problem f7 ; the corresponding parameters variation is shown in Figure 3. Except for µ, the variation of all other parameters is just a very fast oscillation within a certain range, showing that tuning resulted in a process also resembling random instead of a more “meaningful” schedule that could be expressed with a sine wave.

TABLE III. R ESULTS OF EXPERIMENT 1. T HE LEFT PART IS FOR THE GAUSSIAN DISTRIBUTION AND THE RIGHT FOR THE UNIFORM . T HERE IS A SUBTABLE FOR EVERY FUNCTION AND DISTRIBUTION COMBINATION . F OR EACH SUBTABLE , IN EVERY LINE IT IS DENOTED WHICH PARAMETER IS VARIED . T HE FIRST COLUMN OF EACH SUBTABLE SHOWS THE PERFORMANCE WHEN THE PARAMETER IS KEPT STATIC AND THE SUBSEQUENT COLUMNS SHOW THE PERFORMANCE WHEN THE PARAMETER IS VARIED WITH THE CORRESPONDING VALUE OF d. A LL NUMBERS ARE AVERAGES OVER 30 RUNS . E MPHASIZED VALUES SHOW PERFORMANCE THAT IS SIGNIFICANTLY BETTER THAN STATIC ( WITH 0.95 CONFIDENCE ). A LL FUNCTIONS ARE TO BE MINIMIZED . St

µ g σ N kp ks all

µ g σ N kp ks all

µ g σ N kp ks all

2.67

7.32

7.56

Gaussian

uniform

0.01 3.22 2.72 2.69 2.67 2.69 2.77 2.82

0.02 2.75 2.81 2.73 2.67 2.70 2.79 2.73

0.03 3.07 2.73 2.78 2.72 2.67 2.66 2.68

0.05 2.72 2.78 2.80 2.77 2.74 2.72 2.87

0.10 2.78 2.75 2.65 2.68 2.78 2.68 3.20

0.15 2.65 2.77 2.50 2.68 2.69 2.69 2.54

0.20 2.75 2.71 2.43 3.26 2.75 2.77 2.42

0.01 8.64 8.30 7.32 7.32 7.32 7.32 8.38

0.02 7.65 7.53 7.31 7.32 7.32 7.84 7.71

0.03 7.88 7.51 7.29 7.32 7.32 7.51 7.38

0.05 7.70 8.05 7.34 6.83 7.32 7.21 6.98

0.10 8.13 7.74 7.22 6.33 7.31 7.66 5.71

0.15 7.88 8.34 6.99 6.71 7.25 7.53 5.75

0.20 7.77 8.46 7.10 6.39 7.13 7.66 6.28

0.01 9.82 7.56 7.53 7.56 9.78 7.56 8.22

0.02 7.27 7.56 7.89 7.56 7.93 7.56 10.69

0.03 8.27 7.56 7.84 9.99 8.30 7.56 8.71

0.05 7.53 7.56 7.82 7.73 10.25 7.70 8.13

0.10 7.51 7.56 8.18 7.67 7.93 7.98 7.72

0.15 7.73 7.56 7.95 8.04 9.94 7.96 8.21

0.20 7.68 7.56 7.94 10.09 8.10 7.92 7.80

µ g σ N kp ks all

0.01 10.78 11.52 11.48 11.51 11.51 11.51 11.51 11.23

0.02 11.18 11.23 11.46 11.51 11.51 11.47 11.72

0.03 10.95 11.26 11.74 11.51 11.51 11.68 11.84

0.05 11.04 11.22 11.53 11.70 11.51 12.11 11.01

0.10 11.60 11.39 11.23 11.70 11.68 11.69 11.67

0.15 11.63 11.58 11.18 11.92 11.38 11.69 11.15

0.20 11.79 11.18 11.15 11.74 11.87 11.48 11.05

µ g σ N kp ks all

0.01 16.16 16.16 15.91 16.16 16.16 16.16 16.16 16.37

0.02 16.16 16.16 16.00 16.03 16.16 16.16 16.59

0.03 15.80 16.16 15.79 15.56 16.33 16.16 15.76

0.05 16.04 16.16 15.15 16.51 15.24 16.16 15.24

0.10 15.61 16.16 14.78 16.20 16.05 16.16 14.85

0.15 16.65 16.16 13.74 15.82 16.50 16.16 12.49

0.20 15.67 16.16 12.21 16.19 15.10 16.16 12.44

0.01 0.81 0.87 0.86 0.87 0.87 0.87 0.88

0.02 0.80 0.86 0.86 0.88 0.87 0.87 0.83

0.03 0.83 0.85 0.84 0.85 0.87 0.84 0.80

0.05 0.86 0.89 0.79 0.79 0.87 0.77 0.82

0.10 0.89 0.92 0.77 0.99 0.85 0.86 0.81

0.15 0.79 0.80 0.85 0.83 0.83 0.94 0.92

0.20 0.94 0.78 0.83 0.90 0.82 0.88 1.01

0.01 1.65 2.03 1.97 1.97 1.80 1.94 1.96

0.02 1.34 2.00 2.00 1.97 1.35 2.03 2.08

0.03 1.30 2.03 1.97 1.97 1.35 2.00 2.03

0.05 1.64 2.03 1.96 2.00 1.50 2.03 1.76

0.10 1.60 2.03 1.97 1.68 1.36 2.03 1.40

0.15 1.60 2.02 2.00 1.42 1.34 2.00 2.05

0.20 1.57 2.03 1.97 1.45 1.40 2.02 1.54

µ g σ N kp ks all

0.87

µ g -1.97 σ N kp ks all

f1 (Ackley) 0.30 0.50 0.80 2.81 2.69 2.70 3.20 2.80 2.73 2.41 1.23 1.12 3.09 2.80 2.69 2.72 2.75 2.78 2.76 3.11 2.71 1.86 1.59 2.00 f2 (Rastrigin) 0.30 0.50 0.80 9.43 12.39 31.03 8.37 9.34 13.99 7.26 6.87 7.06 6.59 8.76 12.37 7.66 7.89 8.25 7.63 8.05 8.90 7.22 15.35 40.55 f3 (Rosenbrock) 0.30 0.50 0.80 8.22 7.69 8.18 7.56 7.56 7.56 7.41 7.30 7.49 7.96 10.53 8.99 7.49 10.39 7.79 7.61 7.54 9.52 7.39 7.12 7.05 f4 (Schaffer) 0.30 0.50 0.80 12.92 15.38 30.87 11.52 12.38 14.40 10.63 10.67 9.80 11.73 12.10 12.26 12.03 11.43 11.48 11.73 11.85 12.06 10.63 16.32 36.76 f5 (Bohachevsky) 0.30 0.50 0.80 15.52 15.64 15.83 16.16 16.16 16.16 7.89 3.95 3.25 15.28 16.64 16.01 15.72 15.02 15.57 16.34 15.75 17.61 7.45 5.33 7.61 f6 (Griewangk) 0.30 0.50 0.80 2.35 3.17 16.68 0.93 0.94 1.18 0.86 0.77 1.17 0.92 0.97 0.95 0.82 0.89 0.85 1.09 0.79 0.88 0.86 5.28 26.38 f7 (Shekel) 0.30 0.50 0.80 1.69 1.58 1.64 2.02 1.72 2.00 1.97 2.01 2.01 1.60 1.80 1.67 1.65 1.38 2.02 2.03 2.00 2.03 1.82 1.78 2.01

0.01 2.70 2.67 2.74 2.67 2.67 2.70 2.71

0.02 2.75 2.78 2.80 2.67 2.67 2.77 2.78

0.03 2.74 2.64 2.78 2.67 2.75 2.77 2.80

0.05 2.74 2.77 2.78 2.67 2.65 2.66 2.76

0.10 2.81 2.76 3.28 2.74 2.70 2.70 2.72

0.15 2.82 2.71 3.27 2.74 2.70 2.78 2.74

0.20 2.76 2.78 3.19 2.68 2.67 2.77 3.19

0.30 2.77 2.77 2.67 2.67 2.70 2.79 2.68

0.50 3.18 2.68 2.63 2.81 2.80 2.74 2.56

0.80 2.77 2.74 2.39 2.70 2.66 2.70 2.50

0.01 7.82 7.87 7.33 7.32 7.32 7.32 8.09

0.02 7.42 7.95 7.35 7.32 7.32 7.32 8.44

0.03 8.20 7.83 7.35 7.32 7.32 7.32 7.80

0.05 7.93 7.75 7.35 7.32 7.32 7.46 7.97

0.10 8.64 7.45 7.31 6.18 7.32 7.66 5.50

0.15 8.38 7.41 7.32 6.18 7.32 7.66 5.54

0.20 8.10 8.60 7.32 5.71 7.32 7.93 6.40

0.30 7.88 8.20 7.44 5.94 7.85 7.41 6.34

0.50 8.57 8.36 7.23 5.93 7.85 7.59 6.42

0.80 8.58 8.76 7.25 6.58 7.85 7.96 7.21

0.01 7.85 7.56 7.74 7.56 7.56 7.56 8.09

0.02 7.61 7.56 7.49 7.56 7.98 7.56 8.48

0.03 8.02 7.56 7.16 7.56 7.98 7.56 8.16

0.05 7.96 7.56 7.53 7.56 8.16 7.56 8.11

0.10 8.35 7.56 7.71 7.72 8.54 7.56 12.08

0.15 8.43 7.56 7.60 7.72 9.77 7.56 7.83

0.20 7.94 7.56 8.10 7.88 7.77 8.19 8.31

0.30 8.03 7.56 10.09 8.19 9.88 8.19 7.95

0.50 8.09 7.56 10.06 7.52 8.15 7.94 8.18

0.80 8.51 7.56 7.34 8.20 8.39 7.94 7.88

0.01 11.43 11.71 11.53 11.51 11.51 11.51 11.55

0.02 11.12 11.74 11.71 11.51 11.51 11.51 11.05

0.03 11.26 11.98 11.39 11.51 11.51 11.51 11.66

0.05 11.30 11.78 11.55 11.51 11.51 11.51 11.26

0.10 11.31 11.77 11.32 11.59 11.51 11.68 11.41

0.15 11.37 11.19 11.25 11.59 11.51 11.23 11.24

0.20 11.47 11.64 11.45 11.59 11.92 11.44 11.70

0.30 11.56 11.40 11.38 11.93 11.92 11.02 11.65

0.50 11.45 11.58 11.60 11.81 11.57 11.55 11.39

0.80 11.54 11.34 10.98 11.48 11.57 11.64 11.75

0.01 16.16 16.16 16.07 16.16 16.16 16.16 15.68

0.02 16.16 16.16 16.38 16.16 16.16 16.16 15.53

0.03 16.16 16.16 16.66 16.16 16.16 16.16 15.79

0.05 16.16 16.16 15.70 16.16 16.16 16.16 15.69

0.10 15.72 16.16 16.25 15.14 16.16 16.16 15.98

0.15 15.72 16.16 15.81 15.14 15.46 16.16 16.23

0.20 15.72 16.16 15.71 15.92 15.46 16.16 16.29

0.30 15.95 16.16 15.26 15.48 16.40 16.16 15.13

0.50 15.87 16.16 14.26 15.94 16.37 16.16 14.04

0.80 15.83 16.16 12.36 15.38 16.77 16.16 12.13

0.01 0.85 0.87 0.80 0.87 0.87 0.87 0.92

0.02 0.79 0.82 0.78 0.87 0.87 0.87 0.85

0.03 0.85 0.80 0.88 0.87 0.87 0.87 0.80

0.05 0.81 0.78 0.80 0.87 0.87 0.87 0.90

0.10 0.90 0.88 0.80 0.86 0.87 0.84 0.78

0.15 0.83 0.80 0.79 0.79 0.87 0.84 0.85

0.20 0.77 0.92 0.76 0.79 0.87 0.87 0.83

0.30 0.94 0.81 0.81 0.80 0.84 0.88 0.86

0.50 0.79 0.83 0.82 0.82 0.84 0.79 0.82

0.80 0.87 0.84 0.87 1.00 0.91 0.88 0.91

0.01 1.78 1.97 1.96 1.97 1.71 2.00 1.84

0.02 1.81 1.97 1.96 1.97 1.74 2.03 2.14

0.03 1.66 2.00 1.96 1.97 1.66 2.00 1.99

0.05 1.33 2.02 1.96 1.97 1.66 2.03 1.84

0.10 1.72 2.03 1.97 1.98 1.47 2.03 1.78

0.15 1.70 2.32 1.97 1.98 1.74 2.02 2.16

0.20 1.43 2.32 1.97 1.36 1.71 2.03 1.50

0.30 1.37 2.32 1.97 2.01 1.35 2.03 1.67

0.50 1.94 1.97 1.97 1.32 1.96 2.06 1.53

0.80 1.32 1.71 1.97 1.37 1.63 2.02 1.85

(a) Ackley σ Gaussian

(b) Rastrigin all Gaussian

(c) Rastrigin N uniform

(d) Bohachevsky σ uniform Fig. 2. Four cases from the results of experiment 1. Each subgraph shows the performance when varying a parameter (or all) according to a random distribution. The x-axis is the width d of the distribution. The horizontal dashed line shows the performance when keeping the parameter values static to the tuned values. The caption of each subgraph lists the test function, which parameter is varied and the type of the random distribution. Lower values are better for all functions.

(a) µ

(d) N

(b) g

(e) kp

(c) σ

(f) ks

Fig. 3. The parameter values over time when using the sine wave with the tuned settings from experiment 3 with f7 . Each subgraph shows the values of a parameter over the generations.

TABLE V. R ESULTS OF EXPERIMENT 3. F OR EACH PROBLEM , THE PERFORMANCES USING THE VARIATION METHODS WITH TUNED SETTINGS AND THE PERFORMANCE KEEPING THE VALUES FIXED TO THE TUNED VALUES ARE SHOWN . L OWER VALUES ARE BETTER FOR ALL FUNCTIONS . F OR EACH COLUMN , UNDERLINED VALUES DENOTE THE BEST AND BOLD VALUES DENOTE PERFORMANCE NOT SIGNIFICANTLY WORSE THAN THE BEST. gaussian uniform sine static

f1 1.74 3.08 3.21 2.67

f2 14.57 12.17 48.27 7.32

f3 9.58 13.28 38.76 7.56

f4 15.88 18.34 14.13 11.51

f5 14.28 21.76 15.40 16.16

f6 6.18 3.54 17.14 0.87

f7 -1.26 -2.24 -3.08 -1.97

[6]

[7] [8]

[9]

V.

C ONCLUSIONS AND F UTURE W ORK

In this paper we have investigated the effect of randomly changing the values of an evolutionary algorithm’s parameters. To be specific, we put forward the assumption that random variation, without intelligence or strategy, can improve EA performance, compared to keeping parameters fixed, simply by making multiple parameter values available to the evolutionary process. To test this hypothesis we performed three separate experiments where the effect of randomly varying the parameter values was examined. All three experiments showed that it is indeed possible to significantly improve the performance of an evolutionary algorithm by randomly changing its parameter values.

[10]

The results of this paper raise an important issue in methodology. It is common practice in literature that presents parameter control mechanisms to evaluate the controller by performing a comparison to the equivalent EA with static parameter values. However, as the results if this paper show, observing an improvement in such a comparison does not necessarily show that the controller is good as it is not shown whether the observed improvement is a consequence of the intelligent control strategy itself or merely the variation of the values. We believe that a complete evaluation of a control mechanism should also include an analysis of how the parameters are varied during a run and we suggest that a “naive” variation scheme for the same parameters should be included in the baseline benchmarks.

[15]

Future work will focus on making a comparison between sophisticated parameter control approaches and the non-sophisticated random variation approach presented in the experimental part of this paper to investigate the differences between the two in terms of performance. R EFERENCES [1]

[2] [3]

[4] [5]

J. Costa, R. Tavares, and A. Rosa. An experimental study on dynamic random variation of population size. In Systems, Man, and Cybernetics, 1999. IEEE International Conference on, volume 1, pages 607–612, 1999. K. A. De Jong. Parameter setting in EAs: a 30 year perspective. In Lobo et al. [10], pages 1–18. A. Eiben, M. Horvath, W. Kowalczyk, and M. Schut. Reinforcement learning for online control of evolutionary algorithms. In Brueckner, Hassas, Jelasity, and Yamins, editors, Proceedings of the 4th International Workshop on Engineering Self-Organizing Applications (ESOA’06), volume 4335, pages 151–160. Springer, 2006. A. Eiben, Z. Michalewicz, M. Schoenauer, and J. Smith. Parameter control in evolutionary algorithms. In Lobo et al. [10], pages 19–46. Y. Gong and A. Fukunaga. Distributed island-model genetic algorithms using heterogeneous parameter settings. In IEEE Congress on Evolutionary Computation, pages 820–827, 2011.

[11]

[12]

[13] [14]

[16]

[17]

[18]

J. Hesser and R. M¨anner. Towards an optimal mutation probability for genetic algorithms. In H.-P. Schwefel and R. M¨anner, editors, Proceedings of the 1st Conference on Parallel Problem Solving from Nature, number 496 in Lecture Notes in Computer Science, pages 23– 32. Springer, Berlin, Heidelberg, New York, 1991. R. Hinterding, Z. Michalewicz, and A. Eiben. Adaptation in evolutionary computation: a survey. pages 65 –69. T. Jansen and K. A. D. Jong. An analysis of the role of offspring population size in eas. In W. Langdon et al, editor, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2002), pages 238–246. Morgan Kaufmann, San Francisco, 9-13 July 2002. J. L. J. Laredo, C. Fernandes, J. J. Merelo, and C. Gagn´e. Improving genetic algorithms performance via deterministic population shrinkage. In F. Rothlauf, editor, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2009), pages 819–826. ACM, 2009. F. Lobo, C. Lima, and Z. Michalewicz, editors. Parameter Setting in Evolutionary Algorithms. Springer, 2007. F. G. Lobo. Idealized dynamic population sizing for uniformly scaled problems. In N. Krasnogor et al., editors, GECCO ’11: Proceedings of the 13th annual conference on Genetic and evolutionary computation, pages 917–924, Dublin, Ireland, 12-16 July 2011. ACM. B. McGinley, J. Maher, C. O’Riordan, and F. Morgan. Maintaining healthy population diversity using adaptive crossover, mutation, and selection. Evolutionary Computation, IEEE Transactions on, 15(5):692 –714, 2011. S. Meyer-Nieberg and H. georg Beyer. Self-adaptation in evolutionary algorithms. In Lobo et al. [10], pages 47–76. Z. Michalewicz and M. Schmidt. Parameter control in practice. In Lobo et al. [10], pages 277–294. S. Smit and A. E. Eiben. Multi-problem parameter tuning using bonesa. In J. Hao, P. Legrand, P. Collet, N. Monmarch´e, E. Lutton, and M. Schoenauer, editors, Artificial Evolution, pages 222–233, 2011. J. E. Smith and T. C. Fogarty. Operator and parameter adaptation in genetic algorithms. Soft Computing - A Fusion of Foundations, Methodologies and Applications, 1:81–87, 1997. W. M. Spears. Adapting crossover in evolutionary algorithms. In Proceedings of the Fourth Annual Conference on Evolutionary Programming, pages 367–384. MIT Press, 1995. Y.-Y. Wong, K.-H. Lee, K.-S. Leung, and C.-W. Ho. A novel approach in parameter adaptation and diversity maintenance for genetic algorithms. Soft Computing - A Fusion of Foundations, Methodologies and Applications, 7:506–515, 2003.