Improved environmental adaption method and its application in test ...

3 downloads 375 Views 2MB Size Report
1064-1246/14/$27.50 © 2014 – IOS Press and the authors. All rights ... ory, all existing species have a tendency to update their phenotypic ... selection for the next generation. Selection: Best .... operator on current generation will generate next.
2305

Journal of Intelligent & Fuzzy Systems 27 (2014) 2305–2317 DOI:10.3233/IFS-141195 IOS Press

Improved environmental adaption method and its application in test case generation K.K. Mishra∗ , Shailesh Tiwari and A.K. Misra

CO

PY

Computer Science and Engineering Department, MNNIT Allahabad, Allahabad, India

OR

Abstract. An Environmental Adaption Method (EAM) has been recently proposed in which, each solution updates its structure on the basis of current environmental fitness and its own fitness. EAM uses a new operator known as adaption operator. In this paper, an improved EAM, henceforth called as IEAM has been proposed and proper tuning of parameters of algorithm has been done to find the optimal solution in minimum time. IEAM has been tested on a set of benchmark functions. Results show the superiority of IEAM over PSO-TVAC, SADE and other optimization algorithms. Even for complex functions, the performance of IEAM is very good. In addition to this, IEAM has also been applied to software testing to generate test cases for white box testing.

1. Introduction

TH

Keywords: Evolutionary computing, improved EAM, PSO

AU

The term optimization refers to the minimization or maximization of a real function by systematically choosing the values of real or integer variables within an allowed set. Although a number of gradient based approaches [19–21] are available in mathematics to identify optimal solutions, but writing computer algorithm to implement accurate solution of differential equation is very difficult. There is only one way to solve optimization problem and that is to write a randomized algorithm which can search for optimal solution in a large problem search space. As in many cases, users do not have any idea about the nature of function so direction of search cannot be predicted. Keeping this fact in mind, most of these randomized algorithms are derived from the nature because the ∗ Corresponding

author. Dr. Krishn Mishra, Computer Science and Engineering Department, MNNIT Allahabad, Allahabad, India. E-mail: [email protected].

nature provides a good framework to search good solutions when search direction is not clear. Many nature inspired algorithms such as Evolutionary Programming, Evolutionary Strategy, Differential Evaluation, Genetic Programming [22–25], Genetic Algorithm [10] and Particle Swarm Optimization [PSO] have already gained wide acceptance in solving optimization problems. Although a number of randomized algorithms are available in literature for solving optimization problems yet their design objectives are same. Each algorithm has been designed with certain goals such as minimizing total number of fitness evaluations to reach nearly optimal solution and to capture diverse optimal solutions in multi-modal solutions. Moreover all these algorithms should be able to escape from local optimal solutions which are possible only if random parameters used in these algorithms are properly tuned [6, 7]. Each randomized algorithm use some operators that are used to search the optimal solutions either by exploiting or by exploring the problem search space.

1064-1246/14/$27.50 © 2014 – IOS Press and the authors. All rights reserved

K.K. Mishra et al. / Improved environmental adaption method and its application

PY

the information received from the known phenotypic structures of best solutions. This newly designed algorithm will perform well with both types of problems i.e. uni-model and multi-modal problems. Different state of art algorithms are compared on benchmark functions to check its performance. Moreover, this algorithm has been applied to generate test cases for white box testing. Before discussing the proposed algorithm, let us take a brief overview of the Environmental Adaption Method and the Particle Swarm Optimization Algorithm.

2. Background detail

2.1. Environment adaption method [EAM] EAM is a computerized implementation of the process used by the species while they adapt themselves to survive in changing environment. According to this theory, all existing species have a tendency to update their phenotypic structure to attain better phenotypic structure and to contribute in creation of better environment as compared to previous one. An environmental fitness equal to the average fitness of all species residing in current environment and has been used to measure the fitness of current and previous environment. Using its current and environmental fitness, each species refine its structure. Sometimes, during refinement some alteration may occur due to environmental noise. Finally, best species selected from newly created species and the existing species form the new environment. Those species which do not contribute to the new environment will die. Since genomic changes require current environment, the probability of getting a good species in the next generation is very high. Alike GA and PSO, this algorithm is also a population based algorithm. This algorithm processes a randomly generated initial population using three operators. The first operator is named as the adaption operator and this causes every solution of current generation to update its phenotypic structure taking its current fitness and environmental fitness in to account. Second operator simply performs alteration, which may result due to environmental noise. After the applications of both of these operators on the current generation, next generation is generated. Selection operator is used to select best solutions from the combination of current and previous generation forming current generation with better environmental fitness. Those solutions which are not selected will be discarded from the current generation. This process is repeated until either we get

AU

TH

OR

Among all these operators, only some operators govern major portion of search and these operators are known as dominating operators. These operators are very important because if we know their nature it can be easily determined that for which type of optimization problems this algorithm will be more suitable. For example in a random algorithm, if major portion of search is done by those operators, which perform the search by exploiting the search space then this algorithm will be suitable for uni-modal problems which have only one optimal solution and as such once algorithms find good solutions, it can exploit that region to retrieve optimal solution. However in multi-modal problems, there may be many optimal solutions and the whole search space would be explored until a solution near to global optimal is captured. So requirement of uni-modal and multimodal problems are different. This indicates that even if an algorithm is very effective in solving uni-modal problems, it can be trapped in local optimum solution in multi-modal problems. While reverse is not true i.e. any multi-modal optimization algorithm can be applied to solve uni-model problems. For example, main operators of Genetic algorithm and Particle Swarm Optimization focus on exploitation of good regions, that is, why these algorithms works well with uni-modal optimization problems. However these algorithms are not good in capturing multiple solutions in multi-modal problems [8]. These algorithms should be modified when they are applied to multi-modal problems. Recently, an Environmental Adaption Method [EAM] [9] has been proposed for finding solutions to optimization problems. The algorithm uses three operators’ adaption, alteration and selection operator. In EAM, the major contribution in searching is provided by Adaption operator, which explores the search space to search good solutions. Minor contribution is due to alteration operator, this operator exploit the search space. EAM works well with both uni-modal as well as multi-modal problems. However, the performance of EAM can be enhanced further; to search whole search space, the operator must exploit all finite regions having a very high probability of containing the optimal solution. In this way, improved version can achieve a balance between exploitation and exploration. In the present paper, an improved version of Environmental Adaption Method [26] has been proposed in which some ideas taken from PSO is incorporated in EAM, so that, its operators can exploit all good regions which were discovered by adaption operator of EAM. Proposed algorithm selects these regions by utilizing

CO

2306

K.K. Mishra et al. / Improved environmental adaption method and its application

Adaption Operator: Adaption operator updates a solution Pi as follows: Pi+1 = (α∗ (Decoded value in decimal of binary coding of Pin )F (Pi)/Favg + β)%2L Where F (Pi) is the fitness value of Pi, ␣ and ␤ are the random numbers that are to be decided as per the requirement of problem. ‘L’ represents the total number of bits in an individual. Favg is the average fitness value of current population, representing the current environmental fitness.

ested reader may refer [1–5] for detailed information. M. Senthil Arumugam et al. [6] proposed a new version of PSO (called as GLBestPSO) which approach global best solution in different manner. Further modification of same algorithm is done in 2009 by the same authors [7]. Although PSO has very good convergence rate but it may face problem of stagnation. Tuning of random parameters is very important issue and if properly done it can avoid premature convergence of PSO on local optimal solutions. Many papers [3–5] suggested solution of this issue by giving various methods to tune random parameters (Like w, c1, c2, r1 and r2).

PY

the desired solution or maximum number of iterations has reached.

2307

CO

3. Proposed approach Since each optimization algorithm is designed is to minimize searching time to reach global optimal solution and to escape from local optimal solution. This improved version of EAM is also designed to fulfill these requirements. To improve the convergence rate of existing EAM and to prevent it to converge on local optimal solution, a slight change has been done in the adaption operator of basic EAM. In EAM each solution was updating its structure only on the basis of current environmental fitness and its own fitness; there was no role of best particles. However as initially no information is available about the structure of optimal solution, all solutions will update their structure in the direction of best solution in the hope that optimal structure exist somewhere nearby the structure of best solution. This idea was implemented in PSO. In proposed version, same idea is used to modify the adaption operator of EAM. However to avoid the problem of stagnation, best solution will use adaption operator of EAM and will explore full search space. Moreover proper tuning of parameters ␣ and ␤ have been done so that global optimal can be searched as soon as possible. In improved EAM all solutions other than best will use following adaption operator

OR

Alteration: Alteration operator generates new solution Pi + 1 by flipping one or more bit of Pi. Alteration is carried out according to the following pseudo-code: Begin for each member in the population for each bit in the member’s genome Invert the bit with a probability of Palt {Palt is the probability of alteration} end Each new member so generated is considered for selection for the next generation.

AU

2.2. Particle swarm optimization [PSO]

TH

Selection: Best solutions equal to size of initial population are selected from the combination of current and previous generation to form current generation.

PSO is a robust optimization technique and was first proposed by J. Kennedy, R. Eberhart [1]. In this version, a set of potential solutions (called as particles), are dispersed at various points in the solution space. Each point has some objective function value, and thus has fitness, for its current location. Each particle’s movement is influenced by the following two factors: • The best fitness achieved by the particle so far. (BFS) • The best fitness achieved by any particle in the neighborhood. (BFN) Many new variants of PSO have been proposed and are available in the literature. In canonical PSO size of its neighbors is equal to the size of population. In Local PSO modal each particle updates its position by considering some solutions as its neighbors. Inter-

Pi + 1 = (α ∗ (Decoded value in decimal of binary version of Pin ) F (Xi)/Favg + β{(Gi − Pi ))}% (21 ) . . .

(1)

Where Gi is the position vector of best particle, Pi is the position value of a particle that is updating its structure. For best solution following adaption operator will be used

K.K. Mishra et al. / Improved environmental adaption method and its application

CO

PY

2308

Pi + 1 = (α ∗ (Decoded value in decimal of binary version of Pin ) F (Xi)/Favg

(2)

TH

+ β)%(2l ).

OR

Fig. 1. Improved environmental adaption method.

AU

This adaption operator can be used both for exploitation and exploration of the search space. In updated version there will be no change in alteration and selection operator. This is represented in the Fig. 1. Also in proposed version, setting of parameters ␣ and ␤ has done cleverly so that global optimal solution can be retrieved in early generations. For all solutions other than best, the value of ␣ will be taken as 1 and the value of ␤ taken as high somewhere around .5. However for best solution the value of ␣ and ␤ will be taken as very high close to (2l ) and .5. This is done because in all optimization problems first requirement is to search optimal value as soon as possible, in initial generations proposed values will force algorithm to search the optimal value by exploiting the regions around best solutions. However at the same time, best solution will explore whole search space to find out other good regions. This process is repeated again and again until all solutions will tend to converge in to one optimal structure. To check whether this condition is reached, in each generation the value of Favg /F (Gi) will be checked. With the help of this value, one can

indentify whether solutions are forming a cluster or not. For example in initial generations this value will be very less, say around .3 or .4. This will happen because in early generations solution will be distributed over full search space so the initial population will contain a mixture of good and bad solutions. In late generation, when all population is converging toward one structure (this is possible only if all solutions are targeting either local optimum or global optimum), this value will become 1. However to prevent algorithm in being trapped in local optima, a check would be applied when the value of Favg /F (Gi) will be equal to or greater than .8. This procedure will decide whether the algorithm is converging on global optimal solution or not. When the value of Favg /F (Gi) will be equal to or greater than .8 the value of ␣ will be taken as very high (a value which is near to highest value). This is done to perturb solutions of current generations so that premature convergence can be avoided. The reason behind it can be explained as follows, if all solutions of previous generation were approaching toward global optimum then no new solution will replace the existing solutions in the population. As selection operator will choose only best solutions. If the search was targeted toward local best then this stretching of ␣ will force new generation to capture new good solution and will again create a unstable population that will contain a mixture of good and

K.K. Mishra et al. / Improved environmental adaption method and its application

Input: MAX G (maximum number of generations) P S (Population size) Output: Q* (Final set) Other variables: n (number of generation) O’ (intermediate offspring) O (final offspring) P (temporary pool) in (ith individual of nth generation) Gi (Best individual of ith generation) Step

Gi+1 = (α ∗ (Decoded value in decimal

1) Population initialization: A fixed number of random initial solutions serve as the first generation population. 2) Creating next generation: Application of Improved adaption, alteration and selection operator on current generation will generate next generation. The function of each operator can be explained as follows:

OR

1. Set n = 0. 2. Generate initial population POP0 by P S random individuals. 3. Apply Improved Adaptation operator (calculation given below) to each member of population POPn and form On ’ For best solution

Types of Encoding: Alike EAM [9] any type of coding can be used with this optimization algorithm. Binary coding solutions are represented as binary number and in real parameter coding this algorithm will be directly applied on the actual values of each solution. IEAM works well with both type of encoding. Following steps are used in binary version of IEAM

PY

3.1. Proposed algorithm

3.2. Details of algorithm

CO

bad solutions. This stretching of ␣ is repeated again and again until we get stable population. These algorithms will definitely able to cover many optimal solutions in minimum number of iterations. Proposed IEAM can be explained as follows:

2a) Improved Adaption operator: Adaption operator can both explore and exploit the problem search space. This operator will work as follows

All solutions other than best solutions will use following equation for adaption Pi+1 = (α ∗ (Decoded value in decimal of binary version of Pin )F (Xi)/Favg

+β)% (2l )

+ β {(Gi − Pi )})% (2l )

TH

of binary version of Gi )F (Xi)/Favg

AU

For all other solutions

Pi+1 = (α ∗ (Decoded value in decimal

of binary version of Pi )F (Xi)/Favg +β {(Gi − Pi )})% (2l )

4. Apply alteration operator as per the probability of alteration On ’ and form On . 5. Pn = {POPn ∪ On } 6. n = n + 1 7. If n < MAXG Then go to 10 Else go to 8 8. Set Q∗ =on the basis of fitness function choose n best solutions of Pn 9. Return Q∗ . 10. POPn =on the basis of fitness function select PS fittest individuals from Pn−1 ; go to 4

2309

Where Gi is the position vector of best particle, Pi is the position value of a particle that is updating its structure. For best solution following adaption operator will be used Xi+1 = (α ∗ (Decoded value in decimal of binary version of Xin )F (Xi)/Favg + β)% (2l ) Where F (Pi) is the fitness value of Pi, ␣ and ␤ are the random numbers that are to be decided as per the requirement of problem. ‘l’ represents the total number of bits in an individual. FAvg is the average fitness value of current population. 2b) Alteration Operator: In early generation, this operator is responsible for exploit a particular region. In late generation, this operator will work as a explorer. Alteration is carried out according to the following psuedo code:

2310

K.K. Mishra et al. / Improved environmental adaption method and its application

Fig. 2. Generation step and evolution of IEAM.

4. Result analysis

AU

TH

CO

OR

2) Selection: Best solutions equal to size of initial population are to be selected from the combination of next generation and previous generation to form current generation. 3) Generation step and evolution Each generation step consists of generating new offspring by the formula described above, followed by alteration of existing population by Palt probability and applying selection of best fit members generated. Figure 2 describes the working of this step. (4) This evolution process is repeated until either the solution with desired fitness is obtained or number of generation exceeds the maximum number of generation. The detailed flowchart for the above stated Procedure is described below in the Fig. 3.

PY

for each member in the population for each bit in the member’s genome invert the bit with a probability of Palt. Each new member so generated is considered for selection for the next generation.

The performance of IEAM is compared with SADE, PSO, IPOP-SEP-CMA-ES, AMALGAM IDEA and BIPOP-CMA-ES algorithm. To check the efficiency of improved version of Environmental Adaption Method (IEAM), this algorithm has been applied to standard benchmark functions like Axis parallel hyper-ellipsoid, Sphere, Quartic, Rastrigrin, Griewank, Schwefel and Zakharov. Out of these functions Axis parallel hyperellipsoid, Sphere and Quartic are uni-modal functions and Rastrigrin, Griewank, Schwefel and Zakharov are Multi-modal functions. All benchmark functions were analyzed for different dimensions D = 10, 20, 30, and in the first experiment size of initial population for each algorithm was taken as 50 which was extended to 100 in later experiments, each experiment were repeated for

Fig. 3. Flow chart of IEAM.

five times and average values are shown in tables and number of generation is G = 1000. The value of SADE need not be initialized as they will be updated automatically. Each algorithm is executed five times; the proposed results are the average of the best minimum values obtained and the average of the standard deviation after all the five runs of the algorithm. A statistical data containing minimum and standard deviation has been calculated for each experiment. The detailed result analysis for each function is given below: Table 1 shows the different benchmark functions used in this paper. A) Uni-modal functions. 1) Axis parallel hyper-ellipsoid function: From Table 2 and Fig. 4, it is clear that the performance of IEAM is far better than SADE and PSO-TVAC on Axis parallel hyper-ellipsoid function. Its performance is little better than IPOP-SEP-CMA-ES and AMALGAM IDEA and almost equal to BIPOPCMA-ES. When the population size is increased to 100, it performs better than BIPOP-CMA-ES on higher dimensions.

K.K. Mishra et al. / Improved environmental adaption method and its application

2311

Table 1 List of benchmark functions



n−1

f (x) =

Axiz parallel hyper-ellipsoid

ixi2

[−5.12, 5.12]

n f (x) = i=1 (xi2 ) n f (x) = i=2 ixi4 + Rand[0, 1] n [x2 − 10 cos(2πx f (x) =  i )+ 10] i=1 i2 i=1

Sphere Quartic Rastrigrin

f (x) =

Griewank

f (x) =

Schwefel

f (x) =

Zakharov

n

xi 4000

− πin cos √ −xi sin( xi ) i=1

i=1 n n 



xi2 +

i=1

n 

0.5ixi

+1

2  +

n 

[−600.0, 600.0] [−500.0, 500.0]

4

[−5.0, 10.0]

0.5ixi

i=1

PY

i=1

x1 √ i

[−100.0, 100.0] [−1.28, 1.28] [−10.0, 10.0]

Table 2 Statistical analysis of SADE, PSO-TVAC, IPOP-SEP-CMA-ES, AMALGAM IDEA, BIPOP-CMA-ES, IEAM for Axis parallel hyper-ellipsoid function Population Iterations Dimension SADE

PSO-TVAC

IPOP-SEP-CMA-ES AMALGAM IDEA BIPOP-CMA-ES IEAM

50

22.748911 1.87299765 25.7844685 2.97845233 30.5678456 3.2345314 12.422762 1.7251827 15.446862 1.7869463 20.564616 3.22729384

20.993261 1.559346 22.15574962 2.5701364 29.7451203 2.7541102 11.653212 1.5677412 12.1841603 1.5945671 20.231497 3.1164379

20 30 100

1000

10 20 30

22.934678 2.2987444 27.876665 2.994761 30.2334456 3.5566823 15.80040 1.987444 17.987451 2.1231462 24.2417559 3.41635301

20.97896 1.540693 21.984342 2.5063417 29.5630127 2.6745012 11.041523 1.5014588 11.041523 1.4654102 20.0197456 3.021478

20.834189 1.530143 21.436518 2.4578403 28.9451203 2.5521436 10.984423 1.4562412 11.3226753 1.4087452 19.7545216 2.962037

20.957634 1.535624 21.3568923 2.4035745 28.0432667 2.5145217 10.958437 1.3562412 11.3226753 1.103668 19.6564374 2.7541098

AU

TH

Bold value represents standard deviation (std).

OR

10

CO

(MEAN STD. DEV)

1000

Fig. 4. Comparison among SADE, PSO-TVAC, IPOP-SEP-CMAES, AMALGAM IDEA, BIPOP-CMA-ES and IEAM on Axis Parallel Hyper-Ellipsoid for Min value (population size 100).

2) Sphere: Sphere function is a convex uni-modal function. Although for sphere function a big difference was not obtained for smaller and higher dimensions. BIPOP-CMA-ES performs slightly better than IEAM on higher dimension of 30. Surprisingly for an intermediate dimension size of 20, for population size of

Fig. 5. Comparison among SADE, PSO-TVAC, IPOP-SEP-CMAES, AMALGAM IDEA, BIPOP-CMA ES and IEAM on Sphere for Min value (population size 100).

50, SADE and IEAM perform better than all other algorithms. Table 3 and Fig. 5 confirm our analysis. 3) Quartic function: Quartic function is also a uni-modal function of degree four. BIPOP-CMA-ES outperforms IEAM on all dimensions for both 50 and

2312

K.K. Mishra et al. / Improved environmental adaption method and its application Table 3 Statistical analysis of SADE, PSO-TVAC, IPOP-SEP-CMA-ES, AMALGAM IDEA, BIPOP-CMA-ES, IEAM for Sphere function

Population

Iterations

Dimension

SADE

PSO-TVAC

IPOP-SEP-CMA-ES

AMALGAM IDEA

BIPOP-CMA-ES

IEAM

(MEAN STD. DEV) 1000

10 20 30

100

1000

10 20 30

0.013378 0.000905 0.012201 0.001438 0.011783 0.000809 0.012879 0.001230 0.012418 0.001139 0.012165 0.000936

0.013072 0.001240 0.013031 0.000708 0.011608 0.000895 0.012870 0.001464 0.012469 0.001011 0.012371 0.000871

0.0156458 0.003215 0.0164231 0.0023654 0.0113284 0.00051265 0.013607 0.001397 0.012582 1.5945671 0.010178 0.0007561

Bold value represents standard deviation (std).

0.0144684 0.003164 0.012146 0.0031549 0.0119853 0.0003645 0.015309 0.001315 0.013461 1.4654102 0.012643 0.0008934

PY

50

0.012616 0.001876 0.013324 0.0002679 0.0114568 0.00022564 0.0113655 0.0010159 0.012984 1.4087452 0.015324 0.0009214

0.0111231 0.0010022 0.0116547 0.001047 0.0118876 0.000902 0.0111231 0.0010022 0.0116547 0.001047 0.0118876 0.000902

Population Iterations Dimension SADE

PSO-TVAC

50

0.00997766 0.000324553 0.0143444 0.0003328 0.02710098 0.0039933 0.00897688 0.0002538 0.00957723 0.0003328 0.01098763 0.0009338

CO

Table 4 Statistical analysis of SADE, PSO-TVAC, IPOP-SEP-CMA-ES, AMALGAM IDEA, BIPOP-CMA-ES, IEAM for Quartic function IPOP-SEP-CMA-ES AMALGAM IDEA BIPOP-CMA-ES IEAM (MEAN STD. DEV)

20 30 100

1000

10 20 30

0.001661722 0.00455642 0.0172334 0.00477042 0.0282175 0.00499168 0.0071780 0.0006425 0.0102312 0.001042 0.0122465 0.001678

Bold value represents standard deviation (std).

0.0067942 0.00035975 0.012649 0.0003658 0.0285602 0.00036211 0.0061613 0.00020361 0.0076122 0.0002017 0.0140394 0.00044194

OR

10

TH

1000

0.007910 0.0004065 0.0171516 0.0035543 0.0290317 0.00031302 0.0073911 0.0001906 0.0061975 0.0002169 0.0173103 0.00043129

0.002609 0.0002536 0.010319 0.00020059 0.0263458 0.00015894 0.0060324 0.0000941 0.007263 0.0001852 0.0094325 0.00038261

0.008654 0.0002002 0.0110725 0.0002153 0.02600501 0.0002429 0.006837 0.0001001 0.0072543 0.0002153 0.00897562 0.0004426

B) Multi-modal functions.

AU

100 population sizes, As shown in Table 4 and Fig. 6 IEAM still gives much better results than SADE, PSOTVAC and IPOP-SEP-CMA-ES.

1) Rastrigrin function: Rastrigrin function is one of the simple multi-modal functions with cosine modulation. It may have many local minima with a unique global minimum. For this Multi-modal function we get a series of good results in which IEAM is better than its other counter parts. The minimum values are always better than SADE, PSO-TVAC, IPOP-SEP-CMA-ES, AMALGAM IDEA, and BIPOP-CMA-ES. Table 5 and Fig. 7 shows result analysis. 2) Griewank Function: Griewank is also a multimodal function with cosine modulation. It is clear from the Table 6 and Fig. 8 that the gap in performance

Fig. 6. Comparison among SADE, PSO-TVAC, IPOP-SEP-CMAES, AMALGAM IDEA, BIPOP-CMA ES and IEAM on Quartic for Min value (population size 100).

of IEAM when compared with all other algorithm is huge. It clearly outmatches all the algorithms by a large margin.

K.K. Mishra et al. / Improved environmental adaption method and its application

2313

Table 5 Statistical analysis of SADE, PSO-TVAC, IPOP-SEP-CMA-ES, AMALGAM IDEA, BIPOP-CMA-ES, IEAM for Rastrigrin function Population

Iterations

Dimension

SADE

PSO-TVAC

IPOP-SEP-CMA-ES

AMALGAM IDEA

BIPOP-CMA-ES

IEAM

1.8465213 0.210069 11.492325 0.9451303 29.946264 1.49677403 1.7854209 0.0000941 10.9842601 0.8471261 28.369461 0.00038261

1.725166 0.200157 10.478592 0.800209 28.353167 1.4223344 1.635162 0.200157 10.478592 0.800209 28.353167 1.4223344

(MEAN STD. DEV) 1000

10 20 30

100

1000

10 20 30

2.813421 0.416443 15.778827 1.507193 37.473776 1.978166 2.921780 0.325387 15.963878 1.507193 36.973841 2.325681

1.889838 0.220286 11.436818 1.336443 30.845319 1.431310 1.897269 0.201921 11.390910 0.771820 30.725416 2.032141

2.469263 0.216425 11.365902 1.1298604 32.976456 1.5006943 1.8206974 0.00020361 11.6064962 0.8106952 30.8740319 0.00044194

2.7646215 0.221643 11.294094 1.031264 30.946132 1.4979561 1.7746532 0.0001906 11.2974620 0.8646294 29.9495261 0.00043129

PY

50

Bold value represents standard deviation (std).

Population

Iterations

Dimension

SADE

PSO-TVAC

CO

Table 6 Statistical analysis of SADE, PSO-TVAC, IPOP-SEP-CMA-ES, AMALGAM IDEA, BIPOP-CMA-ES, IEAM for Griewank function IPOP-SEP-CMA-ES

AMALGAM IDEA

BIPOP-CMA-ES

IEAM

(MEAN STD. DEV)

10 20 30

100

1000

10 20 30

0.057967 0.016779 0.031974 0.004682 0.021444 0.001739 0.056765 0.012475 0.030488 0.004266 0.021481 0.002026

0.0369462 0.017954 0.029462 0.0029462 0.0194626 0.001976 0.040952 0.040384 0.024462 0.0026543 0.016432 0.001169

0.0319796 0.019496 0.0219792 0.0029796 0.0144962 0.0014632 0.0394612 0.0390096 0.025646 0.0028976 0.014623 0.0001199

0.03974126 0.011616 0.0279567 0.0024913 0.0149865 0.0016324 0.034976 0.021943 0.023946 0.0025973 0.019732 0.001267

0.031867 0.010331 0.022300 0.002132 0.014538 0.001035 0.031867 0.010331 0.022300 0.003247 0.014538 0.001354

AU

Bold value represents standard deviation (std).

0.036360 0.009843 0.023736 0.002594 0.014623 0.001147 0.044003 0.007963 0.023083 0.002390 0.015396 0.001124

OR

1000

TH

50

Fig. 7. Comparison among SADE, PSO-TVAC, IPOP-SEP-CMAES, AMALGAM IDEA, BIPOP-CMA ES and IEAM on Rastrigin for Min value (population size 100).

3) Schwefel: Schwefel is one of the most tough benchmark functions. It is a multi-modal function with sine modulation. To be noted; we execute a two test cases for schwefel one for two-dimensional 50 population

Fig. 8. Comparison among SADE, PSO-TVAC, IPOP-SEP-CMAES, AMALGAM IDEA, BIPOP-CMA ES and IEAM on Griewank for Min value (population size 100).

size and other for two-dimensional 100 population size. From Table 7 and Fig. 9, an idea about the performance of IEAM can be taken. For lower population we get very good result for IEAM and for higher population,

2314

K.K. Mishra et al. / Improved environmental adaption method and its application Table 7 Statistical analysis of SADE, PSO-TVAC, IPOP-SEP-CMA-ES, AMALGAM IDEA, BIPOP-CMA-ES, IEAM for Schwefel function

Population

Iterations

Dimension

SADE

PSO-TVAC

IPOP-SEP-CMA-ES

AMALGAM IDEA

BIPOP-CMA-ES

IEAM

0.003163 0.000756 0.003719 0.001496

0.003227 0.001277 0.003227 0.001277

(MEAN STD. DEV) 50

2

1000

100

2

1000

0.004142 0.001577 0.004474 0.001914

0.003010 0.001188 0.003397 0.001035

0.003461 0.000756 0.003449 0.001349

0.003713 0.000756 0.003603 0.001764

Bold value represents standard deviation (std).

Population Iterations Dimension

SADE

PSO

PY

Table 7b Statistical analysis of SADE, PSO-TVAC, IPOP-SEP-CMA-ES, AMALGAM IDEA, BIPOP-CMA-ES, IEAM for Zakhrov function IPOP-SEP-CMA-ES AMALGAM IDEA BIPOP-CMA-ES

IEAM

(MEAN STD. DEV) 1000

10 20 30

2.785612 0.0187274 7.5908371 0.2234146 10.2541675 0.423163530

2.4352722 0.0173518 5.4688862 0.20899943 10.1564261 0.322073293

2.409416 0.017976 5.29762 0.21495603 10.106462 0.3006464

2.386405 0.0161561 5.009462 0.2079462 10.116462 0.2867952

CO

50

2.0058667 0.01356243 3.33467012 0.18103634 10.004574 0.23340987

However for zakharov also IEAM is better than its counterparts on lower dimensions. The mean error and stddev is calculated for each algorithm for 25 runs. Wilcoxon’s rank sum test is performed between IEAM and SADE, PSO, IPOPSEP-CMA-ES, AMALGAM IDEA, BIPOP-CMA- ES at 0.05 significance level. “-“denotes the performance of an algorithm worse than, “+”, better than, and “≈”, similar to the performance of the proposed IEAM.

AU

TH

OR

Bold value represents standard deviation (std).

2.29746 0.015762 4.66264 0.206796 10.19562 0.25976203

Fig. 9. Comparison among SADE, PSO-TVAC, IPOP-SEP-CMAES, AMALGAM IDEA, BIPOP-CMA-ES and IEAM on Schwefel for Min value (Population size 100).

its performance remains unchanged and is far better than others. 4) Zakharov: Zakharov is also complex multi-modal function. We evaluate zakharov on three dimensions (10, 20 and 30) and population size 50, evaluation is shown in Table 7. The function minimum values have a very little difference say; for 30 dimension 100 population size t; the minimum value were 2.043, 2.006, 2.0058, 2.0035, 2.009 and 1.999 for SADE,PSO- TVAC,IPOP-SEP-CMA-ES, AMALGAM IDEA, BIPOP-CMA-ES and IEAM respectively.

5. Real life application of IEAM Although IEAM has been established as an optimization algorithm yet its performances are not checked on a real life optimization problem. In this section, a real life application of IEAM is discussed. This problem is related to software testing and its main objective is to generate test cases for white box testing. Software testing is the most commonly used technique for improving the quality of Software. It is a labor intensive process, and typically accounts for about half of the total cost of software development and maintenance [11]. To gain confidence that software will work as per user expectations, this newly developed software must be carefully exercised by the fault revealing test cases. Searching appropriate test data is not an easy task. It is very difficult. This is an optimization problem where a set of test cases are to be searched from

K.K. Mishra et al. / Improved environmental adaption method and its application

2315

Table 8 Performance Evaluation of IEAM by Wilconxon’s Rank Test PERFORMANCE SADE PSO IPOP-SEP-CMA-ES AMALGAM IDEA BIPOP-CMA-ES + ≈

20 1 4

19 2 4

14 2 6

13 4 8

11 6 8

Starting module: This module contains all information that is required to start other modules. Following parameters are stored in this module. 1. Information required for starting IEAM: maximum number of generations, number of solutions in the population, type of coding. 2. Information required to generate new population: number of test data in each individual, type of input variable, value of population related parameters, for example in case of IEAM this will store the value of Favg (average fitness of the population). 3. Fitness function used to evaluate the fitness of individual. Information about test tool.

AU

TH

OR

the big input domain of the variables to optimize certain coverage criteria. Hence There is a need to generate test data automatically to save the development time and cost. Some techniques are mentioned in the following references [12–18]. Although many meta heuristic approaches have been used to automate the process of test case generation successfully up to some extent, still there is a need of new meta heuristic algorithm which can minimize the overall time for test case generation and reduces the overall production cost. A good meta heuristic algorithm will be general, robust and generate the right test cases corresponding to the testing criteria for use in the real world of software testing [12]. Sometimes identification of right test case might be difficult because conditions or predicates in the software restrict the input domain that is a set of valid data. Therefore, search algorithm must be effective so that it can decide where the best values (test cases) may lie and concentrate its search there. Hence there is a need of new meta heuristic algorithm that can search whole search space to produce better test cases in less number of iterations. To produce an effective test data, this paper proposes the application IEAM in white box testing. The Branch coverage is taken as adequacy criteria to determine when to end the testing process. Results generated by proposed algorithm are compared with other algorithms. Comparisons show the effectiveness of proposed approach.

1) Starting module 2) Optimization Algorithm 3) Test case coverage tool

CO

Fig. 10. Test case generation framework.

PY

cases are generated randomly from the domain of input variables. To generate new test cases, IEAM needs an input population. A set of randomly generated test cases will work as an initial population for the IEAM. For test case generation, framework shown in Fig. 10 has been used. This framework has three modules as follows:

5.1. Test case generation using IEAM for white box testing In this section, IEAM has been used to generate new test cases whose branch coverage is better as compared to existing test cases. First of all, an initial set of test

Optimization module: This module implements the functionality of optimization algorithm. Its purpose is to generate initial population and new populations with the help of necessary operators. Test case coverage module: This module implements the functionality of coverage tool. IEAM has been used as an optimization algorithm to implement this framework. The optimization algorithm IEAM is used to search appropriate test cases which have better branch coverage as compared to existing test cases. Here branch coverage of test case is taken as fitness function. Gcov coverage tool [a coverage tool available as free open source software at [http://gcc.gnu.org/onlinedocs/gcc/Gcov.html] is used to identify the percentage of branch covered by each test case. The process of test case generation with IEAM can be explained by following pseudo code.

2316

K.K. Mishra et al. / Improved environmental adaption method and its application Table 9 Comparison of IEAM with GA and EAM for test case generation Max-Find program

EAM

IEAM

GA

68.5 68.93 71.93 53.93 53.43 64.43 60.93 63.43 59.43 60.93 67.93 55.93 59.43 54.93 60.43 59.93 58.43 61.93 59.93 65.5

55 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5 72.5

55 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80

67.5 68.68 64.93 64.93 68.68 67.43 69.93 71.18 64.93 68.68 67.43 62.43 69.93 68.68 66.18 66.18 66.18 67.43 68.68 69.93

EAM 68.75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 75

F-closure program

IEAM 68.75 68.75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 75

GA

EAM

30.94375 29.70553 35.65553 28.95553 34.10553 33.55553 28.80553 30.50553 33.65553 30.60553 30.45553 31.95553 30.95553 30.20553 32.10553 34.75553 31.20553 31.45553 36.75553 32.905523

30.7 38.9 40.8 42.1 42.5 42.7 43 43.25 43.35 43.7 43.8 43.8 43.8 43.8 43.8 43.8 43.85 43.9 44.7 45.5

30.7 38.55 40.35 41.65 42.9 43.2 43.7 43.85 44.05 44.2 44.25 44.5 44.6 44.65 44.65 44.65 44.65 44.75 44.85 44.849

GA 91.300003 90.17424 95.42424 87.92424 93.77425 85.92424 94.07424 84.32424 90.02425 89.87425 90.32424 91.52425 87.47424 87.17424 89.87425 90.12425 82.17424 85.22424 87.92424 92.57424

EAM 87.05 96.4 98.2 99.7 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

IEAM 87.05 96.4 97.9 99.1 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

(g) Number of generations = number of generations + 1

OR

1. Create initial population of size N by randomly selecting N test cases from the input domain of variables. Test cases must be represented in binary. 2. While (number of generations < maximum generations)

First-fit algorithm

IEAM

PY

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Triangle problem GA

CO

Generation

AU

TH

The proposed algorithms have been applied to generate test cases for white box testing. The goal of the proposed algorithm is to generate test cases that have higher branch coverage capability as compared to initial test cases. To gauge the performance of our proposed algorithms IEAM on four programs 1) Triangle Program 2) Max Find Program 3) FClosure 4) First fit algorithm, (a) Apply Gcov and Calculate the percentage of branches covered by each test case, this will work as fitness of each solution. (b) Calculate the value of average fitness, by taking the average of fitness of solutions. (c) Apply improved adaption operator to each test case and create intermediate population. (d) Apply alteration operator to some solutions of intermediate population. (e) Merge initial population with intermediate population and sort this population according to branch coverage. (f) Select N best solution and generate new population.

3. end while. 4. print: Final test suite. We have compared the performance of IEAM with Genetic Algorithms and EAM. Total twenty iterations have been used and it is seen that EAM and IEAM have a very fast convergence rate. The fitness values for Triangle Program are as shown in the Table 9.

6. Conclusion To improve the performance of EAM algorithm, an improved version of EAM has been suggested. Result analysis shows that Improved Environmental Adaptation Method (IEAM) is better than all other optimization methods. Moreover this paper establishes the application of IEAM in generating the test suites for white box testing that achieve maximum branch coverage. Further following has been established • A new framework has been designed to automate the process of test case generation. • With the help of EAM and IEAM the average branch coverage of initial test suite has been increased. • The performance of EAM, IEAM and GA has been tested on standard benchmark programs and

K.K. Mishra et al. / Improved environmental adaption method and its application [15]

[16]

[17]

References [18]

[5]

[6]

[7]

[8]

[9]

[10] [11] [12]

[13]

[14]

[20]

[21]

CO

[4]

[19]

[22] [23] [24]

OR

[3]

TH

[2]

J. Kennedy and R. Eberhart, Particle swarm optimization, IEEE International Conference on Neural Networks 4 (1995), 1942–1948. R. Poli, J. Kennedy and T. Blackwell, Particle Swarm Optimization: An Overview, Swarm Intelligence, Springer, New York, (2007) pp. 33–57. Y. Shi and R. Eberhart, Parameter Selection in Particle Swarm Optimization, Evolutionary Programming VII, LNCS, Springer Berlin, (1998), pp. 591–600. R.C. Eberhart and Y. Shi, Comparing inertia weights and constriction factors in particle swarm optimization, 2000 Congress on Evolutionary Computation 1 (2000), 84–88. M. Clerc and J. Kennedy, The particle swarm – explosion, stability, and convergence in a multidimensional complex space, IEEE Transactions on Evolutionary Computation 6(1) (2002), 58–73. M. SenthilArumugam, M.V.C. Rao and AarthiChandramohan, A new and improved version of particle swarm optimization algorithm with global-local best parameters, Knowl, Inf Syst 16(3) (2008), 331–357. M. SenthilArumugam, M. Venkata Chalapathy Rao and A.W.C. Tan, A novel and effective particle swarm optimization like algorithm with extrapolation technique, Appl Soft Comput 9(1) (2009), 308–320. O.K Erol and I. Eksin, A new optimization method: Big bangbig crunch, Advances in Engineering Software, Elsevier 37 (2006), 106–111. K.K. Mishra, S. Tiwari and A.K. Misra, A bio inspired algorithm for solving optimization problems, 2nd International Conference on Computer and Communication Technology (ICCCT - 2011) (2011), 653–659. J.H. Holland, Adaptation in Natural and Artificial systems, Ann Arbor: The University of Michigan Press, 1975. B. Beizer, Software Testing Techniques, 2nd edition, 1990. C.V. Ramamoorthy, S.-B.F. Ho and W.T. Chen, On the automated generation of program test data, 2nd International Conference on Software Engineering, San Francisco, California, United States, IEEE Computer Society Press, Los Alamitos, CA, USA, 1981. J.-C. Lin and P.-L. Yeh, Automatic test data generation for path testing using GAs, Information Sciences, An International Journal 131(1-4) (2001), 47–64. N. Mansour and M. Salame, Data Generation for Path Testing, SoftwareQuality Journal, 2004.

AU

[1]

J. Miller, M. Reformat and H. Zhang, Automatic test data generation using genetic algorithm and program dependence graphs, Information and Software Technology 48 (2006), 586–605. M.R. Girgis, Automatic test data generation for data flow testing using a genetic algorithm, Journal of Universal Computer Science 11 (2005), 898–915. A.S. Ghiduk, M.J. Harrold and M.R. Girgis, Using genetic algorithms to aid test-data generation for data-flow coverage, 14th IEEE Asia-Pacific Software Engineering Conference (APSEC -07), Nagoya, Japan, (2007), pp. 41–48. M.A. Ahmed and I. Hermadi, GA-Based multiple paths test data generator, Computers and Operations Research, 2008. O. Kramer, D. Echeverr´iaCiaurri and S. Koziel, Derivativefree optimization, Computational Optimization, Methods and Algorithms, Studies in Computational Intelligence, Volume 356, Springer Berlin Heidelberg, (2011), pp. 61–83. T.G. Kolda, R.M. Lewis and V. Torczon, Optimization by direct search: New perspectives on some classical and modern methods, SIAM Review 45(3) (2003), 385–482. A.R. Conn, K. Scheinberg and L.N. Vicente, Introduction to Derivative-Free Optimization, MPS-SIAM Series on Optimization, MPS-SIAM (2009). I. Rechenberg, Evolution Strategy, in Zuarda et al. 1994, pp. 147–159. L.J. Fogel, A.J. Owens and M.J. Walsh, Artificial intelligence through simulated evolution, New York: John Wiley, 1966. R. Storn and K. Price, Differential evolution, A simple and efficient heuristic for global optimization over continuous spaces, J Global Optimiz 11 (1997), 341–359. J.R. Koza, Genetic programming: A paradigm for genetically breeding populations of computer programs to solve problems, Stanford, CA, USA, Tech Rep, 1990. K.K. Mishra, S. Tiwari and A.K. Misra, Improved environmental adaption method for solving optimization problems, Computational Intelligence and Intelligent Systems, Communications in Computer and Information Science, Springer Berlin Heidelberg, (2013), pp. 300–313. Hansen and Nikolaus, Benchmarking a BI-population CMAES on the BBOB-2009 function testbed, In Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, pp. 2389–2396. ACM, 2009. Bosman, A.N. Peter, J¨ornGrahl and D. Thierens, AMaLGaM IDEAs in noiseless black-box optimization benchmarking, In Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, pp. 2247–2254. ACM, 2009. Ros and Raymond, Benchmarking sep-CMA-ES on the BBOB-2009 functiontestbed, In Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, pp. 2435–2440. ACM, 2009.

PY

IEAM algorithm is found to be the most suitable algorithm for this application. • Even for complex program, where conditions or predicates in the software restrict the input domain, performance of IEAM and EAM was very good as compared to GA algorithm.

[25]

[26]

[27]

[28]

[29]

2317