Improving GA based Automated Test Data Generation ... - IEEE Xplore

3 downloads 867 Views 264KB Size Report
case generation in object oriented software testing is increasing. Extensive ... process of software development. ... be performed during software development.
Improving GA based Automated Test Data Generation Technique for Object Oriented Software Nirmal Kumar Gupta

Mukesh Kumar Rohil

Department of Computer Science and Information Systems Birla Institute of Technology & Science, Pilani Pilani, India [email protected]

Department of Computer Science and Information Systems Birla Institute of Technology & Science, Pilani Pilani, India [email protected]

Abstract— Genetic algorithms have been successfully applied in the area of software testing. The demand for automation of test case generation in object oriented software testing is increasing. Extensive tests can only be achieved through a test automation process. The benefits achieved through test automation include lowering the cost of tests and consequently, the cost of whole process of software development. Several studies have been performed using this technique for automation in generating test data but this technique is expensive and cannot be applied properly to programs having complex structures. Since, previous approaches in the area of object-oriented testing are limited in terms of test case feasibility due to call dependences and runtime exceptions. This paper proposes a strategy for evaluating the fitness of both feasible and unfeasible test cases leading to the improvement of evolutionary search by achieving higher coverage and evolving more number of unfeasible test cases into feasible ones. Keywords—Genetic algorithms, Object oriented testing, Test automation, Fitness function.

I.

INTRODUCTION

Testing is one of the very important activities which need to be performed during software development. Generally software testing is conducted by executing the developed program with test inputs and comparing the observed output with the expected one [7]. Because the input space of the Software Under Test (SUT) might be very large, testing has to be conducted with a representative subset of test cases. Creating relevant subset of test cases during software testing is the most critical activity. The test cases which are used to examine the SUT must possess an ability to expose the faults as well as test cases must be a representative subset of possible inputs. The quality and the significance of the overall test are directly affected by the set of test cases that are used during testing [2]. Test data is used to create the test cases. Generally test data is generated through a test data generation tool, while a test adequacy criterion assures the quality of test cases generated and gives information about the end of testing process [8]. Many test data generators have been developed each one using different kinds or variations of existing testing techniques [3]. The identification of good test cases generally follows some predefined testing criteria, for example code coverage, path

c 978-1-4673-4529-3/12/$31.00 2012 IEEE

coverage statement coverage and branch coverage [2, 9]. In recent years some more advanced heuristic search techniques have been applied to software testing. These techniques are based on evolutionary algorithms, and their performance in finding test cases was found to be at least as good as random testing, but in many cases it is much better [1]. Commonly these testing techniques are referred as evolutionary testing. Evolutionary Testing uses a kind of meta-heuristic search technique, the Genetic Algorithm (GA), to convert the task of test case generation into an optimal problem. Evolutionary Testing is used to search for optimal test parameter combinations that satisfy a predefined test criterion. This test criterion is represented by using a “cost function” that measures how well each of the automatically generated optimization parameters are satisfying the given test criterion. The focus of this work is about employing evolutionary algorithms for generating and evolving test cases for the structural unit testing of object oriented Java programs. For this purpose a strategy has been proposed for efficiently guiding the search process towards achieving full structural coverage which involves favoring test cases that exercise difficult structures and control-flow paths through the method. II.

FITNESS EVALUATION BACKGROUND

The fitness function is constructed on the basis of the software tested. The function itself is not of interest for the problem, the only goal is to find a test data that fits a test criterion. A well-constructed fitness function [9] can: considerably increase the chance of finding the solution and reach a better coverage of the software under test and results in a better guidance of the search and thus in optimizations with less iterations. Other work on designing fitness functions and the results of the optimization process can be found in [4], which investigates the use of various distance functions like Hamming distance, reciprocal function and their influence on optimization performance. In [4], a decision was made in favor of the Hamming distance because the authors used genetic algorithms with a bit representation of all parameters in their approach.

249

Modifying the distance function of branch conditions is only one possible mechanism for modifying the fitness function. In the presented research we argue that more general alterations to the fitness function may lead to better results in Evolutionary Testing. This results in a higher chance of finding the solution or a better performance of the optimization process in general. Carlos [5] proposed a methodology for evaluating the quality of both feasible and unfeasible test cases i.e., those that are effectively completed and terminate with a call to the method under test, and those that abort prematurely because a runtime exception is thrown during test case execution. In their approach, unfeasible test cases are considered at certain stages of the evolutionary search, promoting diversity and enhancing the possibility of achieving full coverage. In their work weights of Control Flow Graph (CFG) nodes are reevaluated by multiplying it three factors defined in their work. In this research a simple approach has been followed and has been redefined these factors which can help to reduce the possible variation in their computed values. With the presented approach, the quality of a given test case is related to the CFG nodes of the Method Under Test (MUT) which are the targets of the evolutionary search at the current stage of the search process. Test cases that exercise less explored (or unexplored) CFG nodes and paths must be favored, with the objective of attaining the primary goal of the test case generation process – finding a set of test cases that achieves full code coverage of the test object [11]. However, the execution of test cases may abort prematurely if a runtime exception is thrown during execution. When this happens, it is not possible to trace the structural entities transverse in the MUT because the final instruction of the Method Call Sequence (MCS) is not reached [2]. Test cases can thus be separated in two classes: feasible test cases are effectively executed, and terminate with a call to the MUT; unfeasible test cases terminate prematurely because a runtime exception is thrown by an instruction of the MCS. As a general rule, longer and more intricate test cases are more prone to throw runtime exceptions; however, complex method call sequences are often needed for defining elaborate state scenarios and transverse certain problem nodes [10]. If unfeasible test cases are blindly penalized, the definition of elaborate state scenarios will be discouraged. The issue of steering the search towards the traversal of interesting CFG nodes and paths was address by assigning weights to the CFG nodes; the higher the weight of a given node the higher the cost of exercising it, and hence the higher the cost to transverse the corresponding control-flow path. III.

generations the weight of each CFG node is reevaluated to accommodate the following factors: 1. The Hit Count Factor (HCF), which is computed as . It accounts for deteriorating the weight of parameter recurrently hit nodes of CFG. Here contains the count of how many times node was exercised by the test programs of the previous generations. represents the number of test cases produced in the previous generation. Here the value of HCF remains between 0 and 1. If a node is hit more number of times, HCF will be close to 0 which decreases weight of corresponding node more rapidly. 2. The Path Factor (PF) which is used to improve the weight of nodes which lead to interesting nodes and thus belong to interesting paths. We compute PF as . It computes the average value of ratio of change in weight of a node with its initial value of weight for all successor nodes of corresponding node . represents set of all successor nodes of in CFG graph and represents count of all successor nodes of . Therefore PF will decrease the weight of node in reevaluation if the overall weight of its successor nodes has been decreased from their initial value, otherwise it will increase. That means if node leads to unexplored or interesting CFG nodes then PF will increase its weight. 3. The Weight Factor (WF), we represent it as is needed in node reevaluation because HCF will always decrease the weight of node for each test case while PF may increase or decrease it. Therefore, to intensify the path search we need to ensure that the weight of node should be restrained. To accomplish this we use WF whose value should be selected properly which can lead to enhance the search process taking minimum number of generations. Therefore, considering all the above factors the weight of each CFG node is reevaluated in every generation according to following equation:

(1)

The fitness of feasible test cases is computed on the basis of their trace information, which includes the nodes which are hit by that test case. If denotes the set of nodes which are traversed by a test case , and thus denotes the number of nodes along this path then the fitness of this test case is evaluated as follows:

STRATEGY FOR FITNESS EVALUATION

During the start of the first generation the weight of each CFG node is initialized to , then in each successive

250

2013 3rd IEEE International Advance Computing Conference (IACC)

(2)

Using this strategy the fitness of those test cases which traverse the same path which has been traversed already deteriorates in subsequent traversals because the weight of frequently hit nodes is increased thus worsens the fitness of those test cases who execute through that path.

feasible and unfeasible test cases as their impact is considered in separate experiments. S

1 2

The fitness of unfeasible test cases is computed as the ratio of weights of all the remaining possible nodes in CFG where runtime exception occurred with the weights of all those nodes which have been exercised by that test case before the runtime exception occurred. If exception occurs at node and denotes the set of all nodes which are descendants of node in CFG and denotes the set of all nodes which are traversed by the test case before the exception occurs.

3

4 5 6 7

8 9 10 11

(3)

12 13 14

In this manner the fitness of unfeasible test cases is also depending upon unfeasibility factor which is added to penalize the fitness of unfeasible test cases. The unfeasible test cases are selected to improve into feasible test cases at certain point of evolutionary search, which favors the diversity and complexity of MCSs. If is large, more number of unfeasible test cases may be selected for next generation which may reduce the possibility of a better feasible test case to be selected for next generation. If is very small, then only few unfeasible test cases will be selected for next generation which diminishes the overall idea of giving weights to CFG nodes for computing fitness. Therefore this value must be selected very carefully for better results. IV.

CASE STUDY

For conducting the case study the example shown in Figure 1 has been considered. This is a classical example used by many researchers in software testing area to test the code coverage [6]. This example is a simple program for classification of triangles. The class for which test cases are to be generated is TriangleTest class and the method under test is Triang() which takes three parameters of type Side. If these three parameters define an invalid triangle then it has been considered as exception and program terminates. Firstly the experiment is conducted to investigate the importance of CR and MR on the efficiency of the genetic algorithm. Another parameter that might most obviously affect performance is the size of the population. For the experiment population size is varied among the values 8, 15, 20 and 30. To establish that the values chosen for the mutation and crossover rates were reasonable, these two parameters are jointly varied over a range of values. Initial experiments showed that mutation and crossover rates less than 0.1 or greater than 0.8 did not improve performance, so these experiments limited crossover and mutation rates to values within this range. Crossover and mutation rates were thus varied between 0.1 to 0.8 per generation. For this initial experiment the values of α and β are 1 and 0 respectively. These values ensure that they don’t put their own impact while computing the fitness of

15 16 17

18 19

20 21

22

E

S

static void Triang (Side Side1, Side Side2, Side Side3) { // triOut and triexcept are the class variables // Triang = 1 if triangle is scalene // Triang = 2 if triangle is isosceles // Triang = 3 if triangle is equilateral

1 2

if (Side1.getSide()