A Degraded ILP Approach for Test Suite Reduction - CiteSeerX

8 downloads 0 Views 192KB Size Report
Abstract— As the cost of executing and maintaining a large test suite is always expensive, many heuristic techniques have been brought out for test suite ...
A Degraded ILP Approach for Test Suite Reduction Zhenyu Chen, Xiaofang Zhang and Baowen Xu School of Computer Science and Engineering, Southeast University, Nanjing, China. {zychen, xfzhang, bwxu}@seu.edu.cn

Abstract— As the cost of executing and maintaining a large test suite is always expensive, many heuristic techniques have been brought out for test suite reduction in spite of no guarantee of minimum size. The integer linear programming (ILP) approach can generate minimum test suites but it may cost exponential time. This paper proposes a degraded ILP (DILP) approach to bridge the gap between the ILP method and traditional heuristic methods. The DILP can produce a lower bound of minimum test suite and then search a small test suite close to the lower bound. An empirical evaluation of DILP is designed on Boolean specification-based testing. Four typical heuristic reduction strategies: G, GRE, H and GC are compared with DILP empirically. The experimental results show that DILP always outperforms other heuristic reduction strategies and it sometimes can guarantee the minimum size.

I. I NTRODUCTION In a software testing process, the testing requirements firstly need to be defined from software specifications or implementations. Then, test cases are designed to satisfy the requirements manually or automatically. The test cases designed for a particular requirement may also satisfy other requirements in practice, i.e. a requirement may be satisfied by more than one test case. As a result, the constructed test suite may contain redundancy. Some subsets of the constructed test suite may still satisfy the same testing requirements. As the redundancy increases the cost of executing and maintaining the test suite, it is valuable to generate a small test suite satisfying all testing requirements. We use R = {r1 , · · · , rm } to denote the set of testing requirements which must be satisfied in the testing process. A testing requirement is said to be feasible if there is at least one test case satisfying the testing requirement. We assume that every testing requirement is feasible in this paper. A test suite is a set of test cases, denoted by T = {t1 , · · · , tn }. The set of all testing requirements satisfied by t is denoted by Req(t). The set of all test cases satisfying r is denoted by T est(r). A test suite T satisfies R if for each testing requirement r in R, there is at least one test case in T satisfying r. T 0 is said to be a representative set of T if T 0 is a subset of T such that T 0 can satisfy R. A test case t is said to be 1-1 redundant to T if Req(t) ⊆ Req(t0 ) for an other t0 ∈ T . This work was supported in part by the National Natural Science Foundation of China (60425206, 60773104, 60403016, 60633010), Natural Science Foundation of Jiangsu Province (BK2005060), High Technology Research Project of Jiangsu Province (BG2005032), Excellent Talent Foundation on Teaching and Research of Southeast University, and Open Foundation of State Key Laboratory of Software Engineering in Wuhan University, Doctor subject fund of education ministry(20060286020), Jiangsu Planned Projects for Postdoctoral Research Funds (0701003B).

T − {t} is a representative set of T for a 1-1 redundant test case t. This is so-called 1-1 reduction strategy. A test case t is said to be essential to R if there exists r ∈ R such that T est(r) = {t}. An essential test case t must be in every representative set. The objective of test suite reduction is to find a small representative set for a given test suite. A minimum sized representative set is desirable. However, the problem of finding a minimum test suite is equivalent to the set covering problem, which is known to be N P -complete [8]. A test suite reduction problem can be translated into an ILP problem, then some ILP tools could be used to produce a minimum test suite [11]. However, ILP is not suitable for large test suites because it may cost exponential time. A practical approach for test suite reduction is to develop heuristic strategies in spite of no guarantee of minimum test suites. It is often referred to search based software engineering [9]. A challenge of existing heuristic methods is so-called stopping criteria. That is the testers could not estimate whether the result is good enough. Hence, they could not determine whether it needs to use expensive method (such as ILP) to improve the existing result. In this paper, a degraded ILP (DILP) method is proposed to bridge the gap between the ILP method and traditional heuristic methods. The DILP first produces a lower bound (Lb) of minimum test suites. Then it uses single-branch strategy to search a small test suite T 0 close to Lb efficiently. As a result, the testers can make a choice in three cases: (1) The size of T 0 equals to Lb then T 0 is a minimum one, i.e. the best result. (2) The size of T 0 is close to Lb then T 0 can also be considered as a good result. (3) The size of T 0 is far from Lb then it needs to use ILP or other expensive methods to improve T 0 . The rest of this paper is organized as follows. In the next section, we describe some related work of test suite reduction. In section 3, we propose the DILP approach, including the preprocess by 1-1 reduction, single branch strategy and DILP algorithm. Section 4 describes an empirical evaluation on Boolean specification-based testing. Four typical heuristic strategies are compared with the DILP approach. The conclusion is drawn in the last section. II. R ELATED W ORK The greedy strategy (denoted by G) [7] has been used in many fields of computer science including test suite reduction. M.J. Harrold et al. proposed the heuristic reduction strategy H by grouping test cases [10]. T. Y. Chen et al. proposed two

enhance versions of G, called GE and GRE, by combining 1-1 reduction strategy and essential strategy [2]. The above reduction strategies ignore the fact that there are some complex interrelations among testing requirements. Based on testing requirement optimization, it is possible to obtain a smaller test suite more efficiently. X. F. Zhang et al. presented a requirement optimization model to enhance the existing test suite reduction strategies G, GE, GRE and H [16]. Recently, Z. Y. Chen et al. improved it and proposed a graph contraction (denoted by GC) method for testing requirement optimization to achieve test suite reduction [6]. The experimental results shown that GC was very competitive with GRE and it always outperformed other heuristic strategies. GRE, H, ILP and genetic method also have been studied for insight into the selection of test suite reduction techniques [17]. Four typical reduction heuristics strategies: G, GRE, H and GC are compared with DILP in this paper. •







G is the greedy algorithm [7] for test suite reduction. It selects one test case t satisfying the maximum number of testing requirements in R and removes the satisfied requirements in Req(t). And then it selects one test case satisfying the maximum number of remaining testing requirements. The selection repeats until all requirements in R are satisfied. GRE is an enhanced version of the greedy heuristic [2]. It combines the following three strategies: the essential strategy, the 1-1 reduction strategy and the greedy strategy. It is basically the alternate application of the essential strategy and the 1-1 reduction strategy. The greedy strategy is applied only when both strategies cannot be applied. H is a heuristic algorithm categorizing test cases according to different degree of ‘essentialness’ [10]. All requirements r1 , · · · , rm are divided into R1 , · · · , Rd . Ri denotes the set of all requirements in R that are satisfied by exactly i test cases in T . d denotes the maximum number of test cases that a requirement can be satisfied. Roughly speaking, test cases satisfy requirements in Ri are considered to be more ‘essential’ than those satisfy requirements in Rj for i < j. Clearly, H first selects test cases that satisfy requirements in R1 . And then it considers the group of unsatisfied requirements in R2 , · · · , Rd orderly and selects test cases until all requirements are satisfied. GC is a heuristic algorithm contracting testing requirements based on requirement relation graph [6]. A requirement relation graph G(V, E) is constructed first based on testing requirement analysis. V is the set of testing requirements. An edge (v, v 0 ) ∈ E if and only if there are some test cases satisfying both v and v 0 . Then, some graph contraction strategies are proposed to merge the vertices. As a result, a minimal set of testing requirements is obtained and test suite reduction is achieved.

In the real-world software testing, there are often multiple test criteria [1]. K. R. Walcott et al. considered the execution

time of the test suite as an important cost driver [13]. S. Yoo et al. introduced the concept of Pareto efficiency to solve multiobject test suite reduction [15]. For simplicity, this paper treats test suite reduction as a single objective optimization problem. III. D EGRADED ILP A PPROACH FOR T EST S UITE R EDUCTION A. Preprocess of Reduction At present, all known algorithms for N P -complete problems require time that is superpolynomial in the input size. It is unknown whether there are any faster algorithms. However, there exists some polynomial time, even linear time, algorithms which can simplify the original N P -complete problem to be a small one. Although the resulting problem is still N P complete, it can be solved more efficiently than the original one. The satisfiability relation between T and R could be represented as a set S(T, R) = {(r, t) ∈ R × T : t satisfies r}. Let T S(T, R) denote the set of all representative sets of T w.r.t. R and OptT S(T, R) denote the set of all minimum representative sets of T w.r.t. R. For a 1-1 redundant test case t in T , OptT S(T − {t}, R) ⊆ OptT S(T, R). That is, a 1-1 redundant test case can be eliminated to simplify the satisfiability relation between R and T [4]. Similarly, a requirement r is said to be 1-1 redundant to R if there exists an other requirement r0 such that T est(r0 ) ⊆ T est(r). For a 1-1 redundant requirement r, T S(T, R − {r}) = T S(T, R). Hence, OptT S(T, R − {r}) = OptT S(T, R) [6]. A 1-1 redundant requirements could be removed to simplify the satisfiability relation. Given a satisfiability relation S(T, R), a 1-1 reduction satisfiability relation S 0 (T 0 , R0 ) can be obtained by removing 1-1 redundant test cases and testing requirements one by one until there is no 1-1 redundant one in S 0 (T 0 , R0 ). S 0 (T 0 , R0 ) is smaller than S(T, R) and OptT S(T 0 , R0 ) ⊆ OptT S(T, R), thus the test suite reduction problem is simplified. B. ILP Approach In mathematics, integer linear programming (ILP) problems involve the optimization of a linear objective function subject to inequality constraints with integer variables [12]. Given a satisfiability relation S(T, R), the test suite reduction problem can be translated into an ILP (actually 0-1-ILP) problem as the form [11]: n X Min ( xj ) : xj ∈ {0, 1} j

subject to S × x ≥ 1

(1)

S is an m × n relational matrix with si,j = 1 if tj satisfies ri and si,j = 0 otherwise. 1 is an m-vector (1, · · · , 1). x is an n-vector (x1 , · · · , xn ) to be determined. A naive approach of ILP is to enumerate all possible solutions, nevertheless this is feasible only for very small problems. The usual methods of ILP are implicit enumeration techniques. The “implicit” means that many solutions will

hopefully be skipped during enumeration as they are known to be non-optimal. One of the usual implicit enumeration techniques used to solve ILP problems is Branch-and-Bound algorithm [12]. A branching strategy of 0-1-ILP is to pick a variable xj and to replace the current problem by two subproblems, which are copies of the current problem with the variable xj set to 0 in one and set to 1 in the other. Since the variable xj has to take value either 0 or 1 in an optimal solution, this branching scheme guarantees that an optimal solution of the original problem will be an optimal solution of one of the two subproblems. The Bounding operation is a function that returns a bound on the optimal solution of the current subproblem. It is possible to discard some subproblems that have a bound worse than the value of the best currently known solution of the original problem. C. Single-Branch Strategy The Branch-and-Bound algorithm used to solve ILP problems may create exponential subproblems in the worst case, because it creates two subproblems for each variable. The basic idea of degraded ILP (DILP) is the single-branch strategy, i.e. to select only one most possible subproblem for each variable. Hence, DILP creates at most n subproblems for n variables. The LP relaxation of an ILP problem is obtained by removing the integrality constraints on the variables. LP problems can be solved in the polynomial time, whereas ILP problems are N P -hard in general [12]. Since the feasible solutions of the ILP problem are all feasible for the LP problem, LP solution also provides a lower bound on the optimal value of the ILP problem. If the solution of the relaxation has integer components, then it also solves the ILP problem fortunately. Given a satisfiability relation matrix Pn S, an LP relaxation of the ILP problem is formed as M in( j xj ) subject to S ×x ≥ 1 with x ∈ [0, 1]. The LP relaxation is easy to solve by some algorithms, such as simplex algorithm [12]. A feasible solution v can be output by some LP algorithms, in which vj is the value of xj . If vj is not an integer for some j, then v is not a solution of the original ILP problem. However, the value vj could be considered as the possibility of optimal solution of ILP problem. For example, vi = 0.9 and xj = 0.3 indicate xi will more potentially be 1 than xj in the optimal solutions of ILP problem. A single-branch strategy of 0-1-ILP is to pick a variable xj with vj close to 1 and to replace the current problem by a restricted problem, which is a copy of the current problem with fixing the variable xj to 1. A challenge of the single-branch strategy is which variable should be fixed firstly. A natural choice is to fix the variable xj with a high value vj to 1. Note that the variable xj will contribute each testing requirement ri only if si,j = 1, i.e. tj satisfies ri . Hence a more suitable metric is introduced as follows. m X dj := si,j ∗ vj (2) i

D. DILP Algorithm The pseudo-code of degraded ILP (DILP) approach is shown in algorithm 1. 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22:

Algorithm 1: DILP(S) 1-1 reduction S; v = LP (S); Pn Lb = Int( j vj ); while 1 if each vj is an integer return v and Lb; end if Maxd=0; for each j if vj == 1 Fix xj to 1; else Compute dj ; if dj >Maxd Maxd=dj ; k = j; end if end if end for Fix xk to 1; v = LP (S); end while

Firstly a satisfiability relation matrix S is input in the procedure DILP. We use 1-1 reduction as a preprocess of DILP until there is no 1-1 redundant test case and no 1-1 redundant testing requirements (line 1). An LP relaxation of ILP problem in equation (1) will be solved by LP algorithms (line 2). The sum of result v can be considered as a lower bound Lb of minimum test suite (line 3), because any ILP solution is also an LP solution. If each vj is an integer, then v is an optimal solution of the original ILP problem (line 5-7). Otherwise, each xj will be fixed to 1 for vj = 1 (line 11). We calculate the metric dj for each remained xj , i.e. vj < 1, (line 13). The maximal dk is selected (line 15-16) and xk is fixed to 1 (line 20). It is formed as a restricted LP problem and it will be solved by LP algorithms again (line 21). The statements in loop (line 4-22) are repeated until each vj is an integer and v and Lb are output. The loop will stop in at most n times, because at least one xj is fixed in each iteration and there are total n variables in x. IV. E MPIRICAL E VALUATION In this section, an experiment on a suite of Boolean specifications from TCAS II is designed and implemented to evaluate the DILP approach. Four heuristic reduction strategies: G, GRE, H and GC, are also compared with DILP. A. Experiment Design Given a Boolean specification P , an implementation expression may be a mutant M by making simple syntactic changes to P . A test case t is an assignment for all variables. P (t)

TABLE I E XPERIMENTAL S UBJECTS No. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.

|V | 7 9 12 5 9 11 10 8 7 13 13 14 12 7 9 12 11 10 8 7

|L| 23 36 46 5 20 28 21 17 10 15 20 17 13 12 18 37 11 11 9 8

|T | 62 113 2970 29 392 142 210 36 16 256 2188 4290 1731 107 372 2834 1033 584 116 24

|LN F | 21 36 46 5 20 25 21 17 10 15 20 17 13 12 18 37 11 11 9 8

TABLE II E XPERIMENTAL R ESULTS OF LNF |LRF | 235 539 981 40 302 468 362 228 120 353 443 438 279 142 266 794 220 198 126 96

and M (t) denote the values of P and M evaluated by the test case t, respectively. In general, a mutant M may happen to be logically equivalent to the specification P and hence it cannot be distinguished by any test case. A non-equivalent mutant is called a fault and an equivalent mutant is not regarded as a fault. A fault M is said to be killed (or detected) by a test case t if M (t) 6= P (t). That is, t is a satisfying assignment of M ⊕ P (⊕ is the exclusive-or operator). For each mutant Mi , a testing requirement is formed as a Boolean expression ri = P ⊕ Mi . A testing requirement ri is feasible if and only if ri is satisfiable. It is not difficult to see that the number of test cases is finite. Before test suite reduction, all test cases, i.e. satisfying assignments, could be generated to construct an initial test suite. Our experimental analysis was done using software that was specifically designed and implemented for the purpose above. The software allows the analysis of a given Boolean specification. The experimental steps involved in the empirical analysis were as follows: 1. 2. 3. 4.

Select the subject Boolean specifications. Generate the mutants and testing requirements. Construct the initial test suites. Reduce the test suites using the reduction strategies.

We used the set of 20 Boolean specifications, which were originated from the specification of an aircraft collision avoidance system called TCAS II [14]. Two of fault classes, LNF and LRF [5], were considered in the experiment. Literal Negation Fault (LNF): A literal is replaced by its negation, e.g., (a ∧ b) ∨ (¬b ∧ c) implemented as (a ∧ b) ∨ (b ∧ c) with ¬b replaced by b. Literal Reference Fault (LRF): A literal is replaced by another literal that appears in the decision, e.g., (a ∧ b) ∨ (¬b ∧ c) is implemented as (a ∧ b) ∨ (¬a ∧ c) with ¬b replaced by ¬a. Each mutant M of P was generated first, and then the testing requirement was formed as P ⊕M . |LN F | and |LRF |

No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

|T | 62 113 2970 29 392 142 210 36 16 256 2188 4290 1731 107 372 2834 1033 584 116 24

G 7 6 10 2 6 4 3 2 2 3 5 4 3 4 4 7 3 3 2 2

Rsti (∗) GRE H 7 8 6 6 10 10 2 2 5 6 4 5 3 4 2 2 2 2 3 4 5 5 4 5 3 3 4 4 4 4 7 9 3 3 3 3 2 3 2 2

GC 7 6 10 2 5 4 4 2 2 3 5 4 3 4 4 7 3 4 2 2

DILP 7 6 10 2 5 4 3 2 2 3 5 4 3 4 4 7 3 3 2 2

Bsti ILP 7 6 10 2 5 4 3 2 2 3 5 4 3 4 4 7 3 3 2 2

denote the numbers of feasible testing requirements and the infeasible testing requirements are ignored. |V | and |L| denote the number of variables and the number of literal occurrences in Boolean specifications, respectively. The number of all test cases is 2|V | . |T | denotes the number of test cases used in LNF and LRF. The details were shown in Table I. The number of LNF mutants equals to Li and the number of LRF mutants equals to Li · (|Vi | − 1) · 2. However, the number of feasible testing requirements may be less than the number of mutants, because there may be some equivalent mutants for P . For example, |R1 | = 21 < 23 = |L1 | for LNF in Table I. The size of test suite is 2|Vi | in the worst case. However, the real size was always much smaller than 2|Vi | . For example, |T8 | = 36 < 28 = 256 for LNF in Table I. B. Experimental Results and Analysis The effectiveness of each reduction strategy was measured by computing the size of resulting test suites for each Boolean specification. To a further comparison, four typical reduction strategies: G, GRE, H and GC, were also implemented for test suite reduction. Rsti (∗) denotes the size of resulting test suite for the no. i Boolean specification using the reduction strategy ∗. A minimum test suite is desirable for test suite reduction. We computed the size of minimum test suite for each Boolean specification using ILP method [11], denoted by Bsti . |T | denotes the size of initial test suite for LNF or LRF. The detail experimental results of LNF and LRF were shown in Table II and III, respectively. Main observations of the empirical analysis were made as follows. Effectiveness of DILP Approach. The 1-1 reduction was implemented first as a preprocess of reduction until there is no 1-1 redundant testing requirements and no 1-1 redundant test cases. The experimental results of evaluation for 1-1 reduction were shown in Fig. 1 and 2. The number of original requirements and test cases were

TABLE III E XPERIMENTAL R ESULTS OF LRF

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Rsti (∗) GRE H 18 19 26 26 36 38 5 5 15 16 25 27 21 20 18 18 12 12 16 17 20 23 17 18 13 15 13 14 17 17 28 36 13 12 12 14 11 11 10 10

G 19 26 36 5 18 26 21 18 12 16 19 16 13 14 17 28 13 13 12 10

62 113 2970 29 392 142 210 36 16 256 2188 4290 1731 107 372 2834 1033 584 116 24

GC 18 26 35 5 15 25 19 18 12 16 21 17 13 13 16 28 11 13 10 10

Bsti ILP 18 26 31 5 15 25 19 18 12 15 17 15 13 12 16 25 10 10 8 10

DILP 18 26 33 5 15 25 19 18 12 15 18 15 13 12 16 25 11 11 9 10

800

Number of testing requirements

|T |

Redundant Remained

900

700 600 500 400 300 200 100 0

1

2

3

4

5

6

7

8 9 10 11 12 13 Boolean Specification

14

15

16

17

18

19

20

1−1 Redundant of Test Cases 4500 Redundant Remained

4000 3500

Number of test cases

No.

1−1 Reduction of Testing Requirements 1000

3000 2500 2000 1500 1000 500 0

1

2

3

4

Fig. 2.

5

6

7

8 9 10 11 12 13 Boolean Specification

14

15

16

17

18

19

20

Evaluation of 1-1 Reduction for LRF

1−1 Reduction of Testing Requirements 50 Redundant Remained

45

LNF 11 DILP Result Lower Bound Minimum Size

10 35 9 30 8 25

Size of Test Suite

Number of testing requirements

40

20 15 10

6 5 4 3

5 0

7

2 1

2

3

4

5

6

7

8 9 10 11 12 13 Boolean Specification

14

15

16

17

18

19

20 1 0

1−1 Reduction of Test Cases 4500

1

2

3

4

5

6

7

8 9 10 11 12 13 14 Boolean Specification No.

15

16

17

18

19

20

Redundant Remained

4000

LRF 35 DILP Result Lower Bound Minimum Size

30

3000 2500

25

Size of Test Suite

Number of test cases

3500

2000 1500 1000

15

10

500 0

20

1

2

3

4

5

6

7

8 9 10 11 12 13 Boolean Specification

14

15

16

17

18

19

20

5

0

Fig. 1.

1

2

3

4

5

Evaluation of 1-1 Reduction for LNF Fig. 3.

represented as the whole plots. The numbers of 1-1 redundant requirements and test cases were represented as the gray plots. The number of remained requirements and test cases were represented as the black plots. As is evident from Fig. 1 and 2, the results of 1-1 reduction were very inspiring, particularly for 1-1 reduction of test cases for LNF. One reason may be testing requirements are much less than test cases for LNF, as a result, there are many 1-1 redundant test cases. The lower bound was output from LP relaxation by LP algorithms. Then DILP would search a feasible solution close to the lower bound. Note that the lower bound may not be the greatest lower bound, i.e. it might not be reached realistically. The greatest lower bound, i.e. minimum size, of reduced test

6

7

8 9 10 11 12 13 14 Boolean Specification No.

15

16

17

18

19

20

Evaluation of DILP Approach

suite can be calculated by ILP algorithms. However, if the result of DILP equals to the lower bound, then the DILP result must be a minimum sized test suite. That is, DILP approach solves the ILP problem fortunately. The comparison of DILP result, lower bound and minimum size was shown in Fig. 3. The experimental results of DILP approach were very inspiring. In the case of LNF, DILP could generate minimum test suite for all Boolean specifications. The lower bounds of no. 1 and 3 Boolean specifications were not the greatest lower bounds, hence we could not conclude that the DILP results were minimum ones despite they were so actually. In the case of LRF, DILP could generate minimum ones for 15 Boolean

V. C ONCLUSION AND F UTURE W ORK

Comparison of Best Results 20 LRF LNF

18

In this paper, a degraded ILP (DILP) approach was proposed to bridge the gap between the ILP method and traditional heuristic methods. DILP could produce a lower bound of minimum size and then search a feasible solution close to the lower bound. The experimental results shown that DILP always outperformed typical heuristic reduction methods and it can sometimes guarantee the generation of minimum test suite. However, the complexity of LP algorithms was higher than the typical heuristic methods, although LP problems can be solved in the polynomial time. The comparison of time cost need to be discussed further. The empirical evaluation is still very primary, it could not draw rich conclusions for DILP and other heuristic reduction strategies. In the future, large scale testing objects [17] and simulation data [3] would be studied for insight into the selection of test suite reduction techniques.

16 14

BT(*)

12 10 8 6 4 2 0

G

GRE

H Reduction Strategy *

GC

DILP

Comparison of Deviation Analysis 3 LRF LNF 2.5

DV(*)

2

1.5

1

0.5

0

R EFERENCES G

Fig. 4.

GRE

H Reduction Strategy *

GC

DILP

Comparison of Heuristic Reduction Strategies

specifications. DILP can guarantee to generate the minimum ones for 10 Boolean specifications, because they reached the lower bounds. Comparison of Heuristic Reduction Strategies. The detail experimental results of different reduction strategies were shown in Table II and III. To facilitate the comprehension for readers, two evaluation metrics were introduced to show the results of comparison. BT (∗) denotes the times of reduction strategy ∗ generating a minimum test suite. That is, 20 X (Rsti (∗) == Bsti ) (3) BT (∗) = i=1

To a further comparison, standard deviation analysis was introduced to quantify the goal of test suite reduction. The formalization is described as follows. DV (∗) =

20 X Rsti (∗) − Bsti i=1

Bsti

(4)

For a reduction strategy, a higher value of BT (∗) suggests that it obtains better results with respect to the other reduction strategies. A lower value of DV (∗) suggests that it obtains better results with respect to the other reduction strategies. The results of four typical heuristic reduction strategies and DILP approach were shown in Fig. 4. For BT (∗), the DILP approach wined the best score among all reduction strategies, for both LNF and LRF. The results of DV (∗) conformed to the ones of BT (∗). In general, the results of LRF was more significant than the results of LNF, because the numbers of requirements and test cases of LRF are much more than the ones of LNF. The experimental results shown that DILP was always outperformed other heuristic reduction strategies.

[1] J. Black, E. Melachrinoudis, and D. Kaeli. Bi-criteria models for all-uses test suite reduction. In Proceedings of 26th International Conference on Software Engineering, pages 106–115. ACM Press, 2004. [2] T. Y. Chen and M. F. Lau. A new heuristic for test suite reduction. Information and Software Technology, 40(5):347–354, 1998. [3] T. Y. Chen and M. F. Lau. A simulation study on some heuristics for test suite reduction. Information and Software Technology, 40(13):777–787, 1998. [4] T. Y. Chen and M. F. Lau. On the completeness of a test suite reduction strategy. The Computer Journal, 42(5):430–440, 1999. [5] Z. Y. Chen, B. W. Xu, and C. H. Nie. Comparing fault-based testing strategies of general Boolean specifications. In Proceedings of the 31st International Computer Software and Applications Conference, pages 621–622. IEEE Computer Society Press, 2007. [6] Z. Y. Chen, B. W. Xu, X. F. Zhang, and C. H. Nie. A novel approach for test suite reduction based on requirement relation contraction. In Proceedings of the 23rd Annual ACM Symposium on Applied Computing, pages 390–394. ACM Press, 2008. [7] T. H. Connen, R. L. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT Press, Cambridge, MA, 1990. [8] M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theorey of NP-completeness. W. H. Freeman and Company, 1979. [9] M. Harman. The current state and future of search based software engineering. In Proceedings of Workshop on Future of Software Engineering (FOSE’07), pages 20–26. ACM/IEEE, 2007. [10] M. J. Harrold, R. Gupta, and M. L. Soffa. A methodology for controlling the size of a test suite. ACM Transactions on Software Engineering and Methodology, 2(3):270–285, 1993. [11] J. G. Lee and C. G. Chung. An optimal representative set selection method. Information and Software Technology, 42(1):17–25, 2000. [12] A. Schrijver. Theory of Linear and Integer Programming. John Wiley & Sons, 1998. [13] K. R. Walcott, M. L. Soffa, G. M. Kapfhammer, and R. S. Roos. Time aware test suite prioritization. In Proceedings of International Symposium on Software Testing and Analysis (ISSTA’06), pages 1–12. ACM Press, 2006. [14] E. Weyuker, T. Goradia, and A. Singh. Automatically generating test data from a Boolean specification. IEEE Transactions on Software Engineering, 20(5):353–363, 1994. [15] S. Yoo and M. Harman. Pareto efficient multi-objective test case selection. In Proceedings of International Symposium on Software Testing and Analysis (ISSTA’07), pages 140–150. ACM Press, 2007. [16] X. F. Zhang, B. W. Xu, C. H. Nie, and L. Shi. An approach for optimizing test suite based on testing requirement reduction. Journal of Software, 18(4):821–831, 2007. [17] H. Zhong, L. Zhang, and H. Mei. An experimental comparison of four test suite reduction techniques. In Proceedings of 28th International Conference on Software Engineering, pages 636–640. ACM Press, 2006.