Applying Evolution Programming Search Based ... - IEEE Xplore

3 downloads 0 Views 233KB Size Report
organization to adopt software through trial and error approach. This leads to the problems of coming across software and then abandoning it after realizing its ...
2012 International Symposium on Computer Applications and Industrial Electronics (ISCAIE 2012), December 3-4, 2012, Kota Kinabalu Malaysia

Applying Evolution programming Search Based Software Engineering (SBSE) in Selecting the Best Open Source Software Maintainability Metrics A. D. Bakar, A. B. Sultan, H. Zulzalil and J. Din Faculty of Computer Science and Information Technology, University Putra Malaysia, 43400 Serdang, Selangor, Malaysia documentation that can guide its practitioners. As maintenance in OSS is more than fixing bugs, changing or adding a new business requirement but using the available code to design new system based on your application context is also important. Bakar [4] identifies a list of metrics for measuring this aspect of the software development stage. Thus, this is one of the vivid evidence of how practitioners not only are caught across thousands of software but also with dozens of metrics to measure particular quality attributes. To control the quality criteria like maintainability, quantitative measure of software attributes are available at the scientific base of software engineering. For example, metrics to measure size, coupling, cohesion, modularity, algorithmic complexity, and control flow structure have been identified to be able to measure the maintainability in OSS [4]. However, some literatures have acknowledged researchers who seek and introduced quantitative and qualitative measures for software quality in terms of maintainability. Extra effort is needed to practice them in OSS archetype in order to reduce the growing effort and time of OSS adaptation. In this paper, the researchers aimed at presenting the middle stage of an ongoing research to propose the metrics model that could be used by OSS practitioners in predicting maintainability effort of the OSS products. The introduction of proposed idea is validated through a group of proven Chidamber and Kemmerer (CK) [5] maintainability metrics suit optimized based on prudent characteristics obtained in the product to be measured. Furthermore, clear elaboration of the study employed Evolution Programming (EP) search approach used to optimize the metrics for predicting maintainability of OSS. Specifically, this paper is organized as follows: Section II explains the concept of Search Based Software Engineering techniques. Section III describes Evolutionary Programming as one of the search technique. Section IV shows the analysis of the comparison of the study with the real concept of Evolutionary Programming approach. Section V introduces the novel finding and the validation analysis of the proposed idea using the CK metrics suit. Afterwards, expected results are anticipated in Section VI and finally, Section VI concludes the study.

Abstract - The nature of an Open Source Software development paradigm forces individual practitioners and organization to adopt software through trial and error approach. This leads to the problems of coming across software and then abandoning it after realizing its lack of important qualities to suit their requirements or facing negative challenges in maintaining the software. These contributed by lack of recognizing guidelines to lead the practitioners in selecting out of the dozens available metrics, the best metric(s) to measure quality OSS. In this study, the novel results provide the guidelines that lead to the development of metrics model that can select the best metric(s) to predict maintainability of Open Source Software. Keywords – Open Source Software, Search Base Software Engineering, Evolutionary Programming, Maintainability

I. INTRODUCTION In software engineering, the crucial issue is to develop products that can meet user’s quality requirements. Maintainability is considered to be one of the sensitive external quality criteria of the software product. The fact that this software development phase consumes more effort, resources and time [1], several studies have been made in reducing those technological barriers of maintaining the product. In classical (conventional) software development paradigm, maintainability of the product is decided during the development process stage. This notion differs much when considering the Open Source Software (OSS) technology where maintainability is determined by the practitioners after the product has been released. Apart from that, the nature of OSS practice adds the extra challenges. For example profiles of developers who participate in software development are not known. The main philosophy behind this technology is bazaar model where contributors from unknown location are contributing to the software without even knowing the management of that software [2]. Thus, OSS practitioners who inherit ready-made codes are required to select suitable software that is easy for them to maintain at a minimum risk and impacts. One of the advantages of Open Source Software technology is the plethora of data available in software research paradigm. This is the fact that the practitioner has a direct access to the source code that can be used to measure any software attributes. However, according to Yu et al. [3] the technology lacks some important resources like

978-1-4673-3033-6/10/$26.00 ©2012 IEEE

70

2012 International Symposium on Computer Applications and Industrial Electronics (ISCAIE 2012), December 3-4, 2012, Kota Kinabalu Malaysia

II. SEARCH BASED SOFTWARE ENGINEERING E TECHNIQUES Search based approach, usually find the t best optimal or near optimal in order to reduce time coonsumes in search problems. This heuristic (iterative) techhnique inspired the step used by the natural evolutionn of inheritance, mutation, recombination and selectionn. There are two concepts behind this method. First, solution space which is the optimization problem that is pllaying the role of individuals in the population and the seecond is the fitness solution, which determines the environnment in which the potential solution lives in. In accompllishing this, based introduction on how close a givenn design solution achieving the set aims is implementing. Different heuristic algorithms suchh as Evolutionary algorithms are used to solve complex problems p with the growing of more search space [6]. This Evolution Computation is ranging from Geneticc Algorithm (GA) [7], Evolutionary Programming (EPs), ( Evolution Strategies (ES) and Genetic Programm ming (GP), [8][9]. These approaches differ in their impleementation details and problem domain, although all approaches lay in the b of biological field of Software engineering with the bases evolution concepts of selection and mutation m strategies within the selected population and thhe survival of the fittest which produces the optimaal solution at a reasonable computational time.

ƍ ሺ–ሻൟǢ ƍ ሺ–ሻǡ Ǥ Ǥ Ǥ ǡ šρ Evaluate  ƍ ሺ–ሻǣ ൌ ൛šଵ ƍ ƍ ሺ–ሻǣ ƍ ൌ ሺሼ ଵ ሺ–ሻሽሻǡ Ǥ Ǥ ǡ ൫൛ ρ ሺ–ሻൟ൯൯Ǣ 

Select ሺ– ൅ ͳሻǣ ൌ •ቀሺ–ሻ ‫  ׫‬ƍ ሺ–ሻቁ; – ൌ – ൅ ͳ; T THE STUDY IV. ANALOGOUS TO In this novel directive sttudy, primarily individual gene (metric) is selected from a group of OSS maintainability metrics (chrom mosome) as shown in Fig. 1. Mutation of the metric is donne in order to produce the best metric (parent) that then chhange to produce offspring and passed to participate in the next generation. This mating process is continued until the best metrics to measure the OSS is available.

NG SEARCH III. EVOLUTION PROGRAMMIN TECHNIQUES This study used EP in optimizing metrics m to predict maintainability of OSS. Evolutionary programming, p like any other Evolution search algorithms, is based on mutation-selection strategies. The majoor difference with GA and ESs is that, there is no recom mbination attributes in the sense that EP can work with sim mple mutation and self-adaptation scheme in which singlee parents mutation provide one offspring that is selected deterministically d to survive to the next generation basedd on probabilistic technique [9] [11] [12]. p solutions In this approach, population of potential applies the principle of survival of thee fittest to produce better and better approximations to a solution. The optimization technique, given quality functional group that is created randomly and the eleement of function domain is also selected randomly baseed on its fitness as shown in the Fig. 1. In order to apply thhe fitness measure to quality function, quality candidate inn term of fitness is selected for the next generation. Againn the production of new candidate is based on its fitness. The T process can be repeated until the candidates meet the maximum quality solution or the computational limit is reached r [7] [8] [9] [13] and below is a pseudo code for f the Evolution Programming algorithm.

Figure 1. Open Source Sooftware optimization maintainability quality q model The main aim is to allow the EP to decide which metric out of the dozens availlable is suitable to predict maintainability of the selected OSS. The metric or group metrics from a suit is selectingg as an Individual (parent), the metric is mutated to produuce new offspring based on the properties available in thee selecting product, if the metric will exceed the minimuum requirement of survival, supersede the progeny. The prrocess is repeated until the most suitable metric is availabble. For more clarification, the elaborated classical Evvolutionary Programming (CEP) is stated using four stepss as follows [12] [13]. Initialization: Randomly generate an initial population (population size) of N (metrics)) individuals. Mutation process: Out of each e individual X, using a Gaussian mutation (CEP) to geenerate a new individual X ' via a random process that is, X ' = X + N (μ,ɐ ); Where μ is mean and ı is standard deviation (In random variable, we assuumed μ= 0) Then the equation becomes

Initialize –ǣ ൌ ͲǢ Initialize ሺͲሻǣ ൌ ൛šଵ ሺͲሻǡ ǥ ǥ ǡ šμ ሺͲሻൟ ‫ א‬ρ Where I ൌ ୬ ; Evaluate: ሺͲሻ ǣ ൌ ሺš୩ ሺͲሻሻ ൌ ሺˆሺš୩ ሺͲሻሻǡ ˜୩ ሻ; While termination criteria not fulfilled do d Mutate š୩ǡ ሺ–ሻǣ ൌ ൫š୩ ሺ–ሻ൯‫ א ׊‬ሼͳǡ Ǥ Ǥ ǡ ρሽǢ

X ' = X + N (0,ɐ ).

71

2012 International Symposium on Computer Applications and Industrial Electronics (ISCAIE 2012), December 3-4, 2012, Kota Kinabalu Malaysia

coupling between classes. Both two features determine the quality of the product as well as the effort of maintaining it. Since both determine maintainability of OSS to be adopted, WMC is an important metric that determines the maintainability of the OSS. It is equal to the total number of methods in the class. In this case, software with classes with more methods is less maintainable since some class increase complexity when it invokes a method from another class. RFC measures the number of different methods that can be executed when an object of that class receives a message and the complexity is the worst when a method is invoked for that object. Thus, requires extra effort for the one who want to maintain the software with more method refer to the method from another class. Calculating this value, one has to find each method of the class and the methods that the class will call, and then repeat this for each called method. DIT and NOC have some sort of relationship since both deals with the connection between super-class and sub-classes and they are also determining the maintenance effort of the overall adopted product. The more hierarchical class consumes more time and effort in dealing with important sub-characteristics of maintainability quality attribute in testing and reading the system since all descending classes and method should be handled. The CK suit used has proved to be the major ensemble and attracted many researchers in prototyping the metrics implementation in the field of Software engineering and shows a sensible possibility of being representatives of maintainability metrics suit in OSS as quality focus [15] [16] [17]. In this study, the suit has been used as the objective context, and the Chidamber and Kemerer Java Metrics (CKJM) tool introduced by Spinellis [18] has been used as the bridge between the maintainability metric model and the object of the study. The object is regarded as an OSS product to be investigated. The tool listed the value for each metric based on their individual attributes measured from the product. These values are then used in EP search based techniques as ranking criteria in selecting the best metrics of the proposed OSS product. The selection criteria are determined by the benchmark stated for each metric.

In considering mutation rule [12], we applying selfadaptation, the objective variablesš and strategy parameter ı are introduced, then individual become a = (š,ɐ). The parameters are used to mutate the individual as follows: ɐᇱ୧ ൌ ɐ୧ Ǥ ‡š’ሾɒǤ ሺͲǡͳሻ ൅ ɒᇱ Ǥ ୧ ሺͲǡͳሻሿ š୧ᇱ ൌ š୧ ൅ ɐᇱ୧ Ǥ ሺͲǡͳሻ Where š୧ is ith component of the real-valued vector representationš, ɐ୧ is the step-size for ith component, ɒƒ†ɒᇱ is operator-set parameters and N (0, 1) represent the normal Gaussian probability distribution Selection process: In this step, based on the objective function, comparison of performance is made between the individual in hand (selected metric) and the new generation one by one which then maintain the better individuals to constitute the next generation. Halting criteria: This is the point that determines whether or not termination condition is satisfied, if so, stop the search process or otherwise repeat the above process from step 2. In more broad elaboration this scenario sated as follow, user attempts to calculate the optimal or near to optimal solution problems through simulated evolution [14]. In this process solution (a set of software metrics in our case) is encoded in a gene and collection of genes (best solutions which is the best metric in our study) constitutes in the population. EP uses natural selection and genetics based on Darwin evolution as the basis to search for the optimal gene, which is a set of software metrics that gives the fittest solution (metric). A population of solutions is modified using mutation process and self adaptation, to produce offspring. The fittest individual is to be selected to continue into the next generation. V. NOVEL FINDINGS AND VALIDATION ANALYSIS USING CK METRIC SUIT In this part, the object-oriented metrics proposed by Chidamber and Kemerer [14] were studied and analyzed. The suit composed of the following metrics: Weighted Methods per Class (WMC): this is the total number of methods defined in a class. Response for a Class (RFC): it is the counts of the methods that can be potentially invoked in response to a message received by an object of a particular class. Coupling Between Object classes (CBO): The count of the number of classes to which they are coupled. It designates the interdependency of one class on other classes. Lack of Cohesion of Methods (LCOM): Total number of methods-pairs whose similarity are zero, minus the count of method pairs whose similarity is not zero. Depth of Inheritance Tree (DIT): which is defined as the height of the class in the inheritance tree, and lastly. Number of Children (NOC): Total number of subclass inherits methods from super-class. The main concept of Object oriented development is to build highly cohesive classes and maintain loosely

VI. EXPECTED RESULTS The main contribution of this research is the novel introduction of new FOSS maintainability quality model based on maintainability attribute. In causative to this body of knowledge, Evolutionary Programming as a compartment of a Search Based Software Engineering approach has been proposed. This has put in the model that can trigger the simplicity of having metrics model in selecting best metrics that can predict maintainability of OSS. For example the study was composed of two different disciplines of classical evolution from biology and the base of Software Engineering. This kind of merging techniques of Open Source Software technology put in the philosophy of traditional human evolution into OSS discipline provides a wide range of constructive solution to complex selection problems. Furthermore, in classical Object Oriented Software development, product can contain more than one

72

2012 International Symposium on Computer Applications and Industrial Electronics (ISCAIE 2012), December 3-4, 2012, Kota Kinabalu Malaysia

property. Each property used its own metrics to measure single quality attribute. Therefore, having an OSS product with multiple attributes, Evolution programming SBSE technique has been a better solution to select the best metric or set of metrics that can be used to predict maintainability quality characteristic. This selection determines based on merged priority complexity attributes inside the selected software and metrics [19] [20].

[7]. J H. Holland, Adaptation in Natural and Artificial Systems. The University of Michigan Press. Ann Arbor, 1975. [8]. H. G. Beyer, the Theory of Evolution Strategies, Springer, 2001. [9]. T. Bäck, Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary algorithms and genetic Algorithms, Oxford University Press, 1994. [10]. G. B. Fogel and D. B. Fogel, “Continuous Evolutionary Programming: Analysis and Experiments”, International Journal of Cybernetics and Systems, Vol. 25, 1995, pp. 7990. [11]. X. Yao, Y. Liu and G. Lin, “Evolutionary programming made faster. IEEE Trans Evolution Computer, Vol. 2, 1999, pp. 82–102. [12]. B. T, Schwefel, “An overview of evolutionary algorithms for parameter optimization”. Evolutionary Computing, Vol. 1, 1993, pp. 1–23. [13]. L. J. Fogel, Intelligence through Simulated Evolution. Forty Years of Evolution Programming, WileyInterscience publication, 1999. [14]. D. Choi and S. Oh, “A New Mutation Rule for Evolutionary Programming Motivated from Back propagation Learning”, IEEE Transactions on Evolutionary Computation, Vol. 4, pp. 188-190, 2000. [15]. R. Harrison, L.G. Samaraweera, M.R. Dobie and P.H. Lewis, An evaluation of code metrics for object oriented programs, Information and Software Technology, Information and Software Technology, Vol. 38, 443-450, pp. 1996. [16]. J. Dallal, Transitive-based Object-oriented Lack-ofcohesion Metric, Procedia Computer Science, Vol. 3, pp. 1581̄1587, 2011. [17]. S. K. Dubey and A. Rana, Assessment of Maintainability Metrics for Object-Oriented Software System, ACM SIGSOFT Software Engineering Notes, Vol. 36, pp. 1-7, 2011. [18]. D. Spinellis, Tool Writing: A Forgotten Art? IEEE Software, pp. 9-11, [19]. R. Harrison , S. Counsell and R. Nithi, Experimental assessment of the effect of inheritance on the maintainability of object-oriented systems, The Journal of Systems and Software, Vol. 52, pp. 173-179, 2000. [20]. H. Wang, T. M. Khoshgoftaar and N. Seliya, How Many Software Metrics Should be Selected for Defect Prediction? Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference, Association for the Advancement of Artificial Intelligence, pp. 69-74, 2011

VII. CONCLUSION This article provided insight of maintainability on how Open Source Software metrics model has been designed to be able to select the best metric(s) to predict maintainability of the product. The study also showed the direction on how Search Based Software Engineering technique can be used to solve these kinds of problems using CK as the object case of the study. The next move from this progressive is the empirical validation of a discussed model. ACKNOWLEDGMENT The authors would like to thank S. A. Mohammed for his comments. The authors would also like to thank the anonymous reviewers for their valuable suggestions for this work. REFFERENCES [1]. I. M. SoI, Software Complexity: An Aid to Software Maintainability, Microelectronics Reliability, Vol. 25, pp. 223-228, 1985. [2]. E. Raymond, “The Cathedral and the Bazaar. Knowledge”, Technology & Policy, Vol. 12, 1999, pp. 23-49. [3]. L. Yu, S. Schach And K. Chen, “Measuring the Maintainability of Open-Source Software”, IEEE, 2005, pp. 297-303. [4]. R. Chidamber and C.F. Kemerer, “A Metrics Suite for Object- Oriented Design,” IEEE Trans. Software Eng., Vol. 20, pp. 476-493, 1994. [5]. A. D. Bakar, A. B. Md. Sultan, H. Zulzalil, J. Din, “Review of ‘Maintainability’ Metrics in Open Source Software”, International Review on Computers and Software, Vol. 7, pp. 903-908, 2012. [6]. R. Cheong, “A Comparison between Genetic Algorithms and Evolutionary Programming based on Cutting Stock Problem, Engineering Letters”, Vol. 14 ,2007.

73