The Importance of Dealing with Uncertainty in the ... - Semantic Scholar

5 downloads 208749 Views 195KB Size Report
The correct choice of software tools and methods is a critical success factor to reach .... Analytical Hierarchy Process (AHT) [20,28,26,35]. A comparison between ...
The Importance of Dealing with Uncertainty in the Evaluation of Software Engineering Methods and Tools Gerardo Canfora and Luigi Troiano RCOST – Research Centre on Software Technology Department of Engineering - University of Sannio Palazzo ex Poste, via Traiano - 82100 Benevento, Italy (canfora/troiano)@unisannio.it

ABSTRACT

models and processes.

The correct choice of software tools and methods is a critical success factor to reach and maintain market leadership. A mature approach to estimate the impact and risk of technology adoption is required. This paper underlines the need for dealing with uncertainty to manage correctly the risk of decision-making and proposes a method for evaluating software engineering methods and tools. The method, named Software Engineering Fuzzy Evaluation Method (SEFEM) is centred on a new class of fuzzy aggregators named Ordered Fuzzy Number Weighted Averaging (OFNWA).

The correct choice of software tools and methods is today the most critical success factor to reach and maintain market leadership, so that a mature approach to estimate the impact and risk of technology adoption is required. Such an approach should take into account several aspects. The most relevant is that evaluation is based on qualitative elements that can be hardly modelled by classical mathematical approaches. Indeed, most of the risk associated with an evaluation process derives from the imprecision of the available information and the consequent uncertainty of the judgment expressed. In the last 30 years several authors have highlighted the need for taking advantage of the imprecision of information in decision making, as a means for assessing and managing risks.

Categories and Subject Descriptors H.4.2 [Information Systems Applications]: Types of Systems – Decision support. I.2.3 [Artificial Intelligence]: Deduction and Theorem Proving – Answer/reason extraction, Uncertainty, “fuzzy,” and probabilistic reasoning.

A second key problem is related to the definition of a systematic and repeatable evaluation process. In addition to minimizing uncertainty and risks, a systematic process is a key to learning from the experiences made.

General Terms

The present paper is mainly intended to underline the necessity of dealing with uncertainty to manage the risk of decision correctly and to propose a solution by fuzzy mathematics.

Management, Measurement, Experimentation.

Information Technology (IT) is pervasive in all modern organizations. The strategic importance of IT is well known. The high-rate development of communication infrastructures is changing rapidly the structures of markets, the way of doing business and working practices, leading to the Information Society. Due to its strategic nature, IT does not support traditional activities only, making them cost and time effective, but it is more and more the focus around which to think and build new business

The paper illustrates some key aspects of Software Engineering Fuzzy Evaluation Method (SEFEM), an integrated evaluation methodology developed at the University of Sannio, Research Centre On Software Technology (RCOST). It is mainly focused around an integrated definition of the evaluation process, the application of a new class of fuzzy aggregators, named Ordered Fuzzy Number Weighted Averaging (OFNWA) and the use of a decision support tool known as decision graphs. To show the value of dealing with the uncertainty of verbal judgments, after a brief description of the mathematical model underlying SEFEM, the paper shows some results of the application of this method to a concrete study-case. A more detailed description of SEFEM can be found in reference [1]; references [2,3] provide detailed description of OFNWA and its application.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, ot republish, to post on servers or redistribute to lists, requires prior specific permission and/or fee. SEKE '02, July 15-19, 2002, Ischia, Italy. Copyright 2002 ACM 1-58113-556-4/02/0700...$5.00.

The remainder of this paper is organised as follows. Section 2 discusses related work while Section 3 rationalizes the concept of uncertainty and shows the importance of dealing with it to manage the decision risk. Section 4 illustrates a solution using fuzzy mathematics. Section 5 presents the application of the method to the selection of a SCM system. Conclusions in Section 6 look forwards for future works.

Keywords Decision Support Systems, Decision Making, Evaluation, Aggregation, Fuzzy Logic.

1. INTRODUCTION

- SEKE '02 - 691 -

2. SOFTWARE ENGINEERING RELATER WORK Evaluation is itself a wide topic, raging from rigorous statistical and mathematical techniques to social and behavioural models. Here we focus on solutions applied to evaluate software engineering objects. The evaluation of software engineering products is a complex activity that cannot be underestimated. It involves main tasks as criteria identification and selection, assessment, and aggregation. We can categorize attributes in functional and non-functional. The former set regard the operation of software, comprising all characteristics desired for the project. The difficulty of finding a common set of attributes for a wide class of projects may justify the scarce literature around functional attributes identification. A vaster literature describes different sets of non-functional attributes. A reference standard is the ISO 9126 [4]. Top level attributes in this standard are functionality, reliability, usability, efficiency, maintainability, and portability. Other contributions on this topic are Bergmann [5], Boehm [6,7,8], Loral Federal System [9], Maiden [10], and Robertson [11]. Once attributes are identified, their assessment can be made in different ways. Looking at documentation, experimentation, and expert consulting are typical solutions. Some interesting works are Bergman [5], Hershel [12], Miller [13], and Wikenheiser [14]. Aggregation is the information synthesis step in evaluating software. Starting from an empirical knowledge, the aggregation is the set of methods and techniques to produce a synthetic knowledge regarding the overall assessment of the objects examined. Literature on this topic refers usually to Multi-Attribute Utility Theory (MAUT) [19,29]. A widely studied method is Analytical Hierarchy Process (AHT) [20,28,26,35]. A comparison between MAUT and AHT can be found in references [21,22,23,24,27]. Other interesting contributions are Anderson [15], Jeanrenaud [30], Mayrand [34], Morisio [16], Rivett [36], and Roy [18,17]. Several other papers can be found in literature that are not mentioned here. In all methods, a numerical value is somehow derived from each attribute assessment. Usually weights are assigned to the attributes, and the weighted attributes are numerically combined to produce an overall score. We did not find any paper regarding the application of fuzzy mathematics to software engineering products evaluation. Another lack in software engineering evaluation techniques concerns the scarce integration they achieve. Often, research efforts are focused on a single aspect, missing the global vision of evaluation as a business process. In the attempt to provide a more integrated approach, the DESMET (Determining an Evaluation methodology for Software MEthods and Tools) initiative by the UK Department of Trade and Industry (DTI) [31] stands out. The principal aim of DESMET was to “address the problem of the objective determination of the effects and effectiveness of the methods and tools for the development of SW based systems”. As result, the DESMET project developed a comprehensive evaluation methodology that enables independent and consistent evaluations. To accomplish this, DESMET offers a common terminology for describing how an evaluation is performed, and by which methods and tools it can be achieved. DESMET identifies a number of different quantitative and qualitative approaches. The quantitative methods are: Experimental Design and Analysis; Case Study Design and Analysis; Survey Design

and Analysis. The qualitative methods are: Feature Analysis; Qualitative Effects Analysis. For each evaluation type DESMET identifies how to plan and execute it, the advantages and disadvantages, the requirements (time, cost, ...) and the DESMET maturity level necessary for a successful evaluation. Kitchenham [32,33] offers an interesting review of these evaluation techniques. The main merit of DESMET is in the attempt to develop a common framework for evaluation in the software engineering domain. However, the aggregation technique is still based on (crisp) numeric combination and the necessity of dealing with uncertainty and vagueness of assessments is not considered at all. SEFEM retakes DESMET plant and moves on defining an integrated business process. The management of uncertainty is central to SEFEM aggregation and analysis step. Indeed the aim of having a well settled evaluation is to make risk more predicable. SEFEM is built around the idea that taking into account uncertainty is the key to better estimate risk.

UNCERTAINTY AS A SOURCE OF INFORMATION

3.

“Risk” and “uncertainty” are two terms basic to any decision making framework. “Uncertain: Not certain; doubt-ful. (a) Not known in regard to nature, qualities, or general character (b) Not known as regards quantity or extent; indefinite; problematical; (c) having doubts; without certain knowledge; not sure. (d) Not sure as to aim or effect desired (e) Unreliable; insecure; not to be depended on (f) Not firm or fixed; vague; indeterminate in nature (g) Undecided; hesitating; not resolved (h) Not steady; fitful (i) Liable to change; fickle; inconstant; capricious; irresolute” “Uncertainty: 1. The character or state of being uncertain; want of certainty (a) Of things: the state of not being certainly known; absence of certain knowledge doubtfulness; want of reliability; precariousness (b) Of persons: a state of doubt; a state in which one knows not what to think or do; hesitation; irresolution. 2. Something not certainly or exactly known; anything not determined, settled, or established; a contingency.” Century Dictionary, Vol. VIII, Page 6586, Unceaseable to Unchristen “Risk: Hazard; danger; peril; exposure to mischance or harm; venture […] In com.: The hazard of loss of ship, goods or other property. […]” Century Dictionary, Vol. VI, Page 5194, RisingLark to Rithe

- SEKE '02 - 690 -

Many different definitions of risk and uncertainty can be found in literature and they are usually referred to in the probability theory framework. With reference to the notion commonly used in modern decision theory, e.g. employed by Tversky-Kahneman [37] and much earlier proposed by Knight [38], risk can be defined as imperfect being knowledge where the probabilities of the possible outcomes are known, e.g. in a fair roulette game, and uncertainty exists when these probabilities are not well known. But uncertainty may derive from other information deficiencies such as incompleteness, impreciseness, vagueness and so on. A more general usage of these terms would state uncertainty as imperfect knowledge and risk as uncertain consequences. Then, risk and uncertainty are strictly connected because an incomplete knowledge of the characteristics of products entails a higher risk of choosing one alternative.

an external entity (the aggregation model applied) that is not part of the evaluator/decision maker experience. In any case an important source of information is cut off. This increases the risk of decision because it makes consequences more unpredictable. Moreover the result produced is too synthetic, usually consisting of a numerically ranked list. The only way to use such information is to adopt alternatives according to their position in the list. Any other approach can be based only on the personal experience of the decision maker, out of the knowledge produced by the aggregation process.

The objective of any evaluation is to provide an overall judgment for each alternative starting from a set of elementary assessments on basic criteria. The evaluation is then used to make a choice among alternatives, so that it falls basically into a Multi Criteria Decision Making (MCDM) problem. Therefore, central to the evaluation is an aggregation step.

4. A FUZZY APPROACH

A MCDM problem can mathematically be formulated as follows. Let

X = {X 1 ,..., X m } be a set of alternatives and C1,…,Cn a

collection of criteria to rate the alternatives Xi ∈ X. On each criterion Cj we can express a judgment Gj that gives a measure of the criterion satisfaction. Thus, the evaluation Ei for alternative Xi will be a function of G1,…,Gn. Formally:

Ei = f (G1 ,..., Gn )

(1)

where f makes an aggregation of G1,…,Gn to return the aggregate judgment Ei. Basic judgment G1,…,Gn can represent quantitative data or, more frequently, qualitative judgments. More than in other engineering fields, evaluations of software engineering products are based on qualitative attributes. Typical is the ISO 9126 set [4]. Usually such sets of attributes are structured in more levels, showing a dependency of more complex attributes by simpler ones. Once the attribute set is chosen and an initial assessment is made, the alternatives are scored according to various numerical techniques [19,29,20,15,30,34,16,36,18,17]. In any of these methods, a (crisp) numerical value is derived and then aggregated. Traditional techniques produce a reduction of uncertainty and, consequently, they, are not able to capture the information richness of human communication and cognition. For instance, having an optimistic/pessimistic attitude/view can influence the overall judgment. Yager [39] defines a measure of optimism, termed orness (σ) for a wide class of aggregators based averaging. The arithmetic average, widely used in traditional techniques, corresponds to a middle level optimism (σ = 0.5). Therefore, using the arithmetic average entails fixing the level of optimism, even if different contexts may require being more or less conservative. The partiality of knowledge raises an ontological issue. The information is completed by the decision maker a posteriori, according to his/her knowledge and experience, ignoring part of a priori knowledge because it is not available anymore. Even if the evaluator and decision maker are the same person, such problem cannot be solved. Indeed, the resulting knowledge is produced by

Such basic limits of the methods presented in literature can reduce their effectiveness. Taking into account the vagueness of judgments can improve the quality of the knowledge produced by the aggregation and provides a better support to decision maker.

A well known solution to deal with uncertainty is represented by Fuzzy Mathematics [40,41,42]. In literature it is possible to find several proposals of fuzzy aggregators (see [43] for a survey). A class of widely used aggregators, known as Ordered Weighted Averaging (OWA), has been proposed by Yager [39]. This class of aggregators offers an ideal bridge between and-like operators (t-norms) and or-like operators (s-norms). Related to OWA operators, Yager [44] proposed an importance qualification model for the OWA aggregation that is based on logical rules and importance operators. Importance operators are functions

I = I ( v, a ) I : [0,1] × [0,1] → [0,1]

(2)

that transform the judgment (a) according to the importance level (v) associated with the related criterion. Depending on the nature of aggregation we can define different importance operators

for and-like (T) and for or-like (S) aggregations such that

I T (0, ai ) = 1 I T (1, ai ) = ai

I S (0, a i ) = 0 I S (1, ai ) = ai

(3)

This leads to three rules to determine the effective satisfaction for each criterion. The rules are: i1: if importance(Ci) is high then gi = ai i2: if importance(Ci) is low and F is andlike then gi = 1 i3: if importance(Ci) is low and F is orlike then gi = 0 By following the Sugeno [45] method, the effective judgment is computed as

gi =

τ 1 ⋅ ai + τ 2 ⋅ 1 + τ 3 ⋅ 0 τ1 + τ 2 +τ 3

where

τ 1 ,τ 2 ,τ 3

are the firing levels of rules i1, i2, i3

τ 1 = vi τ 2 = (1 − vi )(1 − σ ) τ 3 = (1 − vi )σ τ1 +τ 2 +τ 3 = 1

- SEKE '02 - 693 -

(4)

In [2] some limits of OWA operators have been highlighted. OWA aggregates crisp numbers, so that an early defuzzification step is required. Usually, fuzzy expected value (FEV) is used to defuzzify numbers representing verbal assessments. However, early defuzzification does not preserve all the information available for decision making, and particularly the uncertainty of individual judgments. In addition, OWA aggregation produces misrepresentation of the judgments (called judgement relocation) when it deals with criteria that have a different weight in the overall decision. Such drawbacks may have a significant impact on decision support systems. In [2] we introduced a new class of aggregators, named Ordered Fuzzy Numbers Weighted Average (OFNWA). An OFNWA operator is defined as

aggregation deriving from how many criteria are relevant for the evaluation. In addition, we call the coefficient

ζ J = (1 − ξ J )

judgment determinateness, so that we can write, more conveniently

F = ξ J ⋅ I + ζ J ⋅ FN

(11)

We call this kind of number fuzzy number with indeterminateness. By equaling Eq. (6) with Eq.(5)

FN =

1 1−ξJ

∑τ

F

(12)

C' C'

C'⊆C C'≠∅

n

F ( A1 ,..., An ) = ∑ wi Bi

While I represents total ignorance,

(5)

i =1

where

A1 ,l , An

are fuzzy numbers modeling verbal

B1 ,l , Bn

judgments, and

has been introduced. Such model is based on the iterative application of two logical assertions. p1: if importance(Ci) is high then aggregation should consider Ci. importance(Ci) is low then aggregation can ignore Ci.

Following a reasoning similar to the one proposed by Yager, the aggregate judgment id defined as

F=

∑τ

F

(6)

C' C'

C' ⊆ C

where

FC '⊆C = F (C1′ ,..., C ′p ) = F{Ci′ }i =1,..., p

C ' = {C i′}i =1,..., p ⊆ C = {C i }i =1,..., n

(7)

C1′ ,..., C ′p

indicates an OFNWA aggregation on criteria according to Eq.(5), and

τ C '⊆ C = ∏ τ C ′ Ci′

i

C ' = {C i′}i =1,..., p ⊆ C = {C i }i =1,..., n We assume the element

F∅

keeps all the qualitative

information on the goodness of the aggregate judgment. It is a fuzzy number keeping into count impreciseness of the judgments used for aggregation. Instead, determinateness

is a decreasing ordering of

A1 ,l , An . Moreover, in [2] an appropriate importance model

p2: if

FN

(8)

ξJ

judgments and their relevance and strength. The result obtained in this way could be aggregated to a higher level; details on higherlevel aggregations can be found in [1]. SEFEM aggregation is based on this class of aggregators. SEFEM is structured in four main phases: requirements, planning, execution and analysis. The requirements phase is intended to define goals and constraints for the evaluation process. Planning is the moment where alternatives, criteria and related importance are identified. Moreover all evaluation tests are defined in detail. Execution is when all tests are executed and results collected. Analysis is the phase when a (fuzzy) aggregated result is produced for each alternative to provide a comparative analysis. Each phase is characterized by several documents. Such documents can be input to the phase, and other documents are produced as output. Two documents are particularly important: the evaluation graph and the analysis report. The evaluation graph is a graphic semiformal description for the aggregation process. The analysis report is provided in natural language as final result to describe the evaluation results. Such analysis is conducted by evaluator(s) using specific decision graphs. Such graphs are intended to summarize all information and to provide a comparison tool to the decision maker.

as a primitive entity called

(9)

so that we can write Eq.(10):

F = τ 1, 2τ 2, 2 mτ n , 2 ⋅ ǿ + (1 − τ 1, 2τ 2, 2 mτ n, 2 ) ⋅ FN =

= ξ J ⋅ I + (1 − ξ J ) ⋅ FN

Fig. 1 – Examples of judgment scales

As observed in [2], I can be assimilated to the concept of total ignorance. The coefficients

ξ J = τ 1, 2τ 2, 2 mτ n , 2

gives

represents the weight of ignorance about the aggregating

undetermined element and denoted as

F∅ = ǿ

ζJ

indications about the strength of the aggregate judgment, while

are called

judgment indeterminateness and represent the vagueness of

- SEKE '02 - 694 -

vary σ uniformly between 0 and 1 at the same time for all aggregation levels. But if we consider the different conflict of within an interval some criteria we can limit σi

5. CASE STUDY: EVALUATING SOFTWARE CONFIGURATION MANAGEMENT TOOLS We applied SEFEM to evaluate four Software Configuration Management (SCM) tools: PVCS, ClearCase, SourceSafe, and Continuus/CM [1]. The evaluation was based on the 23 criteria listed in Tab. 2, and aggregation was performed at different levels, i.e., C13, C14, C15 and C16 were aggregated into the Usability attribute, which was in turn aggregated with the Realiability and Maintainability attributes to evaluate the complex property “reducing development time”. SEFEM uses several ways to model verbal judgments. One of these provides for choosing a verbal assessment on several verbal scales. For instance, it is possible to choose the assessment “middle” on the scale [bad] / [middle] / [good] or on a more detailed scale such as [very bad] / [bad] / [middle] / [good] / [very good], so that a different level of uncertainty can be assumed for the assessment. Such verbal scales are translated into fuzzy number making some “reasonable” assumptions. We can assume to represent verbal assessments by equidistant, symmetric, triangular fuzzy numbers as shown in Fig. 1. A synthetic notation to represent these assessments is by means of a couple of integers [k/p] where k is the assessment and p is the cardinality of the scale. It is simple to derive the membership function of the relative triangular fuzzy number by this couple of integers. In this work, criteria relevance has been evaluated in this way. The survey presented in this work has been conducted for simplicity only by checklists. Several techniques to deal with checklist assessments are present in literature. Here, the final score (k) is estimated by counting how many items are satisfied and the granularity (p) of such assessment is given by the number of items in the list. An example is reported in Tab. 1. For simplicity, the entire evaluation was conducted by a single test. A summary of the results is reported in Tab. 2. The analysis for SEFEM is conducted by OFNWA aggregation. There are three types of analysis: with fixed orness, with variable orness, with variable orness by range. When weights (and orness) are fixed, the aggregation of triangular fuzzy numbers (with indeterminateness) still leads to a triangular fuzzy number (with indeterminateness). Such a result is rich of information concerning the impact of initial judgment uncertainty on the global assessment, the determinateness of all the evaluation process. So that the evaluator can better understand the meaning and the risk of decisions he/she will take. But we can do more. We can analyse how the final result can vary by varying the evaluator’s optimism. Since each triangular fuzzy number can be completely described by the support and the prototypal value

xM ,

[ xL , xR ]

we can analyse how these

[ai , bi ] ⊆ [0,1]

at any aggregation levels. Then, the evaluator

optimism can be described by a general parameter k, so that

σ i = ai + k (bi − ai ) .

We will make an analysis with

variable orness.

Tab. 1 – An example of checklist for SCM evaluation Learnability

PVCS

ClearCas e

SourceSafe

Continuus/CM

Structure













 3/5

 4/5

Of commands Presence of Wizards Tutorial on line



Consistent help Score 4/5

  4/5

Tab. 2 – SCM Assessment scoring Basic Characteristics

PVCS

ClearCase

Source Safe

Continuus /CM 8/10

C1

Adequacy

9/10

10/10

6/10

C2

Accuracy

4/5

4/5

3/5

4/5

C3

Security

2/3

3/3

2/3

3/3

C4

Interoperability

2/3

2/3

2/3

2/3

C11 Maturity

2/2

2/2

?

?

C12 Recoverability

2/2

2/2

2/2

2/2

C13 Comprehensibility

3/4

4/4

3/4

4/4

C14 Learnability

4/5

4/5

3/5

4/5

C15 Usability

3/5

4/5

4/5

4/5

C16 Attractiveness

3/4

4/4

3/4

4/4

C17 Answer time

3/3

3/3

3/3

3/3

C18 Resource usage

5/5

5/5

5/5

5/5

C19 Analyzability

3/3

3/3

2/3

2/3

C20 Adaptability

3/3

3/3

2/3

2/3

C21 Ease of installation

4/4

4/4

3/4

3/4

C22 License costs

?

$746

$549

?

C23 Technical support

?

$0

$245

?

characteristic points vary by changing the orness level σ. We can

- SEKE '02 - 695 -

For better understanding the meaning of such an approach, let us consider an analogy with a musical mixer. A mixer is an electronic device used by sound engineers to combine different sound sources in just one track. Outside it appears as a desk with several trimmers. Essentially, each trimmer is piloting one source and it is used to give more or less evidence to each source. The final result depends strongly by the way the engineer sets each trimmer. As the musical track is the result of a particular mix of elementary sources, so the aggregation is the result of a particular mix of elementary aggregate judgments. In this case “trimmers” given to the evaluator can control orness at each level of the aggregation. As we can obtain a musical track by deciding in advance the level of each source, so we get one aggregation by fixing orness at each level (fixed orness). But we can do more. For instance we could be interested in understanding how the global evaluation is effected by varying orness levels, in the same way the sound can change by modifying the level of each source. There are infinite levels of orness we can fix. Even if deciding to ( n +1)

possible quantify each orness by m levels, we have m combinations to consider. We can reduce such complexity making the assumption that levels of orness can just increase or decreasing but not both. Indeed, since orness is a measure of optimism [39], we can analyse how the overall result is affected by a simultaneous increase (reduction) of all aggregation levels. In such case we have just m possible cases to examine. The simplest thing is to set all orness parameters at the same initial value in the same way we could put all trimmers of our mixer on the same line. Then we can analyse the result by increasing or decreasing uniformly all orness parameters, in the same way we can move all trimmers at the same time maintaining them on the same line. Different conflictive criteria can upper-limit the level of orness, in the same way synergetic criteria can set a lower limit for orness. Therefore, it is realistic to consider orness varying within intervals. We can analyse how the overall judgment changes by a uniform variation of orness parameters within appropriate intervals. Coming back to our analogy, it is like to establish a particular profile of sources, and to increase or decrease all levels at the same time using a master trimmer. In both case we are interested to analyse how the overall result change by increasing or decreasing the general optimism of evaluation so that output depends on one parameter. We can trace several result characteristics in function of the orness on some graphs, referred in SEFEM terminology as decision graphs. Such tools will support the decision making. The first question of interest is to understand how the overall prototypal value varies for each alternative. The graph reporting this information is depicted in Fig. 2. From the graph depicted in Fig. 2(a) it appears that ClearCase is the best solution in the evaluation context we considered. There is a big difference with the other alternatives, especially when orness is low. This means that even assuming a conservative approach to the evaluation, it is still the best solution. Increasing the orness, Continuus/CM and PVCS go neat to ClearCase; SourceSafe remains pretty far. This means that ClearCase, PVCS and Continuus/CM have some excellent features highlighted by the high level of orness. To be more confident about our analysis we need to evaluate the uncertainty. Graph in Fig. 2(b) shows that there in enough separation between ClearCase and the others. To better appreciate such difference we can trace the result when

Fig. 2 (a,b,c,d,e,f,g,h) - (a-g) Decision Graphs and (h)the outcome of a traditional comparative evaluation method

σ = 1 (Fig. 2(c)). We can see that even considering the uncertainty associated with the final result, the truth grade for the assessment associated to ClearCase is low near to the prototypal value of the other solutions. Also it is true the opposite: the truth grade of the judgment associated to other solutions is low near to the prototypal value of ClearCase. Analysing the indeterminateness (Fig. 2.(f)) we can deduce that it is too high for Continuus/CM and PVCS so that it could be better to review the evaluation process for them. Other conclusions can be deduced by analysing graphs in Fig. 2(d) and Fig. 2(e) where there is a direct comparison between alternatives. Comparing ClearCase and Continuus/CM (Fig. 2(d)) we can see that for low orness levels (conservative approach) there is enough separation between them so that we can assume ClearCase better than Continuus/CM with a good confidence. For higher orness level, both the products shows the same attitude to have excellent characteristics, but Continuus/CM has an higher uncertainty around its prototypal value, and considering we are near to 1, it means that it has an higher possibility to be worst than what is appearing from our analysis. Then Clear Case is to prefer to Continuus/CM, in the given context. A direct comparison between Continnus/CM e PVCS (Fig. 2(e)) shows that they have pretty the same evaluation

- SEKE '02 - 696 -

profile. Finally, we can make an analysis limiting the range of orness variation within some ranges, according to some considerations about the nature of aggregation between some criteria. For Instance, the aggregate characteristic economical value depends, among the others, jointly on the characteristics license costs (C22) and technical support (C23), so that it is convenient to mark the and-like nature limiting the orness level between 0 and 0.35. Fig. 2(f) shows how the decision graph change by limiting orness at that level of aggregation. Fig. 2(g) shows the results obtained by a traditional methods using averaging. While the outcomes are very similar, the richness of information provided by SEFEM gives a stronger justification to results and a better decision support.

[7]

[8]

[9] [10]

[11]

6. CONCLUSIONS AND FUTURE WORKS In this work we showed why uncertainty is basic to a better risk assessment, and how we can use it as source of information. Central to a better management of uncertainty, there is the selection of an appropriate aggregation method. Traditional numerical techniques cut away uncertainty making some semantic assumptions. Such approach rises up an ontological issue. Knowledge produced by the aggregation method require an integration by the decision maker that makes evaluation not objective even if based on mathematical approach. Fuzzy Mathematics offers a solution to this problem. A new class of fuzzy aggregators, known as OFNWA, has been proposed. Applying OFNWA operators to the evaluation problem, we are able to produce synthetic overall assessments that take in count the uncertainty of basic judgment. The SEFEM methodology uses massively this class of aggregators. The main aim of SEFEM project is to define a set of tools, methodologies, and business processes in attempt to provide a mature approach to software engineering evaluation. Other efforts are planned to make the aggregation model more robust and able to manage a wider class of aggregation (multi-expert, multi-person, etc…). On the other side, we are working to add tools to SEFEM methodology in attempt to get a deeper comprehension of uncertainty nature and to manage better the risk of decision. A wider number of trial evaluations are necessary to validate SEFEM methodology.

[12] [13]

[14] [15]

[16]

[17]

[18]

[19] [20]

7. REFERENCES [1]

[2]

[3]

[4] [5]

[6]

Troiano, L., "Valutazione di Strumenti e Metodi per l'Ingegneria del Software mediante la Matematica Fuzzy," T.L. in Ingegneria Informatica, Università degli Studi Federico II, Napoli, a.a. 1998-99. Canfora G., L. Troiano, "An Extended Model for Ordered Weighted Averaging Applied to Decision Making," submitted to Fuzzy Sets and Systems, 2001. Canfora G., L. Troiano, " An Extensive Comparison between OWA and OFNWA Aggregation," VIII Sigef Congress, Naples - Italy, July 2001. ISO/IEC, "Information Technology – Software Product Evaluation", ISO/IEC 9126, 1991. Bergmann, W. B., " Buying Commercial and Nondevelopmental Items: A Handbook," Office of the Under Secretary of Defense for Acquisition and Technology, 1996. Boehm, B., B. Clark, E. Horowitz, C. Westland, R. Madachy, R. Selby, "Cost models for future software life cycle processes: COCOMO 2.0," Annals of Software Engineering, vol.1, Baltzer, Amsterdam, 1995, pp. 57-94.

[21]

[22] [23]

[24]

[25]

[26]

Boehm, B., B. Clark, E. Horowitz, C. Westland, R. Madachy, R. Selby, "The COCOMO 2.0 software cost estimation model. A status report," American Programmer, vol.9, no.7, 1996, pp. 2-17. Center for Software Engineering, “COCOMO II Model Definition Manual," Computer Science Department, Univ. Southern Cal., 1998. Loral Federal Systems, "COTS Product Identification and Evaluation Process," 1995. Maiden, N. A., C. Ncube, "Acquiring COTS software selection requirements," IEEE Software, vol.15, no.2, 1998, pp. 46-56. Robertson, J., S. Robertson, "Volere Requirements Specification Template, Edition 4," Atlantic Systems Guild, 1997. Herschel, D., "Techniques for selecting a start-up vendor," Gartner Group, #TG-06-0532, 1998. Miller, J. R., R. Jeffries, "Usability Evaluation: Science of Trade-offs," IEEE Software Real-Time Realities, 1992, pp. 97-102. Wikenheiser, D. A., G. Tyrl, "NSIPS/Oracle HR Fit Analysis Report," 1996. Anderson, E., "A Heuristic for Software Evaluation and Selection," Software Practice and Experience, Vol. 19, No. 8, 1989, pp. 707-717. Morisio, M., A. Tsoukiàs, "IusWare: a methodology for the evaluation and selection of software products," IEEE Proceedings of Software Engineering, Vol. 144, No. 3, 1997, pp. 162-174. Roy, B., D. Bouyssou Lamsade, "Comparison of a MultiAttribute Utility and an Outranking Model Applied to a Nuclear Power Plant Siting Example,” Proceedings of Decision Making with Multiple Objectives, Cleveland, Ohio, 1984 Roy, B., "The Outranking Approach and the Foundations of ELECTRE Methods," Theory and Decision, Vol. 31, Kluwer Academic Publishers, Netherlands, 1991, pp. 49-73. Yoon, K., C. Hwang, "Multiple Attribute Decision Making: an Introduction, " Sage Publications, 1995. Saaty, T., "The Analytic Hierarchy Process," McGraw-Hill, New York, 1990. Belton, V., "A Comparison of the analytic hierarchy process and a simple multi-attribute value function," European Journal of Operational Research, Vol. 26, No. 1, 1986, pp. 7-21. Dyer, J., "Remarks on the Analytic Hierarchy Process," Management Science, Vol. 36, No. 3, 1990, pp. 249-275. Saaty, T., "An Exposition of the AHP In Reply to the Paper 'Remarks on the Analytic Hierarchy Process'," Management Science, Vol. 36, No. 3, 1990. Harker, P. and Vargas, L., "Reply to 'Remarks on the Analytic Hierarchy Process'," Management Science, Vol. 36, No. 3, 1990. Edwards, W., J. R. Newman, "Multi-attribute Evaluation: Series on Quantitative Applications in the Social Sciences," Sage Publications, Beverly Hills, 1982. Finnie, G., G. Wittig, D. Petkov, "Prioritizing Software Development Productivity Factors Using Analytic Hierarchy Process," Journal of Systems and Software, Vol. 22, No. 3, 1993, pp. 129-139.

- SEKE '02 - 697 -

[27] Forman, E., "Facts and Fictions about the Analytic Hierarchy Process," Mathematical Computer Modeling, Vol. 17, No. 4/5, Pergamon Press Ltd., Great Britain, 1993, pp. 19-26. [28] Hong, S., R. Nigam, "Analytic Hierarchy Process Applied to Evaluation of Financial Modeling Software," Proceedings of the 1st International Conference on Decision Support Systems, Atlanta, GA, 1981. [29] Hwang, C.L., K. Yoon, "Multiple attribute decision making: methods and applications : a state-of-the-art survey, " #186, Springer-Verlag, Berlin, 1981. [30] Jeanrenaud, A., P. Romanizzi, "Software Product Evaluation Metrics: Methodological Approach," Software Quality Management II. Building Quality into Software 2, 1995, 776, pp. 59-69. [31] Kitchenham, B., S. Linkman, D. Law, "DESMET: a methodology for evaluating software engineering methods and tools," Computing & Control Engineering Journal, vol.8, no.3, 1997, pp. 120-126. [32] Kitchenham, B. A., S. G. Linkman, D. T. Law, "Critical Review of Quantitative Assessment," Software Engineering Journal, vol.9, no.2, 1994, pp. 43-53. [33] Kitchenham, B., L. Pickard, S. L. Pfleeger, "Case Studies for Method and Tool Evaluation," IEEE Software Features, 12, 1995, pp. 52-62. [34] Mayrand, J., F. Coallier, "System Acquisition Based on Software Product Assessment," Proceedings of the 18th International Conference on Software Engineering, 1996, pp. 210-219. [35] Min, H., "Selection of Software: The Analytic Hierarchy Process," International Journal of Physical Distribution and Logistics Management, Vol. 22, No. 1, 1992, pp. 4252.

[36] Rivett, P., "Multi-dimension scaling for multi-objective policies," OMEGA, Int'l J of Mgmt Sci., V. 5, no. 4, Pergamon Press, 1997, pp. 367-378. [37] Tversky, A., D. Kahneman, "Advances in prospect theory: Cumulative representation of uncertainty," Journal of Risk and Uncertainty, 5, 1992, pp. 297-323. [38] Knight, F.H., "Risk, uncertainty and profit," Houghton Mifflin, Boston, 1921. [39] Yager, R.R., "On ordered weighted averaging aggregation operators in multi-criteria decision making," IEEE Trans. on Systems, Man, and Cybernetics, 18(1), 1988, pp.183190. [40] Bellman, R.E., Zadeh, L.A. "Decision making in a fuzzy environment," Management Science, 17(4), 1970, pp. 141164. [41] Zadeh, L. A., "A fuzzy-set-theoretic interpretation of linguistic hedges," Journal of Cybernatics 2:2, 1972, pp. 434. [42] Zadeh, L. A., "Outline of a new approach to the analysis of complex systems and decision processes," IEEE Trans. Systems, Man, and Cybernetics, SMC-3, 1973. [43] Chen, S.J., Hwang, C.L., "Fuzzy Multiple Attribute Decision Making: Methods and Applications.", SpringerVerlag, New York, 1992. [44] Yager, R. R., "On the Issue of Importance Qualification in Fuzzy Multi-Criteria Decision Making," Machine Intelligence Institute, Iona College, New Rochelle, NY. [45] Sugeno, M., Takagi, T., "A new approach to design of fuzzy controllers," In Advances in Fuzzy Sets, Possibility Theory and Applications, Wang, P. P. (Ed.), Plenum Press: New York, 1983, pp.325-334.

- SEKE '02 - 698 -