NEW COMPLEXITY WEIGHTS FOR FUNCTION

2 downloads 0 Views 5MB Size Report
Oct 26, 2004 - Faculty: Computer Science and Information Technology ... dengan yang demikian kombinasi linear pengiraan tidak konsisten dengan teori ...
NEW COMPLEXITY WEIGHTS FOR FUNCTION POINT ANALYSIS USING ARTIFICIAL NEURAL NETWORKS

By MOHAMMED ABDULLAH HASAN AL-HAGRI

Thesis Submitted to the School of Graduate Studies, Universiti Putra Malaysia, in Fulfilment of the Requirements for the Degree of Doctor of Philosophy

October 2004

Dedicated to my Parents; Abdullah and Neammh, to my wife and my kids; Ammar and Afnan, to my family.

ii

Abstract of thesis presented to the Senate of Universiti Putra Malaysia in fulfilment of the requirements for the Degree of Doctor of Philosophy

NEW COMPLEXITY WEIGHTS FOR FUNCTION POINT ANALYSIS USING ARTIFICIAL NEURAL NETWORKS

By MOHAMMED ABDULLAH HASAN AL-HAGRI

October 2004 Chairman:

Associate Professor Abdul Azim Abdul Ghani, Ph.D.

Faculty:

Computer Science and Information Technology

Function points are intended to measure the amount of functionality in a system as described by a specification. Function points are first proposed in 1979 and currently they are known as the International Function Points User Group (IFPUG) version 4.1. Function points are computed through three steps. The first step is counting the number of the five components in a system which are external inputs, external outputs, external inquiries, external files, and internal files. The second step is assigning a complexity weight to each of the components using weighting factors that are established according to the ordinal scale: simple, average, or complex. The last step is determining 14 technical complexity factors. Although, function points are widely used, they still have limitations.

Function points suffer from problem with subjective weighting in the second step since the weights used may not be appropriate. The weights are derived from IBM experience. Besides that, the calculation of function points combines measures from an ordinal scale with counts that are on a ratio scale, thus the linear combinations of the calculation are

iii

inconsistent with the measurement theory. As a result, the function points measure used in estimation will produce inaccurate estimates.

This thesis proposes new complexity weights for the function points measure by modifying the original complexity weights using artificial neural network algorithm. Particularly the Back Propagation algorithm is employed to derive the proposed complexity weights. The complexity weights derived are established according to an absolute scale which is much more flexible and suitable.

The real industrial data sets assembled by the International Software Benchmarking Standard Group are used for comparison between the function point measure obtained using the original complexity weights and proposed complexity weights. The results obtained by proposed complexity weights show improvement in software effort estimation accuracy. The results also show reduction of the error margins in effort estimation where the ratio of average error in using the original complexity weights and the proposed complexity weights is 65% to 35% respectively.

iv

Abstrak tesis yang dikemukakan kepada Senat Universiti Putra Malaysia sebagai memenuhi keperluan untuk Ijazah Doktor Falsafah

PEMBERAT KOMPLEKSITI BAHARU UNTUK ANALISIS MATA FUNGSI DENGAN MENGGUNAKAN RANGKAIAN NEURAL BUATAN Oleh MOHAMMED ABDULLAH HASAN AL-HAGRI October 2004 Pengerusi:

Profesor Madya Abdul Azim Abdul Ghani, Ph.D.

Fakulti:

Sains Komputer dan Teknologi Maklumat

Function Points bertujuan untuk mengukur amaun kefungsian dalam suatu sistem yang dihuraikan oleh spesifikasi. Titik Fungsi mula dicadangkan pada tahun 1979 dan sekarang ini dikenali sebagai International Function Points User Group (IFPUG) versi 4.1. Function Points dikira melalui tiga langkah. Langkah pertama adalah pengiraan bilangan lima komponen sistem iaitu input luaran, output luaran, pertanyaan luaran, fail luaran, dan fail dalaman. Langkah kedua adalah mengumpukkan pemberat kompleksiti ke setiap komponen dengan menggunakan faktor pemberat yang ditetapkan mengikut skala ordinal simple, average, atau complex. Langkah terakhir penentuan 14 faktor kompleksiti teknikal. Walaupun Function Points digunakan secara meluas, ianya masih mempunyai batasan.

Function Points mengalami masalah dengan pemberatan subjektif dalam langkah kedua oleh kerana pemberat yang digunakan mungkin tidak sesuai. Pemberat diperoleh daripada pengalaman IBM. Selain daripada itu, pengiraan

Function Points

menggabungkan ukuran daripada skala ordinal dengan bilangan yang berskala nisbah, dengan yang demikian kombinasi linear pengiraan tidak konsisten dengan teori

v

pengukuran. kesannya, ukuran Function Points yang digunakan dalam penganggaran akan mengeluarkan anggaran yang tidak tepat.

Tesis ini mencadangkan pemberat kompleksiti baharu untuk ukuran Function Points menerusi pengubahsuaian pemberat kompleksiti asal dengan menggunakan algoritma rangkaian neural buatan. Secara khususnya, algoritma Back Propagation digunakan untuk menerbitkan pemberat kompleksiti cadangan. Pemberat kompleksiti yang diterbitkan ini ditetapkan menuruti skala mutlak yang lebih fleksibel dan sesuai.

Set data sebenar industri yang dihimpun oleh International Software Benchmarking Standard Group digunakan untuk perbandingan antara ukuran Function Points yang dihasilkan menerusi penggunaan pemberat kompleksiti asal dan cadangan. Keputusan yang dihasilkan oleh pemberat kompleksiti cadangan menunjukkan pembaikan dalam ketepatan penganggaran keupayaan perisian. Keputusan juga menunjukkan pengurangan margin ralat dalam penganggaran keupayaan dengan nisbah purata ralat dalam penggunaan pemberat kompleksiti asal dan cadangan adalah masing-masing 65% ke 35%.

vi

ACKNOWLEDGEMENTS In the name of ALLAH, the Beneficent, the Compassionate and who giving me strength, patience, and motivation to complete this research work. I would like to take this opportunity to record my gratitude towards the great peoples who they were an important support during the phases of this research; particularly those who help me during the time I was doing my Ph.D. research. My deepest appreciation and gratitude go to the research committee leads by Associate Prof. Dr. Abdul Azim Abdul Ghani, who has always take time to listen to my ideas, and he has patiently answered my questions, invaluable guidance, fruitful discussion, patience and continued encouragement supply me at every stage of this work and who always provides the gold recommendations and suggestions to my inquiries tranquilly and accurately. He was contributed via the faculty for helping me to get the license of data that I need for the research. So this has leaded my work to the stage of the success. Also I would like to introduce my great thanks to all the member of my Ph.D. supervision committee; Associate Prof. Dr. Md. Nasir Sulaiman for his support, attentions during my research work and the guidance in each discussion during all steps of this work and Associate Prof. Mohd. Hasan Selamat who helped me more than I expected for providing me inspiration for this work and also for his virtuous guidance, encouragement and help during the time of doing the research. A great thanks to the Faculty of Computer Science and Information Technology, the university library and Universiti Putra Malaysia that provided the working environment for performing this work. I would also like to thank the faculty dean secretary, Puan Norhaidah and the faculty deputy dean secretary, Puan Suraiya for their helps and goodness dealings during the progress of my work. My special thanks to the gentle honesty Yemeni friend Walid Saeed Al Shargabi, Ali Al Sharafi, Makarem Bamatraf, Al-Taher Seedik from Sudan and the other from Yemen, Libya, Iraq, Jordan and Nigeria for their good dealings.

MOHAMMED ABDULLAH HASAN AL-HAGRI October 2004

vii

I certify that an Examination Committee met on 26/10/2004 to conduct the final examination of Mohammed Abdullah Hasan Al-Hagri on his Doctor of Philosophy thesis entitled "New complexity weights for Function Point Analysis using Artificial Neural Networks" in accordance with Universiti Pertanian Malaysia (Higher Degree) Act 1980 and Universiti Pertanian Malaysia (Higher Degree) Regulations 1981. The Committee recommends that the candidate be awarded the relevant degree. Members of the Examination Committee are as follows: Hj. Ali Mamat, Ph.D. Associate Professor Faculty of Computer Science and Information Technology Universiti Putra Malaysia (Chairman) Ramlan Mahmod, Ph.D. Associate Professor Faculty of Computer Science and Information Technology Universiti Putra Malaysia (Member) Hajah Fatimah Dato' Ahmad, Ph.D. Associate Professor Faculty of Computer Science and Information Technology Universiti Putra Malaysia (Member) Y. Bhg. Safaai Deris, Ph.D. Professor Faculty of Computer Science and Information Systems Universiti Teknologi Malaysia (Independent Examiner)

__________________________________ GULAM RUSUL RAHMAT ALI, Ph.D. Professor/Deputy Dean School of Graduate Studies Universiti Putra Malaysia Date :

viii

This thesis submitted to the Senate of Universiti Putra Malaysia and has been accepted as fulfilment of the requirement for the degree of Doctor of Philosophy. The members of the Supervisory Committee are as follows: Abdul Azim Abdul Ghani, Ph.D. Associate Professor Faculty of Computer Science and Information Technology Universiti Putra Malaysia (Chairman) Md. Nasir Sulaiman, Ph.D. Associate Professor Faculty of Computer Science and Information Technology Universiti Putra Malaysia (Member) Mohd. Hasan Selamat, M.Phi. Associate Professor Faculty of Computer Science and Information Technology Universiti Putra Malaysia (Member)

_______________________ AINI IDERIS, Ph.D. Professor/Dean School of Graduate Studies Universiti Putra Malaysia Date:

ix

DECLARATION I hereby declare that the thesis is based on my original work except for quotations and citations which have been duly acknowledged. I also declare that it has not been previously or concurrently submitted for any other degree at UPM or other institutions.

—————————————————————— MOHAMMED ABDULLAH HASAN AL-HAGRI Date :26/10/2004

x

TABLE OF CONTENTS ................

Page

DEDICATION .................................................................................................................ii ABSTRACT .....................................................................................................................iii ABSTRAK .......................................................................................................................v ACKNOWLEDGEMENTS .............................................................................................vii APPROVAL.....................................................................................................................viii DECLARATION ............................................................................................................x LIST OF TABLES ...........................................................................................................xiv LIST OF FIGURES .........................................................................................................xvi LIST OF ABBREVIATIONS ..........................................................................................xix CHAPTER 1 INTRODUCTION 1.1 1.2 1.3 1.4 1.5

Background ................................................................................................... 1.1 Problem Statements....................................................................................... 1.4 Research Objectives ...................................................................................... 1.5 Research Methodology.................................................................................. 1.6 Organisation of Thesis .................................................................................. 1.7

2 LITERATURE REVIEW 2.1 2.2 2.3

2.4 2.5 2.6

2.7 2.8

2.9

Introduction .................................................................................................. 2.1 Software Measurement ................................................................................ 2.1 Measurement Theory .................................................................................... 2.2 2.3.1 Measurement Methods ..................................................................... 2.3 2.3.2 Software Size Measures ................................................................... 2.4 Measurement Scales ..................................................................................... 2.6 Software Metrics and Software Functionality ............................................... 2.8 Application of Software Measurement ......................................................... 2.10 2.6.1 Estimation ....................................................................................... 2.11 2.6.2 Controlling ....................................................................................... 2.12 Function Point Measure ................................................................................. 2.13 Function Point Analysis................................................................................. 2.14 2.8.1 Function Point Components ............................................................ 2.15 2.8.2 Function Point Complexity Weights ............................................... 2.16 2.8.3 Function Point Complexity Factors ................................................ 2.16 2.8.4 Function Point Counting Procedure ................................................ 2.17 2.8.5 Function Point Applications ........................................................... 2.19 Extended Function Point Analysis Techniques............................................. 2.21

xi

2.10

2.11 2.12 2.13 2.14

2.15

2.9.1 Feature Points ................................................................................... 2.22 2.9.2 Asset-R Function Points................................................................... 2.23 2.9.3 Mark II Function Point and Mark II Model ..................................... 2.24 2.9.4 Banker’s Object Points..................................................................... 2.26 2.9.5 3D Function Point ............................................................................ 2.27 2.9.6 Hallmark Cards ................................................................................ 2.28 2.9.7 Application Feature .......................................................................... 2.29 2.9.8 IFPUG Version ............................................................................... 2.30 2.9.9 Full Function Point........................................................................... 2.32 2.9.10 COSMIC Full Function Point .......................................................... 2.33 Limitations of Function Point (IFPUG version) ........................................... 2.34 2.10.1 Weights Limitations and Its Effect on Software Cost Estimation ... 2.34 2.10.2 Function Point Limitations with Accuracy & Scale Type ............... 2.37 Artificial Neural Networks............................................................................ 2.38 General Applications of Artificial Neural Networks ................................... 2.39 Applications of Neural Networks in Software Engineering ......................... 2.40 Neural Networks Methods ............................................................................ 2.42 2.14.1 Standard Back Propagation Algorithm ............................................ 2.42 2.14.2 Neural Networks Training................................................................ 2.45 2.14.3 Neural Networks Testing ................................................................. 2.46 Summary ....................................................................................................... 2.47

3 RESEARCH METHODOLOGY 3.1 3.2 3.3 3.4 3.5 3.6

3.7 3.8 3.9 3.10 3.11

3.12 3.13 3.14

Introduction ................................................................................................... 3.1 General Description of Research Methodology ............................................ 3.1 Analysing the Original Weights .................................................................... 3.4 Mathematical Representation of Complexity Weights ................................. 3.7 Computer Resources ..................................................................................... 3.9 Artificial Neural Networks............................................................................ 3.10 3.6.1 Reasons for Using Artificial Neural Networks ................................ 3.10 3.6.2 Using an Improved Model of Back Propagation Algorithm ............ 3.11 3.6.3 Reasons for Using the Back Propagation Algorithm ....................... 3.12 3.6.4 Network Architecture ....................................................................... 3.13 Derivation of Training Data ......................................................................... 3.14 Normalisation of Training Data ................................................................... 3.15 Collection of Measurement Data ................................................................. 3.16 Data Sampling ............................................................................................... 3.17 Data Analysis ................................................................................................ 3.18 3.11.1 Mean Magnitude of Relative Error (MMRE) ................................. 3.18 3.11.2 Ratio of Average Error .................................................................... 3.19 3.11.3 Correlation Coefficient .................................................................. 3.19 3.11.4 Error Limits ..................................................................................... 3.20 Calculating the Function Points Using the New and the Original Weights .. 3.21 Results Using the Effort and Cost Models .................................................... 3.22 Summary ....................................................................................................... 3.23

4 DEVELOPMENT OF FUNCTION POINT COMPLEXITY WEIGHTS 4.1

Introduction ................................................................................................... 4.1

xii

4.2

4.3 4.4

4.5

Detailed Steps of Establishing Training Database ........................................ 4.1 4.2.1 Using the Albrecht Weights Tables as Baselines ............................. 4.2 4.2.2 Closing the Open Intervals of ILF Table ......................................... 4.3 4.2.3 Closing the Open Intervals of DET .................................................. 4.4 4.2.4 Closing the Open Intervals of RET .................................................. 4.6 4.2.5 Applying the Mid Point Rule for Calculating Weights Samples ..... 4.8 4.2.6 Increasing the Weights Samples ...................................................... 4.11 Normalization of Training Database ............................................................. 4.12 Description of the Proposed Network ........................................................... 4.15 4.4.1 Network Inputs ................................................................................. 4.17 4.4.2 Training Process ............................................................................... 4.17 4.4.3 Testing Process ................................................................................ 4.18 4.4.4 Network Outputs (Proposed Complexity Weights of ILF) .............. 4.20 Summary ....................................................................................................... 4.21

5 RESULTS AND DISCUSSION 5.1 5.2

5.3 5.4

Introduction ................................................................................................... 5.1 Results ........................................................................................................... 5.1 5.2.1 Sample 1 ........................................................................................... 5.2 5.2.2 Sample 2 ........................................................................................... 5.6 5.2.3 Sample 3 ........................................................................................... 5.9 5.2.4 Sample 4 ........................................................................................... 5.13 5.2.5 Sample 5 ........................................................................................... 5.16 5.2.6 Sample 6 ........................................................................................... 5.19 5.2.7 Sample 7 ........................................................................................... 5.22 5.2.8 Results of Total Data Sets ................................................................ 5.26 Results Discussion ........................................................................................ 5.27 Summary ....................................................................................................... 5.30

6 CONCLUSIONS AND FUTURE WORKS 6.1

6.2 6.3

Conclusions ................................................................................................... 6.1 6.1.1 Estimation Accuracy of the Proposed Weights ................................ 6.2 6.1.2 Capabilities of the Proposed Weights .............................................. 6.3 Research Contribution ................................................................................... 6.4 Suggestions for Further Works ..................................................................... 6.6

BIBLIOGRAPHY.. ....................................................................................................... R.1 APPENDICES.. ............................................................................................................. A PUBLICATIONS BIODATA OF THE AUTHOR ..................................................................................

xiii

LIST OF TABLES Table

Page

2.1: Five Common Scale Types ................................................................................ 2.7 2.2: Function Point Complexity Weights .................................................................. 2.16 2.3: The Internal Complexity Factors ....................................................................... 2.17 2.4: The Hallmark Cards Complexity Weights ......................................................... 2.29 3.1: Internal Logical File (ILF) Weights ................................................................... 3.4 4.1: Rule Table of Internal Logical File (ILF) .......................................................... 4.2 4.2: Rule Table of External Outputs ......................................................................... 4.2 4.3: Rule Table of External Inquiry .......................................................................... 4.2 4.4: Rule Table of External Inputs ............................................................................ 4.3 4.5: Rule Table of External Interface File ................................................................. 4.3 4.6: Closed Intervals of Internal Logical Files Rule Table ....................................... 4.5 4.7: Closed Intervals of External Inputs Rule Table ................................................. 4.7 4.8: Closed Intervals of External Outputs Rule Table .............................................. 4.7 4.9: Closed Intervals of External Inquiry Rule Table ............................................... 4.7 4.10: Closed Intervals of External Interface File Rule Table.................................... 4.7 4.11: Internal Logical File Training Patterns ............................................................ 4.13 4.12: Normalized Data Sets of Internal Logical File ................................................ 4.14 4.13: Special Parameters of the Used Network ......................................................... 4.17 4.14: General Comparison of Internal Logical File Weights .................................... 4.20 5.1: Sampling the Data According to Error Rate ...................................................... 5.2 5.2: Measurement Results of Sample 1 ..................................................................... 5.5 5.3: Measurement Results of Sample 2 ..................................................................... 5.9 5.4: Measurement Results of Sample 3 ..................................................................... 5.12

xiv

5. 5: Measurement Results of Sample 4 .................................................................... 5.15 5.6: Measurement Results of Sample 5 ..................................................................... 5.19 5.7: Measurement Results of Sample 6 ..................................................................... 5.22 5.8: Measurement Results of Sample 7 ..................................................................... 5.25 5.9: Summarised Results for Sample 1 to Sample 4 ................................................. 5.29 5.10: Summarised Results for Sample 5 to Sample 7 with Total Data Sets ............ 5.29

xv

LIST OF FIGURES Figure

Page

2.1: Software Metrics Collection Process ................................................................ 2.7 2.2: Evolution of Functional Size Measurement ...................................................... 2.31 2.3: The Standard Back Propagation Algorithm ...................................................... 2.44 3.1: The General Steps of Research Methodology................................................... 3.3 3.2: Mathematical Representation of the Original Complexity Weights ................. 3.8 3.3: Mathematical Representation of the Proposed Complexity Weights ............... 3.9 3.4: The Architecture of the Used Network ............................................................. 3.13 3.5: General Steps of Establishing Training Data .................................................... 3.14 3.6: Example of Error Gaps Between the Actual and Estimated Values ................. 3.20 3.7: The Function Point Analysis Using the Original Weights ................................ 3.21 3.8: Function Point Analysis Using the Proposed Weights ..................................... 3.22 3.9: Comparison of Results Using Cost Estimation Model ..................................... 3.23 4.1: Steps of Generating the Main Training Patterns ............................................... 4.8 4.2: Selecting Important Cases of ILF Weights ....................................................... 4.9 4.3: Calculation of Five Main Patterns for the ILF ................................................. 4.9 4.4: Algorithm of Generating Enough Training Patterns for One Component ........ 4.11 4.5: Algorithm of Generating All Function Point Training Patterns ....................... 4.12 4.6: The Proposed Network...................................................................................... 4.16 4.7: General Process of Network Training ............................................................... 4.18 4.8: General Process of Network Testing ................................................................ 4.19 5.1: Ratio of Average Error for Sample 1 ................................................................ 5.3 5.2: Error Curve of Sample 1 ................................................................................... 5.3 5.3: Scatter Plot of Sample 1 With the Original Complexity Weights .................... 5.4

xvi

5.4: Scatter Plot of Sample 1 With the Proposed Complexity Weights .................... 5.4 5.5: Ratio of Average Error for Sample 2 ................................................................. 5.7 5.6: Error Curve of Sample 2 .................................................................................... 5.7 5.7: Scatter Plot of Sample 2 with the Original Complexity Weights ..................... 5.8 5.8: Scatter Plot of Sample 2 with the Proposed Complexity Weights ..................... 5.8 5.9: Ratio of Average Error for Sample 3 ................................................................. 5.10 5.10: Error Curve of Sample 3 .................................................................................. 5.10 5.11: Scatter Plot of Sample3 with the Original Complexity Weights ..................... 5.11 5.12: Scatter Plot of Sample 3 with the Proposed Complexity Weights ................... 5.11 5.13: Ratio of Average Error for Sample 4 ............................................................... 5.13 5.14: Error Curve of Sample 4 .................................................................................. 5.13 5.15: Scatter Plot of Sample 4 Using The Original Complexity Weights ................ 5.14 5.16: Scatter Plot of Sample 4 Using The Proposed Complexity Weights ............... 5.15 5.17: Ratio of Average Error for Sample 5 ............................................................... 5.16 5.18: Error Cure of Sample 5 .................................................................................... 5.17 5.19: Scatter Plot of Sample 5 with the Original Complexity Weights .................... 5.17 5.20: Scatter Plot of Sample 5 with the Proposed Complexity Weights ................... 5.18 5.21: Ratio of Average Error for Sample 6 ............................................................... 5.20 5.22: Error Curve of Sample 6 .................................................................................. 5.20 5.23: Scatter Plot of Sample 6 Using the Original Complexity Weights .................. 5.21 5.24: Scatter Plot of Sample 6 Using the Proposed Complexity Weights ................ 5.21 5.25: Ratio of Average Error for Sample 7 ............................................................... 5.23 5.26: Error Curve of Sample 7 .................................................................................. 5.23 5.27: Scatter Plot of Sample 7 with the Original Complexity Weights .................... 5.24 5.28: Scatter Plot of Sample 7 with the Proposed Complexity Weights ................... 5.24

xvii

5.29: Ratio of Average Error for All Data Sets ......................................................... 5.26 5.30: A Scatter Plot of Effort Estimation Using the Original Complexity Weights . 5.27 5.31: A Scatter Plot of Effort Estimation Using the Proposed Complexity Weights 5.27

xviii

LIST OF ABBREVIATIONS ANN

Artificial Neural Networks

BP

Back Propagation

COTS

Commercial-Off-The-Shelf

COSMIC

Common Software Measurement International Consortium

CASE

Computer-Assisted Software Engineering

CAD

Computer-Aided Design

CAM

Computer-Aided Management

CPM

Counting Practices Manual

DB

Database

DET

Data Elements

EI

External Inputs

EQ

External Inquiry

EO

External Outputs

FTR

File Type Reference

FP

Function Point

GSC

General System Characteristics

GUI

General User Interface

IFI

Internal File Interface

ILF

Internal Logical File

IFPUG

International Function Point User Groups

ISBSG

International Software Benchmarking Standards Group

IC

Interval Centre

LOC

Line Of Code

MIS

Management Information Systems

MMRE

Mean Magnitude of Relative Error

MP

Mid Point rule

NN

Neural Networks

OOP

Object- Oriented Programming

RET

Record Elements

RSC

Reusable Software Component

xix

SELAM

Software Engineering Laboratory in Applied Metrics

SEMRL

Software Engineering Management Research Laboratory

SPR

Software Productivity Research

TCA

Technical Complexity Adjustment

TCF

Technical Complexity Factors

3GL

Third Generation Language

TDI

Total Degree of Influence

UFP

Unadjusted Function Point

UFPC

Unadjusted Function Point Count

WH

Working Hours

xx

BIODATA OF THE AUTHOR Mohammed Abdullah Hasan Al-Hagri was born at Allahaj village in IBB city, Republic of Yemen on 5th January 1971. He took his stage of education at Ali ben Abi Talib Primary School, IBB Yemen from 1977 to 1982. He took his intermediate education at Kalied ben Alwaleed School, Waddi Athahb IBB, Republic of Yemen from 1983 to 1986. Mr Al-Hagri finished his secondary education at Al Nahda School, Garaffa IBB, and Republic of Yemen from 1987 to 1989. The Yemen Government sent him to the University of Technology, Department of Computer Science, Baghdad, Iraq, in 1991 where he got his BSc, in Computer Science in 1994. In February 1995, Mr. Mohammed Al Hagri was appointed as a lecturer in Department of Computer Science, Faculty of Science and Engineering, University of Science and Technology, Republic of Yemen. In 1996, the University gave him chance for completing his higher education in the same university. He got his MSc, in Computer Science in 1998, from the Faculty of Science and Engineering, University of Science and Technology Yemen. In 2001, he was awarded a scholarship and enrolled in the Ph.D program at Faculty of Computer Science and Information Technology, University Putra Malaysia, where he is now waiting to be conferred the degree. Mr. Mohammed Al Hagri is married and has a son, Ammar and daughter, Afnan. Currently, Al Hagri is in the position of academic advisor and a lecturer at the Department of Computer Science, Faculty of Science and Engineering, University of Science and Technology, the main branch, Sanna’a, Republic of Yemen.

xxi

NEW COMPLEXITY WEIGHTS FOR FUNCTION POINT ANALYSIS USING ARTIFICIAL NEURAL NETWORKS

MOHAMMED ABDULLAH HASAN AL-HAGRI

DOCTOR OF PHILOSOPHY UNIVERSITY OF PUTRA MALAYSIA 2004

xxii

CHAPTER 1

INTRODUCTION

1.1

Background

Function Point is a well-known method used to measure the functionality of a system from the user’s point of view. The function point includes the standard function point and many different models derived from it. The standard function point method created by (Albrecht, 1979) is currently known as the International Function Point User Group (IFPUG) version. The IFPUG proposes and documents the function point as a technology independent measure of software (Longstreet, 2001).

The Function Point Analysis (FPA) is a method to break down systems into smaller components, for better understanding and analysis. It encompasses three main activities. Firstly, the determination of function point components which are the External Inputs (EI), External Outputs (EO), External Inquiry (EQ), Internal Logical Files (ILF) and External Interface Files (EIF). Secondly, assignment of complexity weights to each component. The complexity weights of function point consist of six tables that are established according to the ordinal scale. Five of these tables are called the rule tables, which are associated with each function point component and the sixth table is the general table of the weights. Finally, determining the general system characteristics that are taken into account during the function point calculations. It consists of 14 complexity factors used to describe the internal complexity of the software system.

1.1

Currently, a number of models employ function point as a primary input for effort, cost and schedule estimation. Hundreds of organisations and companies in the world are using function point to measure the applications functionality size in an effort to meet customer demand on time and within budget (UKSMA, 1998; SMS, 2001).

However, function point has been suffering essential problems. Hence it passes through many stages of development. A number of researchers have tried to develop the function point in order to solve some of its problems. They created various releases such as Feature Point, 3D Function Point, Mark II Function Point, Full Function Point and COSMIC Full Function Point. 

Feature Point: This method developed by Caper Jones of Software Productivity Research, Inc. in 1986 to measure the functionality of operating systems, telephone switching systems, military systems, etc. There is no standards organisation for this measure, like IFPUG (SCT, 1997; UCSE, 1997).



Mark II Function Point: Charles Symons introduced the Mark II Function Points technique in 1988. This version designed to measure Graphical User Interface, Object Oriented Programming, and Client/Server (GIFPA, 1998a).



3D Function Point: Between 1989 and 1992, the Boeing Company considered the use of function point to measure productivity. They designed the 3D function point to address two classic problems associated with the Albrecht approach. First, the approach is considered difficult to use. Second, it does not properly measure scientific and real-time software (Hastings, 1995; SCT, 1997).



Full Function Point: In 1997, a new extension to function point was developed for measuring the functional size of real-time software to address weaknesses of the IFPUG version. This extension is the full function point.

1.2



COSMIC Full Function Point: In 1998, the Common Software Measurement International Consortium (COSMIC) method was created as a refinement of Full Function Point, Mark II and the function point analysis technique, in order to address a variety of software domains especially MIS and real time systems (Symons and Rule, 1999).

Each one of these releases is designed to solve a specific problem. But none of them solved the problem of subjectivity weights of the standard function point measure (IFPUG version). Until now, this measure is still the superior software measure around the world (UKSMA, 1998; Symons, 2001).

This research is planned to solve the problem of subjectivity weights of the function point measure and its related problems by using an improved standard Back Propagation algorithm (BP). Many reasons of using this algorithm are discussed in Chapter 3. This algorithm is one of the most common methods of the Artificial Neural Networks (ANN). It has been used successfully in many software metricmodeling studies, such as the development effort required for software systems with a given set of requirements (Gray, 1996). Accordingly, BP is used in this thesis as a prediction tool to develop the current complexity weights of function point. Consequently, new complexity weights are created and an absolute scale is implemented.

1.3

1.2

Problem Statements

The main problem of this research is the subjectivity in complexity weights of the function point measure. This problem affects directly on the calculation accuracy of the final count of this measure. However, this measure is currently known as the IFPUG version 4.1. Its original complexity weights are not covering all the complexity states because it deals with function point components as three groups namely simple, average and complex, and each function point component relates with a special degree of complexity. Thus, each component must be associated to special weights according to this degree of complexity. The choice of original weights that are used for calculating the function points is determined subjectively from the IBM experience. Therefore, these values may not be appropriate in other development environments. The classification of components as simple, average and complex is simplifying the calculations only. Hence, both more and less complex function components are having the same complexity weights and the same number of function points. Therefore, a problem of failure to measure large software functionality sizes is becoming visible.

Albeit its subjectivity, the complexity weights are still used. Most researches in function point analysis (Hastings, 1995; UCSE, 1997; SCT, 1997; GIFPA, 1998a; UKSMA, 1998; Symons and Rule, 1999; Symons, 2001) do not specifically solve the subjectivity in the complexity weights. As these weights represent the core of this measure, another two subproblems emerge which are related with the subjectivity weights, which are:

1.4

a. Problem with measurement theory: The function point calculations consist of different scales (the complexity weights organised in ordinal scale while the final count generates in absolute scale) in a manner that is inconsistent with the measurement theory. So, the weights must be established according to a higher order scale. b. Problem of accuracy: This problem appears as a result of weights weakness and using the lower order scale. Therefore, the research concentration is on the subjectivity weights and the two related problems.

1.3

Research Objectives

The main objective of this research is to propose new complexity weights for the Function Point Analysis method. This includes, creating complexity weights for the five function point components and organising the proposed weights according to higher order scale such as the absolute scale. The proposed complexity weights are a process to increase the calculation accuracy of the function point measure.

The above objectives can be achieved by the following steps: a. To analyse the original weights of the function point (IFPUG version 4.1) mathematically in order to discover its weaknesses and discrepancies. b. To employ these weights as a baseline and a kernel for creating new complexity weights by using the Back Propagation algorithm. c. To generate a training database for the purpose of establishing a learned and tested network.

1.5

d. To generate the proposed complexity weights for the five components of function point according to the absolute scale. In this thesis, the performance of the suggested method is measured by using the effort model where effort (working hours) means cost. On the other hand, the accuracy of this method is presented by decreasing the error rate in the effort results when the final function points count is used as a main parameter in the effort estimation model. Accordingly, this research increases the calculation accuracy of the function point measure to enable all users, companies, and those interested in function point analysis to get the exact functionality size for their software.

1.4

Research Methodology

The research methodology includes many substages, which consist of mathematical calculations, many complex operations, special algorithms, creating training database (DB), and training the suggested network. The created DB is used for training one of the famous neural networks methods as a prediction tool. This tool is an improved BP method. In order to use the suggested network to predict the proposed complexity weights, it is necessary to build the training DB (five DB for five function point components) according to the original complexity weights. The research methodology is designed completely in chapter 3. The important layouts of this methodology include the following: 1. Using the original complexity weights as base lines for the solution. 2. Constructing a suitable training DB. 3. Learning and testing the suggested network. 4. Producing the weights tables and applying them in the function point count by using data sets of real industrial projects.

1.6

5. Applying the original complexity weights in the function point count by using the same data. 6. Applying the two types of function point counts in effort estimation model. 7. Comparing the effort results of using the original and proposed weights with the actual effort values to prove the validation of the proposed method.

1.5

Organisation of Thesis

The thesis is organised in accordance with the standard structure of thesis and dissertations at University of Putra Malaysia. The thesis has six chapters, including this introductory chapter that covers the background information that leads to an idea of furthering in detail the concepts of function point and its complexity weights.

Chapter 2 explains the main concepts of the software measurement, measures, metrics, and scale types, function point components, function point calculations and its applications. This chapter also covers the literature review of the previous works and related studies, especially in the function point field. Furthermore, discussion focuses on the historical background and the development stages of function point. Several researches on solving the function point problems are described. The chapter completed with an introduction to the ANN techniques and its uses in software engineering, software measurement field and an explanation of the ANN training and testing concepts.

Chapter 3 describes the general architecture of the research methodology. It explains all components of the suggested method of constructing the proposed complexity

1.7

weights tables. In addition, this chapter describes the general steps of derivation of training data, data collection, data sampling, measurement methods and the proposed structure of the used network. The reasons of using ANN and BP algorithm are included. Several detailed stages of the methodology are designed in this chapter, which stands as a theoretical framework of the research.

Chapter 4 explains some hypotheses and mathematical operations for preparing the training database. Furthermore, this chapter describes the detailed steps of generating the proposed complexity weights that are organised in five big tables.

In Chapter 5, the performance of the function point with the proposed complexity weights is compared with other results measured by the original complexity weights. Real industrial data sets of software projects are used to compare the results of function point counts, when they are used as a main parameter in the software cost module. The results are compared to the actual values of total effort.

Finally, Chapter 6 gives the concluding remarks of the research with some recommendations and suggestions for future work. The description of capabilities and contribution of the proposed method is also presented while the recommendations are presented as guidelines for further research, which can be added to this work.

1.8

CHAPTER 2

LITERATURE REVIEW

2.1

Introduction

Software measurement is playing an important role inside software organisation that uses software measurement applications. This chapter presents an overview of software measurement, measurement theory, functionality, size measures, the standard function point (IFPUG version), complexity weights and ANN with their applications in software engineering. The chapter includes an explanation of ANN training and testing in detail. The ANN have many applications in the real world such as software engineering, software measurement, medical diagnosis and etc. The ANN includes a number of methods, one of which is the BP method. Most of ANN are capable of abstracting the essence of a set of inputs. It can learn to produce something they have never seen before. A number of studies are looking at the use of ANN to predict software development effort (Dracopoulos, 1997; Schofield, 1998).

2.2

Software Measurement

Software measurement is the process by which numbers or symbols are assigned to attributes of entities in the real world in such a way to describe them according to clearly defined rules. The software measurement is playing an important role inside software organisation that uses software measurement applications. These applications include estimation, controlling and other types (Jacquet and Abran, 1997). A measurement method is a logical sequence of operations, described generically and used in quantifying an attribute with respect to a specified scale. The 2. 1

operations may involve activities such as counting occurrences or observing the passage of time. The same measurement method may be applied to multiple attributes. However, each unique combination of an attribute and a method produces a different base measure (SESC, 2002).

Software measurement is important for three basic activities (Fenton and Pfleeger, 1997). Firstly, there are measures that help us to understand what is happening during development and maintenance. We assess the current situation, establishing baselines that help us to set goals for future behaviour. In this sense, the measurements make aspects of process and product more visible to us, giving us a better understanding of relationships among activities and the entities they affect. Secondly, measurement encourages us to improve our processes and products. For instance, we may increase the number or type of design reviews, based on measures of specification quality and predictions of likely design quality. Lastly, the measurement allows us to control our projects. Using our baselines, goals and understanding of relationships, we predict what is likely to happen and make changes to processes and products that help us to meet our goals.

2.3

Measurement Theory

Measurement theory is the body of knowledge, which shows how to construct a measure from the observation that there are some attributes that can be represented as numbers (Bache and Bazzana, 1994). In empirical software engineering like in other empirical sciences (e.g., experimental psychology) where measurement is noisy, uncertain, and difficult, the definition of sensible measures, their statistical

2. 2

analysis, and the search for patterns amongst variables are difficult activities. In recent years, measurement theory has been proposed and extensively discussed by Fenton and Pfleeger (1997) as a mean to evaluate the software engineering measures and its rules that have been proposed in the literature, and to establish criteria for the statistical techniques to be used in data analysis and in the search for patterns. Measurement theory is a very convenient theoretical framework to explicitly define the underlying theories upon which software engineering measures are based. This means that measures are not defined out of context and that the theories on which they are based can be discussed, adapted, and refined (Briand et al., 1995). Lord Kelvin said “when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind. It may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science” (Pressman, 2001).

2.3.1

Measurement Methods

Measurements in the physical world can be categorised in two ways: direct measures (e.g., the length of a bolt) and indirect measures (e.g., the “quality of bolts produced, measured by counting rejects). Software measures can be used to categorise similarity. Direct measures of the software engineering process include cost and effort applied. Direct measures of software product include execution speed, defects reported over some sets period of time, Line of Code (LOC), speed, memory size, and number of errors. Rakitin (2001) describes 13 examples of direct measures with a short description for each example. Indirect measures of the software product

2. 3

include

functionality

(size),

quality,

complexity,

efficiency,

reliability,

maintainability, and many other abilities (Pressman, 2001). The direct measurement of attribute is a measurement that does not depend on the measurement of any other attributes (Fenton, 1994). Indirect measurement of an attribute is a measurement, which involves the measurement of one or more other attributes. The measures for software size occur in both classes of measures, that are, function points and LOC (Vickers, 1998).

The cost and effort require building software, the number of LOC produced, and other direct measures which are relatively easy to collect, as long as specific conventions for measurement are established in advance. However, the quality and functionality of software or its efficiency or maintainability are more difficult to assess and can be measured only indirectly (Pressman, 1993; 2001).

2.3.2

Software Size Measures

Software size is a fundamental product measure that can be used for assessment, prediction, and improvement purposes (Hastings and Sajeev, 2001). The software size is used as a main parameter in the effort, cost and many other models. There is no one way to measure software size that will meet all the objectives of the measures. One technique may be better for estimating the size of structured programming code and another to estimate the general functionality of the code. The best solution is to either use the technique best suited its purpose or to combine several methods to come up with a more accurate estimate. Measures can be used to quantify software products as well as the process by which they are developed

2. 4

(Abran, 1999). Once these measures are obtained to build cost estimation models and productivity models. Through this perspective, a key measure is the size of a software product. There are basically two kinds of size measures: 

Technical size measures: These measures are used to quantify software products and processes from a developer’s point of view. It can also be used in efficiency analysis to improve the performance of the design.



Functional measures: They are used to quantify software products and services from a user’s perspective. Being independent of technical development and implementation decisions, functional measures can thus be used to compare the productivity of different techniques and technologies. In this context, organisations frequently use functional size measurement methods to quantify software products included in their outsourcing contracts (Abran, 1999). There are many types of software measures, which includes; LOC, token counts and the function points.

(1) Lines of Code, source instructions, or delivered source instructions: although defined differently, these are all used in a similar way. LOC is used as input to most algorithmic costing and scheduling models (Verner and Tate, 1992). The LOC can be classified as a natural deliverable of a software project. Other natural deliverables are the number of pages of paper documents and test cases. These are all strictly volume deliverables and the tangible outputs of many tasks (Nystedt, 1999). Ramil and Lehman (2000) use the LOC as a software size measure for finding the effort model in order to measure software maintenance. (2) Token Counts: These are counts of basic syntactic units. They are at a lower level than LOC and may have validity in some situations, for example, in Computer Assisted-Software Engineering (CASE) application development, where useful equivalents to LOC may be difficult to find. For the size measures selected, it is

2. 5

necessary to have a clear and consistent definition, which can be used for automatic counting (Verner and Tate, 1992). (3) Function Point: It is based on the theory that the functions of an application are the best measurement of a software application’s functionality (Hastings and Sajeev, 1997; 2001). Since function points measure functionality, they should be independent of the technology and language used for the software implementation (Low and Jeffery, 1990). Function point is better than LOC because it is programming language independent as opposite to the LOC (Ferens, 1999; Kusumoto et al., 2002). On the other hand, Morisio et al. (2000) classify the software size measure into three types, i.e. Function Point Analysis, Object Oriented (OO) Function Point Analysis and LOC. The first approach used in order to measure the functionality size of the applications is function point analysis. The second metric is the OO function points that are introduced as a method used for estimating the functionality size and consequently the effort and duration of OO software development projects. In OO function point, the central concept is logical files and the complexity of each class is calculated by counting single attributes of each class as Data Element Types (DETs) and complex attributes (Morisio et al., 2000; Ram and Raju, 2000). The third metric, which is used to measure the size of software, is LOC, which is the number of physical LOC delivered in each application (Morisio et al., 2000).

2.4

Measurement Scales

Measurement scales determine what interpretations we can meaningfully make and the kind of transformations we can perform.

2. 6

Table 2.1 defines the five well-known scale types (Hastings and Sajeev, 2001). These types include the following: 

Nominal Scale: The nominal scale is used to name objects or events for identification purposes only, and there are no quantitative implications associated with it. Only nonparametric statistics can be used (Abran and Robillard, 1996). This scale is the lowest order scale, to allow as much flexibility as possible, software measures should aim for a high order scale. The nominal scale deals with elements as classes or categories, based on the value of these elements (Fenton and Pfleeger, 1997). Table 2.1: Five Common Scale Types (Hastings and Sajeev, 2001)



No

Scale type

Allowable transformations

1

Nominal

g() is an ordered set

2

Ordinal

g() is strictly increasing

3

Interval

g(x) = ax + b

4

Ratio

g(x) = ax

5

Absolute

g(x) = x

a>0

a>0

Ordinal Scale: The ordinal scale is often useful to augment the nominal scale with information about an ordering of the classes or categories (Fenton and Pfleeger, 1997). If each category in a nominal scale and each point in a restricted ordinal scale does not have it’s own definition, it is difficult to believe that different people will use the measure in the same way (Kitchenham et al., 2001). The ordering leads to analysis not possible with nominal measures.



Interval Scale: The interval scale carries more information and therefore making it more powerful than nominal or ordinal. This scale captures information about the size of the intervals that separate the classes, so that we understand the size of the jump from one class to another (Fenton and Pfleeger, 1997).

2. 7



Ratio Scale: With the ratio scale, no negative values can be used to multiply measurement values. Percentage calculation and all statistics that apply to the interval scale can be applied (Abran and Robillard, 1996). Although the interval scale gives more information and allows more analysis than either nominal or ordinal, we sometimes need to be able to do even more. For example, we would like to be able to say that one liquid is twice as hot as another, or that one project took twice as long as another. This need for ratios gives rise to the ratio, and one that is common in the physical sciences (Fenton and Pfleeger, 1997).



Absolute Scale: As the scales of measurement carry more information, the defining classes of admissible transformations have become increasingly restrictive. The absolute scale is the most restrictive of all. For any two measures K and Ќ, there is only one admissible transformation: the identity transformation. That is, there is only one way in which the measurement can be made, so K, Ќ must be equal (Fenton and Pfleeger, 1997; Abran and Robillard, 1996).

2.5

Software Metrics and Software Functionality

A software metric is defined as a method of quantitatively determining the extent to which software process, product, or project possesses a certain attribute (Daskalantonakis, 1992). In general, software metrics can be established in three steps, which are metrics collection, metrics computation and evaluation. Figure 2.1 illustrates the process for establishing metrics baseline (Pressman, 1993; 2001). According to Daskalantonakis (1992), metrics types can be classified in the software life cycle in three types; process metrics, product metrics and project metrics.

2. 8

(1) Process metrics: are those that can be used for improving the software development and maintenance process. Examples of such metrics include the defect containment effectiveness associated with defect containment process (e.g., inspection and testing), the efficiency of such processes, and their cost (Daskalantonakis, 1992).

Software engineering process Software project

Data collection

Software product

Measures

Metrics computation

Metrics

Metrics evaluation

Indicators

Figure 2.1: Software Metrics Collection Process (Pressman, 2001) (2) Product metrics: are those that can be used for improving the software product. Examples of such metrics include the complexity of the design, the size of the source code, and the usability of the documentation produced (Daskalantonakis, 1992). (3) Project metrics: are those that can be used for tracking and improving a project. Examples of such metrics include the number of software developers, the effort allocation per phase of the project and the amount of design reuse achieved by the project (Daskalantonakis, 1992). Functionality is the number of functions a system provides. Many software engineers argue that length is misleading, and that the amount of functionality inherent in a

2. 9

product paints a better picture of product size. In particular, those who generate effort and duration estimates from early development products often prefer to estimate functionality rather than physical size. As distinct attribute, functionality captures an intuitive notion of the amount of functions contained in a delivered product or a description of how the product is supposed to be (Fenton and Pfleeger, 1997). The relative amount of functionality of the various components of a system is reflected by the set of weight coefficients (Witting et al., 1997).

There have been several attempts to measure functionality of software products. Fenton and Pfleeger (1997) report that, ”we examine three approaches; Albrecht Function point, Demarco’s specification weight, and the COCOMO approach to Object Points. Each of these measures is derived as part of a larger effort to supply size information to a cost or productivity model, based on measurable early products, rather than estimates of LOC”. All three approaches measure the functionality of specification documents, but each can also be applied to later life-cycle products to refine the size estimate and therefore, the cost or productivity estimate. Indeed, our intuitive notation of functionality tells us that if a program P is an implementation of specification S, then P and S should have the same functionality (Fenton and Pfleeger, 1997).

2.6

Applications of Software Measurement

Software measurement can be used for assisting the managers and engineers. For example, the managers can measure the time and the effort involved in the various processes that comprise software production, the time it takes for staff to specify the

2. 10

system, design, code, test and the time it takes to perform each major development activity, calculate its effect on quality and productivity, then they can weigh the costs and benefits of each practice to determine if the benefit is worth the cost. The managers can also measure software quality, enabling them to compare different products, predict the effects of change, assess the effects of new practices, and set targets for process and product improvement. In addition, the managers can measure the usability, reliability and functionality by determining if all of the requirements requested have actually been implemented properly. Furthermore, software measurement can be used for assisting the engineers by analysing the requirements, measuring the number of faults in the specification, design, code, and test plans. They can measure the characteristics of the products and processes that tell us whether we have met standards, satisfied a requirement, or met a process goal. So, the software measurement can lead to good estimation, understanding, control and improvement software (Fenton and Pfleeger, 1997).

2.6.1

Estimation

Two important points about software estimation are: it’s best to understand the background of an estimate before you use it and it’s best to orient your estimation approach to the use that you are going to make of the estimate. We can often get both better and simpler estimates if we keep the use of our estimate in mind (Boehm and Fairley, 2000). There are many types of software estimation such as cost, effort, performance, errors, defects, time, etc. Our focus in this research will be on three types of software estimation, which are size, effort and cost.

2. 11

Software cost estimation is an important activity during its development. Cost has to be estimated continualy during all the software development phases. It also depends on the nature and characteristics of a project. At any phase, the accuracy of the estimate depends on the amount of information known about the final product. Software cost estimation is mainly dependent on its size (Ram and Raju, 2000). Software size is the key input for existing cost estimation models (Shoval and Feldman, 1996). The accurate cost estimates are an essential element for providing competitive bids and remaining successful in the market (Ruhe et al., 2003).

2.6.2

Controlling

Measurement of software development productivity is needed in order to control software costs (Banker et al., 1994). Software organisations adopt measures, as components of measurement plans in order to obtain the quantitative insight necessary to better manage and improve the development process and its products (Fenton and Pfleeger, 1997). At any organisational level, measures allow management to understand, assess, compare, plan, control and monitor the software process and its related products, and resources (Cantone and Donzelli, 2000). Several companies are beginning to realize the important role that software metrics can play in planning and controlling software projects, as well as improving software processes, products, and projects over time. Such improvement results in increased productivity and quality, and reduced cycle time, all of which make a company competitive in the software business (Daskalantonakis, 1992; Kitchenham and Pfleeger, 1995).

2. 12

2.7

Function Point Measure

Function points are generally used for measuring software from a user perspective (April et al., 1997). This method measures software by quantifying the functionality of the software that provides to the user based primarily on the logical design. It is a unit measure for software much like an hour is to measuring time, miles are to measuring distance or Celsius is to measuring temperature (Hastings and Sajeev, 1997; 2001; Rakitin, 2001). The function point measure is developed initially for information systems. The method for counting the function point has been well documented by the IFPUG (Rakitin, 2001). In 1986, the IFPUG was established with the aim of defining the standard spreading the function point measure all over the world (Tortora et al., 2000).

During the last few years, the function point metric has gained increasing approval by application developers. They have exploited the function point count to realise two main tasks. The first task is concerned with productivity monitoring and can be accomplished by estimating the number of function points delivered per personmonth employed. The other is referred to the estimation of software development cost (Tortora et al., 2000). Matson et al. (1994) report that, the function point approach has features that overcome the major problems with using LOC as a measure of physical size of systems. Function points are independent of the language, tools or methodologies used for implementation. They do not take into account programming languages, technology dependent as discussed by Symons (1988) and Low and Jeffery (1990). Second, function point can be estimated from

2. 13

requirements specifications or design. Currently, function points are used by numerous large Australian organisations in the measurement of productivity for project review purposes and effort estimation (Low and Jeffery, 1990). The function point model that is developed by (Albrecht, 1979) is widely used in the USA, Europe and other parts of the world. It has given the IT industry a standard against which productivity can be measured and compared across teams, divisions and organisations (Shoval and Feldman, 1996; Vickers, 1998).

2.8

Function Point Analysis

Function point analysis is a well-known method to estimate the size of software systems and software projects. However, since it is based on functional documentation, it is hardly used for sizing legacy systems, in particular enhancement projects (Klusener, 2003). The function point analysis is a method that divides systems into smaller components, so that they can be better understood and analysed. Function point analysis is very similar to completing a functional decomposition (Longstreet, 2002). It can also be referred to as a structured technique of classifying components of a system (Hastings and Sajeev, 1997; 2001). To analyse the function point, it is necessary to determine three main steps. The first step determines the five function point components, which are EI, EO, EQ, EFI and ILF. The second is assigning the complexity weights. The third is the general system characteristics that include 14 technical complexity factors.

2. 14

2.8.1

Function Point Components

The function point has five components, which are described in the IFPUG manual (Longstreet, 2002). It includes the following: 1. Internal Logical Files (ILF): The ILF is a user identifiable group of logically related data that resides entirely within the application boundary and is maintained through External Inputs. The IFPUG manual presents a definition of ILF entity type, as well as identification rules to ensure that each and every ILF can be clearly recognised within software (Jacquet and Abran, 1997). 2. External Interface Files (EIF): They are a user identifiable group of logically related data that are used for reference purposes only. The data resides entirely outside the application boundary and are maintained by another applications external inputs. The external interface file is an internal logical file for another application. 3. External Inquiries (EQ): This is an elementary process with both input and output components that result in data retrieval from one or more ILF and EIF. The input process does not update or maintain any File Type Reference (FTRs) that may be ILF or EFI and the output side does not contain derived data. 4. External Inputs (EI): This is an elementary process in which data crosses the boundary from outside to inside. This data may come from a data input screen or another application. The data may be used to maintain one or more ILF. The data can be either control information or business information. If the data is control information, it does not have to maintain an ILF. 5. External Outputs (EO): EO is an elementary process in which derived data passes across the boundary from inside to outside, and may update an ILF. The

2. 15

data create, reports or output files sent to other applications in which the reports and files are created from information contained in one or more ILF and EIF.

2.8.2

Function Point Complexity Weights

The original function point complexity weights are organised in six tables, that are represented as three levels of complexity; simple, average, and complex. Five tables of the weights are called the rule tables (refer to the IFPUG manual) and the sixth is called the general weights table as shown in Table 2.2. Table 2.2: Function Point Complexity Weights (Longstreet, 2002) Function Point Components

Complexity Weights Levels

External inputs

Simple 3 * EI =

Average 4 * EI =

Complex 6 * EI =

External outputs

4 * EO =

5 * EO =

7 * EO =

External inquires

3 * EQ =

4 * EQ =

6 * EQ =

External file interface

7 * EF =

10 * EF =

15 * EF =

Internal logical files

5 * ILF =

7 * ILF =

10 * ILF =

Each function point component is multiplied by the weights values and the summation of all products will represent the Unadjusted Function Point (UFP).

2.8.3

Function Point Complexity Factors

These factors are called general system characteristics (GSC) that are intended to measure general aspects of size. The objective of calculating these factors is to adjust the function point final count for better prediction. The complexity factor of a system

2. 16

is determined according to the 14 technical attributes as shown in Table 2.3. Each factor is assigned a weight between 0 (low) and 5 (high). Table 2.3: The Internal Complexity Factors (Longstreet, 2002) Factor Name

Factor Name

1

Data communications

8

On-Line update

2

Distributed data processing

9

Complex processing

3

Performance

10

Reusability

4

Heavily used configuration

11

Installation ease

5

Transaction rate

12

Operational ease

6

On-line data entry

13

Multiple sites

7

End-user efficiency

14

Facilitate change

The sum of their rating scores is called Total Degree of Influence (TDI). This expresses the influence of the 14 GSC on the overall complexity of the software system (Gao and Lo, 1996; Finnie et al., 1997). These factors are also called Technical Complexity Factors (TCF), which gives a clear picture about the internal complexity of any software to be used as an adjustment value. The adjustment value is used to adjust the final count of function point.

2.8.4

Function Point Counting Procedure

The following presents the major steps of the function point analysis: 1. Determine the type of function point count for development project, enhancement project and application. 2. Identify the application boundary. 3. Determine the unadjusted Function point. 4. Determine the value adjustment factor.

2. 17

5. Calculate the final adjusted function point.

These steps are explained clearly in the function point manual and in the Information Technology Service department (ITS, 2002). Kusumoto et al. (2002) explain in detail all the function point counting steps, which include identifying the counting boundary, counting data function types, counting transactional function types and determining the unadjusted function point count as obtained in the IFPUG manual (Longstreet, 2002). This count includes three main steps: 1. Computing an Unadjusted Function Points Count (UFPC): this part of calculations performs according to the five components as presented in Eq. 2.1. 5

UFPC = Σ (Number of items of component i) × (Weight i)

(2.1)

i=1

2. Computing the Adjustment Value: Sometimes the adjustment value is called the Adjusted Function points where, each complexity factor takes any value in the ordinal scale 0 to 5 as a degree of influence. For this purpose, Eq. 2.2 is used. The two constants 0.65 and 0.01 are essential parts of a function point equation. 14

Adjusted Function Points Value = 0.65 + 0.01 × Σ TCFi

(2.2)

i=1

3. Computing the Final Function Points Count: Final count of function points can be measured by Albrecht model, which is obtained in Eq. 2.3. To use this model, the software development organisation has to maintain a database of its projects, including duration, cost, effort and function points. Based on this database, the cost (in terms of time and money) of one function point can be computed (Shoval and Feldman, 1996). Function Points Final Count = UFPC × Adjusted Function Points

2. 18

(2.3)

2.8.5

Function Point Applications

The function points can be used to measure the functionality and the productivity in the web-based applications (Morisio et al., 2000; Hiang, 2001). In general, function points are possible to measure performance of on-going and completed software developments and estimate timescales, staffing and cost for future work, control requirements and scope creep (SMS, 2001). It can also be used for estimating software development effort for web applications (Ruhe et al., 2003).

There are many applications of function points during the software life-cycle inside the software organisation. These applications are used to assist in the following activities: 

Estimating test cases and defects: There is a strong relationship between the number of test cases and the number of function points. The number of acceptance test cases can be estimated by multiplying the number of function points by 1.2. Like function points, acceptance test cases should be independent of technology and implementation techniques (Longstreet, 2003).



Understanding wide productivity ranges: Inconsistent productivity rates between projects may be an indication that a standard process is not being followed. Productivity is defined as the ratio of inputs / outputs. For software, productivity is defined as the amount of effort required to deliver a given set of functionality (Longstreet, 2003).



Understanding scope creep: Scope creep is the insidious growth or change in the scale of a system during the life of a project. It typically involves adding or modifying features as a project evolves. It can increase costs, require time-

2. 19

consuming rework, delay your launch date and fray tempers (Longstreet, 2003). The scope creep can be tracked and monitored by understanding the functional size at all phases of a project. Frequently, this type of count is called a baseline function points count (Longstreet, 2002). 

Estimating overall projects cost, schedule and effort: Software size is a key input for existing effort and cost estimation models. For example, COCOMO uses LOC or function points as the primary data for calculating nominal development effort. The function points approach, on the other hand, calculates function counts from its five components. Our models estimate each of this software size metrics directly from user application features of process control systems (Mukhopadhyay and Kekre, 1992).



Understanding maintenance costs: Many maintenance budgets are established based on prior years of performance. Many organisations attempt to reduce maintenance cost and do not plan on increasing maintenance cost from year to year for any particular application. If the amount of new functionality is tracked and added to the base product, the unit maintenance cost may actual fall while actual maintenance expenditures remain constant or increase. If maintenance cost is increasing at a rate less than the function point growth, then, the maintenance costs are actually falling (Longstreet, 2003).



Defining when and what to re-engineer: Many re-engineering projects are undertaken without any cost benefit analysis being done. Cost benefit analysis seeks the best ratio of benefits to cost; this means for example, finding the applications that will benefit the most from re-engineering efforts.



Understanding the appropriate set of metrics: Of course, function points need to be collected in association with the other measures. In fact, function points by

2. 20

themselves provide little or no benefit. Many metrics need to be reported at the organisational level. For example, both productivity/cost metrics and quality metrics need to be reported at the organisational level. Productivity/cost metrics are used to demonstrate the rate and cost of functionality that is being delivered. Quality metrics are used to demonstrate existing levels of quality and to track continuous improvement efforts in the software development process. These metrics should be tracked and trended (Longstreet, 2003). 

Using function points to help with contract negotiations: Contract managers can use their knowledge of function points to construct and manage projects based on the price per function points, as well as comparing vendor pricing. These individuals establish cost effective use of third parties in development, validate bids based on function point size, and can evaluate the impact of canceled projects. Function points can be used to help specify key deliverables to a vendor, to ensure appropriate levels of functionality will be delivered, and to develop objective measures of cost effectiveness and quality. They are most effectively used with fixed price contracts as means of specifying exactly what will be delivered (Longstreet, 2003).

2.9

Extended Function Points Analysis Techniques

Since 1979, function points measure suffers from essential problems such as double counting, counter-intuitive values, accuracy calculation, early life-cycle use, changing requirements, differentiating specified items, technology dependence, measurement theory and subjective weighting (Fenton and Pfleeger, 1997). Hence it passes through many stages of development. A number of researchers tried to

2. 21

develop the function points measure in order to solve some of its problem. They created many various releases of function points measure such as Feature Points, 3D Function points, Mark II Function points, Full Function Points, COSMIC Full Function Points, and other methods.

2.9.1

Feature Points

Feature point is a software sizing method developed by Caper Jones of Software Productivity Research, Inc in 1986. This technique considers the number of algorithms used in the application and slightly modifies some of the weights of the traditional function points constituents. It is designed in such a way to suite typical business applications, an area where function point has been applied successfully, and will have the same size whether calculated with function points or feature points (Jones, 1998). The feature points measure accommodates applications in which algorithmic complexity is high, such as real-time, process control, and embedded software applications tend to have high algorithmic complexity and are therefore amenable to the feature points (Pressman, 1993; 2001). This method is used for applying function point logic to system software such as operating systems, telephone switching systems, military systems, etc. (UCSE, 1997). With respect to Capers Jones's Feature Points, which, besides transactions and files, also considers number and type of the algorithms present in the application only at a global level (Meli, 1997a).

Feature Points work well if a system is high in algorithmic complexity, where a 6 th parameter is added to the original 5 of function points. This parameter is the number

2. 22

of algorithms. An algorithm is defined as "A set of rules which must be described and encoded to solve a computational problem” (UCSE, 1997). Thus, the feature points approach considers the number of algorithms in a program and reduces the number of adjustment factors from 14 to only three (Jones, 1998). Although, feature points considers number and type of the algorithms present in the application it only count global level of system and there is no standards organisation, like IFPUG, that is concerned with its use.

2.9.2

Asset-R Function Points

Reifer (1990) introduced the Asset-R technique in 1986. According to this technique, the function points analysis is unable to handle four important characteristics inherent in real-time and scientific software such as parallelism, synchronisation, concurrency, and intensive mathematical calculation. Reifer proposes the following system classification for the generation of function points count: (1) Data processing software for MIS = Function points Analysis (2) Scientific system = number of inputs + number of outputs + number of master files + number of modes + number of inquires + number of interfaces (3) Real-time system = number of inputs + number of outputs + number of stimulus/ response relationships + number of rendezvous + number of modes + number of inquires + number of master files + number of interfaces.

In this technique, the weighting values and the adjustment factors have been eliminated. This version does not weigh attributes differently and assigns a coefficient of ‘1’ to all attributes and uses other inputs to adjust the total for

2. 23

complexity. It uses also three additional attributes to those used in traditional function points. This approach is very difficult to cover three types of software such as MIS, scientific system and real-time system at the same time.

2.9.3

Mark II Function Points and Mark II Model

Charles Symons introduced the Mark II Function point technique in 1988. This technique is the second most commonly used functional size measurement besides function point analysis and is supported by the UK Software Metrics Association (GIFPA, 1998b). The technique views software functionality in terms of logical transactions (input data, process, output data) and generates counts by multiplying the logical transaction to a pre-defined weight, using a published industry average weights or tailored to suit the client environment (Symons, 1991; UKSMA, 1998). Mark II function point analysis is a method for the quantitative analysis and measurement of information processing applications. It quantifies the information processing requirements specified by the user to provide a figure that expresses a size of the resulting software product (UKSMA, 1998). In the context of Mark II function points analysis, “information processing requirements” mean the set of functions required by the commissioning user of the application software product (excluding any technical and quality requirements). ‘The activity’ could be the development, change or maintenance of the software product needed to meet the requirements (UKSMA, 1998). This method is intended to comply with ISO 14143/1, the International Standard for functional size measurement, and analysis rules (UKSMA, 1998). The Mark II function points analysis technique considers the system to be measured as a collection of logical transactions, with each transaction consisting of

2. 24

an input, process, and output component (Symons, 1991). In addition, there are five new additional degrees of influence to the 14-adjustment factor of GSC. These additional characteristics are interfaces, security and privacy, user training, third party use and documentation. Clients can also define additional characteristics if necessary (UKSMA, 1998). Improvements of this method cover broader and more modern software development methods such as Graphical User Interface (GUI), OOP, and Client / Server, and maybe applicable to other software domain (GIFPA, 1998a).

The Mark II model is represented in Eq. 2.4. It computes function points by multiplying two factors Information Processing Size and TCF. 1. Information Processing Size: According to this model, a system consists of logical transactions. A transaction performs a stand-alone business process that consists of one or more types of input, a process, and one or more types of output. The “information processing size” factor is measured by the number of UPF counted in all transactions of the system. Function point (FP) = (Information Processing Size) × TCF

(2.4)

For every transaction, the UFP is computed as shown in Eq. 2.5. UFP = Wi × (Num. of input data elements) + We × (Num. of entity types referenced) + Wo × (Num. of output data elements)

(2.5)

where, Wi, We and Wo are weights that can be calibrated. The model provides rules for counting the number of data element types and entity types in the transaction. The total UFP of the system is the sum of UFP transactions (Shoval and Feldman, 1996). 2. TCF of Mark II: The Mark II model utilises the Albrecht’s approach with the following adjustments:

2. 25

-The number of attributes on which the adjustment factor is based is 19. Moreover, the user may add more attributes that are specific to the project at hand. - The weight of each attribute can be calibrated and the TCF factor is computed as follows: TCF = 0.65 + C × (Sum of degree of influence of 19 application characteristics, plus any client-defined characteristics) where C can be calibrated by the user (Shoval and Feldman, 1996). Nevertheless, the weighting method is inconsistent because of different scale types implemented in Mark II model and the assigned weights for each attribute are fixed by three constants that can be calibrated without any obvious criteria. The Mark II manual rules do not take into account different types of software with complex algorithms, which, found in scientific and engineering software nor do the rules specifically take into account real-time requirements. To apply the Mark II function point analysis to these other domains, it may be possible or may require extensions or new interpretations of the rules given in its manual.

2.9.4

Banker’s Object Points

With the increasing popularity of OO development methods, the object point is currently the popular sizing research area. However, there is no “standard” measure of object points; various researchers define them differently. Three of the current approaches to object points are discussed in Ferens (1999). According to Banker, object point is similar to “basic” function points except that object counts are used instead of function counts (Banker et al., 1992). Currently, these object points are primarily for use in developing CASE tools, although it may have other applications

2. 26

in the future. The four object types used are rule sets, Third Generation Language (3GL) modules, screen definitions, and user reports (Ferens, 1999).

2.9.5

3D Function Points

Between 1989 and 1992, the Boeing Company considered the use of function points to measure productivity. An internal Boeing Company document was published in the proceedings of the Pacific Northwest Software Quality Conference in 1992. It defined a technique called 3D Function points. The Boeing Company has developed this technique for real-time systems and engineered products. This technique is designed firstly by Whitmire to address two classic problems associated with the Albrecht approach. First, the approach is considered difficult to use. Second, it does not properly measure scientific and real-time systems (Hastings, 1995; SCT, 1997). Pressman (2001) explains the three dimensions of the 3D function points, which are data dimension, functional dimension, and control dimension. The data dimension consists of the internal program data structure and the external data (inputs, outputs, inquiries, and external references) are used along with measure of complexity to drive a data dimension count. The functional dimension is measured by considering the number of internal operations required to transform input to output data. The control dimension is measured by counting the number of transitions between states. A state represents some externally observable mode of behaviour, and a transition occurs as a result of some events that cause the software or system to change its mode of behaviour. The data dimension in this technique is similar to the Albrecht's approach. The function dimension adds transformations, which are similar to the algorithms. The control dimension adds transitions, which

2. 27

enumerate changes in this application state (SCT, 1997; Pressman, 2001). Abran et al. (1998) find that the concept of this technique is well described but the measurement rules are not detailed enough. The reason is to allow the identification without ambiguity of the new function types. Thus, raising reliability issues for the results produced. Another major drawback with this technique is in relation to the specific type of documentation (Finite-State Machines) required in performing the measurement. This reduces the practicality aspect of using this technique. According to Hastings (1995), only this technique attempts to measure the complete size of a system regardless of application domain or implementation technology. This method has potential in measuring real-time systems but it is rarely used outside Boeing (Morris, 1998).

On the other hand, the 3D function point is difficult to use more than the IFPUG version and does not cover all it’s objectives. This attempt is rarely used outside Boeing Company because the measurement rules are not detailed enough. Another major drawback with this technique is in relation to the specific type of documentation.

2.9.6

Hallmark Cards

In 1982, Steve Drummond of Hallmark Cards, Inc., modified the Albrecht method in several respects. One modification expanded Albrecht’s three levels of complexity weights for each function element to the five levels presented in Table 2.4. The other change reduced the general application characteristics from 14 factors to eight (Putnam and Myers, 1992).

2. 28

Table 2.4: The Hallmark Cards Complexity Weights (Putnam and Myers, 1992) Function point Components Input Output Inquiry Master Files Interface

Simple

Moderate

Average

Complex

Highly Complex

2 3 2 5 4

3 4 3 7 5

4 5 4 10 7

5 6 5 13 9

6 7 6 15 10

The extension of function points weights from three levels to five levels still suffers the same problem of assigning the complexity weights Furthermore, this attempt does not take into account description of weights of each component as documented in the IFPUG version. Likewise it still deals with the previous ordinal scale of original function point method and software projects cannot be represented in this type of scales.

2.9.7

Application Feature

Mukhopadhyay and Kekre (1992) propose the concept of applications features, which attempt to develop a model to estimate function points counts, early in the life-cycle. This method uses characteristics such as positioning and movement, which are very specific to the process control software found in the manufacturing domain as a category of real-time software and thus not applicable to others (Abran et al., 1998). This method identifies three main application features of process control software in the manufacturing domain: 1.Communication Feature: This is when a process requires real-time coordination with an upstream/downstream process or with a master process.

2. 29

2. Position Feature: This feature is used to accurately position a component for machining into a subassembly, or onto a moving conveyor or automated guided vehicle. 3. Motion Feature: This is a feature used in applications requiring synchronised movements of components and tools in processes. Abran et al. (1998) find that the authors do not provide detailed procedures and rules for the measurement of the three application features identified. Thus, it is a challenge to use and generate results consistently.

The application feature designed without full description manual. The document of this release does not consider different types of software such as scientific systems, MIS and GUI requirements. It is also ignores measuring the general system characteristics and algorithms complexity.

2.9.8

IFPUG Version 4.1

Albrecht’s original function points analysis method has evolved over the last 20 years into a method now known, as “IFPUG”, though the original basic concepts and weighting methods have not changed since 1984. Over this same period, other methods have been put forward, each attempting to overcome weaknesses perceived in Albrecht’s original model, or to extend its field of application (Symons, 2001). Figure 2.2 shows the development of some of the principal ideas for improving functional size measurement since Albrecht’s original function point analysis method.

2. 30

ISO ’FSM’ Standard 3-D FP’s

MKII FPA 1.3 COSMIC FFP V.2

MKII FPA

Full FP’s V.1

Feature Points

1980

1985

IFPUG 4.1

IFPUG 4.0

Allan Albrecht FPA

1990

1995

IFPUG 4.1 (UFP)

2000

Figure 2.2: Evolution of Functional Size Measurement (Symons, 2001) The IFPUG version is a modified version of Albrect’s function points. In the modification, the evaluation of the complexity of the software is subjectively established and the rules of the counting procedures are described. In this version, the counting procedure of function points consists of seven steps according to the function points manual version 4.1 (Kusumoto et al., 2002). Function points counting rules have been improved continuously since the function points analysis is proposed (Gao and Lo, 1996). The IFPUG has reviewed these rules, which focus on measuring the functional size of software from the user points of view.

The ISO announced in early 2001 that recognition of function points as an international standard had taken a major step forward. By a large majority, the national bodies comprising ISO approved the application for recognition filed by the IFPUG. Following resolution of the comments accompanying the votes of approval, function point will become the first software functional sizing methodology to be recognised as an international standard (IFPUG, 2002). This version still suffers from some essential problems such as the subjectivity in the complexity weights.

2. 31

Many researchers had a tendency to develop the function point by creating new releases but no one concentrates on the complexity weights of this measure for the following reasons: 

Until the year 2000, there is not an international organisation for gathering of software metrics data. The International Software Benchmarking Standard Group starts to do this task at the end of 90’s.



Albrecht’s method has been very helpful for sizing software which is to be built to bespoke requirements, which was the dominant activity of information systems departments in the 70’s and the 80’s. But over the most recent decade many major companies switched over to using packaged software rather than creating their own bespoke software.



One might have imagined that entering into a contract to outsource software development and maintenance would automatically lead to an emphasis on measuring the performance of the supplier. But such contracts are negotiated by accountants and lawyers under great time pressure and often in secrecy from the organisation being outsourced.

2.9.9

Full Function Points

In 1997, a new extension to function points (Full Function points) was developed for measuring the functional size of real-time software to address weaknesses of the IFPUG version. The International Software Benchmarking Standards Group (ISBSG) accepted Full Function points as a new measurement standard for real-time software (Abran, 1999). The Software engineering Management Research Laboratory (SEMRL) and the Software Engineering Laboratory in Applied Metrics (SELAM)

2. 32

made attempts to extend the capabilities of function points analysis to measure the real-time software (Bootsma, 2000). Most of the counting procedure of this method follows the same steps as the function points analysis technique, with the exception in the control group where different counting rules are applied (Abran, 1999). It is also recognised by ISBSG to be a valid functional size measurement technique. Several field tests are conducted by organisations in USA, Canada, Japan, North America and Asia with various types of software, such as real-time or embedded software, telecommunication software, power utility software, military software and some MIS software. The results indicate that full function points produce significant function size difference when large numbers of embedded sub-processes are present and that the effort reported when using this technique is similar to function points analysis (Maya et al., 1998).

2.9.10 COSMIC Full Function Points

The COSMIC is a group established by six countries; Australia, Canada, Finland, the Netherlands, UK and the USA, with the aim to achieve an international standard set of software measurement. The COSMIC method is a refinement of FFP, Mark II and the FPA techniques, in order to address a variety of software domains especially MIS and real-time systems (Symons and Rule, 1999). The COSMIC group has released the COSMIC measurement manual, which contains all the rules and the different components of this method (COSMIC, 1999). This method explicitly does not claim to measure the size of functionality, which includes complex data manipulation (i.e. algorithms), and does not attempt to take into account the effect on size of technical or quality requirements (Symons, 2001). The COSMIC function point data

2. 33

movement in contrast is tightly defined, difficulties of ambiguous interpretation were not experienced in the field trials. This method is a specific case of a use case. On the other hand, the COSMIC function point method does not concentrate on different types of software with complex algorithms and measuring the general system characteristics.

2.10 Limitations of Function Points (IFPUG Version)

Allan Albrecht proposes function points as a technology-independent measure of size but there are several problems with this measure, and users of the technique should be aware of its limitations. Fenton and Pfleeger (1997) demonstrate many limitations such as subjectivity in the technology factor, double counting, counter-intuitive values, accuracy, early life-cycle use, changing requirements, differentiating specified items, technology dependence, application domain, measurement theory and subjective weighting.

2.10.1 Weights Limitations and Its Effect on Software Cost Estimation

The choice of weighting values that are presented before in Table 2.2 and the values of the rule tables of function points weights are justified by Albrecht as reflecting "the relative value of the function to the user/customer and are determined by debate and trial. It is doubtful whether the weights will be appropriate for all users in all circumstances (Vickers, 1998). The function points based productivity measures suffer from weighting problems since their weights are based on analysing one single data set (Myrtveit and Stensrud, 1999). Albrecht’s choice of these weights is

2. 34

determined by debate and trial. The original methodology is revised in 1984 and has since undergone continuous refinement. The weights coefficients however, have remained unchanged (Witting et al., 1997).

The weights limitations are described in this section precisely according to previous studies and researches as follows: 1. The classification of function types into simple, average, and complex only simplifies the adjustments. This classification does not reflect the entire complexity of the work, which is necessary to develop the user’s systems (Symons, 1991; 1997). 2. The weights of function point components have been determined by Albrecht based on the analysis of projects developed on the IBM environment. Thus, they are not necessarily valid for projects developed in other vendor or user environments (Shoval and Feldman, 1996). Vickers (1998) reports that, “we should remember that the weights are derived from study of projects within IBM; whatever its diversity, IBM is still only one organisation with its own culture”. 3. Both more and less complex calculations are having the same weight and the same number of function points as explained by Júnior et al. (2001). 4. Symons (1991) finds that the function points suffer from weights weakness and notices that the choice of weights for calculating UFP is determined subjectively from the IBM experience. 5. The method of assigning the function point weights does not change until April 2001, because the original concepts and weighting style have not changed since 1984 as reported by Symons (2001) and Júnior et al. (2001).

2. 35

6. Problem of ordinal scale is the first problem which arise when assessing the size of a system in UFP. The classification of all system component types as simple, average and complex is not sufficient for all needs and it simplifies only the calculations (Vickers, 1998). On the other hand, Abran and Maya (1995) discuss this problem in depth. 7. Problem of failure to measure large systems is another challenge for the function point, especially in studying productivity. Albrecht (1983) found that productivity falled off by a factor of three as the size of systems increase from 400 to 2, 000 function points. All of the problems listed above tend to suggest that function point analysis underweights large systems relative to small systems. If this is so, then Albrecht's worrying productivity trend may not be as serious as it appears (Vickers, 1998). Consequently, the current complexity weights of function point are not accurate. These weights are assigned to software projects as three classes only, but in reality, it must cover all expected states of any software projects. For this reason, any software component must take its weight according to its degree of complexity (Ferens, 1999).

Software developers can bill their customers based on a given dollar per function point rate. As the number of function points delivered increases, the dollars charged should also increase (Charley, 2002). From the function type complexity matrices in CPM of IFPUG 4.1 that is a “high” ILF has at least 2 RET and more than 50 DET. This is valued at 15 unadjusted function points. Suppose that the value adjustment factors are calculated as 1.0; then, if a customer orders an ILF with, say, 3 RET and 60 DET, this will be counted as having 15 function points. Suppose, as a simple case, a software developer averages five days to develop function points. If we suppose

2. 36

another customer wants a large master file developed as an ILF. In this case, suppose the large master file contained 180 DET and 9 RET. Since the number of DET and RET are each three times the amount in the first example, one may initially expect the developer to charge a much higher price to develop this large master file. However, according to the CPM, such a large master file must be counted as high, having more than 2 RET and more than 50 DET. If the developers are contracted to charge following the rules in the CPM, the developer will have to charge $15,000 for this large master file, and also promise to schedule this development time at 75 days. This is clearly an impossible situation for the developer (Charley, 2002). The reasons of this contradiction come from the weakness of the previous function point complexity weights, which are established before as an ordinal scale. More details are presented in (Longstreet, 2002; Longstreet, 2001). On the other hand, Meli (1997a; 1997b) reports that software applications do not have a visible "volume" or "size" allowing them to be arbitrarily dismantled and analysed, according to their own constituent elements. Taking one element off the program can affect the proper functioning of the whole system. Therefore, applications cannot be put on an ordinal scale with respect to these attributes, since they are not homogeneous entities, and the very meaning of "volume" or "size" is not at all clear.

2.10.2 Function Points Limitations with Accuracy and Scale Type

Jeffery and his colleagues (1993) have shown that the UFP seems to be no worse a predictor of resources than the Adjusted Function points (AFP) count. The TCF or the AFP does not appear to affect the accuracy of the derived effort equations (Jeffery et al., 1993). So, the TCF does not seem useful in increasing the accuracy of

2. 37

prediction. Recognising this, many function point users restrict themselves to the UFC.

Kitchenham and Pfleeger (1995) have proposed a framework for evaluating measurements. They apply the framework to function point, noting that the function point calculation combines measures from different scales in a manner that is inconsistent with measurement theory. In particular, the weights and TCF ratings are on an ordinal scale, while the counts are on a high order scale, so the linear combinations in the function point formula are meaningless. They propose that the function point be viewed as a vector of several aspects of functionality, rather than as a single number.

2.11 Artificial Neural Networks

ANN are very sophisticated modeling techniques capable of modeling extremely complex functions. The ANN are typically organised in layers that are made up of a number of interconnected nodes, which contain activation functions. Patterns are presented to the network via the input layer, which communicates to one or more hidden layers where the actual processing is done via a system of weighted connections. The hidden layers then link to an output layer where the answer is the output. The ANN learn by examples where user gathers representative data and then invokes training algorithms to automatically learn the structure of the data. The performance of ANN depend on the architecture and parameters of the network (Dolado, 2000; Finnie et al., 1997). Although the user needs to have some heuristic knowledge of how to select and prepare data, how to select appropriate ANN, and

2. 38

how to interpret the results, the level of user knowledge needed to successfully apply ANN is much lower than would be the case using (for example) some more traditional nonlinear statistical methods. More simply, when a ANN are initially presented with a pattern, it makes a random guess as to what it may be. It then sees how far its answer is from the actual one and makes an appropriate adjustment to its connection weights (Freeman and Skapura, 1991). The initial weights and biases of the neurons are chosen randomly. Multiple runs are performed to find the correct settings of the parameters (Dolado, 2000). When the Neural networks (NN) become learned and tested networks, they are called ANN.

2.12 General Applications of Artificial Neural Networks

The wide applicability of ANN is one of their most powerful characteristics. They can address many real life tasks. In addition, their possible hardware implementations due to their inherently parallel nature, makes them ideal for realtime applications. The ANN can be used in different applications such as: 1.

Prediction

2.

Diagnosis

3.

Classification

4.

Weather forecasting

5.

Many applications of ANN applied in a CAD/CAM, pattern completion tasks, image processing, medical applications, industrial measurement, and control (Vemuri, 1992; Lisboa and Taylor, 1993). Likewise, ANN are used in military systems, planning control, search, Artificial Intelligence, power systems and human factors, etc. (Schalkoff, 1997). On the other hand, Wilde (1997) lists more

2. 39

than 18 applications of ANN and demonstrated how many hidden neurons should be chosen. The choice is based on statistical estimation and non-linear function approximation.

2.13 Applications of Neural Networks in Software Engineering

ANN have many applications in the Software Engineering field, especially in the estimation domain. Some of which are: 1. Categorisation of programs: Kurfess and Welch (1996) use Koho-nen networks, as self-organising maps, for the categorisation of programs. Programs with similar features are grouped together in a two dimensional neighbourhood, whereas dissimilar programs are located far apart. BP networks are used for generalisation purposes based on a set of example programs whose relevant aspects have already been assessed. The categorisation and generalisation capabilities of ANN are employed to improve or verify the selection of parameters, and might even initiate the development of additional metrics. 2. Software effort estimation: Finnie and Witting (1996) examine the potential of two artificial intelligence approaches, i.e. ANN and case-based reasoning for creating development effort estimation models. ANN can provide accurate estimates when there are complex relationships between variables and where high noise levels distort the input data. Their research examines both the performance of BP network in estimating software development effort and the potential of case-based reasoning for development estimation using the same dataset. They conclude that the ANN are successful in accurately estimating project effort in a large dataset of simulated project data, which are likely to have

2. 40

contained considerably less noise than typically occurs in project data. This dataset meets the requirements of sufficient observations for adequate training. 3. Predicting testability of program modules: Khoshgoftaar, et al. (2000) present a case study of real-time avionics software to predict the testability of each module from static measurements of source code. The static software metrics take much less computation than direct measurement of testability. Thus, a model based on inexpensive measurements can be an economical way to take advantage of testability attributes during software development. They find that ANN are promising techniques for building such predictive models, because they are able to model non-linearties in relationships. Their goal is to predict a quantity between zero and one whose distribution is highly skewed towards zero. This is very difficult for standard statistical techniques. In other words, high-testability modules present a challenging prediction problem that is appropriate for ANN. 4. Dolado (2000) uses the structure of a feedforward ANN to learn the relationship between LOC and the independent variables number of data elements and number of relations. The hidden layer is composed of two neurons that simulate a logistic sigmoidal function. 5. Schofield (1998) in his work includes function point and ANN where he builds his model according to the BP algorithm to estimate software project effort and development time. One of the inputs to that model is the Unadjusted Function Point value. 6. Boehm and Abts (2000) use ANN techniques for software effort estimation, where the data inputs used to the network are project size, complexity, languages, and skill levels. The output of the network is the effort estimation value of the

2. 41

used project. The network used is constructed with four layers according to the BP algorithm.

2.14 Neural Networks Methods

Soucek et al. (1991) divide the Training algorithms into two mainly classes, which are supervised Training and unsupervised Training. The supervised Training algorithm modifies the weights to produce the desired output pattern, and the unsupervised training algorithms modifies the weights to produce output patterns that are consistent. BP algorithm, Hopfield network, and Boltzmann machine are typical examples of the supervised training algorithm. A major example of unsupervised training algorithm is the Kohonen' self-organising feature map as in Lisboa and Taylor (1993). ANN can provide accurate estimates when there are complex relationships between variables and high noise levels distort the input data. The training of feed forward ANN often requires the existence of asset of input and output patterns, called the training set. This kind of training is called supervised training (Soucek et al., 1991).

2.14.1 Standard Back Propagation Algorithm

The BP network has served as a useful methodology to train multi-layer ANN for a wide variety of applications. The BP is supervised training algorithm using feed forward network that makes use of target values. It is basically a gradient descent method and its objective is to minimise the mean squared error between the target values and the network outputs (Lisboa and Taylor, 1993; Shamsuddin et al., 2000a; 2000b). Shamsuddin et al. (2000a; 2000b) improve this algorithm by developing the

2. 42

error signal of its training. The mechanism of works of BP begins by randomly assigning small initial weights; say in the range -0.1 to 0.1. The reason for this is to break symmetry so that various intermediate cells can take on different roles (Gallant, 1993). Figure 2.3 presents the general steps of this algorithm (Freeman and Skapura, 1991). 1. Apply the input vector, xp = (xp1, xp2, …, xpn)t 2. Calculate the network input values to the hidden layer units: N

nethpj = Σ whji xpi + θ hj i=1

3. Calculate the outputs from the hidden layer: ipj = ƒhj(nethpj) 4. Move to the output layer. Calculate the network input values to each unit: L

netopk = Σ wokjipj + θ ok j=1

5. Calculate the outputs: opk = ƒok(netopk) 6. Calculate the error terms for the output units: δopk = (ypk - opk) ƒok(netopk) 7. Calculate the error terms for the hidden units: δhpj = ƒhj(nethpj) Σδopkwokj Notice that the error terms on the hidden units are calculated before the connection weights to the output layer units have been updated. 8. Update weights on the output layer: wokj(t + 1)= wokj(t)+ ηδopkipj 9. Update weights on the hidden layer: whji(t + 1)= whji(t)+ ηδhpjxi The order of the weight updates on an individual layer is not important. Be sure to calculate the error term: M

Ep= ½ Σ δ2pk k=1

since this quantity is the measure of how well the network is learning. When the error is acceptable for each of the training vector pairs, training can be discontinued.

Figure 2.3: The Standard Back Propagation Algorithm

2. 43

In the BP network, the training rule is called the delta rule. With the delta rule, as with other types of BP, training is a supervised process that occurs with each cycle (i.e. each time the network is presented with a new input pattern) through a forward activation flow of outputs, and the backwards error propagation of weight adjustments (Dolado, 2000). The weights adjustments occur by using the delta rule that contains the training rate η, which has a significant effect on the network performance. Usually, η must be a small number on the order of 0.05 to 0.25 to ensure that the network will settle to solution (Freeman and Skapura, 1991).

The training rate η must be chosen by an adequate selection. η is crucial for the training speed that can be achieved by the algorithms. A very small η may slow down the convergence rate of the algorithm. On the other hand, a relatively large rate η may force the algorithm to oscillate between two points of the parameter space and never reach a minimum value of error (Soucek et al., 1991). The difference between the actual and desired output is determined continuously and repetitively for each set of inputs and corresponding set of outputs produced in response to the inputs.

A portion of this error is propagated backward through the network. At each neuron, the error is used to adjust the weights of the connections so that for the next epoch, the error in the network response will be less for the same inputs. This process is continuous until the network performance on the test set is optimised (Finnie et al., 1997).

2. 44

2.14.2 Neural Networks Training

Assume that a certain amount of a priori information, such as sample input / output mappings or perhaps just sample inputs, defining desired system behaviour is available to the ANN system designer. In supervised training, a set of “typical” I/O mappings form a database which is denoted as the training set. In a general sense, this set provides significant information on how to associate input data with outputs (Schalkoff, 1997).

The method used to adjust the weights in the process of network training is called the training rule as discussed in Section 2.14.1. The training can be supervised or unsupervised. The most widely used supervised training rule is the BP method. One kind of unsupervised training is self-organisation. In summary, the three essential ingredients of a computational system based on ANN are the transfer function, the architecture, and the training rule. It should be emphasised that computational models of this kind have only a metaphorical resemblance to real brains (Vemuri, 1992).

The training is one of the most important features of ANN. All knowledge in ANN are encoded in the interconnection weights, and the training process determines the weights. A weight represents the strength of association, i.e. the co-occurrence of connected features, concepts, propositions, or events during a training period. At the network level, a weight represents how frequently a receiving unit has been active simultaneously with the sending unit. Hence, weight change between two units depends upon the frequency of both units having a positive output simultaneously

2. 45

(Vemuri, 1992). ANN can modify their behaviours in response to the environment. Given a set of inputs (perhaps with desired outputs), ANN self-adjusts to produce consistent responses (Dracopoulos, 1997). Training in ANN means finding an appropriate set of weights (Freeman and Skapura, 1991). These weights are initially set to random values between -1.0 and +1.0. The network’s input and output are usually bounded between 0 and 1. Most ANN contain some forms of training rule, which modifies the weights of the connections according to the input patterns that it is presented with. In a sense, ANN learns by example to be ANN, as do their biological counterparts; a child learns to recognise cars from examples of cars (Lanubile et al., 1995). The more powerful way to speed up the training process is using the momentum method. In this method, the addition value to change in weight ΔWij is a fraction α of the previous weight change. Do this for all α and it starts for instance with 0.4. To adjust the value of the training rate α, it has to adjust at first the number of hidden neurons (Wilde, 1997).

2.14.3 Neural Networks Testing

The learned network must be tested on the testing set to verify that the ANN mapping is general to the entire process under investigation, and not limited to the training set. The inputs from the testing set are applied to the trained network, and the outputs are compared with the corresponding outputs in the testing set (Cazzanti et al., 1998).

2. 46

2.15 Summary

The software measurement is playing an important role in software life-cycle. The Function point and function point analysis are very important in software measurement such as software estimation, software controlling and etc. This chapter has presented a general view of software measurement, software measures, software metrics, function point techniques and three essential problems of function point are explained. A description of ANN with its applications in software engineering and software measurement is presented. On the other hand, this chapter explains the motivation for employing the ANN and the reasons for using the BP algorithm in this work.

2. 47

CHAPTER 3

RESEARCH METHODOLOGY

3.1

Introduction

In this chapter, a new methodology for generating complete complexity weights of function point measure is proposed. The chapter explains the methodology steps which includes analyses of the original complexity weights, theoretical work, general steps of preparing training database, description of the selected prediction tool, data collection, data analyses and measurement methods for results evaluation. The construction of this methodology depends on the nature of the subjectivity weights problem. The original complexity weights are used as baselines for establishing the proposed weights. This mechanism necessitates an accurate prediction tool such as the ANN. Therefore, an improved BP algorithm is selected for this purpose.

3.2

General Description of Research Methodology

The steps of improving the function point weights are depending on a prediction tool, logical operations, and algorithms. The tool that is used for predicting a suitable complexity weights is the ANN, which uses an improved BP algorithm. The purpose of using logical operations and algorithms is to construct the requirements of network such as training database (DB). The network cannot be trained without these DB. In order to use the assigned network to predict the proposed complexity weights, it is necessary to build this DB (five DB for five function point components) based

3. 1

on the original complexity weights as shown in Figure 3.1. This method includes the following: 1.

Analysing the original complexity weights.

2.

Description of theoretical solution.

3.

Using the original weights (rule tables) as baselines for the solution.

4.

Closing the open intervals of DET and RET, in each rule table.

5.

Applying the Mid Point rule in the open interval of each table.

6.

Generating the main patterns (initial weights samples).

7.

Generating more training patterns (increasing the weights samples).

8.

Normalising of the training patterns or the training DB.

9.

Choosing a suitable architecture of the suggested network.

10.

Training the network.

11.

Testing the network outputs.

12.

Generating the proposed weights tables.

13.

Applying the proposed weights in the function point model by using suitable data sets of software projects to find out the function point final count.

14.

Using the final count as the main parameter in effort estimation model to compute projects costs for the purpose of comparison with the actual cost values.

3. 2

    

original weights

Ordinal scale

algorithms for generating full training data sets

External input External output External inquiries Internal logical files Internal file Interface

analyses of the original weights

generation of essential patterns

training DB

ISBSG real data sets

data sampling

training seven samples

testing

whole data set proposed complexity weights

Function Point count 1 absolute scale

ordinal scale

Function Point count 2 cost estimation 1 cost estimation 2 four measurement methods

results comparison with actual values

validation of the proposed complexity weights

Figure 3.1: The General Steps of Research Methodology 3. 3

3.3

Analysing the Original Weights

Analysing the original weights of function point components can be explained depending on the contents of their rules tables. These contents are constant values created by (Albrecht, 1979) and currently used by the IFPUG. These contents are constructed according to the ordinal scale with three classes simple, average, and complex. These classes are assigned for different groups of Data Element Type (DET), File Type Reference (FTR), and/or Record Element Type (RET). This section analyses the contents of these tables in depth. A contradicting information and a high percentage of discrepancies are found because the assigned weights in most cases are given to the same groups of data. The correct way is to give each set of records and data a suitable value of weight. There is a gap between the sets of DET and RET of function point components and its given weights. The gap size represents the error in the weights that gives a negative reflection on all types of estimation that use the function point measure. For example, when we are analysing the contents of the ILF rule table (contents of Table 3.1), we establish the following group of mathematical equations to discover the depth of the problem. This group consists of 14 equations from Eq. 3.1 to 3.14. Table 3.1: Internal Logical File (ILF) Weights Record Elements

Data Elements 1 – 19

20 - 50

> 50

1

7

7

10

2-5

7

10

15

>5

10

15

15

3. 4

From Table 3.1, we can explain how these equations are created. Suppose that we have 1 RET and 2 DET, then the assigned value of weights that can be selected for this pair of DETs and RETs is 7. So, 1 RET with 2 DET take 7 weights and for the purpose of comparison, we can rewrite the last line as follows: 1RET with 2DET=7 weights. It is possible to replace the relation “with” by ® where ® is any coordinating relation that can be used for this purpose. 1RET ® 2DET=7 weights,

(3.1)

Accordingly, the following equations are organised by the same way. 1RET ® 19DET=7 weights

(3.2)

1RET ® 20DET=7 weights

(3.3)

1RET ® 50DET=7 weights

(3.4)

1RET ® 51DET=10 weights

(3.5)

In Eq. 3.1 to 3.14, the weights values can be shown on the right hand side. The Eq. 3.6 to 3.14 are generated using different values of RET and DET. 2RET ® 2DET=7 weights

(3.6)

2RET ® 19DET=7 weights

(3.7)

5RET ® 2DET=7 weights

(3.8)

5RET ® 19DET=7 weights

(3.9)

2RET ® 20DET=10 weights

(3.10)

2RET ® 50DET=10 weights

(3.11)

5RET ® 20DET=10 weights

(3.12)

5RET ® 50DET=10 weights

(3.13)

6RET ® 51DET=15 weights

(3.14)

3. 5

The composition of each equation consists of three elements, for example Eq. 3.1 includes 1 record element type, coordinate relation ®, 2 data element type and the determined weight value 7.

Certainly there is a gap between the given sets of DET with RET of ILF component and their given weights. The gap size represents the error in most cases of the weights, for instance in Eq. 3.1 and 3.2, contradicting information is found. The comparison between these two equations shows that the right hand side of each equation is equal to 7, this means that the left hand side must also be equal. This is a main concept in mathematics but this is not satisfied in both equations, because 1RET ® 2DET ≠ 1RET ® 19DET, the reason 1RET = 1RET that’s true and 2DET ≠ 19DET. The same contradiction in the Eq. 3.10 and 3.12, where 2RET ® 20DET=10 weights, and 5RET ® 20DET = 10 weights. The right hand side takes the same weight =10, the left hand side is not equal in both, because 2RET ≠ 5RET. The other equations also contain the same defect. This defect is also discovered in the other components of function point. On the other hand, Júnior et al. (2001) report that, “there are at least two clear situations in function point analysis that do not accurately translate the function point measurement process as can be observed in the data of EI rule table”. As well as the other rule tables have the same problem. This does not mean that all Albrecht weights are not exact but really there are few samples of these weights which are accepted.

3. 6

3.4

Mathematical Representation of Complexity Weights

The original complexity weights designed by (Albrecht, 1979; 1983) are described in this section by the mathematical formulas 3.15, 3.16 and 3.17 for each complexity level as follows: D ={ds1, ds2, ds3, …dsn}, 1≤ n ≤ Max_DET

(3.15)

T ={t1, t2, t3, …tk}, 1≤ k ≤ RET_no.

(3.16)

W = {wi}, i is a unique element for each level of complexity

(3.17)

where D denotes the set of DET, Max_DET represents the maximum number of data elements in each level of the ILF component and T denotes the set of RET. Likewise RET_no denotes the maximum number of record elements in each level. DT is a set of ordered pairs where each pair consists of two elements x and y, where the first element in D and the second element in T. A relation R1 can be defined on DT as follows: x R1 y, where x, y are integer numbers, x > y and y > 0. R1 is a symmetric relation, because for each two elements x and y, where x  D, y  T, then y R1 x is true whenever x R1 y is true. W in formula 3.17 represents a set of weights, which will be associated with DT elements in another relation R, which can be defined as follows: (x, y) R z means that x, y are integer numbers, x > y, y > 0 and z is a positive real number in the range of Albrecht weights, where (x, y) is pair of ordered elements in DT and z  W. In the R relation, DT will be the source set (compound set) and W will be the destination (range) set. It can be shown in formula 3.18. In this formula, the function point weights problem is represented mathematically by using the relations R and R1, where different input elements (DET, RET/FTR) in DT have the same weight. This can be shown in Figure 3.2.

3. 7

(D R1 T) R W = {((ds1, t1), wi), ((ds2, t1), wi),… ((dsn, t1), wi), ((ds1, t2), wi),…((ds1, tk), wi),…, ((dsn, tk), wi)}

(3.18)

All formulas from 3.15 to 3.18 show the static state of Albrecht weights. On the other hand, the formulas 3.17 and 3.18 are established according to the idea of the Albrecht in assigning weights. The formulas 3.19 established according to the idea of the proposed complexity weights. Ū ={ū1, ū2, … ūn}

(3.19)

where Ū represents a set of different values of complexity weights, according to different values of DETs and RETs. These values start from 1 to n, where n is an integer number which represents the domain of these weights that are ordered depending on the absolute scale.

Accordingly, in formula 3.19, the relation Ŕ between DT and Ū can be defined as follows: (x, y) Ŕ z means that x , y are integer numbers, x > y, y > 0 and z is a positive real number in the range of available weights, where (x, y) are in DT and z  Ū, as shown in formula 3.20. In this formula, each compound element in DT is related with a unique element of weight in Ū. (D R1 T) Ŕ Ū ={((ds1, t1), ū1), ((ds2, t1), ū2), …

((dsn, t1), ūn), ((ds1, t2), ūn+1),

…((dsn, t2), ū2n), …, ..., ((ds1, tk), ū ((n - 1) × k),… ((dsn, tk), ū (n × k)}

(3.20)

R

W

DT

Figure 3.2: Mathematical Representation of the Original Complexity Weights 3. 8

The relation R or Ŕ includes three components two inputs and one output, as presented in Figure 3.2 and 3.3. The suggested solution to improve the Albrecht complexity weights can be represented mathematically as shown in formulas 3.19 and 3.20 and as presented in Figure 3.3. Ŕ

Ū

DT

Figure 3.3: Mathematical Representation of the Proposed Complexity Weights The objective of this research will be satisfied according to the concept of the relation Ŕ to get the solution. Also, the formulas 3.19 and 3.20 are defined mathematically to be implemented practically using the ANN techniques.

3.5

Computer Resources

The selected algorithm of ANN is an improved error signal of the standard BP algorithm using higher order sigmoid activation function developed by (Shamsuddin et al., 2000a). Section 3.6.2 will describe the improved model of this algorithm. It is implemented by using C++ programming language. The C++ was chosen mainly because of its modularity and portability. Moreover, the C++ supports complex processing that we need through training steps. On the other hand, the statistical analysis of results performed by the Microsoft Excel spread sheets and the software used for calculating the total effort values is the ISBSG reality checker tool.

3. 9

3.6

Artificial Neural Networks

The ANN have many applications in the software engineering and software measurement field, especially in estimation and prediction. This research concentrates on using the ANN in prediction purpose according to many reasons.

3.6.1

Reasons for Using Artificial Neural Networks

Using the ANN in this research is very important because we need strong prediction tools with high effectiveness for predicting the complexity weights that will be created. There are many reasons that are made for the ANN at the front to be used in this research. These reasons are: 1. ANN have several features, which make them attractive prospects for solving pattern recognition and prediction having to build an explicit model of the system (Finnie and Witting, 1996; Finnie et al., 1997). 2. ANN include many training algorithms of prediction methods, which can be selected according to the nature of the problem (Freeman and Skapura, 1991). 3. Hwang (1992) uses ANN and function point together in his work. He implements the function point analysis on the ANN. The purpose of his work is to compare the accuracy of the results of the linear regression approach and that of the ANN. 4. ANN are one of the most common software estimation methods. They have been used successfully in many software metric modeling studies such as the development effort required for software systems with a given set of requirements (Gray and Macdonell, 1996; Finnie et al., 1997). Moreover, our study takes a new direction to use the ANN as a tool to improve the complexity

3.10

weights of function points. An improved BP algorithm is used as a selected tool for this purpose.

3.6.2

Using an Improved Model of Back Propagation Algorithm

This research used a modified model of the standard BP algorithm. Shamsuddin (2000) and Shamsuddin et al. (2001a) increased the convergence rates of BP learning, they modified an error function of (Barry and Kwasny, 1991) and they defined implicitly as shown in Eq.3.21. mm = Σ ρk

(3.21)

k

with ρk = Ek2 /(2ak (1– ak2) where Ek = tk – ak and Ek error at output unit k, tk target value of output unit k, ak an activation of unit k.

By taking partial derivatives for the updating weight using chain rule, they generate an error signal for modified BP of the output layer as presented in Eq. 3.22. δk = 2(E + ρ(1–3a2k ))/(1+ak)

(3.22)

and an error signal for the modified BP of the hidden layer is the same as standard BP, δj = Σδkwj Ū(aj) where

3.11

wj weight on connection between unit, Ū(aj) a sigmoid function of (1/1 + e-2x).

The modified BP converges faster compared to standard BP, and also the iterations size is smaller. Hence we implement the simulator of the modified BP model in our research for prediction purpose. This simulator was written in C++ and the full source code is added in Appendix D-1.

3.6.3

Reasons for Using the Back Propagation Algorithm

The BP algorithm is one of the most common software estimation methods and can be implemented by users with lack of knowledge about optimisation methods and computational implementation techniques. Moreover, it is an excellent tool for academic teaching (Cerqueira et al., 2000). Many researchers recommend using this algorithm that has the following advantages: 1. When implemented in parallel, it only uses the communication channels already used for the operation of the network itself (Wilde, 1997). 2. It is the most popular algorithm in ANN, which adjusts network weights by iteration until an error tolerance defined by the user is reached (Lanubile et al., 1995). 3. May lead to faster convergence than other methods such as Jacobi’s method (Freeman and Skapura, 1991).

3.12

4. Unknowns are represented by each layer. 5. No need to solve a quadratic equation. 6. May be expanded to solve higher-order systems of equations (Langley, 2001). 7. The BP is good at generalisation, which means that, given several different input vectors, all belonging to the same class, a BP will learn to key off of significant similarities in the input vectors and irrelevant data will be ignored (Freeman and Skapura, 1991).

3.6.4

Network Architecture

The general architecture of the used network is presented in Figure 3.4. This network consists of two input nodes (data elements and record elements/file type references), one hidden layer of many nodes and one output node. The proposed weights planned to organise in five big tables, each table is associated with a single function point component. Figure 3.4 shows also how to feed the network by input sets. Before using the inputs, they must be normalised in a suitable range such as 0 to 1 or between –1, +1.

Figure 3.4: The Architecture of the Used Network

3.13

3.7

Derivation of Training Data

The ANN learn by examples where user gathers representative data and then invokes training algorithms to automatically learn the structure of the data. In order to establish training network, it is necessary to build the training DB (five DB for five function point component) based on the original complexity weights as presented in Figure 3.5.

Albrecht weights tables

enough training patterns (DB)

general patterns

Mid Point Rule

generating patterns algorithm

normalisation

training mechanism

training data

training set

testing set

Figure 3.5: General Steps of Establishing Training Data

The figure describes different stages of creating the training data as a process to generating the function point complexity weights. These stages are explained in details in chapter 4. It includes the following:

3.14



Using the Albrecht weights tables as baselines.



Closing the open intervals of DETs and RETs for each rule table.



Applying the Mid Point Rule for calculating weights samples.



Increasing the weights samples.



Normalisation of training database.



Generating the proposed complexity weights.

3.8

Normalisation of Training Data

The training process starts after data normalisation and continues for some iterations until the training error is small enough. The essential operations before training process is the normalisation. To apply the training algorithms correctly, all data are normalised and then renormalised afterwards for evaluation of the predictions. The inclusion of the maximum and minimum of each variable is a necessary condition for good generalisation (Dolado, 2000). The objective of the normalisation is to get values in the range -1→ +1 or 0→ +1. This operation makes the ANN training very quick and easy during the processing. In this work, the training data are normalised using the model that was used for normalising the training patterns in improving the error signal of BP model (Shamsuddin, 2000, Shamsuddin et al., 2001a). This process is implemented clearly after constructing the training database in chapter 4.

3.15

3.9

Collection of Measurement Data

This section discusses the data sets that are collected by the International Software Benchmarking Standards Group (ISBSG) organisation. The ISBSG has collected more than 1200 real software projects. These data are collected for assisting researchers. The principal purposes for compiling the project database are: a) To provide IT practitioners with industry output standards against which they may compare their aggregated or individual projects, and b) To provide accurate, current and comprehensive data about software development in many countries so that the relative delivery rates and the factors impacting upon delivery may be analysed. All data are stored with strictest confidentiality.

There were several reasons for using this data, thus includes: a) The collected data represent real projects from several countries around the world. b) The accumulated historical project data includes development effort using the function point and all their data sets. Furthermore, in all cases the total developmental effort was given in working hours (actual effort). More than 200 projects that include the function point components data were used in this research. The data sets were arranged for the purpose of researches in order to compare the results as an apple to apple in the last version of function point that was documented in the year 2000 by the IFPUG. This version is not comparative with other versions of function point such as the feature points, 3D function point, full function point and other versions.

3.16

3.10 Data Sampling

The total number of projects used to validate the proposed complexity weights is 208 industrial real projects. These projects have different sizes, projects with small sizes, projects with medium sizes, and projects with big sizes. All these projects are business applications, ranging in size from 10 function points to 5684 function points. Research data are applied by two ways. Firstly, all are used as a single sample. Secondly, they are used as separate subsamples. The whole data are divided into seven subsamples. The division of data is done according to Stensrud et al. (2002) whom report that, “we recommended therefore the data set to be partitioned into two or more subsamples and that MMRE is reported per subsample”. The reason of data division into seven subsamples is to show the difference of results analysis clearly and to obtain the accuracy of the proposed method. The sampling of data are performed as follows: 1. Calculating the function points value for each project by using both types of complexity weights (the original and the proposed weights). 2. Calculating the effort values for each project by using the function point value. 3. Compare the total working hours (WH) of actual effort with the total WH of estimated effort to find error values of all projects. 4. Sorting error values in ascending or descending order and assigning data samples by slicing this range of error into seven slices or samples.

3.17

3.11 Data Analysis

Four types of measurement methods are applied, i.e. Ratio of Average Error, Error Limits (maximum and minimum error), Mean Magnitude of Relative Error (MMRE) and Correlation Coefficient (R²). The purpose of using these methods is to obtain the accuracy and how the proposed complexity weights reduce the error rate of estimation. These methods are standard and commonly used in software engineering, especially in software measurement (Hastings and Sajeev, 2001).

3.11.1 Mean Magnitude of Relative Error (MMRE)

The MMRE is probably the most widely used evaluation criterion for assessing the performance of software prediction models. It seems obvious that the purpose of MMRE is to assist us in selecting the best model (Foss et al., 2002). The data sets that are used in this research are analysed using MMRE, which is defined as follows: 1 N Ei - Ēi MMRE = ── ∑ │──────│, N I=1 Ei where Ei is a real value of an effort in a project, Ē is its estimate, and N is the number of projects. Thus, if the MMRE is small, then we have a good set of predictions (Dolado, 2000). Therefore, the model that obtains the lowest MMRE values is preferred (Stensrud et al., 2002).

3.18

3.11.2 Ratio of Average Error

Ratio of average error determines the degree to which two variables are differing in average error. Therefore, the model that obtains the lowest average error is preferred. The average error of effort values of N projects is calculated as follows: N

Average error = ( ∑ │error rate of projecti│)/N, N = sample size I=1

The average error of N values to the average error of M values is called the ratio of average error of N to M.

3.11.3 Correlation Coefficient

Correlation coefficient determines the degree to which two variables are related. If two variables are highly correlated and have a cause-effect relationship, then one variable's value may be used to predict the other. In 1995, Humphrey asserts the following correlation relationships for planning purposes where R2 or R-square is the correlation coefficient (Hastings and Sajeev, 2001). a) 0.9 ≤ R2 is considered a predictive relationship and can be used with high confidence. b) 0.7 ≤ R2 < 0.9 is considered a strong relationship and can be used with confidence. c) 0.5 ≤ R2 < 0.7 is considered an adequate relationship and should be used with caution. d) R2 < 0.5 is not reliable for planning purposes.

3.19

3.11.4 Error Limits

The error limits are used to show the gap of error, which indicates decreasing or increasing the rate of error in estimation when applying the two methods. So, if the error gap is reduced with the proposed method, this means that it is better than the original method. The error rate is calculated by using Eq. 3.23. Error rate = absolute (Estimated Values – Actual Values)

(3.23)

There are three cases: - If Estimated Values < Actual Values, then Error rate is negative. - If Estimated Values > Actual Values, then Error rate is positive. - Otherwise, Estimated Values = Actual Values, Error rate is zero. Negative values of error just indicate that the estimated values are under actual values curve. So, the sign is not taken into account. We ignore the negative sign by taken the absolute value of error rate because our concentration is to find the error width between actual and estimated values. Figure 3.6 presents an examples to show the error curve between the actual and estimated values using the original and proposed complexity weights indicated by d1, d2 and c1, c2 respectively. Effort 120 100

C2 80

d1

d2

60

C1

40 20 0 1

2

3

Actual Effort

original

4 Projects

proposed

Figure 3.6: Example of Error Gaps Between the Actual and Estimated Values

3.20

The small value of error obtains that the convergence degree is high and the accuracy is better. The upper and lower limits are two values selected from the measurement data by using Eq. 3.24 and 3.25. Upper limit = the largest value of error in measurement data

(3.24)

Lower limit = the smallest value of error in measurement data

(3.25)

3.12 Calculating the Function Points Using the New and the Original Weights

The general processes of how to analysis the function point using the original complexity weights are presented in Figure 3.7 which are currently used in the IFPUG Version 4.1 based on the ordinal scale and rules tables. Function Point model Six tables of original weights (ordinal scale)

Function Point components EI

EO

EQ

ILF

IFI

Unadjusted Function Point count Adjustment value Software functionality

Figure 3.7: The Function Point Analysis Using the Original Weights According to the generated weights tables, the function point can be calculated using these tables. Figure 3.8 shows the essential steps of calculating the function point using the proposed complexity weights that are organised as five tables instead of using six tables.

3.21

Function Point model Five tables of new weights (absolute scale)

Function Point components EI

EO

EQ

ILF

IFI

Unadjusted Function Point Adjustment value Software functionality

Figure 3.8: Function Point Analysis Using the Proposed Weights

3.13 Results Using the Effort and Cost Models

Results of comparisons in this research are accomplished using effort and cost models. The calculations are performed according to the value of function point that is produced by the original and the proposed weights. The function point value is employed as an essential parameter in effort and cost estimation models as presented in Eq. 3.26 and 3.27. Figure 3.9 shows the comparison using the two types of function point weights. Total_Effort (working hours) = a × (Software functionality size)b

(3.26)

If the software functionality size measured by the function point then, Total_ Effort (working hours) = a × (Function_Points)b Total_cost = Total -Effort (working hours) × cost per hour

(3.27)

Where a and b are two constants a = 29.258 and b = 0.78. These two parameters are assigned by the ISBSG and applied in its repository data sets.

Shepperd and Schofield (1997) report that, accurate estimation of software project effort at an early stage in the development process is a significant challenge for the

3.22

software engineering community. The validity of the designed method is measured according to the value of development effort of each project because the standard function point version of IFPUG that is developed by this work is not comparative with other versions. Function Point components original weights

Ordinal scale training DB

functionality size 1 ordinal scale

results comparison

software cost 1 software cost 2 proposed weights absolute scale

functionality size 2

Figure 3.9: Comparison of Results Using Cost Estimation Model

3.14 Summary

This chapter has described the proposed methodology that includes many different operations. The resources that are employed for weights prediction and results calculation are explained. The data sets that are used in this research represent real 3.23

projects of small, medium, and large business applications, ranging in size from 8 to 8485 function points. All these projects used to prove the validity of the proposed complexity weights when they are applied in the function point count. This count is implemented as the main parameter to calculate the total development effort of software. Four types of measurement methods are selected to evaluate the results, i.e. ratio of average error, error limits, mean magnitude of relative error and correlation coefficient. The purpose of using these methods is to obtain the accuracy and how the proposed complexity weights reduce the error rate of software estimation.

3.24

CHAPTER 4

DEVELOPMENT OF FUNCTION POINT COMPLEXITY WEIGHTS

4.1

Introduction

In this chapter, a new methodology for generating complete complexity weights is proposed. This chapter explains the steps to generate the proposed complexity weights that will cover all the function point components in five tables. Each table will contain the new weights values of each component. The proposed methodology is based on mathematical rules, mathematical operations, and using an improved ANN method. The construction of this methodology depends on the nature of the weights problem. Therefore, creation of solution mechanism necessitates strong prediction tools. Accordingly, the ANN is a suitable prediction tool and thus, it is selected for this purpose. It will be used as an important part in the methodology and the original weights will be used as baselines for creating the proposed complexity weights.

4.2

Detailed Steps of Establishing Training Database

The first step of improving the function point weights is using the original weights of Albrecht (rule tables) as a main point in the proposed methodology. It consists of many steps, which include; mathematical calculations and two special algorithms to create five DB for network training. The improved BP algorithm is applied in the proposed network.

4.1

4.2.1

Using the Albrecht Weights Tables as Baselines

Few samples of training patterns can be calculated directly from the original weights tables that are used as baselines for generating full training DB to learn the BP network. The calculated weights are called main samples or main patterns. These samples will be used for the ANN learning (BP algorithm). The contents of the five weights tables that are placed in the IFPUG manual are the bases of this process. These tables are the EO table, the EQ table, the EI table, the ILF table, and the EIF table as shown in Tables 4.1 to 4.5. All these tables are used as bases for the next steps. Table 4.1: Rule Table of Internal Logical File (ILF) RET

DETs 1 – 19

20 - 50

> 50

1

Low (7)

Low (7)

Average (10)

2-5

Low (7)

Average (10)

High (15)

>5

Average (10)

High (15)

High (15)

Table 4.2: Rule Table of External Output (EO) RET

DETs 1–5

6 - 19

> 19

3

Table 4.3: Rule Table of External Inquiry (EQ) RET

DETs 1–5

6 - 19

> 19

3

Average (4)

High (6)

High (6)

4.2

Table 4.4: Rule Table of External Input (EI) RET

DETs 1–4

5 – 15

> 15

2

Table 4.5: Rule Table of External Interface File (EIF) RET

DETs 1 – 19

20 - 50

> 50

1

Low (5)

Low (5)

Average (7)

2-5

Low (5)

Average (7)

High (10)

>5

Average (7)

High (10)

High (10)

It is important to know that the methodology steps will be applied only on the ILF table and the other components will take the same processes. Albrecht tabulates weights of ILF as an ordinal scale, simple, average, and complex 7, 10, 15 for the DET groups 1 - 19, 20 - 50, > 50 and the RET groups 1, 2 - 5, > 5 respectively as shown in Table 4. 1. Each table contains two open intervals; the first is associated with DET, which is placed at the last of the first row (>50) and the second is associated with the RET, which is placed at the end of the first column as >5. These two intervals should be closed depending on some mathematical rules and logical hypothesis.

4.2.2

Closing the Open Intervals of ILF Table

The final intervals of DET and RET in each rule table are not limited. So, these intervals should be closed before starting the solution. The open intervals of the ILF rule table have two types, which are; open interval of DET and RET. Each interval 4.3

will be closed (determined by two limits; lower limit and upper limit). The lower limits of these intervals are already determined but the upper limits are not determined as presented in Table 4.1.

4.2.3

Closing the Open Intervals of DET

The open interval of DET in the ILF table is >50 DET, where after closing this interval, it will become 51 - 93 DET. This operation is done according to the logical hypothesis that depends on the increment of the length of the first and second interval (1 - 19, 20 - 50) as follows: Suppose that, -L1 is the length of the first interval (x1) as shown in Table 4.6. -L2 is the length of the second interval (x2). -L3 is the length of the final interval (x3). L1 = End of (x1) – First of (x1)

(4.1)

L1 = 19 – 1 L1 = 18 L2 = End of (x2) – First of (x2)

(4.2)

L2 = 50 – 20 = 30 We have to find a relation between the length of the first interval (L1) and the second interval (L2). This relation can be satisfied according to the results of Eq. 4.1 and 4.2 by using Eq. 4.3. L2 = L1 + C

(4.3)

4.4

Table 4.6: Closed Intervals of Internal Logical File Rule Table DETs RET

X1

X2

X3

1 to 19

20 - 50

51 -93

Y1 →

1 RET

Low (7)

Low (7)

Average (10)

Y2 →

2 - 5 RET

Low (7)

Average (10)

High (15)

Y3 →

6 – 10 RET

Average (10)

High (15)

High (15)

where C is a constant value and by substituting the value of L1 and L2 in Eq. 4.3, it is easy to find the values of C. C = L2 - L1 C = 30 -18 C = 12 It can be concluded that the increment rate in the second intervals is 12. Consequently, the length of interval x2 = the length of interval x1 + 12 and the length of interval x3 = the length of interval x2 + 12. Thus, we can conclude that, L3 = L2 + C L3 = 30 + 12 L3 = 42 Hence, the first of x3 = 51, the end of x3 = 51+ its length. End of x3 = 51 + 42 End of x3 = 93, as presented in Table 4.6.

4.2.4

Closing the Open Intervals of RET

This interval is >5 RET, after closing this interval it becomes 6 - 10 RET. This operation is done according to the length of the first and the second intervals of the first column (1, 2 – 5) of ILF table.

4.5

Suppose that: - Ĺ1 is the length of the first interval (y1) - Ĺ2 is the length of the second interval (y2) - Ĺ3 is the length of the final interval (y3) Ĺ1 = 1, since this interval has no limits and it is a single value. According to Eq. 4.1, the length of second interval (y2) can be calculated as follows: Ĺ2 = 5 – 2 = 3 The relation between the length of first and second interval are calculated in accordance with Eq. 4.4. Ĺ2 = Ĺ1+ Ć

(4.4)

Supposed Ć is the increment constant in the length of intervals y2 and y3. By substituting the value of Ĺ1 and Ĺ2 in Eq. 4.4, it concludes that, Ć = Ĺ2 – Ĺ1 Ć=3–1 Ć=2 Consequently, the length of interval y3 = the length of interval y2 + 2. Therefore, Ĺ3 = Ĺ2 + Ć Ĺ3 = 3 + 2 = Ĺ3=5 Hence, the first of interval y3 = 6. End of y3 = 6 + its length End of y3 = 6 + 5 = 11 The closed intervals of ILF are presented above in Table 4.6. The closed intervals of other tables after applying the same procedure are shown in Tables 4.7 to 4.10.

4.6

Table 4.7: Closed Intervals of External Input Rule Table DETs RET

X1

X2

X3

1–4

5 - 15

16 - 33

Y1 →

1 RET

Low (3)

Low (3)

Average (4)

Y2 →

2 RET

Low (3)

Average (4)

High (6)

Y3 →

3 – 6 RET

Average (4)

High (6)

High (6)

Table 4.8: Closed Intervals of External Output Rule Table DETs RET

X1

X2

X3

1–5

6 - 19

20 - 42

Y1 →

1 RET

Low (4)

Low (4)

Average (5)

Y2 →

2 – 3 RET

Low (4)

Average (5)

High (7)

Y3 →

4 – 5 RET

Average (5)

High (7)

High (7)

Table 4.9: Closed Intervals of External Inquiry Rule Table DETs RET

X1

X2

X3

1–5

6 - 19

20 - 42

Y1 →

1 RET

Low (3)

Low (3)

Average (4)

Y2 →

2 – 3 RET

Low (3)

Average (4)

High (6)

Y3 →

4 – 5 RET

Average (4)

High (6)

High (6)

Table 4.10: Closed Intervals of External Interface File Rule Table DETs RET

X1

X2

X3

1 – 19

20 - 50

51 - 93

Y1 →

1 RET

Low (5)

Low (5)

Average (7)

Y2 →

2 - 5 RET

Low (5)

Average (7)

High (10)

Y3 →

6 - 10 RET

Average (7)

High (10)

High (10)

4.7

4.2.5

Applying the Mid Point Rule for Calculating Weights Samples

The objective of this section is to generate main weights samples. After closing the open intervals of ILF table, all these intervals of first row and first column can be used for applying the Mid Point rule (MP) to find out the interval center (IC) of DET and RET in each class of weights. The general steps of generating main training patterns are presented in Figure 4.1.

Main Training Patterns_Algorithm (input, output) Input: function Point Rule Tablek Output: five training patterns Const n = number of intervals:= 3; integer I, j; For each rule table Rk, k =1, 2, 3, 4, 5 Do Begin Mark all values Mi that are placed on the main diameter of table Rk Mark all values that are placed on the secondary diameter of table Rk. -Find first and last of DET and RET / FTR in each complexity level -Find Pi, where Pi is the interval center of DET in interval (Xi) - Find Qi, where Qi is the interval center of RET row (Yi) Construct the following pairs: For I=1; I= 50 - 93 IC of DET (complex) = ((51 + 93)/2) = 72. IC of RET (complex) = ((6 + 10)/2) = 8. 8 RET and 72 DET will take 15 weights. 8 RET ® 72 DET = 15 weights. Also IC of RET (complex) = ((1 + 1)/2) = 1. 1 RET and 72 DET will take only 10 weights. 1 RET ® 72 DET = 10 weights. Where the five generated samples are illustrated in Figure 4.3.c. The contents of this Figure are used to generate more patterns for training the proposed network because these generated patterns are not enough.

4.10

4.2.6

Increasing the Weights Samples

The five generated weights samples (patterns) are not enough for training and testing the network. For this reason, the algorithms in Figure 4.4 and 4.5 are established for the purpose of generating more training patterns from the main generated patterns. Build_pattern_Table (integer FP_Component_k, N; Table_patterns T) Integer i, L, j; begin j