Elapsed time on arrival: a simple and versatile

0 downloads 0 Views 708KB Size Report
measurements; multi-criteria analysis; decision-making system; ELECTRE TRI. Reference to this paper should be made as follows: Ait-Mlouk, A., Gharnati, F.,.
Int. J. Computational Science and Engineering, Vol. X, No. X, 2016

Multi-criteria decisional approach for extracting relevant association rules Addi Ait-Mlouk* Department of computer science, Laboratory of Engineering and Information System, Faculty of Science Semlalia, Cadi Ayyad University, Marrakech, Morocco. Email: [email protected] * Corresponding author

Fatima Gharnati Department of physics, Team of Telecommunications and Computer Networks, Faculty of Science Semlalia, Cadi Ayyad University, Marrakech, Morocco. Email: [email protected]

Tarik Agouti Department of computer science, Laboratory of Engineering and Information System, Faculty of Science Semlalia, Cadi Ayyad University, Marrakech, Morocco. Email: [email protected] Abstract: Association rule mining play a vital role in knowledge discovering in database. The difficult task is mining useful and non-redundant rules, in fact, in most cases, the real datasets lead to a huge number of rules, which does not allow users to make their own selection of the most relevant. Several techniques are proposed such as rule clustering, informative cover method, quality measurements, etc. Another way to selecting relevant association rules, we believe it is necessary to integrate a decisional approach within the knowledge discovery process, to solve the problem we propose an approach to discover a category of relevant association rules based on multi-criteria analysis (MCA) by using association rules as actions and quality measurements as criteria. Finally, we conclude our work by an empirical study to illustrate the performance of our proposed approach.

Keywords: Data mining; knowledge discovery in database; association rules, quality measurements; multi-criteria analysis; decision-making system; ELECTRE TRI.

Reference to this paper should be made as follows: Ait-Mlouk, A., Gharnati, F., Agouti, T. (2016) ‘Multi-criteria decisional approach for extracting relevant association rules’, Int. J. of Computational Science and Engineering, Vol. X, No. Y4, pp.000–000.

Biographical notes: Addi Ait-Mlouk is a PhD candidate in the faculty of science semlalia at Cadi Ayyad University, Morocco. He received his Master degree in Computer Science from the Cadi Ayyad University. He is actively engaged in research on various aspects of Information technologies ranging from data mining algorithms to big data generation, fuzzy logic, transportation, Multi-criteria analysis, and machine learning. Fatima Gharnati is an Associate professor in the Department of physics at Cadi Ayyad University, Morocco. Her research interests lie in physics, networks and telecommunication, network security, fuzzy logic, and embedded systems. Tarik Agouti is an Associate professor in the Department computer science at Cadi Ayyad University, Morocco. His research interests lie in mathematical economics,

Copyright © 2016 Inderscience Enterprises Ltd.

supply chain management, information systems, decision system, data mining, SIG, spatial databases, fuzzy logic, multi criteria analysis, and distributed systems.

1

Introduction

According to Frawley et al. (Frawley et al., 1992), data mining refers to the process of non-trivial extraction of implicit knowledge, previously unknown and potentially useful from data. The basic idea of data mining is to extract hidden knowledge from a bunch of available data, which can be in the form of association rules, concepts, models, etc. Recently, data mining techniques are a vital part of many business analytics and predictive applications that come to complete systems that provide prediction techniques and necessary services of analysis. The term of association rules is a popular technique of data mining for discovering processable knowledge, which is based on statistical analysis and artificial intelligence. It considers conditional interaction among input dataset, and produce association rules of type IF-THEN. An example of association rule extracted from a supermarket sales database: “Cereals  50%)”

Sugar  Milk (Support 7%, Confidence

This rule mean that the customers who buy cereals and sugar also tend to buy milk. The support measure defines the support of the rule is the proportion of customers who bought the three articles, and the confidence measure defines the precision of the rule, it is the proportion of customers who bought the milk from those who bought cereal and sugar. The extraction of association rules is to extract rules whose support and confidence are at least equal to the minimum thresholds of support and confidence defined by the user. Association rules have been used successfully in many areas, including assistance with business planning, diagnostic support, medical research, improvement of telecommunications processes, organization and access to websites and image analysis, spatial data, geographic data and statistics. Although, the association rule model has many drawbacks like producing a large number of rules. The final stage of the rule validation will let the user facing a main difficulty: like the extraction of the most interesting rules among the large amount of discovered rules. Therefore, it is necessary to help the user in validation task by implementing a preliminary stage of post processing of discovered rules. The post-processing task aims at reducing the number of rules potentially interesting for the user. This task must take into account both preferences of decision makers and quality measurement; it is the case of our proposed approach of multi-criteria analysis (MCA).

Copyright © 2016 Inderscience Enterprises Ltd.

Much research has been conducted on knowledge discovery in databases (Chun-Hsien et al., 2014; Faisal et al., 2015; Ms.Shweta et al.; 2013) (Yasuhiko Morimoto, 2010; Dai Caiyan et al., 2016; Abhishek Sainani et al., 2012). However, some works (Lallich et al., 2013; Benites et al., 2014; Al-Dharhani et al., 2014) cover the usefulness and relevance of the extracted association rules. In this context, we are interested in quality of association rules by selecting the most relevant from the vast array extracted, by integrating our proposed approach. The remainder of this paper is organized as follows: section 2 introduces KDD process. In sect.3, related work is presented, the contribution of multi-criteria analysis approach is proposed in sect.4, in sect.5 we discusses our approach, in sect.6 we focus our attention on an empirical study to illustrate the performance of our proposed approach, finally this paper ended by concluding section.

2 Knowledge discovery in database 2.1. The KDD process Data mining could be defined as a non-trivial process of identifying valid, novel, potentially useful and understandable pattern in database. It presents a specific application of algorithms and analysis techniques issued from artificial intelligence and statistics for extracting patterns from data Figure 1, these extraction techniques are intended to present the results as a valid elements for decision making (Gouda et al., 2001). Figure 1 Knowledge discovery process (Fayyad, PiatetskyShapiro, and Smyth 1996)

The process of KDD consists of five major steps namely, extraction goal, data selection, transformation, the application of data mining techniques and finally the interpretation of results. Objective of extraction: The first step of data mining process is to understand the objective of extraction. Data selection: Is made by the selection of significant samples data, to minimize the amount of available data, and facilitate the study of main objectives.

Data Processing: use of the ETL (Extract-Transform-Load) process that allows for the necessary transformations between different data sources. Application of data mining techniques: There are two types of models, classification models for organizing classes in data, and regression models for determining variable dependency between them. Interpretation of results: Finally, the result information must be analyzed according to the specified objectives in the first step of this process. 2.2. The extraction of association rules Association rules is a powerful technique to discovering interesting relations between variables in large databases, it was initiated by Agrawal (Agrawal R et al., 1996) for the first time to analyze transactional databases. a.

Definition 1: Association rule

An association rule (Frawley et al., 1992) is a couple ( A, B) , where A and B are non-empty disjoint itemsets, i.e. A   , B   conventionally this rule as b.

and A  B   , we note

AB

Definition 2: The support of rule

We define the support of an association rule as the support of the itemset A  B (the percentage of transactions that contain both A and B)

| t ( A  B) | (1) Support ( A  B)  Support ( A  B)  t ( A)

c.

Definition 3: The confidence of rule

The confidence determines how frequently items in B appear in transaction that contains A, the formal definition is:

Support ( A  B) Confidence( A  B)  Support ( A)

d.

(2)

Definition 3: The lift of rule

We also define the lift of a rule A  B , noted Lift ( A  B) by the unconditional probability of the consequent, or by dividing the support by the probability of the antecedent times the probability of the consequent.

Lift ( A  B) 

Support ( A  B) Support ( A)* Support ( B)

(3)

3 Related work Many algorithms have been proposed to extract the association rules and solve the problems of time response and complexity; however, these algorithms are still limited

and costly in term of the number generally important of extracted association rules. 3.1. Extraction algorithms of association rules The extraction algorithms of association rules can be classified into three large categories: frequent algorithms like APRORI (Gouda et al., 2001), maximum, such as MAXMINER algorithm (Agrawal R et al., 1994) and closed algorithm such as CLOSE (Han et al., 2000). a. Algorithms for mining frequent itemsets For this category, we find the basic algorithms that have solved the problem of mining frequent itemsets. Among them we can mention Apriori, in which two parameters, minimum support and minimum confidence are introduced. Apriori algorithm is the key algorithm proposed by Agrawal et al. it constituted the basis of the majority algorithms that are designed to extract the association rules. In general, Apriori is to find the attributes (items) having a support  minsup, in the first pass, and for the next passages (level n) Apriori use frequent itemsets found previously level (n-1) to generate the association rules. In literature, Han et al. (Han et al., 2000) proposed AprioriTID algorithm to improve the performances of Apriori algorithm that requires a significant number of passes generally in database, the sets of candidate elements are generated in the same way as in Apriori algorithm. They are different in the computation of the support itemsets candidates. In fact, AprioriTID use a set C k of the form ( TID;C k ) where

C k is a list of itemsets contained in

the transaction. In the same logic, and to reduce the number of passes in database other algorithms have been proposed like FPGrowth (Han et al., 2000), Partition (Savasere et al., 1995), DIC (Brin et al., 1997), Eclat (Tekin, 2014; Zaki, 1997) etc. b. Algorithms for mining closed itemsets An itemset I is closed if it is frequent and does not exist frequent itemset I’ such as I  I’ and support (I’) = support (I). The extraction of closed itemset is a method in connection with the formal concept analysis based on the closing of the Galois connection (Pasquier et al., 1999). These itemsets are frequent. The frequent closed itemsets, according to closure operator, are minimal non-redundant generating a set for all frequent itemsets and their supports. c. Algorithms for mining maximal itemsets The maximum itemsets are frequent itemsets, which all supersets are infrequent. The problem of extracting frequent itemsets is decomposed as follows:  Extraction of all maximal frequent itemsets.  Computation of the subsets supports of maximum itemsets by running a single pass of database.

The algorithms dedicated to the extraction of maximal frequent itemsets simultaneously use a path from the bottom up and from top to bottom. Several algorithms have been proposed. Among them, include MaxMiner (Bayardo, R., 1998), Pincer Search, MaxEclat, MaxClique, and GenMax.

In this context of multidisciplinary approach, we propose in an integrated approach of multi-criteria decision analysis method on a set of extracted association rules.

4 Research methodologies 3.2. Quality measurement of association rules To evaluate the performances of the rules issued from extraction algorithms, the term of interesting is introduced. In the literature, many surveys deal with the interestingness measures according to two different aspects: the definition of a set of principles to select a suitable interestingness measure, and their comparison with theoretic criteria or experiments on data. In the perspective to establish the principles of a best interestingness measure, Bayardo and Agrawal (Bayardo, R. and Agrawal ,R., 1999), concluded that the best rules according to all the interestingness measures must reside along a support /confidence border. Piatetsky-Shapiro (Piatetsky-Shapiro et al., 1991) presented a new interestingness measure, called Rule-Interest (RI), and proposed three fundamental principles for a measure on a rule X  Y : RI  0 when X and Y are independent, RI monotonically increases with X  Y , RI monotonically decreases with X or Y . Hilderman and Hamilton (Hilderman et al., 2001) proposed five principles for ranking summaries generated from databases by using sixteen diversity measures and illustrate that: (1) six measures satisfied the five principles proposed; (2) nine remaining measures satisfied at least one principle. Carvalho et al. (Carvalho et al., 2005) evaluated eleven objective interestingness measures in order to rank them according to their effective interest for a decision maker. Huynh et al. (Huynh et al., 2005) proposed a clustering approach by correlation graph, which allows to identifying eleven clusters on thirty-four interestingness measures. Gavrilov et al. (Gavrilov et al., 2005) studied the similarity between the measures for classifying them. Xuan-Hiep and Fabrice (Xuan-Hiep et al., 2006) proposed a data-analysis technique for calculating the most suitable objective interestingness measures on a rule set or a set of rule sets. The association rule model has the disadvantage of producing a large number of rules. The final stage of the rule validation will let the user facing a main difficulty: how the user can extract the most interesting rules among the large number of discovered rules. Therefore, it is necessary to help the user in the validation in order to reducing the huge number of rules by preselecting a reduced number of rules potentially interesting for the user. This task must take into account the decision maker preferences and the data structure. To solve this problem many solution based on quality measurement are proposed this large quantity of interestingness measures leads to a second problem: how to help the user to choose the interestingness measures that are the best adapted to its goals and its data.

In this section, we discuss the various steps constructing our proposed approach, and then we start with the contribution of Multi-criteria analysis 4.1. Contribution and the choice of MCDA method The MCDA methodologies are developed to deal with complex situations that involve multiple, usually conflicting decision criteria which include qualitative and/or quantitative aspects in a decision-making process. Many methods of multi-criteria decision analysis have been proposed to enable the decision-makers to make the right choice. These methods can be grouped into two approaches (Boutkhoum et al., 2015) methods of the unique approach of synthesis such as TOPSIS, SMART, WEIGHTED SUM, MAUT, MAVT, UTA, AHP, ANP and the outranking methods as PROMETHEE, ELECTRE, and ORESTE. Multi criteria analysis provides the ability to select the most relevant association rules according to proposed criteria. In the literature (Dias et al., 2006; Aitmlouk A. et al., 2015) we encounter three types of multi-criteria problems, namely selection, sorting, and arrangement, noted respectively

p ,

p and p . Seen the large number of association rules generally extracted, we have to choose a method that cover an important number of alternatives, so we located at the sorting problematic p it is a choice of the appropriate association rules, so the appropriate method is ELECTRE TRI. 4.2. ELECTRE TRI Method ELECTRE TRI method has been applied in several situations thanks to its ability to simplify and solve the complex decision problems of ranking type. Its purpose is to solve sorting problems. The principal of this method is to assign a set of m alternatives noted A={a1, a2, a3,…,am} on which the decision is based. We note F= {1, 2… n} the set of criteria indices. Each alternative of the set A will be evaluated by a real function expressing the evaluation of the action for a given criteria, we note G={g1, g2, g3 … gn} the evaluation of the actions for the criteria considered (Roy B, 2002). The alternatives that are the subject of the decision are not compared with each other, but with thresholds reflecting the boundary between h classes predefined, noted C={C1, C2, C3,…,Ch}. Each alternative will be compared to the borders of each category forming a profile B={b1, b2, b3,…,bh}.The Figure 2 illustrates the sorting problem.

Figure 2 Illustration of the sort problem

veto

v j thresholds for each criterion g j play a major role

in the comparison of partial discordance index.

The affectation of alternatives in categories is based on the concept of classification. An action a of the set A outrank bh noted aSbh, if a as good as bh on all criteria. ELECTRE TRI proceeds in two consecutive steps (Mousseau et al., 2001): Step 1: The formulation of outranking relation S for comparison of alternative a to bh ( aSbh Or bh Sa ). Comparison of partial concordance index c j (a, bh ) for criteria g j : it expresses to which extent “a outrank bh ” or a is at least as good as

bh , similarly

0 if g j  ah   g j  bh   p j  bh   d j  a, bh  1 if g j  ah   g j  bh   v j  bh   Otherwise   0,1 

(6)

The computation of credibility index is based on global concordance index and partial discordance index. It present the degree of credibility of the outranking relation aSbh similarly bh Sa can be computed.

  a, bh   C  a, bh   jF

1  d j  a, bh  1  C  a, bh 

(7)

Figure 3 The outranking relation

for c j (bh , a) . Partial concordance index values vary between 0 and 1, the indifference thresholds for each criterion

q j and preference p j

g j play a major role in the

comparison of partial concordance index.   0 if g j  bh   g j  a   p j  bh  (4)  c j  a, bh   1 if g j  bh   g j  a   q j  bh   p b g a g b  j  h  j  j  h    p b  q b     j h j h 

The computation of global concordance index:

c(a, bh ) and

c(bh , a) respectively considering all criteria, it vary between 0 and 1.

C  ah , bh  

 jF K j C j  a, bh  

Kj jF

Step 2: utilization of outranking relation to assign dataset to categories. Two assignment procedures, namely pessimistic and optimistic are available to assign the alternatives to different categories,  Pessimistic assignment: compare alternative a successively to bi for i  h, h  1,...,0 and bh being

aSbh is satisfied, then assign a to category Ch1 (a  Ch1 ) . the first profile such that

(5)

 Optimistic assignment: compare alternative a successively to bi for i  1, 2,..., h and bh being the

With K j : Weight of criteria j.

C j  a, bh  : Concordance index of criteria j.

first profile such that

Computation of partial discordance index d j  a, bh  for

Q represent strongly or weakly preferred, then assign a to category Ch (a  Ch ) .

criteria

g j : it expresses to which extent criteria g j is

opposed to the statement “a outrank good as

bh Pa or bhQa is satisfied, (P,

bh ” or a is at least as

bh ” similarly to d j (bh , a) , partial discordance

index values vary between 0 and 1, the preference

p j and

5 The Proposed Approach To have an adequate model for discovering the interesting association rules it is important to integrate the complex decision-making method.

The integration of multi-criteria analysis concept within association rules has a several advantages for the decision support like: 

Manage complex decision situations by taking into account all the objective and subjective factors.  Selection of interesting association rules according to decision makers preferences.  Reduce the large number of extracted association rules.

analysis and artificial intelligence, it generally spend two important steps, the first is the extraction of frequent items from dataset, and the second is the extraction of association rules from this frequent itemsets previously extracted. This technique is a difficult task, costly in terms of the large number of rules. In this module, we chose the use of Apriori algorithm to extract an important number of interesting and no interesting rules, see Figure 4. Figure 4 Apriori algorithm

The proposed approach is divided into four modules; the first is the data-mining module, for extracting association rules by using Apriori algorithms of extraction. The second is the MCA module of multi-criteria analysis. The third is the module of quality measurement of association rules (Benites et al., 2014) and the last one is the combination module of association rules and AMC approach. 5.1. The Methodological process To simplify our approach, we adopt four major modules, as explained in Figure 3. a. Module 1 : Mining Association Rules Association rule is a powerful technique of data mining for discovering processable knowledge, based on statistical Figure 5 The proposed approach

The major steps of this algorithm are: 

Find the attributes (items) having a support

 minsup,



Uses frequent itemsets found previously to generate the association rules. The inputs of this algorithm are dataset, minimum support and minimum confidence, and in the output, we obtain a set of association rules.

A  B is considered interesting  ( A  B)   ,  set by the user.

as

far

as

Figure 6 Notations

b. Module 2 : Decision support When t h e decision-makers specify all criteria that will determine their favorable choice, proceed firstly to structure these criteria, convert the appreciations assigned by decision-makers to a precise value, and finally, calculate the importance weight of each criteria (Dias et al., 2003). Table 1 Notations The major steps to consider in this process are: • Identification of actions • Construction of objective (quantitative) and subjective (qualitative) criteria. • Convert the appreciations assigned to each criterion to precise value. •

Determine the weight of importance for each criterion.

• Define all thresholds values In this module, we study our decision problem in order to choice an appropriate method. In our case, we chose ELECTRE TRI Method, in which we consider a set of extracted association rules as action and a set of quality measurement as criteria. c. Module 3 : Quality measurement The Apriori algorithms based on the support and confidence of the rules provide an elegant solution to the rules extraction problem, but produce a large number of rules, selecting certain rules without interest and ignorant of interesting rules. There must be other measures to complement the support and confidence. The measures of interest can play a key role allowing to filter the rules automatically extracted according to criteria adapted to the user needs (Lenca et al., 2006; Lallich et al., 2006). A good way to clarify the eligibility of measures in the interests of association rules is to keep only the measures increasing with nb Let E denote a set of data and n  E For A  B , we have:

na  A , The number of variant data A nb  B , The number of variant data B

nab  A  B , The number of rule examples

nab  A  B The number of against rules examples,

nab  na  nab

A  B is evaluated using decreasing monotonous action based on nab

A\B

0

1

0 1 Total

pab pab pb

pab pab

pb

Total

pa pa 1

In the literature, many quality measures have been proposed see Table 2 for a list of some quality measures. Table 2 set of quality measurement Symbol Measures References (Frawley et al., 1992) SUP Support (Frawley et al., 1992) CONF Confidence PS Piatetsky-Shapiro (Piatetsky-Shapiro et al., 1991) LOE Loevinger (Loevinger, J, 1947) ZHANG Zhang (Terano et al., 2000) INDIMP Implication index (Lerman et al., 1981) SURP Surprise (Sergey et al., 1997) LIFT Lift (Aze et al., 2002) SEB Sebag et (Aze et al., 2002) Schoenauer CONV Conviction (Brin et al. 1997) IQC Cohen Quality (Cohen, J, 1960) Index IPD Discriminative (Lerman et al., 2003) probabilistic index d. Module 4 : MCA Approach The main process consist in the evaluation of association rules by using the selected multi-criteria analysis method in which we use the extracted association rules as alternatives and the quality measurement as criteria. This process will give us a set of chosen association rules according to decision makers preferences, in case of satisfaction we obtain the final set of relevant association rules, if not we update a set of thresholds or measure etc. according to user preferences. We repeat this operation until needs satisfaction of decision makers.

6 Empirical study: identification of characteristics of a set of customers who filed a credit application file.

6.1. Problem description The problem to solve is to select the interesting association rules by applying our proposed approach on a set of extracted rules from database (Tanagra, 2005). The used dataset identifies the characteristics of a set of customers who filed a credit application file, with the threshold support = 0.33, and confidence = 0.75. The support define the percentage of transactions that contain both items in antecedent and consequent of rule, if the support > 0.33 the number of extracted rule < 27 rules, and if the support < 0.33 the number of extracted rules > 27 rules. In addition, the confidence measure control the precision of rules; while the value of confidence is higher, we eliminate the noisy rules.

We used the set of rules previously extracted by Apriori algorithm Table 1 as actions to be evaluated according to the preferences of decision makers (support, confidence, lit, etc.). The first step is the definition of quality measurement. Seen the important number of proposed measures we chose only three measures: lift, support, and confidence respectively as the criteria Cr1, Cr2 and Cr3. When several options are possible, it is usually worthwhile to analyze the advantages and drawbacks of each one before taking the decision, an effective way to conduct this comparative analysis is to establish a compatibility matrix that crosses all available options with different criteria that

Table 3 Set of extracted rules generated by Apriori algorithm N 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Rule "Income=tranche_2" → "Habitat=tenant" ˄ "Grade=middle_grade" "Income=tranche_2" → "Grade=middle_grade" "Habitat=tenant" ˄ "Income=tranche_2" → "Grade=middle_grade" "Grade=middle_grade" ˄ "Demand=consumption" → "Habitat=tenant" "Demand=consumption" → "Habitat=tenant" "Family situation=married" → "Agreement=Yes" "Demand=consumption" → "Habitat=tenant" ˄ "Grade=middle_grade" "Habitat=tenant" ˄ "Childreen=zero" → "Grade=middle_grade" "Habitat=tenant" ˄ "Agreement=Yes" → "Grade=middle_grade" "Income=tranche_2" → "Habitat=tenant" "Grade=middle_grade" ˄ "Agreement=Yes" → "Habitat=tenant" "Agreement=Yes" → "Habitat=tenant" ˄ "Grade=middle_grade" "Grade=middle_grade" ˄ "Income=tranche_2" → "Habitat=tenant" "Habitat=tenant" → "Grade=middle_grade" "Grade=middle_grade" → "Habitat=tenant" "Agreement=Yes" → "Grade=middle_grade" "Grade=middle_grade" - "Childreen=zero" → "Habitat=tenant" "Agreement=Yes" → "Habitat=tenant" "Habitat=tenant" - "Demand=consumption" → "Grade=middle_grade" "Childreen=zero" → "Grade=middle_grade" "Demand=consumption" → "Grade=middle_grade" "Childreen=zero" → "Habitat=tenant" - "Grade=middle_grade" "Grade=middle_grade" ˄ "Family situation=married" → "Habitat=tenant" "Childreen=zero" → "Habitat=tenant" "Habitat=tenant" ˄ "Family situation=married" → "Grade=middle_grade" "Family situation=married" → "Habitat=tenant" "Family situation=married" → "Grade=middle_grade"

The association rules extracted by Apriori algorithm is given in Table 3, this table present twenty-seven extracted rules in which some of them are redundant (refer to the same signification) like the rules 20, 22 and 24, in other ways some rules are not interesting for users. Therefore, the next step is to apply our approach on these extracted rules by taking into account the decision maker preferences.

Lift

Sup (%)

1,07 1,07 1,07 1,05 1,05 1,05 1,04 1,03 1,03 1,02 1,02 1,02 1,02 1,02 1,02 1,01 1,01 1,01 1,01 1,01 1,00 1,00 1,00 0,99 0,97 0,97 0,94

40,91 43,94 40,91 40,91 46,97 42,42 40,91 43,43 59,09 44,44 59,09 59,09 40,91 80,30 80,30 63,13 43,43 66,67 40,91 46,97 42,42 43,43 41,41 48,99 41,41 49,50 45,46

Conf (%) 86,17 92,55 92,05 96,43 95,88 75,68 83,51 88,66 88,64 93,62 93,60 81,82 93,10 87,85 92,98 87,41 92,47 92,31 87,10 86,92 86,60 80,37 91,11 90,65 83,67 88,29 81,08

must be taken into account in the decision. The Table 4 gives the decision matrix. Table 4 Decision matrix Rules Cr1 Rule1 1,07 Rule2 1,07

Cr2

Cr3

0,41 0,44

0,86 0,93

Rule3 Rule4 Rule5 Rule6 Rule7 Rule8 Rule9 Rule10 Rule11 Rule12 Rule13 Rule14 Rule15 Rule16 Rule17 Rule18 Rule19 Rule20 Rule21 Rule22 Rule23 Rule24 Rule25 Rule26 Rule27

1,07 1,05 1,05 1,05 1,04 1,03 0.92 0.90 1.02

0,41 0,41 0,47 0,42 0,41 0,43 0.32 0.41 0.42

0,92 0,96 0,96 0,76 0,84 0,89 0.90 0.91 0.92

0,97 0,97 0,94 0,97 0,97 0,94 0,97 0,97 0,94 0.90 1.02

0,41 0,49 0,45 0,41 0,49 0,43 0,41 0,49 0,45 0.41 0.47

0,84 0,88 0,81 0,84 0,88 0,81 0,85 0,88 0,81 0.91 0.93

0,97 0,96 0,94 1,00 0,97

0,41 0,49 0,42 0,41 0,49

0,83 0,88 0,82 0,84 0,88

The next step is to define the set of profiles thresholds that can be compared with the extracted association rules. Table 5 Definition of profiles Cr1 b1 0.1 b2 1.0

Cr2 0.5 1.2

Cr3 1.0 1.5

The importance of each criteria in the decision-making resulted to predefined thresholds, such as the Indifference Q j (bi ) , the Preference Pj (bi ) and the Veto v j (bi ) . Table 6 Definition of thresholds for Electre Tri Cr1 Cr2 weight (Kj) 1.0 2.0 q j  b1  0.1 0.2 p j  b1  0.3 0.3 v j  b1  0.4 0.5 q j  b2  0.5 0.6 p j  b2  0.6 0.7 v j  b2  0.8 0.9

Cr3 3.0 0.1 0.2 0.4 0.5 0.8 0.9

We define the default value  -cut index   0.76 as the parameter that determines the situation preferably between actions a and profiles bh. Table 7 Degrees of credibility

Rules Rule1 Rule2 Rule3 Rule4 Rule5 Rule6 Rule7 Rule8 Rule9 Rule10 Rule11 Rule12 Rule13 Rule14 Rule15 Rule16 Rule17 Rule18 Rule19 Rule20 Rule21 Rule22 Rule23 Rule24 Rule25 Rule26 Rule27

Profile 1 0.800/0.000 1.000/0.000 1.000/0.000 1.000/0.000 1.000/0.000 0.700/0.000 0.950/0.000 0.950/0.000 0.950/0.000 1.000/0.000 1.000/0.000 0.600/0.000 1.000/0.000 1.000/0.000 1.000/0.000 0.850/0.000 1.000/0.000 1.000/0.000 0.850/0.000 0.850/0.000 0.850/0.000 0.500/0.000 1.000/0.000 1.000/0.000 0.700/0.000 0.900/0.000 0.550/0.000

Profile 2 0.421/1.000 0.550/1.000 0.533/1.000 0.600/1.000 0.600/1.000 0.367/1.000 0.483/1.000 0.483/1.000 0.783/1.000 0.567/1.000 0.867/1.000 0.667/1.000 0.550/1.000 0.883/1.000 0.883/1.000 0.783/1.000 0.533/1.000 0.867/1.000 0.450/1.000 0.450/1.000 0.450/1.000 0.325/1.000 0.517/1.000 0.517/1.000 0.367/1.000 0.467/1.000 0.350/1.000

The implementation of the case study provides a preference relationship between rules and two profiles see Table 8, with R is Incomparable that mean the rule is incomparable to the profile, and I is Indifference that mean the rule is indifference to the profile see Figure 3. Table 8 Comparison of profiles Rules Profile 1 Rule1 > Rule2 > Rule3 > Rule4 > Rule5 > Rule6 R Rule7 R Rule8 > Rule9 > Rule10 > Rule11 > Rule12 R Rule13 > Rule14 > Rule15 > Rule16 > Rule17 > Rule18 > Rule19 > Rule20 >

Profile 2 < < < < < < < < I < I < < I I I < I <
R > > R > R

< < < < < <