Privacy Preserving Quantitative Association Rule Mining Using ...

4 downloads 0 Views 366KB Size Report
Jan 30, 2007 - 84156-83111 Isfahan, Iran [email protected]. Abstract - Privacy preserving data mining (PPDM) has been a new research area in the past ...
2014 7th International Symposium on Telecommunications (IST'2014)

Privacy Preserving Quantitative Association Rule Mining Using Convex Optimization Technique Elham Hatefi Department of Electrical & Computer Engineering Isfahan University of Technology 84156-83111 Isfahan, Iran [email protected]

Abdolreza Mirzaei Department of Electrical & Computer Engineering Isfahan University of Technology 84156-83111 Isfahan, Iran [email protected]

available in R may not be extracted and some rules that are not available in R may be extracted. These rules are called as a side effect of association rules hiding problems. If we assume that x → y is the sensitive rule, in order to hide this rule we can decrease its support to be smaller than MST (Minimum Support Threshold) or decrease its confidence to be smaller than MCT (Minimum Confidence Threshold). In this paper hiding sensitive association rules is formulated as a convex optimization problem in which the goal of objective function is to minimize the changes in the dataset. In order to hide sensitive rules some constraints are written which keep their support values in the changed dataset below MST. This will hide all sensitive rule rules. However by minimizing the changes in dataset most of the nonsensitive rules may still be extracted from changed dataset.

Abstract - Privacy preserving data mining (PPDM) has been a new research area in the past two decades. The aim of PPDM algorithms is to modify data in the dataset so that sensitive data and confidential knowledge, even after mining data be kept confidential. Association rule hiding is one of the techniques of PPDM to avoid extracting some rules that are recognized as sensitive rules and should be extracted and placed in the public domain. Most of the work has been done in the area of privacy preserving data mining are limited to binary data, however many real world datasets include quantitative data too. In this paper a new methods is proposed to hide sensitive quantitative association rules which is based on convex optimization technique. In this method, fuzzy association rule hiding is formulated as a convex optimization problem and experiments have been carried out on the real dataset. The results of experiments indicate that the proposed method outperformed exiting methods at this field in the term of percentage of missing rules and changes made in the dataset.

The rest of this paper is organized as follows. Section II introduces a number of related works. Section III describes the basic concepts of association rule mining, fuzzy association rule mining, privacy preserving association rule mining and convex optimization problem. Section IV describes the proposed method. Experiments will be presented in Section V and finally the Section VI describes conclusions and offers some future works.

Keywords—Privacy preserving data mining, Association rule hiding, Convex optimization;

I.

Mehran Safayani Department of Electrical & Computer Engineering Isfahan University of Technology 84156-83111 Isfahan, Iran [email protected]

INTRODUCTION

Data mining is used as a pattern and knowledge discovery tool in large collections of data. A variety of data mining algorithms have been proposed, such as decision tree, neural network, association rule mining and clustering algorithms. Knowledge extracted from data mining techniques can be extremely beneficial but may also have personal data or secret information that should be hidden. In order to achieve the two goals of data mining and the protection of data privacy, different algorithms have been developed to perform data mining algorithms while maintaining confidentiality of data. This field of data mining is called privacy preserving data mining and it was first proposed by Lindell, Pinkas [1] and Agarwal, Srikant [2].

II.

RELATED WORK

Association rule hiding is a NP-Hard problem [3]. Different methods have been proposed to hide association rules. Most of these methods consider only binary data set. These methods can be divided into four categories: heuristic based approaches, border based approaches, exact approaches and cryptography based approaches. Heuristic based approaches can be divided into two techniques, distortion technique and blocking technique. In both techniques in order to hide sensitive rules, the value of an item is changed in selected transactions. In blocking method, the values 0 or 1 will be replaced with values “?” in selected transactions [4,5,6]. In distortion method, for the selected transaction the value 0 will be replaced by value 1 and vice versa [7,8,9,10]. In border based approach sensitive frequent itemset are made hidden based on modifications take place in the lattice of frequent and infrequent itemsets of initial dataset [11,12,13,14]. In exact approach, sensitive frequent itemsets are made hidden based on constraint satisfaction problems using linear and non-linear programming methods [15,16]. Cryptographic methods are used when

Privacy preserving association rule mining is one of the techniques of PPDM. The purpose of privacy preserving association rule mining is hiding sensitive rules with the minimal side effects. If we assume that R is a set of interesting rules which can be retrieved from a dataset D and that R H is a set of sensitive rules available in R , the association rule hiding problem aims to change dataset D so that all interesting rules except R H can be retrieved. Based on the changes made in the dataset D , some interesting and nonsensitive rules are

978-1- 4799-5359-2/14/$31.00 ©2014 IEEE 815

multiple parties want to share their sensitive information [17,18].

Support measure is defined as " | x ∪ y | / N " where | x ∪ y | is equal to the number of transactions that include both ‫ ݔ‬and ‫ݕ‬ and N represents the total number of transactions in the dataset. This measure shows the percentage of rows in the transaction dataset which contain itemset xy . In fact, the support measure is used to find frequent itemsets.

For quantitative data a few works has been done which are all based on fuzzy logic. ISL (Increase Support of Left hand side) method was proposed in [19]. This method is based on decreasing the confidence of sensitive rules by increasing the support of LHS (Left Hand Side) of the association rules. DSR (decrease support of RHS) algorithm [20] is another fuzzy association rule hiding method which is based on decreasing the confidence of sensitive rules by decreasing the support of RHS (Right Hand Side) of association rules. In [21], the authors proposed two approaches to hide sensitive fuzzy association rules, namely decreasing support value of items in RHS of association rules (DSR) and Particle Swarm Optimization (PSO). A new method for preserving privacy in quantitative association rules using genetic algorithm was presented in [22]. The sensitive rules in this way were hidden by decreasing the support of the RHS of the sensitive rules.

Confidence measure indicates the strength of the relationship between the total items x and y . This measure is defined as " | x ∪ y | / | x | " where | x | is equal to the number of transaction that include x . This measure indicates the percentage of transactions in the dataset that include y , knowing that they include x . Association rule mining algorithms extract interesting rules which the amount of their support and confidence are greater than the threshold specified by the user. Association rule mining is performed in two stages. The first step is to extract frequent items. In fact set of large (frequent) items are extracted that have support value above a minimum support threshold. In the second stage, all possible rules are formed from the frequent itemsets and rules that have higher confidence value than a predefined threshold are extracted. In all the previous methods of privacy preserving quantitative association rule mining, association rules are extracted from 2-large itemset. In this paper, we follow this approach.

What makes the method proposed in this paper different from previous methods is that in this method, we make gradual changes in the values of items. In most of the previous methods, there were sudden changes in the values of items. For example in DSR method if the value of item in RHS is greater than 0.5 and greater than the value of the item in LHS then its RHS value is subtracted from 1. For example, If the value of item in RHS is equal to 0.7 and the value of item in LHS is equal to 0.5, the amount of item in RHS will be equal to 0.3 after subtracting this value from the value 1. In fact, big changes in the value of the item were made .These changes may make many changes to the real and fuzzy datasets. Although these changes may not be needed and fewer changes can hide the sensitive rules.

B. Fuzzy Association Rule Mining In the real world, datasets is not only limited to binary data, but the domain of transaction is also included the real numbers. The process of mining rules of this data is called quantitative association rule mining. Before association rule mining operations, the real values of the dataset are divided into intervals and they are assigned to the corresponding values by the set of fuzzy sets. For example if we consider fuzzy sets {low, medium, high} are associated with attribute height, then this attribute can be divided into intervals as follows: height.low = {0 − 120} , height.medium = {100 − 170} and height.high = {160 − 199} and they are assigned to a membership value in the unit interval [0,1] by the set of predefined fuzzy sets. A fuzzy set A is described by a membership function " μA = x → [0,1]" mapping the element of a universe X to the unit interval [0,1] and has been first introduced by Lotfi A.Zadeh in 1965 [24].

If we assume the item A exists in the real dataset and fuzzy sets {low, medium, high} are associated with this item, it is converted to three items A low , A medium and Ahigh in fuzzified dataset. In this paper we called such items as related items. If the value of each items is changed, this change may alter the values of other related items. The second difference in the proposed method is that we consider the correlation between related items but these relationships were not considered in previous approaches. Thus they may make a lot of changes in the real dataset. III.

BASIC CONCEPTS

The extraction of fuzzy association rule mining is similar to binary association rule mining except that each item i will be appeared with several fuzzy sets. For example, if fuzzy sets {low, medium, high} are associated with item x , it is converted to three items x low , x medium and x high in fuzzified dataset. The quantitative association rule will be defined as "If x is M then y is N " , where x ∈ I , y ∈ I and M (N ) contain fuzzy sets that are associated with item x ( y ). In this paper we use triangular fuzzy sets in order to extract association rules. Triangular fuzzy sets are described in (1).

A. Association Rule Mining Association rule mining is one of the main techniques of data mining and it was first introduced by Agarwal[23]. In this way the relationships among large set of items is demonstrated. The association rules mining can be expressed in this way: assume that I = {i 1 , i 2 , i 3 ,..., i n } be a set of items and T = {t1 , t 2 , t 3 ,..., t m } be a set of transactions, where each transaction is a set of items. Each rule is associated with a set of items. An association rule is displayed in the form of x → y where x ∩ y = ∅ , x ∈ I and y ∈ I . x is the left hand side of the rule and called antecedent and y is the right hand side of the rule and called consequent. Tow measure of support and confidence for association rules are defined.

μ = Max (min(

816

x −a c − x , ), 0), b −a c −b

(1)

where a is the left hand side of triangle, b is the peak of the triangle and c is the right hand side of triangle. Support measure of these rules is defined as "| T(x , y ) | N " and confidence measures is defined as " | T(x , y ) | | x |" , where | x | is equal to the sum of the values of item x in all rows of fuzzified dataset, N represents the total number of rows in the dataset and | T(x , y ) | is equal to the sum of the values of T( x , y ) corresponding to item x and y in all rows of fuzzified dataset. T is known as t-norms (Triangular Norms). The basic continuous t-norms are the minimum, product and lukasiewicz [24]. In this paper we use lukasiewicz t-norm for extracting quantitative association rules which is defined as:

IV.

PROPOSED METHOD

In the proposed method, in order to hide sensitive rule x → y we decrease the support of itemset xy to be smaller than minimum support threshold. In this paper we use fuzzy sets which contain three membership functions. In order to minimize changes in the real dataset we identify four areas in the fuzzy sets and changes occur in these certain areas. These areas are shown in Fig.1. For each item of sensitive rules, we first find appropriate area in selected transaction based on the real value corresponding to this item in real dataset and the relation between related items are determined in that area. Membership Value medium

low

T (x , y ) = max(x + y − 1, 0).

(2)

1

C. Privacy Preserving Association Rule Mining The purpose of privacy preserving association rule mining is to avoid extracting such rules that known as sensitive rules. Association rule hiding occurs based on decreasing the support or confidence of sensitive rules. In the first approach the support of sensitive rules is reduced under defined threshold MST. In the second approach the confidence of sensitive rules is reduced under defined threshold MCT. Based on the changes made on initial dataset, some wrong rules may be extracted from sanitized dataset. These rules are called as a side effect of association rules hiding problems. Side effects can be divided into two categories, lost rules and ghost rules. Lost rules are interesting and nonsensitive rules extracted from the initial dataset which can not be extracted from the sanitized dataset. Ghost rules are the ones that are not extracted from the initial dataset but are extracted as interesting rules from the sanitized dataset.

0

4

3 10

15

20

Real Value

For example, if the real value of items A is equal to 4 in the real dataset, it would be converted to three items A low , A medium and A high with membership values μlow (4) = 0.8 , μmedium (4) = 0 and μhigh (4) = 0 by fuzzy sets shown in Fig.1. Because the value of item A is located in the first area, its changes will also occur in this area. In this figure we can see that changes in μlow of this item will not affect its μmedium and μhigh . As an another example suppose that the value of item A is equal to 6 in the real dataset, it would be converted to three items A low , A medium and Ahigh with membership values μlow (6) = 0.8 , μ medium (6) = 0.2 and μ high (6) = 0 . The value of item A is located in the second area. Thus changes in the value of item A medium (A low ) alter the value of item A low (A medium ) in this area. If we consider the membership value of item A medium is equal to μmedium and the value of item A low is equal to μlow , the relation between these items is " μlow + μmedium − 1 = 0" in Fig.1. For example, the change in the value of item A medium from 0.8 to 0.6 alters the value of item A low from 0.2 to 0.4.

(3)

where f 0 ,..., f m are the convex functions. In fact a convex problem has three requirement.

Generally if the real value of item belongs to the first area, changes in the value of item A low do not affect the values of item A medium and Ahigh . In the second area, changes in the value of item A low alter the value of item A medium and vice versa. The value of item Ahigh is not affected in this area. If the real value of item A belongs to the third area, changes in the value of item A medium alter the values of item Ahigh and vice versa. The value of item A low is not affected in this area. In the fourth area changes in the value of item Ahigh do not affect the values of item A medium and A low . In this paper we called such items as related items.

The objective function must be convex. The inequality constraint functions must be convex. The equality constraint functions must be affine.

A function f : R n → R is convex if dom f is a convex set and if for all x , y ∈ dom f and θ with 0 < θ < 1 , we have [25] :

f (θ x + (1 − θ ) y ) ≤ θ f (x ) + (1 − θ )f ( y ).

2

Figure 1: Region of Fuzzy set

aTi x = bi i = 1,... p ,

• • •

1 5

D. Convex Optimization Convex optimization is a class of optimization problems. These optimization problems has just a single solution and their local optimum is equal to their global optimum. A convex optimization problem is in the form of [25]:

Minimize f 0 (x ) Subject to f i (x ) ≤ 0 i = 1,...m

high

(4)

817

values that represent the fuzzy variables will be between zero and one and defined as follows:

In this paper fuzzy association rules hiding is done in the form of convex optimization problems. Before formulating the problem, we must specify the variables of this optimization problem. If the dataset is supposed as two-dimensional array which rows indicate the transaction and columns indicate items of the dataset, then each value in the row i and column j of dataset will be denoted by Dataij . In this case, the variables are displayed in the form of x ij which the location of this variable is in row i and column j of dataset. Vector X is used in order to store these variables. First for each sensitive rule y → z , if index j refers to the column of item y and index k refers to the column of item z, then rows that have "max(Data ij +Data ik -1,0) > 0" are specified and the value of these rows are considered as variables. In fact, the values of these rows that affect the amount of support of itemset yz must be changed in order to reduce the support of sensitive rule. These variables are stored in vector X. After identifying the variables related with sensitive rules, related items of items that associated with these variables in vector X are specified and their values are also considered as variable and stored in vector X.

0 ≤ x ij ≤ 1.

This constraint is written for all variables. Norms function and maximum between two convex functions are convex function [25]. Thus the proposed optimization is a convex optimization and the answer is unique. For example if we consider Table I, the real values are translated to fuzzy value using triangular fuzzy set given in Fig.1. Table II shows this data. If we assume that MSV= 0.1 and MCV= 0.2, in this case ten rules B L → A L , B L → A H , C M → A H , A L → B L , AH → B L , A H → C M , C M → B L , C H → B L , B L → C M , B L → C H are derived from dataset. If C M → A H and C H → B L are sensitive rules, the optimization problem is written to solve this problem. TABLE I. REAL DATASET

T T1 T2 T3 T4

Any optimization problem consist of an objective function and some constraints. In the proposed method the objective function is norm2 difference between the values in the initial dataset and the sanitized dataset and is defined as follows:

(



2 12 ,

(x ij − Dataij ) )

(5)

where x ij is the value of an item which is considered as variable in sanitized dataset, Dataij is the value of this item in initial dataset. Constraints are defined as follows. The aim of first constraint is to reduce the values of support of sensitive rules. For m th sensitive rule: y → z , this constraint is defined as follows: N

− 1, 0) < MST * N ,

B 6 4 7 18

C 13 11 14 3

T

AL

AM

AH

BL

BM

BH

CL

CM

CH

T1

0.8

0.2

0

0.8

0.2

0

0

0.4

0.6

T2

0

0.2

0.8

0.8

0

0

0

0.8

0.2

T3

0

0

0.6

0.6

0.4

0

0

0.2

0.8

T4

0.8

0

0

0

0

0.4

0.6

0

0

First, variables corresponding to sensitive rules are specified. These variables are shown in Table III. After defining these variables, related items of items that have been considered as variables in vector X are specified and their value are considered as variables too. For example, the variable x 19 which corresponds to item C H belongs to third area and changes in the value of this variable alter the value of item C M in row 1. Thus the value of item C M which is related item of item C H is considered as variable and defined as x 18 in row 1. As an another example, the variable x 34 corresponding to item B L belongs to the second area and changes in the value of this variable alter the value of item B M in row 3. Thus the value of item B M which is related item of item B L is considered as variable and defined as x 35 in row 3. These variables are shown in Table IV.

(6)

i =1

where N represents the total number of rows in the dataset, index j refers to the column of item y and index k refers to the column of item z of this rule. If x ij (x ik ) is not available in vector X , then its constant value Dataij (Dataik ) is used instead of x ij (x ik ) . This constraint is written for all the sensitive rules. The second constraint specifies the relationship between the value of related items. In Fig.1 this relationship is defined as follows:

x ij + x ik − 1 = 0,

A 6 14 17 4

TABLE II: FUZZIFIED DATASET

x ij ∈X

∑ max(x ij + x ik

(8)

TABLE III: INTERMEDIATE DATASET

(7)

where x ij and x ik are the value of related items in selected transaction and x ij , x ik ∈ X . This constraint is written for all related items. The third constraint specifies that the range of the

818

T

AL

AM

AH

BL

BM

T1 T2

BH

CL

CM

CH

0.8

0.2

0

x 14

0

0.2

x 23

0.8

0.2

0

0

0.4

x 19

0

0

0

x 28

0.2

T3

0

0

0.6

x 34

0.4

T4

0.8

0

0

0

0

0

0

0.2

x 39

0.4

0.6

0

0

TABLE IV: INTERMEDIATE DATASET

T

AL

AM

AH

BL

BM

BH

CL

CM

CH

T1

0.8

0.2

0

x 14

x 15

0

0

T2

0

x 22

x 23

0.8

0

0

0

T3

0

0

0.6

x 34

x 35

0

0

x 18 x 28 x 38

x 19 x 29 x 39

T4

0.8

0

0

0

0

0.4

0.6

0

0

Machine Learning Repository. Six tests were done to assess the performance of the proposed algorithm. Our method was compared with DSR method which has better result compared to the algorithm in this domain. Our experiments were performed on a PC with Core i5 CPU and 4 GB RAM running on windows 7. Support value was considered between the values of 4% to 10% and confidence value was considered 10%. 10 percent of total rules are selected as sensitive rules. The goal of the first test is to assess the percentage of hiding sensitive rules for a variety of support values with the constant confidence of 10%. The result shows that hiding the rules in our proposed method is completely accomplished while about 75% of sensitive rules are made hidden in DSR method.

After defining variables, objective function and constraints are defined as follows: minimize ((x 14 -0.8) 2 +(x 15 -0.2) 2 +(x 18 -0.4) 2 +(x 19 -0.6) 2 + (x 22 -0.2) 2 +(x 23 -0.8) 2 +(x 28 -0.8) 2 +(x 29 -0.2) 2 + (x 34 -0.6) 2 +(x 35 -0.4)2 +(x 38 -0.2) 2 +(x 39 -0.8) 2 ) 1 2

Fig.2 shows the number of total rule and the number of lost rules for a variety of support values with the constant confidence values of 10%. As it is clear, the number of lost rules is less in the proposed method.

max (0 + x 18 − 1, 0) + max (x 23 + x 28 − 1, 0) + max (0.6 + x 38 − 1, 0) + 0 < 0.4

No-of Rules

max (x 14 + x 19 − 1, 0) + max (0.8 + x 29 − 1, 0) + max (x 34 + x 39 − 1, 0) + 0 < 0.4 x 14 + x 15 − 1 = 0, x 18 + x 19 − 1 = 0, x 22 + x 23 − 1 = 0 x 28 + x 29 − 1 = 0, x 34 + x 35 − 1 = 0, x 38 + x 39 − 1 = 0

100 80 60 40 20 0

0.8

T2

0

T3

0

T4

0.8

BL

BM

BH

CL

CM

CH

0.2

0

0.66

0.34

0

0

0.47

0.53

0.38

0.62

0.8

0

0

0

0.78

0.22

0

0.6

0.46

0.54

0

0

0.46

0.54

0

0

0

0

0.4

0.6

0

0

T

V.

A 6 13.1

B 6.7 4

C 12.56 11.1

T3

17

7.7

13.65

T4

4

18

3

100 80 60 40 20 0

Lost Rules in proposed method

Total Rules

4

TABLE VI: REAL SANIITIZED DATASET

T1 T2

10

Fig.3 shows the number of total rules and the number of ghost rules. In DSR method, there are no ghost rules and in the proposed algorithm, there are ghost rules which the number of them is not greater than 2% of total rules.

No-of Rules

T1

AH

8

Figure 2:Number of lost Rules under different values of minimum support

TABLE V: SANITIZED DATASET

AM

6

Support

The final association rules become B L → A L , A L → B L , AH → B L , B L → C M , C M → B L , B L → A H . In fact, two sensitive rules C M → A H and C H → B L are made hidden and two of the nonsensitive rules are lost and there are no ghost rules. Sanitized dataset and real sanitized dataset are shown in Table V and Table VI. As it can be seen, little difference in the fuzzy and real dataset is created.

AL

Lost Rules in DSR method 4

0 ≤ x 14 ≤ 1, 0 ≤ x 15 ≤ 1, 0 ≤ x 18 ≤ 1, 0 ≤ x 19 ≤ 1, 0 ≤ x 22 ≤ 1, 0 ≤ x 23 ≤ 1, 0 ≤ x 28 ≤ 1, 0 ≤ x 29 ≤ 1, 0 ≤ x 34 ≤ 1, 0 ≤ x 35 ≤ 1, 0 ≤ x 38 ≤ 1, 0 ≤ x 39 ≤ 1.

T

Total Rules

6

8

Support

10

ghost Rules in DSR method ghost Rules in proposed method

Figure 3: Number of ghost Rules under different values of minimum support

The aim of the fourth test is to evaluate changes made on the initial dataset in hiding operations. The difference between quantitative value in the initial dataset and sanitized dataset is shown in Fig.4. It is clear that this difference in proposed approach is less. The fifth test shows the difference between the real values in the initial real dataset and sanitized real dataset. According to Fig.5 it is clear that this difference in proposed approach is less. In terms of run-time, algorithm has a

EXPRIMENTAL AND RESULTS

In order to implement the proposed method the CVX package was used. This package is designed to solve convex optimization, linear and quadratic problems. Experiments were done on breast cancer dataset. This dataset is available on UCI

819

average running time of 4 seconds. This time period is acceptable for association rule hiding algorithms.

[4] [5]

Difference

400 300

Difference in DSR method

200

[6]

100

[7]

0 4

6

8

10

Difference in proposed method

[8]

Support

[9]

Difference

Figure 4: Difference between the value in the initial dataset and sanitized dataset under different values of minimum support

[10] [11]

1000 800 600 400 200 0

difference in DSR method

[12] [13]

4

6

8

10

Difference in proposed method

[14]

Support

[15]

Figure 5: Difference between the real value in the initial real dataset and sanitized real dataset under different values of minimum support

VI.

[16]

CONCLUSION AND FUTURE WORK

[17]

In this paper, a new method was proposed to hide sensitive fuzzy association rules. In this method, fuzzy association rule hiding is formulated as a convex optimization problem. The objective function is to minimize the difference between the initial and sanitized dataset. Constraints were defined to reduce support of sensitive rules and to specify the relationship between related items. The performance of the proposed algorithm is measured in the term of running time, percentage of hiding rules, side effects and changes occurred in the dataset. The results showed, all sensitive rules were made hidden with the proposed method and the number of lost rules and changes in the initial dataset have been significantly reduced. In all exiting methods at this field, association rules were extracted from 2-large itemset. Our method can be extended for any kind of association rules. For our future studies we will present a method for privacy preserving quantitative association mining in which rules are extracted from other t-norms.

[18]

[19] [20]

[21]

[22]

[23]

REFERENCES [1] [2] [3]

Y. Lindell and B.Pinkas, ”Privacy preserving data mining”, Journal of Cryptology,15(3): pp.177-206, 2002. R. Agrawal, R. Srikant “Privacy preserving data mining”. In: ACM SIGMOD Conference of Management of Data, pp. 439-450, 2000. M. Atallah, E. Bertino A, Elmagarmid, M.Ibrahim and V.S. Verykios, “Disclosure limitation of sensitive rule”, In proc.IEEE Knowledge and Data Engineering Exchange Workshop(KDEX ’99), pp. 45-52, 1999.

[24] [25]

820

Y. Saygin, V.S. Verykios and C. Clifton, “Using Unknowns to Prevent Discovery of Association Rules”, ACM SIGMOD, vol.30(4), pp.45- 54, Dec 2001. Y.Saygin, V.Verykios and A. K. Elmagramid, “Privacy preserving association rule mining”, In Proc. Int 1 Workshop on research Issues in Data Engineering(RIDE 2002), pp. 151-158, 2002. S.L. Wang and A. Jafari, “Using unknowns for hiding sensitive predictive association rules”, In Proc.IEEE Int1 Conf.Information Reuse and Integration(IRI 2005), pp. 223-228, Aug. 2005. V.S. Verkios, A. Elmagramid, E.Y. Bertino saygin and E.Dasseni “Association rule hiding”,IEEE Transaction on knowledge and Data Engineering,16(4): pp. 434-447, 2004. S,L. Wang, B. Parikh, A. Jafari, “Hiding informative association rule sets”, ELSEVIER, Expert Systems with Applications 33, pp. 316-323, 2007. S,L Wang, D. Patel, A. Jafari, T.P Hong, “Hiding collaborative recommedention association rules”, Published online: 30 January 2007, Springer Science+business Media, LLC, 2007. C.C. Weng, S.T Chen, H.C LO,” A Novel algorithm for completely hiding sensitive association rules”, IEEE Intelligent System Design and Applications, vol 3, pp.202-208, 2008. H. Mannila and H. Toivonen, “Levelwise search and borders of theories in knowledge discovery” , Data Mining and Knowledge theories in knowledge Discovery, vol.1 (3), pp. 241-258, Sep. 1997. X. Sun and P.S. Yu, “A border-based approach for hiding sensitive frequent itemsets”. In Proceedings of the 5th IEEE International Conference on Data Mining(ICDM), pp.426-433, 2005. G.V.Moustakides and V.S. Verykios, “A max-min approach for hiding frequent itemsets”. In Workshop Proceedings of the 6th IEEE International Conference on Data Mining (ICDM), pp. 502-506, 2006. G.V. Moustakides and V.S. Verykios, “A maxmin approach for hiding frequent itemsets”. Data and Knowledge Engineering,.65(1):pp. 75-89, 2008. A. Gkoulalas-Divanis and V.S. Verykios, “An Integer Programming Approach for Frequent Itemset Hiding”, In Proc. ACM Conf. Information and Knowledge Management(CIKM 06), pp. 748-757, Nov. 2006. A. Gkoulalas-Divanis and V.S Verykios, “Exact Knowledge Hiding through Dataset Extension”, IEEE Transaction on Knowledge Hiding and data Engineering, vol. 21(5), May 2009, pp.699-713. J. Vaidya and C. Clifton, “Privacy preserving association rule mining in vertically partitioned data”,In proc,Int Conf. Knowledge Discovery and Data Mining, pp. 639-644, July 2002. M. Kantarcioglu and C. Clifton, “Privacy-preserving distributed mining of association rules on horizontally partitioned data”, IEEE Transaction on knowledge and Data Engineering, vol.16(9), pp. 1026-1037, sept.2004. T.Berberoglu and M. Kaya, “Hiding fuzzy association rules in quantitative data”, The 3rd international conference on grid and pervasive computing workshops, pp. 387-392, May 2008. K. Sathiyapriya, G Sudha Sadasivam, N. Celin, “ A new method for preserving privacy in quantitative association rules using DSR approach with automated generation of membership function”, Word Information and communication technologies, pp.148-153, 2011. G. Sudha Sadasivam., S. Sangeetha, K. Sathiyapriya, “Privacy preserving with attribute reduction in quantitative association rules using PSO and DSR”, Special Issue of International Journal of Computer Applications (0975-8887) on Information Processing and Remote Computing –IPRC, pp. 19-30, August 2012. K. Sathiyapriya, G. Sudha Sadasivam, V.B. Karthikeyan, “A new method for preserving privacy in quantitative association rules using genetic algorithm”, International Journal of Computer Application, Vol 60, pp. 12-19, December 2012. R. Agrawal, T.Imielinski and A.Swami, “Mining association rules between sets of items in large dataset”, In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, pp.207-216, May 1993. Pedrycz.W , Gomide F, “Fuzzy Systems EngineeringToward HumanCentricComputing”, John Wiley & Sons, 2007. Boyd.S, Vandenberghe.L,“Convex Optimization”, Cambridge university press, 2004.