Knowledge distance in information systems - Springer Link

2 downloads 482 Views 414KB Size Report
ISSN: 1004-3756 (Paper) 1861-9576 (Online) ... 3 School of Management, Graduate University of Chinese Academy of Sciences, Chinese Academy ... Keywords: Information systems, knowledge, knowledge distance, knowledge granulation.
J Syst Sci Syst Eng (Dec 2007) 16(4): 434-449 DOI: 10.1007/s11518-007-5059-1

ISSN: 1004-3756 (Paper) 1861-9576 (Online) CN11-2983/N

KNOWLEDGE DISTANCE IN INFORMATION SYSTEMS∗ Yuhua QIAN1,2 1

Jiye LIANG1

Chuangyin DANG2

Feng WANG1

Wei XU3

School of Computer & Information Technology, Shanxi University, Taiyuan, 030006, China [email protected] ( ) 2

Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong, Hong Kong [email protected]

3

School of Management, Graduate University of Chinese Academy of Sciences, Chinese Academy of Sciences Beijing, 100080, China

Abstract In this paper, we first introduce the concepts of knowledge closeness and knowledge distance for measuring the sameness and the difference among knowledge in an information system, respectively. The relationship between these two concepts is a strictly mutual complement relation. We then investigate some important properties of knowledge distance and perform experimental analyses on two public data sets, which show the presented measure appears to be well suited to characterize the nature of knowledge in an information system. Finally, we establish the relationship between the knowledge distance and knowledge granulation, which shows that two variants of the knowledge distance can also be used to construct the knowledge granulation. These results will be helpful for studying uncertainty in information systems. Keywords: Information systems, knowledge, knowledge distance, knowledge granulation

1. Introduction

cluster analysis, machine learning, databases,

As a recently renewed research topic,

and many others (Zadeh 1979). Zadeh (1997)

granular computing (GrC) is an umbrella term to

identified three basic concepts that underlie the

cover any theories, methodologies, techniques,

process of human cognition, namely, granulation,

and tools that make use of granules in problem

organization, and causation. A granule is a

solving Zadeh 1996, Zadeh 1997, Zadeh 1998 .

clump of objects (points), in the universe of

Basic ideas of GrC have appeared in related

discourse, drawn together by indistinguishability,

fields, such as interval analysis, rough set theory,

similarity,







proximity,

or

functionality.

In

This work was supported by the National Natural Science Foundation of China under Grant Numbers 60773133, 70471003, and 60573074, the High Technology Research and Development Program of China under Grant No. 2007AA01Z165, the Foundation of Doctoral Program Research of the Ministry of Education of China under Grant No. 20050108004, and Key Project of Science and Technology Research of the Ministry of Education of China.

© Systems Engineering Society of China & Springer-Verlag 2007

QIAN, LIANG, DANG, WANG and XU

situations involving incomplete, uncertain, or

2005) gave a measure called knowledge

vague information, it may be difficult to

granulation for measuring the uncertainty of

differentiate different elements and instead it is

knowledge in rough set theory from the view of

convenient to consider granules, i.e., clump or

granular computing. Liang and Qian (2005)

group of indiscernible elements, for performing

studied rough sets approximation based on

operations. Although detailed information may

dynamic granulation and its application for rule

be available, it may be sufficient to use granules

extracting. Qian and Liang (2006a) extended the

in order to have an efficient and practical

Pawlak's rough set model to rough set model

solution. Very precise solutions may not be

based on multi-granulations (MGRS), where the

required for many practical problems. The

set approximations are defined by using

acquisition of precise information may be too

multi-equivalences on the universe. Recently, the rough set theory proposed by

costly and coarse-grained information reduced cost. There is clearly a need for the systematic

Pawlak

(1991)

has

become

a

popular

studies of granular computing.

mathematical framework for granular computing. granular

The focus of rough set theory is on the

computing was presented by Zadeh (1997) in the

ambiguity caused by limited discernibility of

context of fuzzy set theory. Granules are defined

objects in the domain of discourse. Its key

by

A

general

framework

of

of

concepts are those of object “indiscernibility”

possibilistic,

and “set approximation”. The primary use of

probabilistic, fuzzy, and veristic constraints.

rough set theory has so far mainly been in

Many specific models of granular computing

generating logical rules for classification and

have also been proposed. Pawlak (1991),

prediction (Skowron and Rauszer 1992) using

Polkowski and Skowron (1998), and Yao (2006)

information granules; thereby making it a

examined granular computing in connection

prospective tool for pattern recognition, image

with the theory of rough sets. Yao (1996, 2000)

processing, feature selection, data mining and

suggested the use of hierarchical granulations

knowledge discovery process from large data

for

sets. Use of rough set rules based on reducts has

generalized

constraints

the

constraints.

are

study

Examples

equality,

of

stratified

rough

set

approximations. Lin (1998) and Yao (1999)

a

significant

studied granular computing using neighborhood

reduction/feature

systems. Klir (1998) investigated some basic

redundant features; thereby having potential

issues of computing with granular probabilities.

application

In the literature (Zhang and Zhang 2003), Zhang

(Komorouski et al. 1999).

for

role

for

selection mining

dimensionality by

large

discarding data

sets

extended the theory of quotient space into the

Knowledge base and indiscernibility relation

theory of fuzzy quotient space based on fuzzy

are two basic concepts in Pawlak's rough set

equivalence relation, in which they studied

theory.

topology relation among objects, and provided

knowledge in knowledge base becomes an

theory basis for fuzzy granular computing.

important issue in resent years and information

Liang et al. (Liang and Shi 2004, Liang and Li

entropy and knowledge granulation are two

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

Research

on

the

uncertainty

of

435

Knowledge Distance in Information Systems

main approaches. For our further development,

knowledge, the difference in between all

we briefly review some relative researches. The

knowledge in a knowledge base. In fact, if

entropy of a system, as defined by Shannon

knowledge granulations or information entropy

gives a measure of uncertainty about its actual

of two knowledge have the same value, then

structure (Shannon 1948). It has been a useful

these

mechanism for characterizing the uncertainty in

discernibility ability in information systems.

various modes and applications in many diverse

Therefore, this kind of measures cannot be used

fields. Several authors have used Shannon's

to characterize the difference between two

entropy and its variants to measure uncertainty

knowledge on the universe. In many practical

of knowledge in rough set theory (Beaubouef et

issues, however, we need often to distinguish

al. 1998, Duntsch and Gediga 1998, Chakik et al.

any

2004). A new definition for information entropy

processing. Thus, a more comprehensive and

in rough set theory was presented by Liang in

effective measure for depicting the difference

the literature (Liang et al. 2002). Unlike the

between knowledge is desirable.

two

two

knowledge

knowledge

have

for

the

uncertain

same

data

logarithmic behavior of Shannon entropy, the

This paper aims to present an approach to

gain function considered there possesses the

measure the difference among knowledge in an

complement nature. Combination entropy and

information system. The rest of this paper is

combination

organized as follows. In Section 2, some

granulation

in

incomplete

information system were proposed by Qian and

preliminary

concepts

such

as

complete

Liang for measuring uncertainty of knowledge

information systems, incomplete information

(Qian and Liang 2006b), their gain function

systems and partial relation are brief recalled. In

possesses

content

Section 3, the concept of a knowledge closeness

nature. Especially, Wierman (1999) presented a

is introduced to measure the similarity between

well justified measure of uncertainty, the

two knowledge in information systems. In

measure of granularity, along with an axiomatic

Section 4, the concept of a knowledge distance

derivation. Its strong connections to the Shannon

is presented for describing the difference among

entropy and the Hartley measure of uncertainty

knowledge on the universe and its some

(Hartley 1928) also lend strong support to its

important mathematical properties are derived.

correctness and applicability. Furthermore, the

In Section 5, we establish the relationship

relationships among information entropy, rough

between the knowledge distance and the

entropy

knowledge granulation. Section 6 concludes the

intuitionistic

and

knowledge

knowledge

granulation

in

information systems were established (Liang and

Shi

2004).

In

essence,

paper.

knowledge

granulation characterizes and defines average

2. Preliminaries

measure of information granules in a given

In this section, some basic concepts are

partition or cover on the universe. Although the

reviewed, which are complete information

information entropy and knowledge granulation

systems, incomplete information systems and

can effectively characterizes the uncertainty of

partial relation among knowledge.

436

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

QIAN, LIANG, DANG, WANG and XU

An information system is a pair S = (U , A) ,

where,

group of patients for which it is impossible to

1) U is a non-empty finite set of objects;

perform all the required tests. These missing values can be represented by the set of all

2) A is a non-empty finite set of attributes;

possible values for the attribute or equivalence

3) for every a ∈ A , there is a mapping a : U → Va , where Va is called the value set of

by the domain of the attribute. To indicate such a situation, a distinguished value, a so-called null

a.

value is usually assigned to those attributes. For an information system S = (U , A) , if

∀a ∈ A , every element in Va is a definite value, then S is called a complete information system. Each

subset

of

attributes

P ⊆ A

determines a binary indistinguishable relation IND ( P) given by IND ( P) = {(u , v) ∈ U × U | ∀a ∈ P, a (u ) = a (v )}.

It is easily shown that IND ( P ) = I a∈P IND ({a}). U IND ( P )

constitutes

a

partition

of U.

U IND ( P ) is called a knowledge in U and

every equivalence class is called a knowledge granule or information granule (Liang et al. 2006). Information granulation, in some sense, denotes the average measure of information granules (equivalence classes) in P. In general, we denote the knowledge induced by P ⊆ A by U P . Example 2.1 Consider descriptions of several

cars in Table 1. This is a complete information system, where U = {u1 , u2 , u3 , u4 , u5 , u6 } and A = {a1 , a2 , a3 , a4 }, with a1 -Price, a2 -Mileage, a3 -Size, a4 -Max-Speed. By computing, it follows that U IND( A) = {{u1},{u2 , u6 },{u3 },{u4 , u5 }}.

Table 1 The complete information system about car (Kryszkiewicz 1998, Kryszkiewicz 1999) Car

Price

Mileage

Size

u1 u2 u3 u4 u5 u6

High

Low

Full

MaxSpeed Low

Low Low High High Low

High Low High High High

Full Compact Full Full Full

Low Low High High Low

If Va contains a null value for at least one attribute a ∈ A , then S is called an incomplete information system (Liang et al. 2006, Kryszkiewicz 1998, Kryszkiewicz 1999), otherwise it is called a complete information system. From now on, we will denote the null value by ∗ . Let S = (U , A) be an information system and P ⊆ A an attribute set. We define a binary relation on U as SIM ( P) = {(u, v) ∈ U × U | ∀a ∈ P, a(u ) = a(v) or a (u ) = * or a(v) = *}.

In fact, SIM ( P) is a tolerance relation on U. The concept of a tolerance relation has a wide variety of applications in classification (Liang et al. 2006, Kryszkiewicz 1998, Kryszkiewicz 1999). It can be easily shown that

It may happen that some of the attribute values for an object are missing. For example, in medical information systems there may exist a

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

SIM ( P ) = I a∈P SIM ({a}).

Let U SIM ( P ) denote the family sets

437

Knowledge Distance in Information Systems

{S P (u ) | u ∈ U } , the classification induced by P . A member S P (u ) from U SIM ( P ) will be called a tolerance class or a granule of information. It should be noticed that the tolerance classes in U SIM ( P ) do not constitute a partition of U in general. They constitute a cover of U, i.e., S P (u ) ≠ ∅ for every u ∈ U and U u∈U S P (u ) = U . Of course, SIM ( P) ) degenerates into an equivalence relation in a complete information system. Example 2.2 Consider descriptions of several

cars in Table 2. Table 2 The complete information system about car (Kryszkiewicz 1998, Kryszkiewicz 1999) Car

Price

Mileage

Size

Max-Speed

u1 u2 u3 u4 u5 u6

High

Low

Full

Low

Low

*

Full

Low

*

*

Compact

Low

High

*

Full

High

*

*

Full

High

Low

High

Full

*

This is an incomplete information system, where U = {u1 , u2 , u3 , u4 , u5 , u6 } and A = {a1 , a2 ,

a3, a4} , with a1 -Price, a2 -Mileage, a3 -Size, a4 -Max-Speed. By computing, it follows that U / SIM ( A) = {S A (u1 ), S A (u2 ), S A (u3 ), S A (u4 ), S A (u5 ), S A (u6 )}, where S A (u1 ) = {u1}, S A (u2 ) = {u2 , u6}, S A (u3 ) = {u3 }, S A (u4 ) = {u4 , u5 } , S A (u5 ) = {u4 , u5 , u6 } ,

S A (u6 ) = {u2 , u5 , u6 } . Of particular interest classification

is

the

U SIM ( A) = ω = {S A (u ) = {u} | u ∈ U } ,

438

discrete

and the indiscrete classification

U SIM ( A) = δ = {S A (u ) = {U } | u ∈ U } , or just δ and ω is there is no confusion as to the domain set involved. Now we define a partial order on the set of all classifications of U. Let S = (U , A) be an incomplete information system, P, Q ⊆ A, U / SIM ( P ) = {S P (u1 ), S P (u2 ), L , S P (u U )} and

U / SIM (Q) = {SQ (u1 ), SQ (u2 ),L , SQ (u U )} . We define a partial relation p as follows

P p Q ⇔ S P (u ) ⊆ SQ (u ), ∀u ∈ U . When S be a complete information system, there are two partitions U IDN ( P ) = {P1 , P2 ,L , Pm } and U IDN ( P) = {Q1 , Q2 ,L , Qn } . Then the partial relation has the following property (Liang and Li 2005) P p Q ⇔ for any Pi ∈ U IND ( P ), there exists Q j ∈ U IND (Q ) such that Pi ⊆ Q j .

3. Knowledge Closeness In this section, we extend the concept of set closeness to the concept of knowledge closeness for measuring the closeness degree between two knowledge in an information system. Tolerance classes induced by attribute set A are described by a family of sets {S A (u ) | u ∈ U } in an incomplete information system. In fact, a complete information system is a special form of incomplete information systems. Let S = (U , A) be a complete information system, U SIM ( A) = {S A (u1 ), S A (u2 ), L , S A (u U )}, U IND ( A) = { X1 , X 2 , L, X m } and X i = {ui1 , ui 2 ,L, uimi }, where X i = si and ∑ im=1| X i |= U . Then, the relationship between the elements in U SIM ( A) and the elements in U IND ( A) is

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

QIAN, LIANG, DANG, WANG and XU

as follows (Liang et al. 2006)

( P ∪ Q) = ω .

X i = S A (ui1 ) = S A (ui 2 ) = L = S A (uimi ) and

Proof. From the definition of tolerance relation, we have that for arbitrary i ∈ U , the tolerance classes induced by ui in U SIM ( P ), U SIM (Q) and U SIM ( P ∪ Q) are S P (ui ) ,

X i = S A (ui1 ) = S A (ui 2 ) = L = S A (uimi ) .

Definition 3.1 (Yao 2001) Let A , B be two finite

sets. Set closeness between A and B is defined as

H ( A, B) =

A∩ B A∪ B

( A ∪ B) ≠ ∅ ,

S P (ui )

~ (U SIM ( A)) = {{u1} ∪ (U − S A (u 1 )), {u2 } ∪ (U − S A (u2 )),L,{u U } ∪ (U − S A (u U ))} .

Hence, for arbitrary i ∈ U , it follows that S P ∪Q (ui ) = S P (ui ) ∩ S P (ui ) = S P (ui ) ∩ ({ui } ∪ (U − S P (ui ))) = {ui } Therefore, U SIM ( P ∪ Q ) = {S P ∪Q (ui ) = {ui }, i ≤ U } = ω

This completes the proof.



Corollary 3.1 The following properties hold 1) ~ (~ (U SIM ( P))) = U SIM ( P ) ,

2) ω = ~ δ , δ = ~ ω . Definition 3.2 Let S = (U , A) be an information

system, P, Q ⊆ A , and U SIM ( P) , U SIM (Q ) two knowledge on the universe U , where U SIM ( P ) = {S P (u1 ), S P (u2 ), L , S P (u|U | )},

U SIM (Q) = {SQ (u1 ), SQ (u2 ), L , SQ (u U )} . Knowledge closeness between the knowledge

U SIM ( P ) and the knowledge U SIM (Q ) is defined as

H ( P, Q ) = where

Proposition 3.1 Let S = (U , A) be an information

system, P, Q ⊆ A and U SIM ( P ) , U SIM (Q) two knowledge on the universe U . If U SIM ( P) =~ (U SIM (Q)), then U SIM

respectively. Since

SQ (ui ) = {ui } ∪ (U − S P (ui )) (i ≤ U ) .

That is ~ (U SIM ( A)) = {{ui } ∪ (U − S A (u i )) | i ∈ U } .

S P ∪Q (ui ),

U SIM ( P ) =~ (U SIM (Q)) , we have that

(1)

where 0 ≤ H ( A, B) ≤ 1 and we assume that H ( A, B) = 1 if ( A ∪ B ) ≠ ∅ . If A = B , then the set closeness between A and B achieves maximum value 1. If A ∩ B = ∅ , then the set closeness between A and B achieves minimum value 0. The set closeness denotes the measure of the similarity between two sets. The more the overlap between these two sets is, the large the value of H is, and vice versa. In order to investigate the measure of the similarity between two knowledge and its some properties, we here introduce the concept of complement of knowledge. Let U / SIM ( A) = {S A (u1 ), S A (u2 ), L, S A (u U ) be the knowledge induced by attribute set A on the universe U, then the complement of this knowledge is defined as

and

1 U

U

S P (ui ) ∩ SQ (ui )

∑S i =1

P (ui ) ∪ SQ (ui )

,

(2)

1 ≤ H ( P, Q ) ≤ 1 . U

The knowledge closeness represents the measure

of

the

similarity

between

two

knowledge on U. The more the overlap between knowledge is, the larger knowledge closeness H

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

439

Knowledge Distance in Information Systems

is, and vice versa.

=

Proposition 3.2 (Maximum) Let S = (U , A) be information system, P, Q ⊆ A, and

an

=

U SIM ( P ) , U SIM (Q ) two knowledge on the universe U . If U SIM ( P) = U SIM (Q) , then and

U SIM (Q )

achieves

its

maximum value 1.

Then,

U SIM ( P) =~ (U SIM (Q)), then the knowledge U SIM ( P )

U SIM (Q )

distance. Definition

important

4.1

=

U



1 U

U

440

i =1

∑ i =1

Knowledge distance between the knowledge U SIM ( P ) and the knowledge U SIM (Q ) is defined as

S P (ui ) ∩ SQ (ui )

D ( P, Q ) =

P (ui ) ∪ SQ (ui )

S P (ui ) ∩ ({ui } ∪ (U − S P (ui )))

1 U

u

S P (ui ) ∩ SQ (ui )

∑ (1 − S i =1

P (ui ) ∪ SQ (ui )

),

(3)

1 . U

The knowledge distance denotes the measure

S P (ui ) ∪ {ui } ∪ (U − S P (ui ))

of difference between two knowledge on the

( S P (ui ) ∩ {ui }) ∪ ( S P (ui ) ∩ (U − S P (ui ))) S P (ui ) ∪ (U − S P (ui ))

an

U SIM (Q) = {SQ (u1 ), SQ (u2 ),L , SQ (u U )} .

where 0 ≤ D ( P, Q) ≤ 1 − 1 U

Let S = (U , A) be

U SIM ( P ) = {S P (u1 ), S P (u2 ), L , S P (u U )} and

Hence, we have that

=

mathematical

information system, P, Q ⊆ A , and U SIM ( P ) , U SIM (Q ) two knowledge on the universe U, where

SQ (u i ) = {u i } ∪ (U − S P (u i ))(i ≤ U ) .

i =1

some

for verifying the validity of this knowledge

we have that for arbitrary i ∈ U , the tolerance classes induced by ui in U SIM ( P ) and U SIM (Q ) are S P (ui ) and SQ (ui ) , respectively. Since U SIM ( P ) = ~ (U SIM (Q )) we have

∑S

its

analyses on two public data sets are performed

achieves

Proof. From the definition of tolerance relation,

U



properties are obtained. Finally, experimental

1 minimum value . |U |

1 H ( P, Q ) = U

1 . U

between two knowledge on the same universe.

U SIM (Q ) two knowledge on the universe U. If

knowledge

− S P (ui ))

of knowledge distance to measure the difference

information system, P, Q ⊆ A, and U SIM ( P ) ,

the

P (ui ) ∪ (U

In this section, we first introduce the concept

Proposition 3.3 (Minimum) Let S = (U, A) be an

and

i =1

{ui } ∪ φ

4. Knowledge Distance

Proof. It is straightforward.

closeness between the knowledge

U

∑S

This completes the proof. 1 Corollary 3.2 H (ω , δ ) = . |U |

the knowledge closeness between the knowledge U SIM ( P )

1 U

same universe. The more the overlap between knowledge is, the smaller knowledge distance

H is, and vice versa.

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

QIAN, LIANG, DANG, WANG and XU

Proposition 4.1 (Maximum) Let S = (U, A) be an

Hence, H ( P, Q ) + D ( P, Q) = 1 .

information system, P, Q ⊆ A , and U SIM ( P ) , U SIM (Q ) two knowledge on the universe U . If U SIM ( P) =~ (U SIM (Q)), then the

complement relation between the knowledge

Obviously,

there

is

a



strictly

mutual

distance and the knowledge closeness in terms

knowledge distance between the knowledge U SIM ( P ) and the knowledge U SIM (Q)

of Definition 3.2 and 4.1.

1 achieves maximum value 1− . |U |

Q ={Max-speed}. Compute the knowledge distance between P and Q .

Example 4.1 For Table 1, let P={ Price } and

Proof. It is straightforward.

By computing, we have that

1 Corollary 4.1 D(ω , δ ) = 1 − . U Proof. This proof is similar to that of

Proposition 3.3. Proposition 4.2 (Minimum) Let S = (U , A) be an information system, P, Q ⊆ A , and U SIM ( P ) , U SIM (Q ) two knowledge on the universe U. If U SIM ( P) = U SIM (Q) , then the knowledge closeness between the knowledge U SIM ( P ) and the knowledge U SIM (Q ) achieves minimum value 0. Proof. It is straightforward. Proposition 4.3 Knowledge distance D has the following properties 1) D ( P, Q) ≥ 0 (non-negative), 2) D( P, Q) = D(Q, P) (symmetrical). Proof. They are straightforward. Proposition 4.4 Let S = (U , A) be an information system, P,Q ⊆ A , and U SIM ( P) , U SIM (Q ) two knowledge on the universe U . Then, H ( P, Q ) + D ( P, Q) = 1 . Proof. From Definition 3.2 and 4.1, we have that

D ( P, Q ) =

1 U

S P (ui ) ∩ S Q (ui )

U

∑ (1 − S i =1

= 1−

1 U

U

S P (ui ) ∩ S Q (ui )

∑S i =1

P (ui ) ∪ S Q (ui )

P (ui ) ∪ S Q (ui )

= 1 − H ( P, Q ) .

U IND( P) = {{u1 , u 4 , u5 },{u2 , u 3 , u6 }} , U IND(Q) = {{u1 , u 2 , u3 , u6 },{u4 , u 5}} . If we regard Table 1 as a special incomplete information system, we can obtain the following

U SIM ( P ) = {{u1 , u 4 , u5 },{u2 , u 3 , u6 }, {u2 , u 3 , u6 },{u1, u 4 , u5},{u1 , u 4 , u5 },{u2 , u 3 , u6 }},

U SIM (Q) = {{u1 , u 2 , u3 , u6 },{u1 , u 2 , u3 , u6 },. {u1, u 2 , u3 , u6},{u4 , u 5},{u4 , u 5},{u1, u 2 , u3 , u6}} . By computing, the knowledge distance between P and Q is

D ( P, Q ) =

1 U

U

S (u ) ∩ S (u )

∑ (1 − S P (ui ) ∪ S P (ui ) ) i =1

P

i

P

i

1 1 3 3 2 2 3 = [(1− )+(1− ) +(1− ) +(1− ) +(1− )+(1− )] 6 6 4 4 3 3 4 3 = , 8

and the knowledge closeness between P and Q is

H ( P, Q ) =

1 U

U

S (u ) ∩ S (u )

∑ SP (ui ) ∪ SP (ui ) i =1

P

i

P

i

1 1 3 3 2 2 3 ( + + + + + ) 6 6 4 4 3 3 4 5 = . 8

=

)

Therefore, H ( P, Q ) + D ( P, Q ) =

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

5 3 + =1. 8 8

441

Knowledge Distance in Information Systems

Proposition 4.5 Let S = (U , A) be an information P, Q, R ⊆ A with P p Q pR. Then D ( P , R ) ≥ D ( P , Q ) and D ( P, R) ≥ D(Q, R) . Proof. If we regard S as an incomplete information system, then U / SIM ( P) = {S P (u1 ),

system,

S P (u2 ), L , S P (u U )},

U IND ( P) = {P1 , P2 , L ,

Pm } and Pi = {ui1 , ui 2 ,L, uimi } , where Pi = si and ∑ im=1 | X i |= U . In other words, complete information systems and incomplete information systems can be consistently represented. Then, the relationship between the elements in U IND ( P ) and the elements in U SIM ( P ) is as follows Pi = S P (ui1 ) = S P (ui 2 ) = L = S P (uimi ) .

≥0. Similarly, we have D ( P, R) − D ( P, Q) ≥ 0 . Therefore, D ( P, R ) ≥ D ( P, Q ) and D ( P, R) ≥ D(Q, R) hold. ■ Proposition 4.6 Let S = (U , A) be an information system, P, Q, R ⊆ A with Pp Q pR . Then D ( P , R ) + D (Q , R ) ≥ D ( P , Q ) , D ( P , R ) + D ( P , Q ) ≥ D (Q , R ) and

D ( P, Q ) + D (Q, R) ≥ D( P, R) . Proof. From Proposition 4.4, one can know that

D ( P, R) ≥ D ( P, Q ) and if P pQ pR . It is clear that

D( P, R) ≥ D (Q, R)

Similarly,

D ( P, R) + D(Q, R) ≥ D( P, Q ) and D( P, R ) + D( P, Q) ≥ D(Q, R ) hold.

Q j = SQ (u j1 ) = SQ (u j 2 ) = L = SQ (u jm j ) ,

Therefore, we just need to prove

Rk = S R (uk1 ) = S R (uk 2 ) = L = S R (ukmk ) .

D ( P , Q ) + D (Q , R ) ≥ D ( P , R ) . Similar to the proof of Proposition 4.4, we can

Since P pQ pR , we have

S P (u ) ⊆ SQ (u ) ⊆ S R (u ) for arbitrary u ∈ U . So,

SP (u) ∪ S R (u ) ⊇ S P (u ) ∪ SQ (u ) and

get that S P (u ) ⊆ SQ (u ) ⊆ S R (u ) if P pQ pR That is to say,

S P (u ) ∩ SQ (u ) = S P (u ) ,

S P (u ) ∩ SQ (u ) = S P (u ) ∪ S R (u ) .

S P (u ) ∪ SQ (u ) = SQ (u ) ;

Hence, one can obtain that

S P (u ) ∩ S R (u ) = S P (u ) , S P (u ) ∪ S R (u ) = S R (u ) ;

SP (u) ∩SR (u) ≥ SP (u) ∩ SQ (u) and S P (u ) ∩ SQ (u ) = S P (u ) ∪ S R (u ) .

and SQ (u ) ∩ S R (u ) = SQ (u ) ,

Therefore, we have that

SQ (u ) ∪ S R (u ) = S R (u ) .

D ( P, R ) − D ( P, Q )

=

1 U −

1 = U

442

U

So S P (u ) ≤ SQ (u ) ≤ S R (u ) . Therefore, we have that

S (u ) ∩ S (u )

∑ (1 − SP (ui ) ∪ SR (ui ) ) i =1

1 U

P

i =1

i

P (ui ) ∪ SQ (ui )

SP (ui ) ∩ SQ (ui )

∑( S i =1

R

S P (ui ) ∩ SQ (ui )

U

∑ (1 − S

U

i

P (ui ) ∪ SQ (ui )



D( P, Q) + D(Q, R ) − D ( P, R)

)

SP (ui ) ∩ SR (ui ) SP (ui ) ∪ SR (ui )

=

)

1 U

U

SP (ui ) ∩ SQ (ui )

∑ (1 − S i =1

P (ui ) ∪ SQ (ui )

)

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

QIAN, LIANG, DANG, WANG and XU

+

SQ (ui ) ∩ S R (ui )

U

1 U

∑ (1 − S i =1 U

S P (ui ) ∩ S R (ui )

1 − U

∑ (1 − S

1 = U

U

− =

Q (ui ) ∪ S R (ui )

i =1

P (ui ) ∪ S R (ui )

S P (ui ) ∩ S R (ui )

∑( S i =1

P (ui ) ∪ S R (ui )

S P (ui ) ∩ SQ (ui ) S P (ui ) ∪ SQ (ui ) U

1 U



S (u )

∑ ( SP (ui ) + 1 − i =1

R

i

induced by all attributes of an information

)

system and knowledge induced by various numbers of attributes. The changes of values of knowledge distances with the number of

)

attributes in these two data sets are shown in Figure 1 and Figure 2.

+1

SQ (ui ) ∩ S R (ui ) SQ (ui ) ∪ S R (ui ) S P (ui ) SQ (ui )



)

SQ (ui ) S R (ui )

).

Denoted by p = SP (ui ) , q = SQ (ui )

and

r = S R (ui ) . From SP (u) ≤ SQ (u) ≤ S R (u ) , it follows that 0 < p ≤ q ≤ r ≤ U . Suppose that the function f ( p, q, r ) = 1 +

p p q − − . Here, we r q r

Figure 1 Knowledge distance with the number of attributes about dermatology

only need to prove f ( p, q, r ) ≥ 0 . Therefore f ( p, q, r ) = 1 + =

p p q qr + pq − pr − q 2 − − = qr r q r

(r − q )(q − p ) ≥0. qr

Hence, D( P, Q) + D(Q, R ) − D ( P, R) =

1 U

U

∑ f ( p, q, r ) ≥ 0 . This completes the proof. i =1

■ In the following, through experimental

analyses, we illustrate some properties of the

Figure 2 Knowledge distance with the number of attributes about monks-problems

knowledge distance in information systems. We

It can be seen from Figure 1 and Figure 2

have downloaded two public data sets with

that the value of knowledge distance decreases

practical applications from UCI Repository of

as the number of selected attributes becomes

machine

are

bigger in the same data set. In other words,

information system dermatology with 240

through adding number of attributes, the

objects and information system monks-problems

knowledge induced by these attributes can

with 432 objects. All condition attributes in the

approach to the knowledge induced by all

two

analyze

attributes in this information system, i.e., the

knowledge

knowledge distance between them can approach

data

knowledge

learning

sets

databases,

are

distances

discrete. between

which

We

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

443

Knowledge Distance in Information Systems

to zero. Therefore, we can draw a conclusion that

the

knowledge

characterize

the

distance

difference

can

well

between

two

knowledge in the same information system.

5. Relationship between Knowledge Distance and Knowledge Granulation Knowledge granulation is an important concept of granular computing proposed by Zadeh (1997). The knowledge granulation of an information

system

gives

a

measure

of

uncertainty about its actual structure. In general, knowledge

granulation

discernibility

ability

information

systems.

can of

represent

the

knowledge

in

Especially,

several

measures in an information system closely associated with granular computing such as granulation measure, information entropy, rough entropy and knowledge granulation and their relationships were discussed (Liang et a. 2004, Liang et al. 2006). In the literature (Qian and Liang 2006b), we introduced two concepts so-called combination entropy and combination granulation to measure the uncertainty of an information system. In the literature (Liang and Qian 2006), an axiom definition of knowledge granulation was given, which gives a unified description for knowledge granulation. In this section, we will discuss the relationship between

S P ( x2 ),L , S P ( x U )}, there exists a sequence K ' (Q) of K (Q) , where

K ' ( P ) = {S P ( x1' ), S P ( x2' ), L, S P ( x 'U )} , such that S P ( xi ) ≤ SQ ( xi' ) . If there exists a sequence K ' (Q) of K (Q) such that ' S P ( xi ) ≤ SQ ( xi ) , then we will call that P is strict granulation finer than Q , and denote it by P p' Q . Definition 5.1 (Liang and Qian 2006) Let S = (U , A) be an information system and G be a mapping from the power set of A to the set of real numbers. We say that G is a knowledge granulation in an information system if G satisfies the following conditions: 1) G ( P) ≥ 0 for any P ⊆ A (Nonnegativity); 2) G ( P ) = G (Q ) for any P, Q ∈ A if there is a bijective mapping function f : K ( P) → K (Q) such that

SP (ui ) = f (SP (ui )) (∀i ∈{1, 2,L, U }) , where K ( P ) = { S P ( u i ) | x i ∈ U } K (Q ) = {SQ (u i ) | xi ∈ U } (Invariability);

3) G ( P ) < G (Q) for any P, Q ∈ A with P p' Q (Monotonicity). Corollary 5.1 If P p Q , then D ( P, ω ) ≤ D (Q , ω ) . Proof. Since knowledge ω = {{u i }| ui ∈U } and P pQ . So for u i we have that

knowledge distance and knowledge granulation. Let S = (U , A) be an information system, P, Q ⊆ A.

K(P) = {SP (xi ) | xi ∈U},

K (Q ) =

{SQ ( xi ) | xi ∈ U } . We define a partial relation p with set size character as follows (Liang and

Qian 2006):

and

{u i } ⊆ S P (u i ) ⊆ SQ (u i ) . Thus, 1 ≤ S P (u i ) ≤ SQ (u i ) . Hence, we have that D ( P, ω ) =

1 U

U

∑ (1 − i =1

S P (u i ) ∩ {u i } S P (u i ) ∪ {u i }

)

P p'Q if and only if, for K ( P) = {S P ( x1 ),

444

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

QIAN, LIANG, DANG, WANG and XU

= ≤ =

U

i.e., G1 ( P ) = G2 (Q ) .

S P (u i ) − 1

1 U



1 U



1 U

∑ (1 −

i =1

S P (u i )

U

SQ (u i ) − 1

i =1

SQ (u i )

3) Let P, Q ⊆ A with P p' Q , then for arbitrary SP (ui )(i ≤ U ) , there exists a sequence

U

SQ (u i ) ∩ {u i }

i =1

SQ (u i ) ∪ {u i }

{SQ (u1' ), SQ (u2' ), L, SQ (u 'U )} such S P (ui ) < SQ (ui ) . Hence, we obtain that )

D ( P, ω ) =

= D (Q , ω ) ,

1 U =

U

S P (u i ) ∩ {u i }

∑ (1 −

S P (u i ) ∪ {u i }

i =1

1 U

U



1 < U

U IND( P) = {P1 , P2 ,L , Pm } and

= D (Q , ω ) ,

=

1 U

)

S P (u i ) − 1

i.e., D( P, ω ) ≤ D(Q, ω ) . This completes the proof. ■ Proposition 5.1 G1 ( P ) = D ( P, ω ) is a knowledge granulation in Definition 5.1. Proof. 1) Obviously, it is non-negative; 2) Let P, Q ⊆ A , then

i =1

S P (u i )

U

SQ (ui' ) − 1

i =1

SQ (ui' )



that

SQ (u i ) ∩ {u i }

U

∑ (1 − S i =1

Q (u i ) ∪ {u i }

)

U IND(Q) = {Q1 , Q2 ,L , Qn } in a complete information system can be denoted by

i.e., G1 ( P ) < G2 (Q) . This completes the proof.

U SIM ( P ) = {S P (u1 ), S P (u2 ), L , S P (u U )}

Corollary 5.2 If P pQ , then D( P, δ ) ≥ D (Q, δ ) .



and

Proof. Since δ = {S P (ui ) | S P (ui ) = U , ui ∈ U }

and P pQ . So for u i we have that

U SIM (Q) = {SQ (u1 ), SQ (u2 ), L , SQ (u U )} .

Suppose that there be a bijective mapping function f : U SIM ( P ) → U SIM (Q ) such that

S P (ui ) = f ( S P (ui )) (i ∈ {1, 2,L, U }) f ( S P (ui )) = SQ (u ji )( ji ∈ {1, 2, L, U }) , then we have that 1 D ( P, ω ) = U

U

S P (u i ) ∩ {u i }

∑ (1 − S i =1

U

P (u i ) ∪ {u i }



1 = U

U

SQ (u ji ) − 1

i =1

SQ (u ji )

=

1 U



Thus S P (ui ) ≤ SQ (ui ) ≤ U . Hence, we have that D ( P, δ ) =

)

1 U =

S P (u i ) − 1

1 = U

i =1

and

S P (ui ) ⊆ SQ (ui ) ⊆ U .



S P (u i )

U

SQ (u i ) ∩ {u i }

∑ (1 − S i =1

= D (Q , ω ) ,

Q (u i ) ∪ {u i }

=

U

S (u ) ∩ U

∑ (1 − S P (u i ) ∪ U ) i =1

U

P

i

U − S P (u i )

1 U



1 U



1 U

∑ (1 − S

U

i =1

U

U − SQ (ui )

i =1 U

i =1

U SQ (u i ) ∩ U Q (u i ) ∪ U

)

= D(Q, δ ) ,

)

i.e., D ( P, δ ) ≥ D (Q, δ ) . This completes the proof. ■

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

445

Knowledge Distance in Information Systems

1 − D ( P, δ ) U

1 − D( P, δ ) is a U

G2 ( P) = 1 −

knowledge granulation in Definition 5.1. Proof. 1) Obviously, it is non-negative; 2) Let P, Q ⊆ A, then U IND( P) = {P1 ,

= 1−

U IND(Q) = {Q1 , Q2 ,L , Qn }

= 1−

Proposition 5.2 G2 ( P ) = 1 −

P2 ,L , Pm }

and

in complete information system can be denoted by

< 1−

U SIM ( P) = {S P (u1 ), S P (u2 ), L , S P (u U )}

and

= 1−

U SIM (Q ) = {SQ (u1 ), SQ (u2 ), L , SQ (u U )} .

= 1−

Suppose that there be a bijective mapping function f : U SIM ( P ) → U SIM (Q ) such that

S P (ui ) = f ( S P (ui )) (i ∈ {1, 2,L, U })

f (SP (ui )) = SQ (u ji )( ji ∈{1, 2,L, U }) , then we have that and

G2 ( P) = 1 − = 1−

= 1−

1 − D ( P, δ ) U

1 1 − U U

= 1−

1 1 − U U

S P (ui ) ∩ U

∑ (1 −

S P (ui ) ∪ U

i =1

U

1 1 − U U

∑(1−

U − SP (ui ) U

i =1

1 1 = 1− − U U = 1−

U

∑ (1 − U

SQ (ui ) ∩U

∑(1− S i =1

)

U

i =1

Q (ui ) ∪U

∑ (1 −

1 1 − U U

U

1 1 − U U

P

U − S P (ui ) U − SQ (ui' ) SQ (ui ) ∩ U

∑ (1 − S i =1

)

U

i =1 U

)

U

i =1

∑ (1 −

i

Q (ui ) ∪ U

)

1 − D (Q , δ ) U

= G2 (Q) , i.e., G2 ( P ) < G2 (Q) . This completes the proof. ■ Proposition 5.1 and 5.2 show that 1 D ( P, ω ) and 1 − − D( P, δ ) are all special U

In the view of granular computing, the

)

information entropy and knowledge granulation can

)

measure

the

discernibility ability of

knowledge on the universe. But these two kinds of measures could not felicitously characterize the difference between two knowledge with the same value of information entropy or knowledge

= G2 (Q) ,

granulation. For this reason, a new measure

i.e., G2 ( P ) = G2 (Q) . 3) Let P, Q ⊆ A with P p' Q , then for arbitrary there

{SQ (u1' ), SQ (u2' ),L , SQ (u 'U

exists

a

)}

such

so-called

knowledge

distance

has

been

introduced to information systems. We have

sequence

shown the mechanism how this measure

that

characterizes the difference among knowledge

S P (ui ) < SQ (ui ) . Hence, we obtain that

446

1 1 − U U

U

i =1

6. Conclusion

1 − D (Q , δ ) U

S P (ui )(i ≤ U ),

S (u ) ∩ U

∑ (1 − SP (ui ) ∪ U )

forms of knowledge granulation in Definition 5.1, and can be used to measure the uncertainty of knowledge induced by attribute set P ⊆ A in the view of granular computing.

)

U − SQ (u ji )

U

U

1 1 − U U

by

several

important

properties

and

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

QIAN, LIANG, DANG, WANG and XU

experimental analyses on two public data sets.

[8] Kryszkiewicz,

M.

(1999).

Rules

in

Furthermore, we have pointed out that the

incomplete information systems. Information

relationship between the knowledge distance

Sciences, 113: 271-292

and knowledge granulation. With the above the

[9] Liang, J.Y., Chin, K.S., Dang, C.Y. & Yam,

discussions, we have developed the theoretical

C.M. (2002), A new method for measuring

foundation of measuring knowledge distance in

uncertainty and fuzziness in rough set theory.

information systems for its further research.

International Journal of General Systems, 31 (4): 331-342

References

[10] Liang, J.Y. & Li, D.Y. (2005). Uncertainty

[1] Beaubouef, T., Petry, F.E. & Arora, G.

and Knowledge Acquisition in Information

(1998). Information-theoretic measures of

Systems. Science Press, Beijing, China

uncertainty for rough sets and rough

[11] Liang, J.Y. & Shi, Z.Z. (2004). The

relational databases. Information Sciences,

information entropy, rough entropy and

109: 535-563

knowledge granulation in rough set theory.

[2] Chakik, F.E., Shahine, A., Jaam, J. & Hasnah,

A.

(2004).

An

approach

International

for

Journal

of

Uncertainty,

Fuzziness and Knowledge-Based Systems,

constructing complex discriminating surfaces

12 (1): 37-46

based on Bayesian interference of the

[12] Liang, J.Y., Shi, Z.Z., Li, D.Y. & Wierman,

maximum entropy. Information Sciences,

M.J. (2006). The information entropy, rough

163: 275-291

entropy and

[3] Duntsch, I. & Gediga, G. (1998). Uncertainty

granulation in

incomplete information system. International

measures of rough set prediction. Artificial Intelligence, 106: 109-137

knowledge

Journal of General Systems, 35 (6): 641-654 [13] Liang, J.Y. & Qian, Y.H. (2005). Rough

[4] Hartley, R.V.L. (1928). Transmission of

set

information. The Bell Systems Technical

approximation

under

dynamic

granulation. Lecture Notes in Artificial

Journal, 7: 535-563

Intelligence, 3641: 701-708

[5] Klir, G.J. (1998). Basic issues of computing

[14] Liang,

J.Y.

&

Qian,

Y.H.

(2006).

with granular computing. In: Proceedings of

Axiomatic

1998 IEEE International Conference on

granulation in information system. Lecture

Fuzzy System, 101-105

Notes

[6] Komorouski, J., Pawlak, Z., Polkowski, L. & Skowron, A. (1999). Rough sets: A tutorial.

in

approach Artificial

of

knowledge

Intelligence,

4304:

1074-1078 [15] Lin, T.Y. (1998). Granular computing on

In: Pal, S.K., Skowron, A. (eds.), Rough

binary

Fuzzy Hybridization: A New Trend in

neighborhood

Decision-Making, 3-98, Singapore, Springer

representations and belief functions. In:

[7] Kryszkiewicz,

M.

(1998).

Rough

set

approach to incomplete information systems.

relations

I:

Data

systems,

II:

mining

and

Rough

sets

Polkowski, L, Skowron, A. (eds.), Rough Sets in Knowledge Discovery I, pp. 107-140

Information Sciences, 112: 39-49

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

447

Knowledge Distance in Information Systems

[16] Pawlak, Z. (1991). Rough Sets, Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht

International Forum on Granular Computing from the Rough Set Perspectives [25] Yao, Y.Y. (1996). Two views of the theory

[17] Polkowski, L. & Skowron, A. (1998).

of rough sets in finite universes. International

Towards adaptive calculus of granules,

Journal of Approximate Reasoning, 15 (4):

granularity of knowledge, indiscernibility

291-319

and rough sets. In: Proceedings of 1998

[26] Yao, Y.Y. (2000). Granular computing:

IEEE International Conference on Fuzzy

basic issues and possible solutions. In:

Systems, 111-116

Proceedings

[18] Qian, Y.H. Combination

the

Fifth

International

Conference on Computing and Information, I:

and

combination

in

incomplete

information

[27] Yao, Y.Y. (1999). Stratified rough sets and

Lecture

Notes

Artificial

granular computing. In: Proceedings of the

granulation system.

& Liang, J.Y. (2006a). entropy

of

in

18th International Conference of the North

Intelligence, 4062: 184-190 [19] Qian, Y.H. & Liang, J.Y. (2006b). Rough set method based on multi-granulations. In: th

American Fuzzy Information Processing Society, 800-804

International

[28] Zadeh, L.A. (1979). Fuzzy sets and

Conference on Cognitive Informatics, I:

information granularity. In: Gupta, N.,

297-304

Ragade, R., Yager, R., et al. (eds.) Advances

Proceedings

of

5

IEEE

186-189

[20] Shannon, C.E. (1948). The mathematical theory of communication. The Bell System Technical Journal, 27 (3, 4): 373-423 [21] Skowron, A. & Rauszer, C. (1992). The discernibility matrices and functions in information systems, In: Slowinski, A (ed.),

in Fuzzy Set Theory and Application, pp. 3-18. Amsterdam: North-Holland [29] Zadeh,

L.A.

(1996)

Fuzzy

logic

=

computing with words. IEEE Transactions on Fuzzy System, 4 (1): 103-111 [30] Zadeh, L.A. (1997). Towards a theory of

Intelligent Decision Support: Handbook of

fuzzy

Applications and Advances of the Rough

centrality in human reasoning and fuzzy

Sets Theory, pp. 331-362. Kluwer Academic,

logic. Fuzzy Sets and System, 90: 111-127

Dordrecht [22] Wierman,

information

granulation

and

its

[31] Zadeh, L.A. (1998). Some reflections on Measuring

soft computing, granular computing and their

uncertainty in rough set theory. International

M.J.

(1999).

roles in the conception, design and utilization

Journal of General Systems, 28 (4): 283-297

of

[23] Yao, Y.Y. (2001). Information granulation and rough set approximation. International

information/intelligent

system.

Soft

Computing, 2 (1): 23-25 [32] Zhang, L., Zhang, B. (2003). Theory of

Journal of Intelligent Systems, 16: 87-104

fuzzy quotient space (methods of fuzzy

[24] Yao, Y.Y. (2006). Three perspectives of

granular computing). Journal of Software (in

granular computing. In: Proceedings of the

448

Chinese), 14 (4): 770-776

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

QIAN, LIANG, DANG, WANG and XU

Yuhua Qian is a doctoral student of School of

degree in computational mathematics from

Computer

Technology at

Shanxi University, China, in 1983. He is

Shanxi University, China. His research interests

Associate Professor at the City University of

include rough set theory, granular computing

Hong Kong. He is best known for the

and artificial intelligence. He received the M.S.

development of the D1-triangulation of the

degree in Computers with applications at Shanxi

Euclidean space and the simplicial method for

University (2005).

integer programming. His current research

and

Information

interests include computational intelligence, Jiye Liang is a professor of School of Computer

optimization theory and techniques, applied

and

Key

general equilibrium modeling and computation.

Laboratory of Computational Intelligence and

Information

Technology

and

He is a senior member of IEEE and a member of

Chinese Information Processing of Ministry of

INFORS and MPS.

Education at Shanxi University. His research interests include artificial intelligence, granular

Wang Feng is a postgraduate of School of

computing

Computer

data

mining

and

knowledge

and

Information

Technology at

discovery. He received the Ph.D degree in

Shanxi University. Her research interests include

Information

granular computing and rough set theory.

Science

from

Xi’an

Jiaotong

University. He also has a B.S. in computational Wei Xu is a doctoral student of School of

mathematics from Xi’an Jiaotong University.

Management, Graduate University of Chinese Chuangyin Dang received a Ph.D. degree in

Academy of Sciences Chinese Academy of

operations

the

Sciences, China. His research interests include

University of Tilburg, The Netherlands, in 1991,

research/economics

from

rough set theory and rough prediction. He

a M.S. degree in applied mathematics from

received the M.S. degree in Computers with

Xidian University, China, in 1986, and a B.S.

applications at Shanxi University (2006).

JOURNAL OF SYSTEMS SCIENCE AND SYSTEMS ENGINEERING

449