Mining Hyperclique Patterns with Confidence Pruning Hui Xiong
Pang-Ning Tan
Vipin Kumar
Computer Science Department University of Minnesota 200 Union Street SE Minneapolis, MN-55455,USA
Computer Science Department University of Minnesota 200 Union Street SE Minneapolis, MN-55455,USA
Computer Science Department University of Minnesota 200 Union Street SE Minneapolis, MN-55455,USA
[email protected]
[email protected]
[email protected]
% X /x.+^ Bjo+ $(*X + ) l%4[ &! "yYz DR{DG| DE}D z)~D9z5D_z Dq 57!:+ + "[ +?@=+ &, l *+ $(*F, / f F:% F+ 6/ * o, 6 B ,G . $/ < +)79 # $"* $% !9 l%&8:, jy | - &S " 5H Ol'M '6 $(E, / +>7
ABSTRACT
! "#" $% &'% )(* + % # , ,-*. 0/ +1, " +"234 5(*+6$+ +. " , + )798:$% " %;% + J + + 7LKM% ;, / N/-++ !+O(*+1& = $)CB /-^+ % 4z5·=7s:)HMl(+D %= 5Hn , ,G .C% + % *$# " + +&)(O , + @$(* $( ";$+! Be *N $QR+ +*& , ,- .#$(*+)7]Kq3 Be2L% J & +a &)D-% +% a 66 .^/-O/ #b&+$+$2> 5(*+^ + +. "^, j(*+.2X5HT , ,- .% + % *$ )7jKq^$B2 % C ++ 6 a +!+*)D% 9+l% $a C6 q/-M/ ', )HM)2=& .% Bp .+O% 4Be +a +^, + [" + + "=" $% !' % ;8:, IrF% [D:, .+ $2n>5H $(*+> B , ,-*.)7
}7 c ! + /GOdHF;,-+9, , ) @ BC! ";%2 ,G+ a , ` + "&) ;O ++T J T% >! 6 + J + +? B`G,G / : +C% M+ #/G^o l+=Be " $(*+O, E79µ^O% . "*Y2;$&, $+ % »¤$&¼l7LKBe !(*+.lo [% +?(*+.++@$ J +LE /l9PmRþ « ´ ooo, v ´ s ´ qp 7
2.2 Problem Statement
c M)!Be #F% M%*2,-+ +a /G@+ % @ Bq% @/ +$ " * Y% [7 c 6 &?% :, "= "&% %+ J + + % +% )!/G', , &?,G . , + "@.+,!Be j% + " * Y% &+7
s:5HF(*++D ?)> % 5H
% M% +$*= ^HNþ ooo p 7VK
».z5¼
±u²³,³´+µ ®2= J $ 3z DHF@% )(*^% @Be*$5H'$ " Å
z
ooo, |$} } i» þ p ¼ ¼ |$} } »iþ ´ ooo qp ? ÿ p þ |$ »iþ ¼® o } } ´®° °
´
K +"2@ &$^=% , ,- . / \, "T +"2¹ +
/*2no $ "1Bp a +* , + n! "1" * Y% &+7UK*Bq% @%+ J + &) [+ #% +$,V, >% * [YoT) [, + )D` +6(*+.2 / @ B9% O, + @ O%2,-+. +a @, + +7
%¢ú®ìí ÁsÁ ÆÅ Ç %
»iþ ´ ¼öõ |$} } »iþ ¼lD+HF"*lOMl 9P7 »B' ¼sõVìí ÁhÁ î ¢ ð Å òh% õ j+á 7 ìí ÁsÁ î ®ò % S .+, 7%*2,-+ +$a >, + /X)©O / +P P% =Be )H' ">% +=+ $$* +Å>z5¼@ *!#A*z# / [ BXL[ 4$ O v 7»¤t'++ $?% >% [, L +,\ ´ r:7KM% % X* + +3% & + Be*?% $+!@>/G;þ8XD_® «üþN r 6»0~7 |¼lDþ8XDN r @«üþN ® 6»0~7 *¼lD* =þ®^DN r @«üþ58 8 6»0~7 ¶*¼lD H'% ?% ?( @$;% @, +% +" B )*$2T/- +!% ]+ .+ "L/ +¹ ]Be +a +*#, + )7 ++ EDR&%*2,-+ +a / ++$ + "&% +! ^ ^ l a $ = !+ $i23 + $*\y ¸0D/-+) #$6 X% =%2 ,-+ " , %#! + +, + +F + C ! " $+!)7qhi#+
4.2 Proofs of Completeness and Correctness
¤Â³²¤¥
-de
z 7¸` P¥ + ¤¦+ T¢6eGT£ *le¤¢e1¢6
±u²³,³´+µ KM% :+ &, l+ + BR% :%2,G+ a :&$ +C"
$% + >/G % )H'[/*2;% Be 5H+ J +*:%2,G +a , + )79KM% 'Be *O " @&$ +M*!, 2*7 Confidence-Pruning Effect Number of Hyperclique Patterns
1e+08
6. EXPERIMENTAL EVALUATION
c @, +:o+ Y(@lo,-+ !+
*@ *A CB ED F%G.HI;J< C&=H LK
.- / 10325476 98 =; =; =; ;
TS
S
400
0.01
0.015
0.02
0.025
Minimum Support Thresholds
ml
Ò+`Ô9ÎÍ6gߤÊ`Ì`Î
µ: o,-+ !+9HF+
Minimum Support Thresholds
Ñ6Ò`ÎmÎ$ýEÎgÌEÖÊeÕGÙ¾ÖʤÜZÎfÕGÚÖÒ`Î ÕRÙ ¯*° ±-²+³ ×`Þ Ö)ÞLÓ+ÎÖ*Ý
r>
ÿ(ü
Ò+EÔ9ÎÍ6gßpÊÆ`ÌEÎ
Sj$"* ;}>% 5H! +! VrFs:8:t:u Be O% #¯*° ±G²³T Z )7T8:
% )H'Z4% J " DE% O 6/-+? B, + ? 5(*+ +Z/*2 !" $% 6(*+F + O*B¸| 7 *·#7Lt'l )qBe * Kj/ =z % ^ ) $24¸ 7 ·*Bj% 6$+!:% )(*6 , ,-*. + ?% L~7 | 7 c $%L; , ,- . % + % L" )+?% ~7 | DCrCs:8:t:u )T $2P$ +B21 + 6& "Z[(.2 &CBe P BM% =$&+7Zu4 +5(*++Dj% =lo++$*P! Be 'rFs:8:t:u M+ / Y2=% " % +F% ;%2,G +a @! ++D 6 % 5H'LLSj" ;½7 c Y%P%+ J + ;, " DqHF#) :%*2,-+ +$a :! + +)(C%2,G +a :, Fl( X , ,G . % +% 3+a 9[+ 7;Sj$ $$2*D_H&$ "! $"* $% !F ?O% +M5HI , ,- .o !, D HF^ + J +>6%*2,-+ +a ?, + ;*( $( "6+ +$2& ++ $+!q %O a: A'/ +.2* D a: A: ,+D* T a: A^þ5) " )D " ; " D / ++l 7K, , *l%Z) +5(*+C$+!M +[!& $6 , ,- .:% + % +a E#{·Ç [*/ Pz)¸ ~6Be +a :$+!'5(*+ " | |~= BC% O$+!)7!KM% +D`% +@ $ %*2,-+ " , %3+ + | |~6(*+.+: Lz)¸* ~*6%*2,-+ + " +)7Mµ^ 'o,G+ $&+*' + $ )!% % & 6/-+X BMBe +a +*OY+& $ + )+ z){*¶*½ ¸*[ L% = 6/G6 B%*2,-+ " , % X&z5·¡! 6 ¾ , ,- .@% +% E7 c $% 3& $6 ü%* J +4% +% ¹ ;;~*·=DCHM[ / z)~ |} }X%2,G+ a @, + % 7 KM% %2,-+ +$a Z, + V;P 5(*+'d2,- B + &, =H~ ~ ~ 7 y 5?t Ò 7 a:" Â D Â 7 _A% & ED È 7 s^ED- >8X7 " 7
o, .2=! "6(6+ +;Bp a +*: a + )7 hi C )G ebd çE ¨ c D, " +:}*} } } ¶D È !z)¸ ¸ ¸ 7 y { 7-µ^!+ A07 8:$+ $(*? + b&+ $"* $% Bp M+ +;$+! È 7- 7-®F)2*+ ; J ) »0+ l¼lÅq% + .2> > + $+ 7 ç'£-l^e -M ) c el¥#£G c ££ e eD , " + z5}{ -z)¶ ~DEz)¸*¸ 7 y ¶ 7-rF % +ED uP7 @+D-G7 S x.$HM D8X7R: )D q7 hi 2AGD t 7 u4dHM 0D È Â 7 :$#ED 4r:7 " 7 Sj$ " *+ + $ "&+ FH'Y% *% " %0 & E Ol)Å8 !& .2# Bq $+7 el0e3 «?¤ V ¸G* ¤£ 9¢6¢6e0i6 c £i£ w eGleD z*».z5¼lD u4 %3z)¸ ¸ ¶7 yYz+| 7 s^ EDG67 .2, $)D67 ? #+D [®:7-u[*/ % ++7 rF$ + "O/ +> [ + = @%2,G "*, % +7 hi
²
] ]
o
B. THE COMPLETE LIST OF CLUSTERS z
Confidence-Pruning Effect Number of Hyperclique Patterns
1e+06 min_conf = 90% min_conf = 50% CHARM
100000 10000 1000 100 10 0.0001
0.00015
0.0002
0.00025
0.0003
Minimum Support Thresholds
yk
É'ÊËRÌ`ÍÎgÏ `Ð [ÌqÜ4Ø9ÎÍnÕGÚ[ÔqÞÖ)Ö)ÎÍ Ù`Ó¹Ë-ÎÙ`ÎÍÞÖ5ÎסØÇÖÒ`Î ± ×_ÞÖ5Þ Ò `Ô9Î6Í gß¤Ê `Ì`ÎfÜPÊpÙ`ÎÍVÞ-Ù_× ÿ(ü + ÕRÙ ±°-² Ó+ÎÖ*Ý
>
[^`_a_
Execution Time (sec)
10 min_conf = 90% min_conf = 50% CHARM
8 6 4 2 0 0.0001
0.00015
0.0002
0.00025
0.0003
Minimum Support Thresholds
`
É'ÊËRÌ`ÍÎ Ï RÐ\Ñ6Ò`ÎnÎ;ýRÎgÌEÖÊÕRÙ ÖÊpÜ3Î
Õ-Ú&ÖÒEÎIÒ+EÔ9ÎÍ6gßpÊÆ`ÌEÎ ± ×`Þ Ö)ÞLÓ+ÎÖ*Ý ÜPÊpÙ`ÎÍ&ÞGÙ_× ÿ(ü ÕRÙ ±°-²
r>
[^`_a_
Confidence-Pruning Effect Number of Hyperclique Patterns
1e+09
min_conf = 95% min_conf = 85% min_conf = 70% CHARM
1e+08 1e+07 1e+06 100000 10000 1000 100
0
0.05
0.1
0.15
Minimum Support Thresholds
` yk
É'ÊËRÌ`ÍÎgÏ RÐ [ÌqÜ4Ø9ÎÍnÕGÚ[ÔqÞÖ)Ö)ÎÍ Ù`Ó¹Ë-ÎÙ`ÎÍÞÖ5ÎסØÇÖÒ`Î ² ² ×_ÞÖ5ÞLÓ+Î*ÖÝ Ò `Ô9Î6Í gß¤Ê `Ì`ÎZÜPʤÙEÎÍ#ÞGÙ_× ÿü + ÕRÙ
r>
10000
min_conf = 95% min_conf = 85% min_conf = 70% CHARM
1000 Execution Time (sec)
Z.[]\
100
10
1
0.1
`
0
0.05
0.1
0.15
Minimum Support Thresholds
É'ÊËRÌ`ÍÎ Ï RÐ\Ñ6Ò`ÎnÎ;ýRÎgÌEÖÊÕRÙ ÖÊpÜ3Î
Õ-Ú&ÖÒEÎIÒ+EÔ9ÎÍ6gßpÊÆ`ÌEÎ *² ² ×`Þ Ö)ÞLÓ+ÎÖ*Ý ÜPÊpÙ`ÎÍ&ÞGÙ_× ÿ(ü ÕRÙ
r>
Z.[]\
z5{
a:
z
^ +5(*+ +[rF .+ ®F$! :? DrChqa:+ "2#rF , D8:&+ +l :j)HM -D*^ A ^q5HF+ DrF + D *+ "2#rF , D :+ 9 / $8 '$ D s: .*;hi D- rhi D9rF !6 $d2Z9.2% D ?S ¡rC* , D9^ F" D _ _ -Dµ:.2o
" 2>rC D- $ @t'$ D Kqlo >hi rFCr hi* DX? ZrF D u4:)H^ !& D*K7 ®C " " D^5(*+&rF , Ds:)$% ) D c ! ^hi D /- rM+ rC + $i2$d2 -DE^" $ a , !+ D uZr rF , Dj^ 9hi . & DEhd®Mu -D`8:( ++Lu4$+ > Dq %HF+ D u[+
`2 % D 8:, , $XrF &, -D u4 * D 8: lH rF , DG+ J 8F -D Kq+A* $o=hi rC &, +8:Thi D!8: D!{*rF D!rC &, avrF &, -DO+ / $ 2 + D&rC+I2 +! D6@ r rC &6 +$* Ds^ -Dhi+_rF , 8:, , $+4& Ò D j§ h _ " DGu4$+ 4Kql% "2 D a^ j+!+ DEµ:+OrC , -DR$+ 3:, % + D [u[+ 2 +! DK_+/ rF , Â D ?8:^: * , D-+ " rF 7 ` D c $HF .%[rF , rF =rC Dq^5(*+6rF , D9^+[» c 7 t 7 ¼ D_s^ * .!:+ + DqhiKMKrC* ,¹¢» a:H^¼ Dju +9hi Du4 XrM D a:h -Â D a: .% + D t'$?8: D t')2% + rF , D KMt'qh a?µMj8frF , D c " +4rF ®ClA f^+A + D-S $' Bq% 8 _ D s: 2HF+$_hi D @ Be# m®F * D @ " c [9 -D !$,G rF , D 9 $ rF , D t