Yinjie Huang , Michael Georgiopoulos and Georgios C ... - UCF EECS

† Hash Function Learning via Codewords ∗

∗

Yinjie Huang , Michael Georgiopoulos and Georgios C. Anagnostopoulos

∗∗

(∗) Machine Learning Laboratory, University of Central Florida, (∗∗) ICE Laboratory, Florida Institute of Technology. (†) Presented at the ECML/PKDD, Porto, Portugal, Sep 7th - 11th , 2015

A BSTRACT

F ORMULATION

E XPERIMENTS We compare with Kernel Supervised Learning (KSH), Binary Reconstructive Embedding (BRE), single-layer Anchor Graph Hashing (1-AGH) and its two-layer version (2-AGH), Spectral Hashing (SPH) and Locality-Sensitive Hashing (LSH). Retrieval accuracy for the s-closest hash codes, where s = {10, 15, . . . , 50}, and precision-recall (PR) curves are shown below.

Content-based image retrieval (CBIR) has attracted plenty of attention over the past years. Two major challenges in CBIR are:

• Storage space complexity – Large images typically have big memory footprints. – Large image databases may require exorbitant amounts of storage.

0.9

0.8

0.85

0.7

0.6 0.5

Our KSH LSH SPH BRE 1−AGH 2−AGH

0.4

0.3

*SHL attempts to reduce the distortion measure: X X min d (h(xn ), µg ) E(ω) , d (h(xn ), µln ) + n∈NL

n∈NU

g

(1)

0

fb (x) , hwb , φ(x)iHb + βb with wb ∈ Ωwb ,

wb ∈ Hb : kwb kHb ≤ Rb , Rb > 0 and

*SHL training minimizes the following B independent, cost function majorizers (binary SVM problems): inf

wb,m ∈Hm ,m∈NM

C

βb ∈R,θb ∈Ωθ ,µg,b ∈H

XX g

n

2 X kwb,m kHm 1 0 γg,n [1 − µg,b fb (xn )]+ + b ∈ NB 2 m θb,m

P

where fb (x) = m hwb,m , φm (x)iHm + βb and Hm are RKHSs used for MultipleKernel Learning (MKL). This leads to a block coordinate descent algorithm consisting of three blocks: 1. By considering wb,m and βb for each b as a single block, this step leads to a set of SVM problems. 2. A closed-form solution can be obtained for the MKL coefficients. 3. Optimal codewords can be found via simple substitution. Figure 1: CBIR through Hash Function Learning.

Employing hash functions that transform the original data into compact binary codes can accelerate retrieval by using Approximate Nearest-Neighbor (ANN) search. We propose a novel hash function learning approach - *Supervised Hash Learning (*SHL), which shows two following advantages. First, it considers a set of learned Hamming space codewords in order to capture the intrinsic similarities. Thus, *SHL can naturally engage supervised, unsupervised and semi-supervised learning tasks. Besides, the minimization problem of *SHL naturally lead to a set of Support Vector Machine (SVM) problems. Finally, experiments on 5 benchmark datasets, compared with 6 other state-of-art methods, show *SHL is highly competitive. *SHL code is available at http://www.eecs.ucf.edu/~yhuang/.

C ONCENTRATION G UARANTEES 0 2 0 Theorem 1 Assume reproducing kernels of {Hb }B s.t. k (x, x ) ≤ r , ∀x, x ∈ X. b b=1 B ¯ any {µl }G Then for a fixed value of ρ > 0, for any f ∈ F, , µ ∈ H and any δ > 0, with l l=1 probability 1 − δ, it holds that: s 1 log δ 2r X √ er (f , µl ) ≤ er ˆ (f , µl ) + Rb + (2) 2N ρB N b 1 E{d (sgn (f (x), µl ))}, B

where er (f , µl ) , l ∈ Nn ˆ (f , µ G is the true label of x ∈ X , er l) , n oo P 1 u Q (f (x )µ ), where Q (u) , min 1, max 0, 1 − . ρ n ρ b l ,b n n,b NB ρ

30

40

0.75


0.7

0.65

0.5 10

50

0.4


0.2

15

20

25

30

35

40

45

0

50

0

0.2

0.4

Number of Top s CIFAR−10


0.6

0.8

CIFAR−10 1


0.9 0.4 0.8

0.3

0.25

0.2

1

Recall

0.45

0.35

0.5

0.1

CIFAR−10

0.4

0.6

0.3

0.55

0.45

0.35

0.7


0.3

0.25

0.2

0.6 0.5 0.4 0.3 0.2

0.15

0.15 0.1

0

10

20

30

40

50

0.1 10

15

Number of Bits

20

25

30

35

40

45

50

0

0

0.2

Number of Top s

0.4

0.6

0.8

1

Recall

Figure 3: Top s retrieval results and PR curves related to the Mnist and CIFAR datasets for *SHL and 6 other hashing algorithms.

βb ∈ R for all b ∈ NB , {1, . . . , B}.

A LGORITHM

20

0.7 0.8

Number of Bits

0.1

where NL and NU are the index sets for labeled and unlabeled data respectively, and h(x) , sgn f (x) ∈ HB for a sample x ∈ X . n Here, f (x) , [f1 (x) . . . fB (x)]T , where o

10

0.8

0.6

Top s Retrieval Precision

– Nearest neighbor search: exhaustively comparing each query datum to each sample in a very large database is impractical.

0.9

0.1

Top s Retrieval Precision (s=10)

• Run-time complexity

0.9

0.2

Figure 2: *SHL learns a hash function h : X → HB and a set of G labeled codewords µg , g ∈ NG (each codeword representing a class), so that the hash code of a labeled sample is mapped close to the codeword corresponding to the sample’s class label.

Mnist 1

Precision

I NTRODUCTION

Mnist 0.95

Top s Retrieval Precision

Top s Retrieval Precision (s=10)

Mnist 1

Precision

In this paper we introduce a novel hash learning framework that has two main distinguishing features, when compared to past approaches. First, it utilizes codewords in the Hamming space as ancillary means to accomplish its hash learning task. These codewords, which are inferred from the data, attempt to capture similarity aspects of the data’s hash codes. Secondly and more importantly, the same framework is capable of addressing supervised, unsupervised and, even, semi-supervised hash learning tasks in a natural manner. A series of comparative experiments focusing on content-based image retrieval highlights its performance advantages.

Query Image: Car *SHL KSH LSH SPH BRE 1−AGH 2−AGH Figure 4: Qualitative results on CIFAR-10. Query image is “Car”. The remaining 15 images for each row were retrieved using 45-bit binary codes generated by different hashing algorithms.

R EFERENCES [1]. Kulis, B., Darrell, T.: Learning to hash with binary reconstructive embeddings. NIPS 2009, page: 1042 - 1050. [2]. Liu, W., Wang, J., Ji, R., Jiang, Y.G., Chang, S.F.: Supervised hashing with kernels. CVPR 2012, page: 2074 - 2081. [3]. Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. NIPS 2008, page: 1753 - 1760. [4]. Liu, W., Wang, J., Kumar, S., Chang, S.F.: Hashing with graphs. ICML 2011, page: 1 - 8. [5]. Kloft, M., Brefeld, U., Sonnenburg, S., Zien, A.: lp-norm multiple kernel learning. Journal of Machine Learning Research 12 (July 2011), page: 953 - 997.

A CKNOWLEDGMENTS Y. Huang acknowledges support from a Trustee Fellowship provided by the Graduate College of the University of Central Florida. M. Georgiopoulos acknowledges partial support from NSF grants No. 0806931, No. 0963146, No. 1200566, No. 1161228 and No. 1356233. G.C. Anagnostopoulos acknowledges partial support from NSF grant No. 1263011.