Features - IEEE Xplore

1 downloads 100 Views 7MB Size Report
H B Kekre, V A Bharadi. NMIMS University, Mumbai, India -400056. Abstract: In this paper we discuss an off-line signature. Wediscussfourspecialfeaturestheyare.
2009 IEEE International Advance Computing Conference (IACC 2009) Patiala, India, 6-7 March 2009

Signature Recognition using Cluster Based Global

Features

H B Kekre, V A Bharadi

NMIMS University, Mumbai, India -400056 Abstract: In this paper we discuss an off-line signature recognition system designed using clustering techniques. These cluster based features are mainly morphological feature, they

We discuss four special features they are 1. Grid & Texture Information features 2. Walsh Coefficients of Horizontal and vertical pixel distributions.

information features and Geometric centers of a signature. In this paper we discuss the extraction and performance analysis of these features. We present the FAR, FRR achieved by the system using these features .We compare individual performance and overall system performance.

3. Vector Quantization based-Codeword histogram 4. Geometric Centers of Signature Template These features are used and tested in the development of an off-line signature system. In this paper we provide an overview and performance analysis of the signature recognition system based on clustering techniques.

include Walsh coefficients of pixel distributions, Vector Quantization based codeword histogram, Grid & Texture

Vectr Quti

I. INTRODUCTION

Signature verification is an important research area in the field of authentication of a person [1]. We can generally distinguish between two different categories of verification systems: online, for which the signature signal is captured during the writing process, thus making the dynamic information available, and offline for which the signature is captured once the writing process is over and thus, only a static image is available. In this paper we deal with Off-line signature verification System. We design a system capable of verifying authenticity of a signature based on test performed with genuine signatures (Verification Mode) and person identification from signature (Recognition Mode). This system is using special set of features extracted for group of signature points or signature segments, these are cluster features. Over the years, many features have been proposed to represent signatures in verification tasks. We distinguish between local features, where one feature is extracted for each sample point in the input domain, global features [1] [6], where one feature is extracted for a whole signature, based on all sample points in the input domain, and segmental features, where the signature is subdivided into segments and one feature is extracted for each segment. Here we are using Global & Segmental features. Signature verification can be cosdee as tw-ls pattem as aa two-class atenrcgiio rbe [3], 3 considered recognition problem where the authentic user is a class and all the forgers are the second class. Feature selection refers to the process by which descriptors (features) extracted from the input-domain data are selected discrimination capability between maximal dsriiato seetetoo provide rviemxia apbliybewe classes. Here we consider only global features for verification ofsgaue

b

ds

II. CONVENTIONAL GLOBAL FEATURES

Pre-processing is the first step of any signature recognition system, where we normalize the signature to make it a binary template. This normalized template is then used for feature extraction. The extracted feature vector is used for comparison and classification of signatures. The performance of signature recognition system is greatly influenced by the feature set. The pre-processing is shown in Fig. 1. The steps in pre-processing are as follows 1. 2. 3. 4.

Noise Removal Intensity Normalization Scaling Thinning

.

The conventional features [3] that are considered for system

featuresfeaturesareare as as foros Thelopmentaong development along with cluster follows c

1. Number of ixels

1iNure pixep signature template

-

Total Number of black ixels in a

The height in pixels of the heittemplate after horizontal blank spaces removed. pte

signature

in pixels offe 3. Picture width- Thee width the image with width in a w

3.rPicturidth vertical horizontal blank spaces removed 4. Maximum horizontal projection- The horizontal projection histogram is calculated and the highest value of it pel

divide the features in two types types WeWe divide 1. Standard Global features is considered as the maximum horizontal projection. 2. Segmenal Cluter & basedfeatures5. Maximum vertical projection- The vertical projection of the skeletonized signature image is calculated. The highest (Seilfetrs value of the projection histogram is taken as the maximum vertical projection. 978-1-4244-2928-8/09/$25 .OO ( 2009 IEEE

1323

6. Baseline shift- This is the difference between the ycoordinate of centre of mass of left and right part. 7. Dominant Angle feature -Dominant angle of the signature. 8. Signature surface area - here we consider the modified tri-area feature [4] The features discussed above as shown in Table I for the signature shown in Fig 1. Next we discuss the special features TABLEI FEATURE EXTRACTED FROM SIGNATURE SHOWN IN FIG.1 Sr.

I 2

Number of pixels

3

Picture Height (in pixels)

Horizontal max Projections Vertical max Projections

8 9 10

547

Picture Width (in pixels)

4

5 6 7

Extracted Value Value

Feature

166

137

Dominant Angle-normalized Baseline Shift (in pixels) 47 Areal Area2

12

15 0.694 0.151325

0.253030 0.062878

Area3

III. GRID & TEXTURE INFORMATION FEATURE EXTRACTION

Grid and texture feature provide information about the distribution of pixels and the distribution density of the pixels [3]. Texture feature provide information about the occurrence of specific pattern in the signature template. These features are not based on single pixel or whole signature but they are based on group of pixels or signature segments, hence these are cluster features. Grid feature gives information about the pixel density in a segment and texture feature gives information about the distribution of specific pixel pattern. These features are discussed in detail here. A. Grid information feature We have the pre-processed signature template. We use resolution of 200*160 pixels. To extract the grid information feature from the signature we use the following algorithm. Algorithm for gridfeature extraction 1. Divide the skeletonized image into 10 X 10 Pixels blocks. We get total 320 blocks. 2. For each block segment, calculate the area (the sum of foreground pixels). This gives a grid feature matrix (gf) of size 20 X 16 3. Find minimum and maximum (min, max) values for pixels block. Ignore block with no pixels. 4. Normalize the grid feature matrix by replacing each nonzero element 'e , j' by (ei, i - min) (31 ei,j = (3.1)

max- men

This gives matrix with all elements within the range of 0 and 1. The results are normalized so that the lowest value (for the

rectangle with the smallest number of black pixels) would be

11324

zero and the highest value (for the rectangle with the highest number of black pixels) would be one. 5. The resulting 320 elements of the matrix (gf) form the grid feature vector. A representation of a signature image and the corresponding grid feature vector is shown in Fig. 2. A darker rectangle indicates that for the corresponding area of the skeletonized image we had the maximum number of black pixels. On the contrary, a white rectangle indicates that we had the smallest number of black pixels.

Fig. 2 Representation of grid feature.

B. Texture feature [3] Texture feature gives information about occurrence of specific pixel pattern. To extract the texture feature group, the co-occurrence matrices of the signature image are used. In a grey-level image, the co-occurrence matrix pd [i, j] is defined by first specifying a displacement vector d = (dx, dy) and counting all pairs of pixels separated by d and having grey level values i and j. In our case, the signature image is binary and therefore the co-occurrence matrix is a 2 X 2 matrix describing the transition of black and white pixels. Therefore, the co-occurrence matrix Pd [i, j] is defined as Ip00 PpOl (1 Pd[i, j] = 1

pIO pll

Where pOO is the number of times that two white pixels occurs, separated by d. pO1 is the number of times that a combination of a white and a black pixel occurs, separated by d. plO is the same as pO1. The element pIt is the number of times that two black pixels occur, separated by d. The image is divided into eight rectangular segments (4 X 2). For each region the P (1, 0), P (1, 1), P (0, 1) and P (-1, 1) matrices are calculated and the pOt and pit elements of these matrices are used as texture features of the signature. B

A

A\

-

A

T

B

d=,d=

x1 y1 .

Fg3.Pxlpstos whl

A

f

B 6

d= y1

B

B- - d=1 y

cnigfrtedslcmn etr

2009 IEEE Internactionalz Advance Computing Conference (IACC 2009)

We use this procedure to calculate the texture feature matrix the signature template is divided in eight segments as shown in fig 4.

4i Cl l tv v

>Il| 5

7

6

.

8

Fig. 4 Pixel positions while scanning for the displacement vector

We consider a normalized signature template of size 256*256 pixels; this gives horizontal & vertical pixel projections of size 256 elements each. We use Hadamard transform to the horizontal pixel distribution points (Hi) and vertical pixel distribution points (Vi); Hadamard transform is fast to calculate and gives moderate energy compaction. This operation gives the horizontal Hadamard coefficients (HHi) and vertical Hadamard coefficients (VHi). We use Kekre's [4] to get the Walsh coefficients from the Hadamard /algorithm coefficients. This operation yields Walsh coefficients of the histograms (SHHi, SVHi). The coefficients are plotted and shown in Fig. 7. These coefficients are calculated for the test signature as well as standard signature and Euclidian Distance is evaluated to measure the similarity between to sequences of the coefficients and hence the similarity of the two signatures.

(representation).

V. VECTOR QUANTIZATION BASED-CODEWORD HISTOGRAM

The Corresponding texture feature matrix is shown below in Fig. 5. Thus we get two matrices, one for grid feature and one for texture feature as feature vectors.

Next we discuss a feature based on Vector Quantization of a signature template. We segment the signature into blocks to form the vectors. These vectors are represented by codewords from the codebook. Here the codebook used for mapping the image block plays very important role [2][4][6]. Here the objective of the vector quantization is not to compress the image but to classify the signature and verify the authenticity. Hence we extend the approach to serve our purpose. We use the codeword distribution pattern as a characteristic of the signature. The frequency of the codewords occurring in the is calculated and the histogram is plotted for the distributiongroup ~codeword against the number of occurrences and finally

B1

B2 pOl >..O 24 p1 1 0 10 0 26 | p0l12 0 4 p1a12 p0131 0 19 0 11 p11l3

pUlĀ°4

B3 B4 B5 B B7 B 8 0 3 92 140 22 138 8 23 32 8 0 0 0 130 2 88 109 12 33 1 13 Go 18 0 4 79 132 30 110 0 0 1 1 22 41 59 1.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~...............................................

0

0

p124614 0128 38

24 129 4

44

130

30

0

44 130 14 ~~~2 41

301

0

these indistributions are compared using a similarity ~used [2], which is based on Euclidian distance.

The normalized Signature template is taken, the image size considered is 200 x 160 Pixels, this image is divided into 4 x 4 pixel blocks; each block is treated as a code vector. Overall 2000 blocks will be there. The codebook is having all the 216

IV. WALSH TRANSFORM OF PIXEL DiSTRIBUTION

Here we propose a novel parameter for signature recognition. This parameter is derived from the Pixel Distribution of the signature [1] [3]. This is shown in Fig. 6. Sioh&W6 6601SW NknAi6d

--I

measure

codewords initially then the invalid code vectors are removed by thinning process as shown in Fig 8. The remaining code vectors are sorted as their appearance in gray code table so that the consecutive codewords will have minimal change. similar codewords are grouped to form codeword groups. The codeword histogram is generated for given signature template which will be used for comparison. This toperation is illustrated in Fig. 10.

T6016t6lSuch -A

i-

1- JL

Fig. 6 Signature and its horizontal and vertical pixel distributions

_

Fig. 8 Thinning operation illustrated A is input block, B is Thinned Block

2009 IEEE Inxternational Advanxce Computing Conference (IACC 2009)

1325

400

320 240 160 80 0 -80

-160

2400 L320J 2400 800 30 -80 -160 L320

-L40 60 L400

C

10

20

30

40

50

CC VS Nd iif Pdiht --3 gffiEgerent

60

70

00

90

100

11l

1 20

1 30

1 40

15

160l

1 70

1l

19

200 21 0

220

230

240

250

260

270

isthe se of coewrs usedl fo grou formatio|n.Nex step isU Fig.7. Hadamard Coefwrcients of Horizontal pixel distribution (Upper Plot) and their sequenced arrangement (Lower Plot) of the signature mentioned in Fig3 N to form odeword roups. T form co ewordgoup weMM& (5)flflat starto"a Wil -Wi2| dis (I II) + i 2+ The thinning operation gives 11756 valid codewords. This Wi+W2 is the set of codewords used for group formation. Next step isj=I (5) to form codeword groups. To form codeword groups we start egttecdwrnitorma hw nFg with initial codeword chosen from the set and compare with the other codewords to find group of codewords with minimum hamming distance. For current scenario we have constructed 240 groups of 50 codewords each. We Signatur 2 form one extra group for the codewords which do not participate in any group. The observed intra group hamming distance is in the range 2-6 and in extreme case distance such as 8-9 is observed. 20

Similarity Measure [2] Given an encoded image having similar a as text representation document, image ag features can be extracted based on codewords frequency. The feature vector for signature template I1 and the feature vector for test signature 12aregivenbelow, F 1= {W, W21, ...tWN} (2) g ForI12, It is given by 2= {W12 W22, ... WN2 } (3) where the In the histogram model, WN2= F the a in Ij Thus, the frequency of group Ci appearing feature vectors I1 and 12 are the codeword histograms. The similarity measure is defined as

Inthehistofgrammoup eCi s(I2,II1) --

~

~~~~~~~~~~~~ Frequency s Codeword Group -->

237

Fig.9CodewordHistogramforSignatureshowninFig.4

The codeword histograms can be compared using the

similarity measure shown in Equation 5.3. For all these features we have used user specific thresholds which give better results as compared to the common thresholds. We have used set of training signatures containing 8 signatures. The thresholds are calculated by using training procedure

3)is ing=Fin jThuris,frequency~~~~~~~~~~~~~~~~~~~~~~~~~~ discussedimn[1].

1 + dis(12, It) 1 + dis(I(2 I))

Where the distance function is

11326

o

I

(4)

dicse in[]

vi. GEOMETRIC CENTERS OF SIGNATuRE TEMPLATE We use the successive geometric centers [1] of signature as a global parameter in the development of a signature verification system. This parameter is derived from the center of mass of an image segment. The term 'Successive' is used to describe the nature of feature extraction process. We use the process recursively to generate the set of points called as

2009 IEEE Internactionalz Advance Computing Conference (IACC 2009)

geometric centers. We find the center of mass of given signature template and divide the template in two parts at the

center of mass, this process is repeated until we get specified number of points.

I I I I I

Sigjnatire Temrplate 200 x

Codebook

I I I I

000010110111101

Pixels

160

I I |I ~~ ~~I 11 ,wI16 Ii_ioiooio1I0X Codeblock

I

o

Codeword

I

Image Block

/I

I.l

/i

4 pixel 4 4X 4:X PIXe

1 2 6

/

/

Cotilewol ts h11s"ograrn

No of Occulrance

l4

Bits

/

f

oeod

Codewords in Gray Sortedl

Code

___

g

2

Codeword

Codeword JOIU'p

Codeword Groups

blip I

N

Fig. 10. Vector Quantization applied for signature template

This algorithm requires splitting of image for four times and we extract 24 points in each mode, this will be discussed in detail in the coming part. This process requires signature template of larger size to capture the details properly. After testing various sizes we have implemented this feature using size of 320*240 pixels. Geometric center of an image: The geometric center is giving the idea of the distribution of pixels. Physically it is the point where the center of mass of an object is located. For an image the geometric center is defined by Cx, Cy Where,

A.

ax

cx

X=1

ynax x E b[x,y]

=x s

xf

y== X=1 CY = Y

yIrux

E [x,y] by

(6)

(6)

Y=1 X=1

Algorithm This is the procedure for generating feature points based on vertical splitting. Input: Static signature image after moving the signature to center of image Output: vl, v2, v3, v4, v5, v6 (feature points) 1. Split the image with vertical line at the center of image then we will get left and right parts of image. 2. Calculate geometric centers vl and v2 for left and right

parts

correspondingly.

3. Split left part horizontal line at vl and find out geometric centers v3 and v4 for top and bottom parts of left part

y=4

E

b[x,y]

Six feature points are retrieving based on vertical splitting. Here feature points are nothing but geometric centers. The procedure for finding feature points by vertical splitting is mentioned in Algorithm.

b[x,y]

correspondingly.

4. Split right part horizontal line at v2 and find out geometric centers v5 and v6 for top and bottom parts of left part

Y=1

For the current case we consider a normalized signature template of 320*240 pixels resolution. Now to develop the new set of feature point we adopt a method of splitting the template at the geometric centers and finding the center of mass of the two segments obtained after splitting. We perform the splitting in two manners once horizontally and once vertically. We use the algorithm to generate two set of points based on Vertical splitting mechanism and

correspondingly.

Fig. 11 shows the feature points retrieved from signature image. These features we have to calculate for every signature image in both training and testing.

Horizontal splitting Mechanism.

B. Feature points based on vertical splitting

2009 IEEE International Advanlce

Computilng Conference (IACC 2009)

1327

Fig. I I Feature points retrieved from signature image by vertical splitting of

depth 1.

Similar sequence is generated by horizontal splitting, next we discuss generation of geometric centers of depth 2. Previous procedure is used for splitting depth of one to give set of six points. We extend this concept to splitting depth of two so that total 24 points are generated for this we apply the splitting algorithm to the four subparts obtained by splitting depth 1. This operation is illustrated in Fig. 12 for vertical splitting. By this procedure we obtain total 48 feature points (vl;;; v24 and hl;; h24). This set of point is can be used for comparing two signatures. For comparison purpose we use coefficient of Extended Regression Square (ER2) [2]. Defined as R-squared is also called the coefficient of determination. It can be interpreted as the fraction of the variation in Y that is explained by X. R-squared can be further derived as: M ( n 2

X/)yY1(XJi Y)l

L__________________

ER = M

n

2 M n

k J)2 ZZ(yJ ZZ(XJI=1 i= J=1 i=1

2

J

)2

2

(7)

n= Number of dimensions (For current scenario we have two

dimensions)

xi- Points for first sequence yi- Points for second sequence Where X, Y two sequences to be are correlated each of two

dimensions.

11328

..........

Fig. 12 Feature points retrieved from signature template by vertical splitting of

depth 2 VII. RESULTS We have used a signature database consisting 984 signatures from 75 different persons. Per person 12 signatures are collected out of which 8 signatures are used for thresholds calculation and record creation. Remaining signatures are used as genuine test signatures. From some arbitrary persons we have collected forged signatures for testing purpose. We have collected 125 skilled forgery signatures, 30 casual or unskilled forgeries. Total number of signatures used for testing is i139 at 600 dpi. Out of 1139 samples 480 signatures were used for user enrollment, 232 signatures were genuine test signatures, 127 skilled forgeries, 35 casual or unskilled forgeries, 250 unenrolled users test signatures and 30 signatures were unusable due to distortion. We have tested individual modules for each cluster based feature discussed above and the final system implementing the feature set together. The final system uses user specific thresholds and training mechanism discussed in [1]. We have implemented a weighted comparator based classifier in the final system. The metrics FAR (False Acceptance Rate) & FRR (False Rejection Rate) [3] are evaluated. In the final program we have integrated all the modules and designed a training mechanism to decide thresholds separately for each user. For training we collect 12 signatures per person and thresholds are calculated from the variance of feature points. This gives better performance as compared to individual performance. We have achieved accuracy up to 9500. The FAR-FRR plot is as follows, using this test bed we have performed total 353 tests for verification mode and 257 tests for recognition mode. The system is having decision threshold of 60%o for both, the

2009 IEEE Internactionalz Advance Computing Conference (IACC 2009)

signature verification and signature recognition mode. Out of 353 verification tests 152 tests were for genuine signatures and 201 tests were for forged signatures. For the recognition mode we made 135 tests for genuine signatures and 122 tests were for forged signatures. Fig. 13 shows FAR-FRR plot for signature recognition system for verification mode and the EER is 3.29%. At selected threshold level of 60 %. We have achieved final FAR of the system as 2.5% and accuracy 95.08 %. FAR-FRR Plot 1 oo.oo

S

0.0W0

In this paper we have discussed an Off-line signature recognition system based on cluster features. We have developed an Off-line signature recognition system based on these features the system is using user specific threshold. On the database of 1139 signatures we have performed 600 tests for both signature verification and recognition modes. The system is has reported accuracy of 95.46% (CCR). These features are easy to implement and can be explored in a deeper approach and sophisticated training mechanism and classifiers like neural networks to achieve higher recognition rates. REFERENCES [1] B. Majhi, Y S Reddy, D Prasanna Babu (2006), "Novel Features for Off-

line Signature verification", International Journal of Computers, Communications & Control Vol. i [2] L. Zhu, A. Rao & A. Zhang (2002), "Theory of Keyblock-based Image retrieval", ACM Journal, Volume V, No. N,, PP 1-32

XE -

Ln

M,,,,M,,, onM Ln I? 19