On-Line Signature Verification by Dynamic Time ... - IEEE Xplore

10 downloads 264 Views 1MB Size Report
Dpt. Electronics Technology c/ Butarque 15, 28911 Leganes (Madrid). SPAIN. {omiguel, mengibar, lorenz, jIiu}@ing.uc3m.es. Abstract - Handwriting signature is ...
On-Line Signature Verification by Dynamic Time Warping and Gaussian Mixture Models Oscar Miguel-Hurtado

Luis Mengibar-Pozo

Judith Liu-Jimenez

Michael G. Lorenz

University Carlos Ill of Madrid Dpt. Electronics Technology c/ Butarque 15, 28911 Leganes (Madrid) SPAIN

{omiguel, mengibar, lorenz, jIiu}@ing.uc3m.es

Abstract - Handwriting signature is the most diffuse mean for personal identification. Lots of works have been carried out to get reasonable errors rates within automatic signature verification on-line. Most of the algorithms that have been used for matching work by features extraction. This paper deals with the analysis of discriminative powers of the features that can be extracted from an on-line signature, how it's possible to increase those discriminative powers by Dynamic Time Warping as a step in the preprocessing of the signal coming from the tablet. Also it will be covered the influence of this new step in the performance of the Gaussian Mixture Models algorithm, which has been shown as a successfully algorithml for on-line automatic signature verification in recent studies. A complete experimental evaluation of the algorithm base on Dynamic Time Warping and Gaussian Mixture Models has been conducted on 2500 genuine signatures samples and 2500 skilled forgery samples from 100 users. Those samples are included at the public access MCyT-Signature-Corpus Database.

On the other hand, within all biometrics techniques for personal verification, handwritten signature is the most diffuse mean in our daily life, being the widest accepted all around the world and having legally support on the majority of countries. Literature about handwritten signature usually split the different methods in two main groups: off-line and on-line [1]. On the contrary to off-line methods, where signatures are usually treated as grey level images, on-line methods take into account dynamic characteristics such as pressure, tilts, position, velocity, etc. These signals are acquired when the signature is being made. Due to this, and the higher quantity of information available on on-line systems, generally these automatic verification systems get higher reliability than offline ones.

D ;X;: XOata;: 07

Index Terms - On-Line Signature, Dynamic Time Warping, Gaussian Mixture Models.

Acqulaition Mdule

rreprroesasing

I.

INTRODUCTION .,,,,,, ..,.

Automatic Identification by Biometrics techniques are becoming more a more familiar for those tasks where the user has to be authenticated. A huge number of research are being working out and the error rates achieved for some of biometric modalities make them a really confident method. Biometrics can be split in two main groups: behavioural and physical. Physical Biometrics are based on the measure of some bio-characteristics such as Iris, Face, Fingerprint, DNA, Vascular patterns, etc. On the contrary, behavioural methods are based on the measure of some trait during a period of time. Handwritten signature, key strokes, or gait are some examples of these methods. These last methods are over time acquired, hence, the difference between samples is an issue to take into account and, therefore, behavioural techniques have normally higher error rates than physical biometrics.

1-4244-1129-7/07/$25.00 ©2007 IEEE

DiW ~.Patterns

Feature

Extraction

Matching

GMM

Patterns-t

;00Decision: -l

Maker.

Fig. 1 Schema of the Automatic Identification System

In this paper an Automatic Identification System for on-line handwritten signature based on time alignment and Gaussian Mixture Models (GMMs) is presented. As shown in Fig. 1, signals of the signature, acquired by the sensor (i.e. pen tablet), are pre-processed by a Dynamic

23

Time Warping (DTW) algorithm, whereas Gaussian Mixture Models are used in the final decision step. Dynamic Time Warping is one of the most reliable and profusely used techniques for on-line signature verification. Usually, different signatures of the same individual have variability in their form and characteristics. Time alignment algorithms can be used to minimize the impact of these variations [2]. On the other hand, Gaussian Mixture Models (GMM) are based on the modelling of a continuous probability distribution by a weighted mixture of Gaussian probabilistic functions. GMMs have been used for speaker recognition and some authors have demonstrated their usefulness for on-line signature verification [3]. GMMs algorithm works taking as its input a set of features extracted from the different signal captured by the sensor, (i.e. pen tablets). Once the set of features of the signature has been extracted, they are matched against the pattern stored in the knowledge-base. This process produce a score "s" (similarity between the pattern and the sample) which is use to arrive at a decision based on a threshold "T", if "s' is higher than "r, the system decide to accepted the hypothesis "She/he is the person who she/he claims to be". Dynamic Time Warping algorithm is used as the final step of preprocessing, before extract the set of features of the signature. The set of features is calculated after the sample has been time-axes aligned with the user signature's DTWpattern, who is make up at the enrolment process. These time-axes alignment minimizes time differences among samples from the same user. Therefore, it should increase the features' discriminating power and improve the performance of the identification algorithm, as compared to a system where just only GMMs are used. A complete experimental evaluation of the algorithm based in DTW+GMM has been conducted on a 2500 authentic signature samples and 2500 skilled forgery samples from 100 users. Those samples are included at the public access MCyT-Signature-Corpus database [4]. In following section the theory for GMM and DTW will be presented (section 11 and 111). Then, MCyT Signature Database will be introduced (section IV). After that, in section V, preprocessing steps are shown, followed by features extraction and analysis (section VI). This paper will conclude with the experimental results (section VII) and conclusions derived (section Vil). 11.

recognition. The probabilistic function is defined as the weighted sum of

M Gaussians probabilistic functions as follow: M

(1)

p(ilA)= Yci -bi(i) i=l

Where c are the weighted coefficients applied to each Gaussian probabilistic function, which have to satisfied, to be a proper probabilistic function, the following constraint: M

Zc -1

(2)

biO, Gaussian probabilistic function, has the following form:

(3( -

b; () = exp{- 2( -Hi )'*

i 1J ((27L2

| )

(3)

, and 1. are the mean vector and covariance matrix respectively, defined for each different Gaussian probabilistic function. x is the features vector that represents a handwritten signature. p(xlA)

C1/

-

1i

C2/

1

\m

2

92

*

m

gm

Fig. 2 Gaussians Mixture Models Representation

As it's has been shown, a GMM model is defined for the following three elements: mean vectorA,, matrix covariance

1i and weighted coefficient Ci, all of them for each Gaussian component from 1 to M. Every single user will have his/her own model, we will use X as the notation to reference each user GMM model.

GAUSSIANS MIXTURE MODELS

Gaussians Mixture Models (GMM) is a well known and so much referenced technique for pattern recognition. GMM theory has been known for ages, but it was not till Expectation - Maximization algorithm was developed [5] that it become a useful technique for pattern recognition. One of the most known applications to biometrics comes from Reynolds [6]. GMM has been successfully used in biometrics modalities as Voice Recognition [6], Hand Geometry [7] and

A ={/5,2i,cJ} s=1 .....S where S is the number of users into the system. In order to get this model for each user, Expectation Maximization (EM) [5] algorithm has been used. EM provides an easy way to estimate the three elements for GMM model in an iterative mode. The elements of the models have to been initialized. Random selection from the training data have been used to set up the mean vectors, covariance matrix has been initialized as the unit matrix, and each c coefficient is defined as 1/M. After initialization, EM iterative algorithm is run until get some threshold, difference between mean vectors in consecutive iterations has been used for it.

Signature Verification [3].

GMM is based on the representation of the features spaces as a weighted linear combination of Gaussian probabilistic functions (Fig. 2). Thanks to this definition, GMM has the ability to get smooth models of the probabilistic distribution from the set of features used for pattern

24

Dynamic Time Warping: Pattem Sample

Ill. DYNAMIC TIME WARPING

Dynamic Time Warping (DTW) has been one of the most successfully techniques for on-line signature verification. DTW algorithm originates from the fields of speech recognition [8]. One of the first successfully attempts to use DTW for handwritten verification comes from Sato and Kogure [2]. Consider two signals as timing sequences of point: S(X,y) = S1,2 ...si (5)

p(x,y)=Pi P2, ..........PJ

(6)

5

and define the distance between two points from each

Fig. 4 DTW Alignment for sample data

signal as:

(+ (i (y)

di,j

-

In this paper, DTW has been used to make up a standard DTW-pattern, which will be used to align the signatures of the users before extracting the features. This standard DTWpattern was made up taken one signature from the training data, making time-axis alignment for the others signatures. After that, spatial average waveforms of the coordinate functions were calculated and their time-axes were transformed inversely using the averaged warping function. Hence, the features extracted from this new alignment sample are more consistent, and they should have a greater discriminative power.

pi(y)) (7)

DTW deals with temporal axis alignment between two utterances, one of them we call pattern (p) and the other one sample (s). DTW make the time-axis from the pattern fixed and then, the time-axis from sample is transformed nonlinearly to minimize the distance between them. Once is defined de distance measure between two points from pattern and sample (7). DTW has an iterative procedure to fill up a distance matrix between every single point from pattern and sample, and after that, with a back - forward methods, find the optimal way to align the sample with the pattern. This optimal way is called the warping function.

-

Slope Constraint -

S3

MCyT-Signature Database has been used to carry out the evaluation of proposed identification signature system. This database is publicly available [4]. Database is made up from 100 users, each of them has contributed with 25 genuine signatures, and 25 forgeries signatures are also captured for each user. These forgeries signatures have been made up by the 5 subsequent users, who were able to get a static image of the signature to imitate and trying until they feel confident, hence, there are skilled forgeries signature.

Warping function

sJ

Si

IV. MCYT SIGNATURE DATABASE

I or. _--+

c3L (2.3)

. -

4

l Cn-

J) , ~~~~~~~(i

~~~C,(,)

Kj

.-Ck (lAJ)

.

f-

Slope Constraint

/

S2 Si

C=(1,1)2 P1 P2 P3

Ii'

Pi

Fig. 3 Warping Function between pattern and sample

Some restriction can be made to the matrix distances to minimize de number of point to calculate, like adjustment windows, slope constraint, etc... [8] Before making the time alignment some preprocessing (time, position, size) have been made to the signatures (see section V). The warping function can use to get a new time-aligned sample, where the time difference has been eliminated.

Fig. 5 Schema of data acquired by the Pen Tablet

Signatures have been captured by an Intous Wacom pen tablet. This tablet provides the following discrete-time dynamic sequences, Fig. 5, (range of each sequence is specified): i) position in x-axis (0-12700) ii) position in y-axis (0-9700) iii) pressure p (0-1023) iv) azimuth angle az (0 - 3600) v) inclination angle in (0 - 900) 25

y(t) = y(t) / norm([x, y])

-XXAxe

x10'

where the norm used is defined as: 2-~~~~~~-u

norm([x, y]) = f|[x, y]| 2(t)d(t)

(12) (13)

0 x

. l 40l

3

1.25

Pressure, azimuth and inclination are normalized by their maximum values, as follows:

Pressure Azimuxe

lo'

0

10

.

_ 7.

0

9

20

30

40

50

60

770

0

90

20 2 10

30

40

50

sO

70

80

p'(t) = p(t)/1023

az'= az(t)/360 in(t) = in(t)/90

90)

(14) (15) (16)

D. DTWAlignment Steps above are the standard preprocessing ones for discrete-time sequences. We introduce in this paper a new step in order to get more reliable features from signatures. DTW has been used to align signatures with pre-calculated DTW pattern for each user. Therefore, the time difference between samples, due to the behavioural characteristic of signing, should be minimized.

Fig. 6 Example of signals acquired by the Pen Tablet

Both, pen-up and pen-down movements have been captured in a discrete-time sequence with a sample frequency of 100 Hz V.

E. Derived Signals: Speed and Acceleration

DATA PREPROCESSING

Speeds and accelerations derived signals have also been computed from the five primary signals (coordinates x-axis, yaxis, pressure, azimuth and inclination) acquired by the pen tablet. Speeds derived signals are calculated as the first derivative of those signals:

Signals captured at MCyT Database have been preprocessed to reduce noise and irrelevant information. In this paper the following preprocessing steps are used. A. Filtering

Firstly, smoothing of the five temporal functions (x-axis, yaxis, pressure, azimuth and inclination) by a low pass filter to eliminate the noise introduced by the graphic tablet in the data capture.

v()

After eliminate high frequencies, the five temporal functions are transformed to an equispaced 256-point temporal sequence, by linear interpolation.

s (t) = s(t)/ norm(s)

C. Normalization

iii)

(19)

VI. FEATURES EXTRACTION AND ANALYSIS

Once preprocessing have been made, features are extracted for the different discrete-time sequence (position in x-axis, position in y-axis, pressure p, azimuth az, inclination in) and their corresponding derived signal of speed and acceleration. After that, an analysis of their discriminative power will be done to make up a subset of features using Fisher's Ratio.

X(t)- XG

Y'(t) y(t)-yG (10) Size normalizing: x-axis and y-axis are normalized through the norm of the 2 dimension vector [x,y]: x(t) = x(t) / norm ([x, y])

(17)

Where the definition of norm is the same as (13), but using a one dimension vector (speed or acceleration) instead the 2 dimension vector [x,y].

Once, filtering and equispacing interpolation have been made, normalization in time, location and size have to be done: Time normalization: i) s(t) = s(tT) t E [0,i] (8) Location normalization: x-axis and y-axis ii) temporal function are normalized through the mean of those function: =

t

The accelerations of those five signals are calculated as the first derivative of their speeds: as (t) = vs t+1-V (t+1)- t (18) Both temporal functions (speed and acceleration for x and y coordinates) are normalized by their norm:

B. Equispacing by Linear Interpolation

X(t)

(t + 1)-

A. Features Extracted

(1 1)

The following features have been extracted from each discrete time sequence:

26

1) [nit minus min 2) End minus min 3) Average 4) Root Square Average 5) Time over 0 6) Crossing 0 7) Mean over 0 8) Standard deviation

VIl. EXPERIMENTAL RESULTS

A.

Finding Out the Subset of Features

As it has been told before, MCyT Signatures has been used to carry out the evaluation of the proposal identification signature system. Hence, to get classification rates 2500 genuine signature from 100 different users have been used. Classification rates have been used to find out the best subset of features, dealing with minimum numbers of features and minimum error rate. The number of Gaussians components (M = 4, 8 and 16) and number of training sample data (N = 3 and 5, as a standard enrolment) have also been analyzed in terms of classification errors.

Furthermore those features, for a signature it has also have been analyzed the next global features:

1) Number of traces 2) Time of signature 3) Writing Time of signature / Time of Signature 4) Height / Width 5) Area 6) Length

Classification Error vs. Num. Features

B. Features Analysis 2

In order to analyze the discriminative power of those 126 features, 8 features for each signal and their derivate speeds and accelerations (15) plus 6 global features, Fisher's Ratio has been used. Fisher's Ratio compares the variability from the features within-class, with the variability of the features between-class. The variability within-class is taken as the mean of the standard deviation in each class (user). The variability between class is taken as the standard deviation of the features for all over the class. . Variability within-class should be small; meaning that this feature is stable in each sample of the genuine signature. On the other hand, the variability between-class (between users) should be large, which means that this feature is pretty different for each user (class). The higher the Fisher's ratio (F), the higher the discriminative power of the feature. Fisher's Ratio is calculated as follow:

Lu

0

0 It U)

C'

' 0

I 1

N11 N p

N ,p m=

1

E E

j=

=

30

40

50

60

70

80

90

Num. Features Fig. 7 Classification Error vs. Num. Features with 4 Gaussians

component and 3 training data samples

As it shown in Fig. 7 DTW alignment improve the performance of GMM Classification algorithm, obtaining its best, 5% classification error, with a subset of 26 features. Those results were for 4 Gaussians and 3 training samples data.

iN

F=

20

(20)

Classification Error vs. Num. Features

(Xji -rnj)2

N

N )Imi j=l

(21) 0

Where m is the with-in class mean, mj is the mean of the feature "x" for the class "j", xji is the sample "i" of the feature "x" for the class "j" and N is the number of class. Data normalization is an important concern when calculating Fisher's Ratios, due to fact that variance is affected by the range of the features. To avoid this effect, all the features of the set have been normalized to the range

C4-)

[0,1].

Num. Features

DTW alignment should decrease the variability withinclass, increase variability between class, getting a higher Fisher's Ratios for features, therefore, risen discriminative power of them.

Fig. 8 Classification Error vs. Num. Features with 4 Gaussians

component and 5 training data samples Fig. 8 shows the results obtained from 4 Gaussians components and 5 training data sample. As it can be noticed,

27

5 training data sample also improve the performance of the system, getting also an optimal subset of 26 features, 3.75% classification error. Analyzing the performance with different number of Gaussian components, Fig. 9 shows that there are not significant differences in order to the number of Gaussians chosen.

After that, the performance of the proposed system for skilled forgeries will also be tested. Fig. 11 shows the performance of the identification system for a subset of 26 features for random forgeries. It achieves an EER of 0.6% for DTW+GMM system, instead EER of 1.1% if just GMM were used. The proposed system with DTW+GMM has no market difference between 3 or 5 training data sample, instead of if just GMM is used.

Classification Error vs. Num. Features

Errors Tradeoffs Curves for Random Forgeries

-*-G T5

3.5 ... *.3

0 :0-

w .co C

-4,-DTW+GM M T5

--.-DTW+GMM T3 t....-. -GMM T5

......

0.5

!LiJ C,

m;

(0 (I

.8 1.5

L..

1

Num. Features Fig. 9 Classification Error vs. Num. Features with 4, 8 and 16 Gaussians component and 3 training data samples

0

0.5

1

2

15

2.5

3

3.5

4

False Rejects (%) Fig. 11 Error Tradeoffs Curves for random forgeres, 4 Gaussians components and 5 training data samples and 40 subset features

Same results are obtained with 5 training sample data, Fig. 10. From this point, 4 Gaussians component will be used for GMM algorithm.

Fig. 12 shows the performance of the system for skilled forgeries for a subset of 26 features. The EER for DTW+GMM system is around 8%, whereas ERR around 9% is achieve by GMM system. For skilled forgeries both systems differentiate between 3 or 5 training data samples.

Classification Error vs. Num. Features

30

Error Tradeoffs Curves for Skilled Forgeries

,,

-*-DTW+GMM T5 --GMMT5 D W+G-MMhT3 --GMM T3

30

24

~

0el1'5f ;0

L 12 Num. Features

:

6

Fig. 10 Classification Error vs. Num. Features with 4, 8 and 16 Gaussians component and 5 training data samples

0

B. Identifications Experimental Results

3

6

9

12

15

18 21

False Rejects (%)

24 27 30

Fig. 12 Error Tradeoffs Curves for skilled forgeries, 4 Gaussians components, 5 training data samples and 40 subset features

At this point, the proposed identification system will be tested with all the signatures included at MCyT Signatures Database, 2500 genuine and 2500 skilled forgeries from 100 users. An analysis of random and skilled forgeries has been made. Random forgeries are those formed without any knowledge of the signer's name and signature's shape, signatures from other users, genuine and also skilled forgeries, will be used as random forgeries for each user.

VIII. CONCLUSIONS This paper shows DTW alignment, using as the last step of the preprocessing block in a handwritten signature biometric system. It has been shown that DTW is a useful technique to improve the performance of the GMM algorithm used as final decision step.

28

[1 1 ] Duda, Hart y Stork, "Pattern Classification", John Wiley & Sons,. 2001.

DTW alignment minimizes the natural difference between each sample of a signature, due to its behavioural nature. The improvement is greater for random forgeries than for skilled forgeries. This is because for skilled forgeries DTW alignment might even improve the quality of the forgery signatures. This paper also introduce the Fisher's Ratio as a methodology for analyze the discriminative power from a set of features. GMM has also been shown as a successful algorithm for automatic signature identification, and the impact of numbers of Gaussian component for GMM and the numbers of training sample data for make up the pattern has been analyzed.

Xl. VITA Oscar Miguel-Hurtado graduated as Industrial Engineering by University Carlos III of Madrid in 2004. He is currently a working at the University Group of Identification Technologies, as a R&D engineer. His PhD is focused in automatic identification systems by on-line handwritten signature. Dr. Luis Mengibar-Pozo: Since 1996 he is working in UC3M, and after gaining his PhD in 2003 he became Associate Professor at the Electronics Technology Department in UC3M. He is an expert in Microelectronics, specially in low power design. He is currently working in lowprofile biometric systems, focussing in Handwritten Signature. His participation in European Projects include AMATISTA, eEpoch and BioSec. He is also author of multiple articles in journals and communications at conferences.

IX. ACKNOWLEDGES Authors would like to thank J. Ortega-Garcia and J. Fierrez-Aguilar for the provision of the MCyT Signature Database. This work has been funded by the Spanish Ministry of Science and Education (TEC2006-12365)

Dr. Michael GARCIA-LORENZ* obtained his PhD in Electronics Engineering in 2004, by the Universidad Carlos III de Madrid. Nowadays he is assistant Professor in the Electronics Technology Department of the Universidad Carlos III de Madrid (UC3M). Since 1997 he has been working in the University Group of Microelectronics in the Electronics Technology Department in UC3M, involved in project development concerning a broad range of applications, low power consumption design, hardware acceleration of Digital Signal Proccesing and Dynamic Reconfiguration of FPGAs. He is also an expert in Security and Biometrics, with multiple articles published as well as conferences given.

X. REFERENCES

[1] [2]

R. Plamondon, S.N. Srihari, "On-Line and Off-Line handwriting Recognition: A Comprehensive Survey" IEEE Tans. Pattern Analysis and Machine Intelligence, Vol 22, 63-84, 2000. Y. Sato and K. Kogure, "On-Line Signature Verification Based on Shape, Motion, and Writing Pressure," IEEE Proc. Sixth Int'l Conf. on Pattern Recognition, pp. 823-

826,1982. J. Richiardi, A. Drygajlo "Gaussian Mixture Models for On-line Signature Verification". Proc. ACM Multimedia, 2003. [4] J. Ortega-Garcia, J. Fierrez-Aguilar, D. Simon, J. Gonzalez, M. Faundez-Zanuy, V. Espinosa, A. Satue, I. Hernaez, J.-J. Igarza, C. Vivaracho, D. Escudero and Q.-I. Moro, "MCYT baseline corpus: a bimodal biometric database", IEEE Proc.-Vis. Image Signal Process., Vol. 150, No. 6, December 2003. [5] J. Bilmes. "A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models". Technical Report ICSI-TR-97-021, University of Berkeley, Apr. 1998 [6] D. Reynolds, "Speaker identification and verification using Gaussian mixture speaker models," Speech Communications, vol. 17, pp. 91-108, 1995. [7] Sanchez-Reillo R,Sanchez-Avila C,Gonzalez-Marcos "A Biometric identification through hand geometry measurements". IEEE Trans on PAMI,2000 [8] H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken word recognition" IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-26, pp. 43-49, Feb 1978. [9] L. Lee, T. Berger, E. Aviczer. "Reliable On-Line Human Signature Verification Systems". Pattern Analysis and Machine Intelligence, IEEE Transactions, Vol. 18, Issue 6, pp. 643 - 647, 1996. [10] J. Richiardi, H. Ketabdar, A. Drygajlo. "Local and Global Feature Selection for On-Line Signature Verification". Proc. ICDAR. 2005.

[3]

Judith Liu-Jimenez obtained her Telecommunication Engineering degree at Polytechnic University of Madrid in 2003. She is finalizing her PhD in Hw/Sw co-design for Iris Biometrics improvement. She is Assistant Teacher at University Carlos IlIl of Madrid, and researching at the University Group of Identification Technologies. Other R&D lines are signal processing for biometric identification, and biometric testing.

29