A Review of Dynamic Handwritten Signature Verification

14 downloads 72184 Views 286KB Size Report
authentication techniques e.g. nger prints or retinal patterns, which are reliable but .... is likely to be based on a dynamic technique which uses a signature input.
A Review of Dynamic Handwritten Signature Veri cation Gopal Gupta and Alan McCabe Department of Computer Science James Cook University Townsville, Qld 4811, Australia September 1997

Abstract There is considerable interest in authentication based on handwritten signature veri cation (HSV) because HSV is superior to many other biometric authentication techniques e.g. nger prints or retinal patterns, which are reliable but much more intrusive and expensive. This paper presents a review of dynamic HSV techniques that have been reported in the literature. The paper also discusses possible applications of HSV, lists some commercial products that are available and suggests some areas for future research.

1

1 Introduction Our society is increasingly dependent on electronic storage and transmission of information and this has created a need for electronically verifying a person's identity. Handwritten signatures have been the normal and customary way for identity veri cation. Although there have been occasional disputes about the authorship of handwritten signatures (Osborn, 1929; Harrison, 1958; Hilton, 1956), veri cation of handwritten signatures has not been a major problem, since it appears that humans are generally very good at verifying genuine signatures. Miller (1994) and Sherman (1992) discuss the importance of handwritten signature veri cation (HSV), also called signature dynamics, and note the availability of commercial products that use HSV (both note a product called Sign-On ). Miller claims that more than 100 patents have been granted in the eld of computer veri cation of handwritten signatures, many of these however are for hardware to capture the signature. This interest in HSV is at least partly due to the fact that HSV is superior to many other biometric authentication techniques e.g. nger prints or retinal patterns, which are reliable but much more intrusive and expensive and therefore not as acceptable except perhaps in highly security sensitive situations where reliability is of utmost importance. Miller and Sherman both note that although HSV is likely to become very important in the future, the technique will be widely accepted only if it is more reliable than the products that are currently on the market, in particular the technique needs to have lower false rejection rates (FRR or Type I Error). It should be noted that the aims of authentication are going to be di erent for di erent types of applications. For example, the primary concern of veri cation in a credit card environment (where the card holder presents a card to make a purchase and signs on an electronic device that automatically veri es the signature) must be to have zero or near zero false rejection rate so the genuine customer is not annoyed by unnecessary rejections. In this environment fast veri cation is essential and, in addition, the information required for HSV should not require too much storage since it may need to be stored on a credit card strip or a smart card memory. A high level of security against forgeries may not be required and a false acceptance rate (FAR or Type II Error) of 10% or even 20% might be acceptable since even that is likely to assist in reducing credit card fraud as that would be much better than the minimal checking that is done currently. On the other hand, in a security sensitive environment that was, for example, using HSV for granting an authenticated user access to sensitive information or other valuable 2

resources, it would be necessary to have a high level of security against intruders and a zero or near zero FAR. A FRR of 10% or higher would be a nuisance but might be acceptable. Of course an ideal HSV system should have both the FRR and the FAR close to zero but no technique of HSV presently appears capable of performing consistently at this high level. It should be noted that FRR and FAR are closely related and an attempt to reduce one invariably increases the other. It should be noted that a technique which promises a small FRR when tested on an entire database does not guarantee a small FRR for each individual. The performance gures reported in the literature are normally aggregate gures and it is not uncommon to nd some individuals that have much larger error rates than the rest of the population in the test database. Of course, it is desirable that a technique not only have good aggregate performance but also good individual performance. Most early work in automatic HSV, in the early 1970's or before, focused on static (or o -line) HSV. Static HSV systems are those that require only an image of the signature. The advantage of these systems is that they do not need specialised hardware to capture signature information at the point of signing. Static HSV also has important areas of application, for example, in automatic cheque clearing, however there are disadvantages to the static approach. For example, static signature information is unlikely to be useful for storage on a credit card or smart card since generally, signi cant storage is required to store a signature image. Also static techniques don't take advantage of the signature dynamics. Thus the results generally aren't as good as for dynamic techniques. In this paper the focus is on dynamic (or on-line) HSV although some static HSV techniques are also discussed. For a more detailed discussion of static HSV see Plamondon and Lorette (1989) and Leclerc and Plamondon (1994). Early dynamic HSV was based on using specially instrumented pens since no suitable equipment for capturing signatures was available. This has changed in the last several years as graphics tablets have come into widespread use, tablets that can capture a signature as samples of coordinate pairs, some also capture pen pressure and pen tilt, 100 to 200 times a second. With data available from such equipment, it is straightforward to compute velocities and accelerations. A typical modern on-line HSV system therefore is likely to be based on a dynamic technique which uses a signature input device like a graphics tablet. The present paper reviews dynamic HSV techniques that have so far been proposed in the literature. An attempt is made to describe important techniques and assess their performance based on published literature. The 3

paper is organised as follows. Section 2 rst discusses how HSV may be useful and gives examples of some commercial products already available. Section 3 discusses how human signature experts verify handwritten signatures. This is then followed in Section 4 by a discussion of basic methodology of computer HSV and a discussion of signature writing dynamics. Section 5 reviews some of the existing literature in the eld and is split into three subsections: point-to-point comparison, feature values comparison, and capturing shape dynamically. Section 6 explains why it is so dicult to directly compare HSV systems and concludes the paper.

2 Applications of Signature Veri cation In this section it is brie y discussed how HSV might be used in applications that require user authentication. Credit cards deal with very large amounts of funds each day. Perhaps as much as $5-$10 billion worth of purchases are charged to credit cards every day. It has been reported that credit card issuers lost $800 million int 1989, about $1 billion in 1990 and about $1.6 billion in 1991. Present credit card fraud in the United States alone has been reported to be well over $2 billion per year. Although these sums are substantial, they are quite small when expressed as percentages of the total credit card purchases, certainly well below 1%. Credit card issuers however get only a small commission of the purchase price and therefore the total losses due to credit card fraud are signi cant for the credit card issuers. It has been reported that half of all credit card fraud involves lost or stolen cards while the rest involves counterfeit cards, the nonreceipt of cards, or fraudulent credit card applications. A number of techniques have been used in an attempt to curb this fraud, some based on identifying sudden surges of spending, but it appears none of these techniques have been particularly successful. Most banks therefore do little to control credit card fraud. A reliable HSV technique could have applications in reducing credit card fraud although there appears to be some hurdles that must be overcome if the technology is to be useful. A stolen card has the owner's signature on the back and this makes it particularly easy to forge the signature given minimal checking of signatures at the place of purchase. This problem of forging the signature does not even arise if a credit card has been intercepted in mail since any signature could be used and put on the back of the card. One possible approach to reducing credit card fraud would be to require the 4

owner of a new credit card to visit the bank and supply sample signatures so that information from the signatures could be put electronically on the card making it unnecessary to have the owner's signature on the card at all, a signature that may be viewed and forged by a person who has stolen the card. The above scheme of course has a problem. If one credit card issuer requires that a person that is being issued a new card must go to the bank and produce identi cation and give sample signatures while other credit card issuers continue with the present procedure of requiring no such visit to the bank, the credit card issuer requiring signatures would not have many customers! The suggested approach is therefore not particularly convenient for the customer and is unlikely to be adopted in a very competitive market place where the customers are being bombarded with o ers of new credit cards almost every day. Another approach might be possible. When the owner of a new card makes the rst purchase using the card, the check-out sta be asked (by the credit card terminal) to check the customers identi cation (e.g. drivers license) and provide the identi cation number, as is done when cashing a cheque. This scheme does have some merit in that it only creates a minor nuisance when the customer is asked the rst time for an identi cation but using this approach the HSV technique may not work very well since the system has only one signature to base its reference signature on. Further signatures will of course become available as the customer makes more purchases but then it is possible that those signatures are those of the genuine customer and not a forger. Yet another approach might be possible. In this approach, the customer uses the credit card as he or she would normally but his or her signatures are captured electronically and compared with a signature pro le that has been built over the last few weeks or months. When the result of comparison shows a signi cant mismatch a suitable action is taken which may include either rejecting the purchase being charged to the card or, preferably, bringing the mismatch information to the attention of a human operator who can take appropriate action e.g. contact the card owner. HSV might also have uses in computer user authentication if a HSV technique could be designed that provided a high level of security against intruders through a zero or near zero FAR. It might be suitable for user authentication not only at login time but also for accessing sensitive applications e.g. sensitive databases or exclusive software. A typical dynamic HSV system will of course require that a signature input device like a graphics tablet be connected to each workstation to capture signature details. This 5

technique has the potential to replace the password mechanisms for accessing computer systems in some situations. The major disadvantage of this approach of course is the requirement that a graphics tablet be attached to each workstation. Reliable HSV could well have other applications. For example, it might assist in reducing the forging of passports. An application for a passport normally requires that the applicant go to some authorized oce to le an application form and signatures be certi ed in the presence of an authorized ocer. It is thus not unreasonable that the oce where a passport application is led may require the applicant to provide a set of sample signatures which are captured electronically and used for building a reference signature. That reference signature could then be placed on a magnetic strip on the passport, moreover passports in the future are likely to have magnetic strips for faster processing anyway. At the port of entry, at the immigration counter, the person entering the country is then required to sign his or her name on a graphics tablet and the signature is compared with the reference signature on the passport strip. Forging of passports will then be almost impossible. A number of commercial products in HSV are already being advertised and some are listed here. The list is not comprehensive since growth in this eld is quite fast and there is no simple mechanism to nd a list of all products in the eld. A product called PenOp is being marketed by Peripheral Vision of New York and it is claimed that the software may be used in con guring systems so that users must login using handwritten signatures. Another product called Sign-On, it is claimed, allows HSV to be built-in to a variety of widely used software enabling the system to use a handwritten signature instead of a password. It uses, besides the signature image, acceleration, stroke angles, start and stop pressures (if available) and other factors. The signature information can be updated each time a successful veri cation occurs. The product uses six signatures plus a nal verifying signature to build a reference signature. The test signature is now compared with the reference signature resulting in one of three judgements: true, forgery or ambiguous. The product is claimed to have a 2.5% FRR and a 2.5% FAR but details of performance evaluation are not available. Signer Con dence is the name of a static HSV system that is marketed for HSV on cheques. It brings two images, one from the cheque and the other stored in the database, on the screen for comparison and veri cation. Yet another product is Cadix ID-007 which is claimed to be suitable for user authentication and which requires a pressure-sensitive pen and tablet for HSV. The Microsoft Windows based software examines the test signature 6

according to three di erent criteria: the shape of the signature, the speed at which it was written and the pressure of the pen stroke. Veri cation of a signature with the ID-007 system typically takes less that one second. No details of how it performs are available. Countermatch is the name of a HSV product from AEA Technology in the UK. The product uses three sample signatures to build a reference signature. It is claimed that the product is suitable for signatures written in any language but no details of techniques used are provided. Another UK company, British Technology Group, markets a product called Kappa. Kappa uses signature shape as well as the timing and rhythm of the signature and claims to use a new high accuracy pattern matching algorithm developed at the University of Kent, but no details were available. It uses a user speci c feature set designed for low FRR, but it is not clear how this set is selected. It also provides a `shape-only' option that allows paper records of signature to be computerized. The Kappa system has been tested in a public trial at a sub-Post Oce where some 8500 signatures were collected. A FRR of 1.8% with one test signature and 0.85% with three test signatures has been reported for individuals that were able to provide a satisfactory enrolment model, that is, people who have a signature which does not require \special measure" for veri cation. The system identi es at enrolment time those people that are believed to require special measures for veri cation. It is not known how many individuals were rejected at enrolment time. The company Silanis Technology Inc. has also released its system called ApproveIT, which runs on WordPerfect 6.0 (Computing Canada, 1995). Signatures are added to documents directly from a pen-based input or from a previously captured signature on a le which is safe from tampering due to a password protection feature. If a document is signed using ApproveIT, it is veri ed to ensure the contents have remained unchanged since the prior approval. If a document is modi ed after approval, the signature will not print or display itself on the altered document.

3 Handwritten Signature Veri cation Handwritten signatures come in many di erent forms and there is a great deal of variability even in signatures of people that use the same language. Some people simply write their name while others may have signatures that are only vaguely related to their name and, as Brault and Plamondon (1993a) note, some signatures may be quite complex while others are simple and appear as if they may be forged easily. It is also interesting to 7

note that the signature style of individuals relates to the environment in which the individual developed their signature. For example, people in the United States tend to use their names as their signature whereas Europeans tend away from directly using their names. Systems which rely directly on the American style of signing, such as Nagel and Rosenfeld (1977) may not perform as well when using signatures of Europeans, or signatures written in di erent languages. It is well known that no two genuine signatures of a person are precisely the same and some signature experts note that if two signatures of the same person written on paper were identical they could be considered forgery by tracing. Successive signatures by the same person will di er, both globally and locally and may also di er in scale and orientation. In spite of these variations, it has been suggested that human experts are very good in identifying forgeries but perhaps not so good in verifying genuine signatures. For example, Herbst and Liu (1977) cite references indicating that as high as 25% genuine signatures were either rejected or classi ed as no-opinion by trained document examiners while no forgeries were accepted. Untrained personnel were found to accept up to 50% forgeries. Osborn (1929) notes that handwriting shows great variation in speed and muscular dexterity. He claims that forgeries vary in perfection all the way from the clumsy e ort which anyone can see is spurious, up to the nished work of the adept which no one can detect. Experience shows that the work of the forger is not usually well done and in many cases is very clumsy indeed. Osborn notes that the process of forging a signature or simulating another person's writing, if it is to be successful, involves a double process requiring the forger to not only copy the features of the writing imitated but must also hiding the writer's own personal writing characteristics. If the writing is free and rapid it will almost certainly show, when carefully analyzed, many of the characteristics of the natural writing of the writer no matter what disguise may have been employed. Osborn notes that unusual conditions under which signatures are written may a ect the signature. For example, hastily written, careless signatures, like those written in a delivery person's books, cannot always be used unless one has sample signatures that have been written under similar conditions. Furthermore, Osborn notes that signatures written with a strange pen and in an unaccustomed place are likely to be di erent than the normal signatures of an individual. When a signature is being written to be used for comparison this can also produce a self-conscious, unnatural signature. Osborn further notes that the variations in handwriting are themselves habitual and this is clearly shown in any collection of genuine signatures pro8

duced at di erent times and under a great variety of conditions, which when carefully examined show running through them a marked, unmistakable individuality even in the manner in which the signatures vary as compared with one another. To investigate signatures, Osborn recommends that several genuine signatures should always be obtained, if possible, and ve signatures always provide a more satisfactory basis for an opinion than one and ten being better than ve. To detect forgeries, Osborn gives a list of about 50 points that one needs to consider including the following:

 Is the signature in a natural position?  Does the signature touch other writing and was the signature written    

last? Is the signature shown in embossed form on the back of the sheet? Was the signature written before the paper was folded? Is the apparent age of the writing ink used consistent with the date of the document? Does the document contain abrasion, chemical, or pencil erasures, alterations, or substitutions of any kind?

These points only show that handwritten HSV is far from trivial but clearly most of these points cannot be applied to on-line HSV where a person's signature is collected on a graphics tablet rather than on paper. Hilton (1992) discusses what a signature is and how it is produced. Hilton notes that the signature has at least three attributes, form, movement and variation, and since the signatures are produced by moving a pen on a paper, movement perhaps is the most important part of a signature. The movement is produced by muscles of the ngers, hand, wrist, and for some writers the arm, and these muscles are controlled by nerve impulses. Once a person is used to signing his or her signature, these nerve impulses are controlled by the brain without any particular attention to detail. Hilton notes that a person's signature does evolve over time and with the vast majority of users once the signature style has been established the modi cations are usually slight. For users whose signatures have changed signi cantly over time, and such cases do occur although infrequently, the earlier version is almost always completely abandoned and the current version is the only one that is used. Only in some exceptional cases has it been 9

found that a user may recall an old form of his or her signature, perhaps for signing special documents. Liu, Herbst and Anthony (1979) found that in their experimentation with 248 users, three users continually varied between two signatures. This suggests that if a HSV system was to verify such exceptional cases of more than one signature by an individual, the system would need to maintain a list of reference signatures over time. When a user's signature varies over time, should this variation be taken into account in HSV assuming that the user might be using elements of a former signature in the current signature? Hilton comments that it is hard to answer this question since in the vast majority of cases the current signature is sucient for veri cation purposes.

4 Computer Signature Veri cation In this section, a discussion of the basic methodology of computer HSV is presented followed by a discussion of signature writing dynamics.

4.1 The Basic Methodology

Most techniques for HSV involve the following ve phases: data acquisition, preprocessing, feature extraction, comparison process, and performance evaluation Most methods, but not all, during the rst three phases, data acquisition, preprocessing and feature extraction, would generate a reference signature (or a set of reference signatures) for each individual. This normally requires a number of signatures of the user to be captured at enrollment or registration time (these signatures are called sample signatures ) and processed. In the discussion that follows it is assumed that only one reference signature is available. When a user claims to be a particular individual and presents a signature (we call this signature the test signature ), the test signature is compared with the reference signature for that individual. The di erence between the two is then computed using one of the many existing (or specially developed) distance measures. If the distance is above a prede ned threshold value the user is rejected otherwise authenticated. A performance evaluation of the proposed technique is important of course and normally researchers use a set of genuine signatures and forgery attempts either collected by them or by someone else, and determine the false rejection rate (FRR) and the false acceptance rate (FAR) for the technique given the signature database. Obtaining good estimates of FAR is very dicult since actual forgeries are impossible to obtain. Performance evaluations therefore 10

rely on two types of forged signatures. A forgery may be skilled if it is produced by a person other than the individual whose signature is being forged when the forger has had access to one or more genuine signatures for viewing and/or practice. A forgery is called zero-e ort or random when either another person's genuine signature is used as a forgery or the forger has no access to the genuine signature and is either only given the name of the person whose signature is to be forged or just asked to sign any signature without even knowing the name. Tests on random forgeries generally lead to much smaller FAR than on skilled forgeries. Although performance evaluation as described above is essential, the evaluation is not always a true indicator of the performance of the technique since the test signatures often do not adequately represent the population at large. Plamondon and Lorette (1989) note that there is a great deal of variability in signatures according to country, age, time, habits, psychological or mental state, and physical and practical situations. Building a test database of signatures that is representative of real-world applications is quite a dicult task since it is dicult enough to nd people that will willingly sign 10 or 20 times. People are not always happy to have their signatures stored in a computer or given to others to practice forging them. Therefore most test databases are built using signatures from volunteers from the research laboratory where the HSV research has been carried out and as a result most test databases have very few signatures from people that are old, disabled, su ering from a common disease (for example arthritis), or poor. Percentages of such people in the population is signi cant and these are the people whose signatures are likely to pose the greatest challenge for a HSV technique. It has been reported that FAR and FRR are generally higher when the systems are used by a more representative group. The higher FRR by a more representative group is not surprising since the signing environment generally is not quite as consistent as it often is when signatures for a test database are collected. Furthermore, more signatures in this wider group are likely to be obtained from people whose signatures are more variable either due to background or age than for research sta , students and professors whose signatures are likely to be in a test database. The reasons for a higher FAR in this group on the other hand are not clear since a more representative group could even lead to lower FAR than that obtained by using a test database since most test databases have skilled forgeries obtained after the forgers were allowed to practice the forgeries which may not always be possible in the real world. It should be noted that a number of other diculties face a person who attempts to compare the results of many studies reported in the literature. 11

There is no public signature test database that could be used by all researchers for performance evaluation and comparison. Most researchers have their own test signature databases with a varying number of genuine signatures, some have skilled forgeries while others do not, some have screened the signature database to remove some signatures that for some reason were not acceptable while others have done no screening, the number of signatures used in building a reference signature often varies, di erent tests and thresholds have been used and even di erent de nitions of FAR and FRR have been used. Some studies use a di erent threshold for each individual while others use the same threshold for all individuals. This is a rather sad state of the art in HSV. The primary issue in HSV is of course \what aspects or features of a signature are important?". There is no simple answer to this but two di erent approaches are common. In the rst approach, all of the collected position values (and/or velocity values and/or acceleration values) of a signature are assumed important and the test and reference signatures are compared point-to-point using one or more sets of these values. In this approach the major issue that arises is how the comparison is to be carried out. Perhaps the signatures could be compared by computing the correlation coecient between the test signature values and the corresponding reference signature values but point-to-point comparison does not work well since some portions of any two genuine signatures of the same person can vary signi cantly and the correlation may be seriously a ected by translation, rotation or scaling of the signature. Another approach might be to segment the signatures being compared and then compare the corresponding segments using some alignment of segments if necessary. This approach works somewhat better and we discuss it in more detail later. In the second approach, all the available values are not used. Instead, a collection of values are computed and compared. These are sometime called statistical features (or statistical parameters ) and some examples that have been used in the studies that are discussed below are:  Total time taken in writing the signature.  Signature path length: displacement in the x and y directions and the total displacement.  Path tangent angles: pro le of their variation and average or root mean square (RMS) values.  Signature velocity: pro les of variations in horizontal, vertical and total velocities as well as their average or RMS values. 12

 Signature accelerations: variations in horizontal and vertical accelera-

tions, centripetal accelerations, tangential accelerations, total accelerations, as well as their average or RMS values.  Pen-up time: total pen-up time or the ratio of pen-up time to total time.

The above list is far from comprehensive. For example, Crane and Ostrem (1983) propose 44 features that include some of the features listed above as well as several others. In a United States Patent, Parks, Carr and Fox (1985) propose more than 90 features for consideration. Once a set of features has been selected, there may be no need to store the reference signature and only the features' values of the reference signature need be stored. Also, when a test signature is presented, only the features' values are needed, not the signature. This often saves on storage (storage may be at premium if, for example, the reference signature needs to be stored on a card) and that is why representing a signature by a set of values of its features is sometimes called compression of the signature. In the methods that use feature-based comparison, selection of features and extracting the best subset once a set of features has been identi ed are major research tasks. Some of the problems that a HSV algorithm needs to deal with are:

 How many features are sucient?  Do the features require some transformation to be applied to the sig-

nature data e.g. resizing, deslanting, time-warping?  How many sample signatures will be used in computing the reference signature (assuming that it has been decided to have only one reference signature)?  Would the reference signature be updated regularly? If yes, how would this be done?  How is the distance between a test signature and the corresponding reference signature going to be computed?

Some of these issues are now brie y discussed starting with the last one. Assume that the reference signature is based on a set of sample signatures and for each element of the set of selected features the mean and standard deviation of the feature values have been computed. Therefore the reference 13

signature is two vectors; a vector (R ) of the means of the features' values of the sample signatures and a vector (S ) of the standard deviations. Clearly to obtain good estimates of the mean and the standard deviations of the features' values of the genuine signatures population it is necessary to have several, perhaps a minimum of ve, sample signatures. A larger set of sample signatures is likely to lead to a better reference signature and therefore better results but it is recognised that this is not always possible. The number of sample signatures needed is discussed further in the next section. The distance between a test signature and the corresponding reference signature may be computed in several di erent ways. Lee (1992) considers the following ve approaches: 1. 2. 3. 4. 5.

Linear discriminant function. Euclidean distance classi er. Dynamic programming matching technique. Synthetic discriminant function. Majority Classi er.

Linear discriminant function : a linear combination of the components of feature vector x, has the general form: G(x) = wt x + w0 where w is a weighting vector and w0 a constant. w0, also called the threshold weighting, speci es the boundary between two classes. It should be noted here that algorithms such as the gradient descent algorithms which require, in general, a large training set are precluded by the fact that the reference set is generally small. Two particular approaches to linear classi cation are proposed by Lee. The rst has each feature value ti of the test signature normalized by the reference mean ri; the second approach has feature value ti normalized by the reference standard deviation si . Therefore, the rst linear discrimination function becomes:

G(T ) = (1=n)

X t ?r n

i=1

i

ri

i

where T denotes the test signature being with feature set (t1 ; t2; :::; tn). Lee evaluates the mean-value-normalized linear classi er and obtains an equal error rate of about 17%. The performance of the standard-deviationnormalized linear classi er obtained a similar equal error rate. Lee also 14

showed that as either FRR or FAR approached zero, the opposite error rates rise rapidly. Euclidean distance classi er. The Euclidean distance discriminant function is used quite widely, for example see Crane and Ostrem (1983) and Nelson, Turin and Hastie (1994). The Euclidean distance metric has the following form:

X

n G(T ) = (1=n) ( ti ?s ri )2 i i=1

where, as de ned earlier, T is the test signature and ri and si are, respectively, the ith feature's reference mean and reference standard deviation. Lee's performance evaluation of the Euclidean distance classi er using 42 features yielded surprisingly poor results of an equal error rate of approximately 28%. Other researchers, for example Nelson et al (1994), report much better results. Synthetic discriminant matching. The use of SDF in HSV was introduced by Wilkinson (1990), and it consists of nding a lter impulse response w by solving a series of equations. For further details the reader is referred to Bahri and Kumar (1988). SDF performs well in comparison to the earlier presented classi ers and Lee obtained an equal error rate of approximately 7%. Dynamic programming matching. Put very simply dynamic programming matching (DPM) involves minimizing the residual error between two functions by nding a warping function to rescale one of the original functions time axis. DPM is further explained by Lee and is investigated in a number of papers in the literature, for example, see Parizeau and Plamondon (1990). Using the DPM technique Lee obtained an equal error rate of about 13%. Majority Classi er. The main drawback of the linear classi er and Euclidean classi er is that the FAR tends to 100% as the FRR approaches zero. Lee explains this as follows. Any single feature unduly in uences the decision result when deviating far from the mean value, even if the other features have values close to their means for the genuine reference set. One way of alleviating this problem is to use the so-called majority classi er which, according to Lee, is based on the \majority rules" principle. That is, it declares the signature being tested to be genuine if the number of feature values which pass a pre-determined test is larger than half the total number of tested features. According to Lee, the majority classi er achieves an equal error rate of only 3.8% using 42 features. How the majority classi er performs on zero-e ort forgeries is not clear. 15

The theoretical basis of HSV using the feature-based approach is now brie y considered. This discussion follows standard multivariate normal distribution approach which is discussed in most books on multivariate analysis (for example, refer to Jobson (1992), Section 7.2). It is assumed that an in nitely large population of genuine signatures is available for an individual and that the signatures are represented by the values of their features. Let there be m features which are not always independent. Let f be a random variable which is a vector of feature values representing the genuine signature population. f is assumed to be normally distributed with mean vector  and covariance matrix V. Normally, of course, the mean and covariance matrix of the genuine signature population are not known and so the mean vector  is replaced by the mean reference signature vector R and the covariance matrix V is replaced by a diagonal matrix that has the squares of the reference standard deviations (that is, S ) and all correlations between the features are ignored. The resulting simpli ed procedure therefore works as follows. When a test signature is presented, vector T of values of the features is computed. The distance vector D = jR ? T j is then computed. The vector D is normalized by dividing each value in it by the corresponding standard deviation in the vector S to obtain vector Z whose norm is then computed. The computed norm is now compared to a prede ned threshold and the signature is authenticated only if the norm is smaller than the threshold. Given that the population mean and the covariance matrix of the features' values is unknown and the correlations between the features have been ignored, the norm of the vector Z cannot be expected to closely follow the Chi-squared distribution. If the distribution were Chi-square, it would be possible to make accurate predictions for the threshold given the desired value for FRR. A simple example is now presented to show that larger threshold values need to be used in practice to achieve the same FRR as those which are predicted by the Chi-square distribution. Consider a single feature situation. A FRR of 1% corresponds to a Chi-square value of 6.63. Now if six features are used that have correlation coecients of 1.0 between them, it is essentially using the single feature six times and therefore the correlated six features correspond to six times the value 6.63, that is 39.78 which is much higher than 16.8 if there was no correlation. Of course, no prediction can be made about the expected FAR since the forgeries are signed by many di erent persons and therefore the forgeries would not have a normal distribution and there is no way to obtain even an estimate of the mean and covariance matrix of the forgeries. The issue of how many features a method needs to use to obtain reliable 16

HSV is a very dicult one. There is always a temptation to include more and more features in a method in the hope of improving performance and, as was noted earlier, some researchers have proposed more than 90 features. It is believed that using many features is unlikely to lead to high performance and may lead to some diculties. For example, if a method is using many features, the storage needed to store the values of the features for the reference signature is going to be relatively large and a credit card may not have sucient capacity to store all the values. Also, when a test signature is compared to the reference signature, given that no two genuine signatures are identical, it is unlikely that a genuine signature will have all features' values close to the values for the reference signature. To ensure that all (or almost all) genuine test signatures are authenticated, a technique using a large number of features either must have a large threshold for the norm of the distance or use some criterion similar to the majority classi er of Lee (1992). This majority classi er is not particularly satisfactory since it cannot be easily analysed theoretically and also using it is in e ect stating that although a large number of features are considered important, it is acceptable to ignore several of them in comparing the test signature to the reference signature. This appears contradictory. Regarding the number of sample signatures, the belief was expressed earlier that at least ve signatures are needed for satisfactory performance of a feature-based technique. Gupta and Joyce (1997a) have studied this issue and their work is discussed in the next section. Although this review does not discuss updating of reference signatures in any depth, it is believed that a reference signature should be updated regularly as the users signature evolves over time. A simple technique of perhaps giving 90% weight to the stored reference signature and 10% to the latest authenticated signature could be used but no studies of such an adaptive approach have been reported in the literature. Adding some weighting function will not have any e ect on users whose signatures don't change over time and should have a positive e ect on users whose signatures do. Finally, is it necessary to transform a test signature before it is compared with the reference signature? A number of transformational techniques have been proposed, the simplest only suggest smoothing the data but others suggest, for example, nding the bounding rectangle and scaling the test signature to the same size as the reference signature. While various authors have used transformations, no systematic study of bene ts of such transformations has been reported in the literature. An example of a transformational technique was that of Phelps (1982) who used a space-domain approach in which a close tting polygon around the signature image is formed. The 17

signature area is then normalised and centred on a coordinate plane. Phelps shows that the area of overlap for a pair of valid signatures is consistently higher than for forgeries.

4.2 Dynamics of Signature Writing

The dynamics of the signature is captured by a graphics tablet in the data that is given by

S (t) = [x(t); y(t); p(t)]T t = 0; 1; 2; :::; n that is, it is a collection of x,y location values of the pen tip and the pen tip pressure values at given times (generally, equal time intervals). Many devices sample at the rate of 200 times a second (sometimes less) and the resolution of such devices is often about 1000 pixels/inch although some have ner resolution. Typical American signatures are a writing of the persons name and therefore for American signatures the x -values typically grow linearly with time with small oscillations on the linear curve while the y -values show a more oscillatory variation with time, becoming positive and negative many times during a signature. The signature data, after appropriate smoothing may be used to compute the derivatives of x,y and even p (assuming that the pressure data is not binary) if required. The rst derivatives of x and y are the velocities in the two directions (which may be combined to compute total velocity, if required) and the second derivatives are of course the two accelerations. One may also wish to compute the third derivatives although these are rarely used in HSV. The third derivative is the rate of change of acceleration and is sometime called jerk. Once these derivatives have been computed, the following signature data (assuming no pressure derivatives) is available:

S (t) = [x(t); y(t); p(t); xv(t); yv(t); xa(t); ya(t); xj(t); yj(t)]T t = 0; 1; 2; :::; n 0

It is clear that every time a person signs his or her signature the number of samples obtained will be somewhat di erent, that is n will have a di erent value. This variation in genuine signatures of the same individual makes it very dicult to compare one set of values from one genuine signature with another set from another. A signature may be considered a sequence of strokes. Dimauro, Impedovo and Pirlo (1994) de ne strokes as a sequence of fundamental components, delimited by abrupt interruptions. Let us consider one such stroke in the y -direction (assume that the stroke is from a small y value to a large y 18

value) and for the moment ignore any displacement in the x -direction since that can be treated in a similar way. If the y -velocity during the stroke is now examined, it is found that the stroke is characterized by a number of positive velocity values that are growing and reaching a peak somewhere perhaps mid-way through the stroke and then a number of positive values that are declining. Therefore the velocity pro le of a stroke is a waveform, perhaps it looks like a bell-shape curve, starting with a zero velocity, reaching a peak and ending with a zero velocity. Furthermore, the variation of the y -acceleration corresponding to a single peak velocity pro le of a stroke is studied, it is found that the accelerations during the period of the single peak velocity pro le will grow from zero value to reach a peak value perhaps mid-way during the velocity climb-up to the peak and then decline steadily to zero at the peak velocity. On the way down, the accelerations will grow in magnitude in the negative direction (as velocity reduces) and reach a negative peak around half-way down the curve and then decline steadily in magnitude (although still negative) to reach zero at the end of the curve. A stroke therefore will look like a single positive peak curve when the velocity pro le is examined and will look like a two peaks curve, one peak in the positive direction and another in the negative direction, when the acceleration pro le is examined. Further study of velocity and acceleration shows that the height of the peak of the velocity will be much smaller than the length of the stroke and the heights of the acceleration peaks will be smaller still. A rough estimate for the height of the velocity pro le may be obtained if the single peak curve is approximated with a triangle since the area under the curve is equal to the length of the stroke. Let the length of the stroke be d then the peak velocity v is simply given by v = 2d=t if t is the time taken to write the stroke. By counting the number of peaks larger than some given height h in the velocity or the acceleration pro le, the number of strokes that are larger than some given length can be found. A skilled forger may manage to forge the long strokes of the signature, but it is unlikely that his/her velocity and acceleration pro les for a long stroke be the same as that of the genuine signer. The shape of the velocity pro le of a stroke has been studied by Plamondon (1993) who states that the shape of velocity pro le obtained when a rapid-aimed stroke, like those in a signature is written, is approximately bell-shaped but the shape is asymmetric and it is claimed that the shape is almost preserved for movements that vary in duration, distance or peak velocity. It may be that this consistent shape is related to the way the central nervous system plans and controls movement. Plamondon presents a model 19

for rapid movement involving an impulse command of amplitude D at time t . The model leads to asymmetric bell-shaped velocity pro les, that is, the velocity rises quickly to reach the peak velocity during the stroke writing and then reduces to zero much more gradually. It is suggested that these pro les can be explained by log-normal curves. This is related to the Fitts' Law, a greater discussion of which can be found in Fitts (1954). Plamondon, Alimi, Yergeau and Leclerc (1993) compare 23 di erent models that may be used to describe the asymmetric bell-shaped velocity pro les of rapid movements. The models were compared by using them to reproduce a set of 1052 straight lines drawn rapidly by nine individuals. Log-normal models were found to be the best. Given the dynamics of signature writing, how does a forger forge a signature and how dicult is it for a forger to produce a good forgery? As noted earlier, it is believed that a forger cannot write another person's signature in a ballistic motion without a lot of practice and therefore producing good forgeries is never going to be easy. However it is also true that some signatures lend themselves more easily to forgery than do others. It is this complexity of forging of a signature that is studied by Brault and Plamondon (1989, 1993a) in an attempt to estimate the intrinsic risk of a signature being forged. They note that humans can only remember about seven variations of a pattern without error and so a forger cannot memorize all the details of a signature that is being forged. A process of minimization or recoding of information that needs to be remembered takes place. For example, a forger might recode information about signature of John Smith by rst remembering the name and then remembering that John Smith's signature has taller J than the forger would normally write himself and rounder S and so on. Brault and Plamondon assert that a signature is a sequence of triplets: curvilinear stroke, angular stroke and another curvilinear stroke, half superimposed. Based on these observations, they derive an expression for an imitation diculty coecient of a signature. Roughly the expression gives the diculty of a given signature to be a function of the variation rates in length and direction of the strokes. It is claimed that people with small value of this coecient and high variability between their signatures are the problematic signers. They also recommend personalized thresholds based on the diculty of a person's signature but it is not clear how that would help HSV since the large variation is already taken into consideration when the distance is normalized by the standard deviation. 20

5 Review of Earlier Work Given the importance of HSV, the volume of published literature in the eld is not large. This, it is believed, is primarily due to the high perceived commercial value of innovations in the eld and perhaps the reluctance of industry to make public the results of their research and development. Most companies that are carrying out research in this eld are keen to protect their inventions by keeping the research con dential within the company or by patenting their innovations; in either case much of the work does not get published in the learned journals. This review will include some of the patents in the eld of HSV. Some of the early work in HSV is not easily available but has been cited in the papers of Herbst and Liu (1977) and Lorette (1984). The earliest cited work on HSV appears to be that of Mauceri (1965) cited by Herbst and Liu (1977) and Lorette (1984). Herbst and Liu report that Mauceri took 50 signatures from each of the 40 subjects in evaluating his method which used power spectral density and zero-crossing features extracted from pen acceleration waveform measurements. A FRR of 37% has been reported. Another study that has been cited is that of Farag and Chen (1972) who have been reported to have used chain-encoded tablet data and tested the method on signatures from ten subjects. A FRR of 27% and a FAR of 27% has been reported. The work of Sternberg (1975) who studied HSV using handwriting pressure has been cited by Herbst and Liu (1977) and Zimmermann and Varady (1985). Sternberg treated handwriting pressure as analog z axis information and based HSV on it obtaining a FRR of 0.7% and FAR of about 2% using random forgeries. Later tests, cited by Herbst and Liu (1977), found the rates to be 6.8% FRR and 3.2% FAR for random forgeries and 17% FAR for skilled forgeries. Two di erent approaches to HSV have been used throughout the literature (as was noted earlier). The rst is based on comparing position, velocity or acceleration values point-to-point, often by computing a set of correlation coecients between the test signature values and the corresponding reference signature values. In the second approach, a number of features are used to capture the signature information and these features are then compared. There has lately been further research into a third category outside the main two, which involves capturing the shape of the signature. The HSV work under these three categories is reviewed in this section.

21

5.1 Point-to-Point Comparison

Point-to-point comparison is based loosely on the idea of comparing a test signature with a reference signature by comparing the di erent parts of the signature separately and combining these comparisons to achieve an overall similarity measure. As noted earlier, the diculty arises since even the genuine signatures of one person have a certain level of variation, making a direct point-to-point comparison almost impossible. In order to make more e ective comparisons, a system must perform some type of alignment of the test and reference signatures in an attempt to \line-up" the corresponding parts of the signatures. This alignment may in turn create problems of its own, since forgeries undergo alignment as well as genuine signatures. Herbst and Liu (1977) describe a technique based on acceleration measurements and using regional correlations. As noted previously, they comment that signature writing is predetermined by the brain and as such, the total time taken for writing signatures by an individual is remarkably consistent. The authors assert that the component durations should also remain quite consistent and hence the zero-crossings of the pen accelerations. Herbst and Liu use two orthogonal accelerometers mounted on an experimental pen to sample a signature at the rate of two hundred times per second. Five sample signatures were given by each individual and from these one or two reference signatures were selected such that the distance between the selected signatures and the remaining signatures is at least equal to a prespeci ed value. These are supposed to be the `best' reference signatures. The logic of this selection procedure has not been explained. Note that mean values of accelerations in the sample signatures were not used as a reference signature since variations in the signatures can lead to mean accelerations close to zero. Herbst and Liu found that most signatures were of 2-10 seconds in duration with an average time of about 5 seconds. Each signature was heuristically partitioned into segments and corresponding segments were crosscorrelated after the segments were modi ed (aligned) based on the duration of the interval and discrepancies between the test signature and the reference signature. It was found that segments in the range of 1-2 seconds performed the best. Longer segments were more dicult to align due to minor variations within the segments. Often segmentation was based on pen-down and pen-up occurrences within the signature. Another approach to segmentation is to use equal components of the signature (equal according to duration), and examples of this type of segmentation will be discussed later. In the Herbst and Liu system the segments were shifted in the corre22

lation analysis by up to 20 percent of the reference signature's segment duration but never more than 300 ms to nd the best match. A number of adjustments needed to be made to compute the correlations e.g. extra pen lifts were eliminated and correlation results were weighted to penalize excessive shifting, and weighted by the reference signature length of the segment since larger segments are believed to be harder to forge. The technique was evaluated by rst collecting 350 signatures (5 X 70) from 70 users for selecting reference signatures followed by a collection of another 695 genuine test signatures and 287 forged signatures; a total of 1332 signatures. Before applying the correlation test, the veri cation algorithm rejects a signature if it takes time that is more than 20% di erent than the reference signature. If this test is passed, the correlation test is applied. A FRR of more than 20% was obtained with a FAR of around 1% although, not surprisingly, a much lower FRR was obtained if a user was able to undergo three trials to get veri cation which is unlikely to be unacceptable in most applications. Although not noted in the paper by Herbst and Liu, Zimmermann and Varady (1985) note that 59% of forged signatures never had to undergo a correlation test since they were rejected on the basis of the signature time being more than 20% di erent than the time for the reference signature. To overcome the limitations of the regional correlation technique of Herbst and Liu, Yasuhara and Oka (1977) suggest that the reference signature and the test signature be compared using nonlinear time alignment, a technique that has been used in speech recognition. They assert that nonlinear alignment aligns better as compared to linear time alignment of signatures which may lead to local misalignment due to variations in the signatures. The technique involves building what they call time registration paths by plotting the template signature pen force values against the test signature values. An algorithm is designed which then searches for a path for which the Euclidean distance between the template signature and the test signature is minimized. Now this distance is tested against a threshold value and if below the threshold the signature is veri ed otherwise it is rejected. They conducted experiments using students who were asked to write the word \key" instead of their signatures. The test word was written by each of the ten subjects ten times over a period of ten days. No signatures were used. They show that using the nonlinear time alignment leads to lower errors as compared to linear time alignment. Error rates of less than 5% appear to have been obtained although much further evaluation of the technique obviously is required to assess the performance of the proposed technique for HSV. In another attempt to improve the performance of the original technique 23

of Herbst and Liu, Liu, Herbst and Anthony (1979) propose using writing pressure in addition to the two acceleration measurements used in the earlier study. The authors note that although the pressure waveforms have gross similarity from signature to signature, the absolute values depend on human factors like the feel of the pen or how the pen is inking. Also, it appears the pressure waveform is not very hard to imitate. It was found that correlation between pressure waveforms shows little discrimination since the gross form of the pressure waveform dominates the correlation values, but they found that it was more e ective to remove low frequency paper contact components of the pressure waveform. Using the acceleration and pressure correlations separately as well as together, they carried out some experiments using signatures from 24 subjects and obtained results close to 16% FRR and well below 1% FAR. The technique described above was further modi ed by increasing the number of signatures used in selecting reference signatures to six and always selecting two references for future comparisons. A procedure that appears somewhat unsatisfactory for selecting the reference signatures is now followed. The reference signatures are labeled provisional if they do not lead to successful veri cation the next time the user presents his or her signature, they are modi ed by addition of three latest signatures. Rejects from provisional users were not counted in the performance results. The experiments were conducted at a large internal data processing center at White Plains, NY, where a total of about 6000 signatures were collected. To collect skilled forgery data, 40 targets were selected and their genuine signatures given to forgers on index cards. Forty imitations from research sta , each attempted after practice, about ten di erent targets in two di erent sessions. The signatures were analysed o -line and each user was allowed three signatures for veri cation. The results of this experiment were 1.7% FRR and 0.4% FAR. Zero-e ort forgery FAR of 0.02% was obtained. The initial results of 16% FRR and low FAR were, not surprisingly, reduced signi cantly by using three signatures for veri cation (the FRR would be expected to go down from 16% to 0.4% if the three signatures are considered independent events). As noted previously, this is not acceptable in most applications. The study found that 3 out of 248 users continually varied between two di erent signature styles. Further improvement to the correlation technique is proposed by Lew (1983) who presents an algorithm that allows further transformations to be made in an attempt to nd the best correlation between any two given segments. These transformations include translating a segment by a fraction of the sampling interval and using a transformation that takes into account 24

writing at slightly di erent uniform speed within each segment. The performance of the proposed algorithm was not evaluated. Another approach of computing correlations is proposed by Brault and Plamondon (1984) who describe a HSV system based on using an instrumented accelerometric pen which sampled acceleration patterns of a signature. The four acceleration transducers give x and y accelerations for the top and bottom of the pen. Total accelerations at each end and their orientations may then be computed. Some suitable features with characteristics (e.g. minimum or maximum value, average, variance) dealing with accelerations in direction  may then be computed and each of these is represented as a histogram for  values of 1 to 360. Veri cation involved building a set of n histograms for each signature, one for each feature. To compare two signatures a four-step procedure is required. The rst step, synchronization, involves alignment of the two histograms before comparison and choosing the alignment that gives the largest correlation. The same alignment is then applied to the remaining (n - 1) histograms of the signature to be compared. A preselection test is used to reject signatures that are unlikely to pass the correlation tests by using a number of global features (e.g. average acceleration values, signature duration) although the logic of this preselection is not entirely clear. After the preselection is passed, a comparison of histograms for each value of  is carried out by computing the `likeness coecient' for each value of  and then computing the average value. A decision is then made on the basis of a predetermined threshold. A limited evaluation of the technique was carried out using a database of 243 signatures from 50 people, most signing ve times, using histograms of average accelerations, the number of samples and a feature obtained by multiplying the two. A reference histogram for each person was obtained by averaging the histograms from the ve (or the number supplied) signatures for that person. There were no forgeries in the signature database and signatures of other people were used as zeroe ort forgeries. A set of threshold values that gave the best results were used and a FRR of 1.2% and FAR of 1.0% was obtained, but these results are meaningless as the test signatures were used in building the reference signatures. It is not clear how the threshold values were chosen and it appears that di erent threshold values for each test of each person might have been chosen. The performance of correlation techniques and techniques based on statistical features is discussed by Lamarche and Plamondon (1984). They note that when a signature is described by features like mean and peak speeds and total duration of signature, a good compression of the signature is obtained, 25

but better results are likely to be obtained if all of the data of the signature was used (i.e. position, velocity and acceleration) and then correlation between the reference signature and the test signature was computed. As discussed earlier, success of techniques that compute correlations requires that the signature be divided into segments, corresponding segments be aligned, and correlations between corresponding segments be computed. Lamarche and Plamondon study the problem of segmentation based on a model of handwriting that suggests that signature writing may be represented by a simple formula. It claims that each signature consists of transient and steady state response portions, where transient portions exhibiting an exponential behavior and steady state portions are straight lines. A sample signature database is used to analyse signatures and identify transient portions from a cumulative scalar displacement curve. A signature may now be represented by a sequence of steady and transient portions each pair corresponding to a pulse. Each segment may now be associated with a di erent input pulse and rather than storing signature position data, it may be possible to store starting time, duration and amplitude of each pulse although information on segment orientation and curvature may also be needed. The technique, it appears, does not work well with continuously curved portions of signatures since it is then dicult to identify a segment. The performance of a method using only the most basic information about the signature is evaluated by Zimmermann and Varady (1985) who study only the pen-up and pen-down timing information contained in a signature and use this information for verifying signatures. They assert that since total time taken to sign is very precise and repeatable, the timing information obtained from the pen-up and pen-down must be very discriminating. They capture the pen-up-down information from a tablet and rather than carry out a point-by-point comparison they apply Walsh transformation to obtain a frequency spectrum for the signature. The rst 40 low frequency harmonics are retained. To evaluate the technique, a database of nine groups of ten genuine signatures, collected as two subsets of ve gathered several weeks apart, is used and it appears that no skilled forgeries were collected. A variety of tests are conducted using variable numbers of signatures in the reference set, some using all genuine signatures of an individual to build a reference and then testing the same signatures for computation of FRR. When the sample signatures were not included in the set of test signatures (a more realistic approach of performance evaluation), the FAR varies from a low of 4.6% to a high of 12.0% while the FRR varied from a low of 33.6% to a high of 51.9%. The FAR seems to correspond to zero-e ort forgeries since no skilled forgeries have been collected. These results are 26

interesting since they are obtained from very limited information from the signature and the results show that low (zero-e ort) FAR may be obtained by using limited information. Given that some point-to-point HSV techniques proposed in the literature have used position data while others have used the velocity or the acceleration data. Plamondon and Parizeau (1988) compare the performance of the three types of data, position, velocity and acceleration, using three techniques viz. regional correlation, dynamic time warping and skeletal tree matching. The regional correlation method is similar to that proposed by Herbst and Liu (1977) presented earlier although some of the details, for example how the weights are computed, are not presented. Each 0.7 second piece of the signature was used as a segment based on a claim that handwriting signals tend to fall out of phase beyond 0.7 seconds. Time warping is a nonlinear correlation technique borrowed from speech recognition and is used to map segments from a test signature to segments of the corresponding reference signature by removal of unwanted timing di erences. The distance between two samples can then be computed. Tree matching is a technique that involves building a tree of peaks and valleys of a signature. Once the trees corresponding to the test and reference signatures are available, the distance between them may be computed in terms of minimum number of operations needed to transform one tree to the other. Few details have been provided. Each of the three techniques uses all the values of position, velocities and accelerations along the x and y directions and no statistical features are used. The study used velocity and acceleration values (in addition to position values) by computing them using an eleven-coecient FIR lter. The test database consisted of 50 signatures from each of 39 student and professor volunteers, equal numbers of males and females with 18% left-handed persons. Data was collected in ve sessions, usually one session each day for a week. Signatures were signed along a line parallel to the horizontal axis. On the average 48 signatures remained for each signer, after visual inspection of the data which resulted in some signatures being rejected (the exact reason for rejections is not reported). No skilled forgeries were collected and random forgeries were used to determine FAR. The procedure involved selecting one reference signature from the database of signatures and comparing it to T other signatures of the same person selected randomly and T signatures from signatures of T other people, one from each selected randomly. Assuming that it is easiest to forge global signature size, the pen tip position values of forgeries were scaled to t in the same minimum bounding rectangle as the corresponding signature. This was done before 27

computing the velocity and acceleration values and the impact of scaling on these values is not clear. Total error rates (FRR + FAR) averaged over the three types of algorithms using the six sets of data values are presented. It appears that the data in the y direction were more discriminating (which could well be, as noted earlier, because most American signatures tend to have much more variation in the y -direction than in the x -direction) and FAR with random forgeries varied from an average of 1.9% to 8.1%. The use of velocity data appears to give better results than displacement which was better than acceleration. The experiments used a di erent threshold for each test by selecting a threshold value which minimized the total error rate for that test. Further experimentation shows the e ect of a number of parameters used in the three methods. For example, the e ect of permitted time lags in regional correlation is not signi cant while the length of segments is more important. Better results were obtained for 1.5 second segments if y(t) values were being compared but could well be di erent for di erent test databases. The e ect of window length in time warping was also not signi cant. The results of the same experiments that were used to compare the six sets of data values, it appears, are now presented in Parizeau and Plamondon (1990) to compare the three types of HSV techniques (viz. regional correlation, dynamic time warping and skeletal tree matching). Although the earlier results included tests using only signatures, results of using 50 signatures, 50 handwritten passwords and 50 initials from each of 39 student and professor volunteers are now presented. Data was collected as described above and the comparisons were based as discussed above on pen tip positions, velocities and accelerations. Again the procedure described above was followed using one reference signature and T genuine signatures and T random forgeries. Total error rates of between 3% and 17% were observed for the three techniques and no technique was globally superior than the other two although regional correlation was often better and much faster than the other two. The results are encouraging given that only one reference signature was used but the experiments used a di erent threshold for each test by selecting a threshold value which minimized the total error rate for that test. This makes it almost impossible to compare these results with the results of other studies where the same threshold is used. In a US Patent, Bechet (1990) proposes a method in which a reference writing (or a reference signature) is compared with the test writing using the x and y velocities. The method involves dividing the velocities signals into discrete time segments and then comparing the corresponding velocity segments. It is noted that the signature velocities are indisputably char28

acteristic of the individual in spite of the variations in the signatures of a person. Another technique described by Crane and Ostrem (1983) involves shifting and scaling the signature so that maximum correlation is obtained between the test signature and the reference signature. Bechet notes that this correlation criteria for HSV is inadequate since two people making the same signature produce similar velocity pro les. This, however, has not been found to be true in other studies. Bechet notes that it was observed that each signature consists of a number of segments where the number is predetermined for each person's signature. Random signals were found to occur between the segments and some parts of the segments were found to be missing. The variations in segments included variations in positions of the segments and their duration. The technique proposed uses 600 ms segments that are standardized and then compared. It appears that a value of o set is obtained for which the sum of the absolute di erences between the velocities of the reference signature segment and the corresponding test signature segment is minimum. This is followed by a procedure that is somewhat dicult to understand but seems to involve multiplying each velocity value of the segment by a time factor in the range of 0.75 to 1.25 and selecting the factor for which the di erence between the reference segment and the test segment is minimum. Once the o set and the time factor for each segment has been computed, the segment is then \standardised" by shifting and scaling it using the two factors. A total distance between the test signature and reference signature is now obtained using the reference signature and these standardised segments, and a decision is made regarding acceptance of the test signature based on the number of segments that pass a threshold test.

5.2 Feature Values Comparison

A large number of features are studied by Crane and Ostrem (1983) who use a strain-gauge instrumented pen to sample three forces of the writing tip (viz. downward force and the x and y forces) and then compute values of a number of features. As noted earlier, initially the experiment consisted of 44 features that included the following: scaled means, standard deviations, minimum and maximum values, average absolute, average positive, average negative, numbers of positive and negative samples, number of zero-crossings, maximum minus scaled mean, and maximum minus minimum. It appears, although it is not quite clear, that the signature database that was used for performance evaluation of the technique was rst used for selecting the best subset of features and some 23 features were then se29

lected as the `best' subset. To evaluate the proposed technique, a database of 5220 genuine signatures, half signed while sitting down and the other half while standing at a counter, from 58 subjects consisting of equal numbers of men and women and a representative range of left handers as well as height weight and age, was collected over a four-month period and 648 forgeries from 12 forgers that were allowed to practice the signatures to be forged. Prizes were awarded for best forgeries. The testing consisted of an enrollment phase in which 10 or 12 sample signatures were collected and a reference vector of features was formed by computing the mean and standard deviation of each feature for the given set of sample signatures. The test signature was then compared to the reference signature and the Euclidean norm of the distance vector was computed which, if small enough, authenticated the signature otherwise rejected it. The authors allow up to three trials for HSV and a false rejection occurs only if all the three signatures fail the veri cation test. This is unlikely to be acceptable in most applications as noted earlier. Several sets of results are presented but it is very hard to interpret these since the de nitions of FRR and FAR are not what are normally used in the literature. Also, in some tests the experiments removed some of the subjects that had very high variance in their signatures (3 out of 58 subjects failed enrollment!). FAR and FRR varying from 0.5% to about 3% are reported but all experiments appear to have allowed three tries for veri cation. The reference signature was continually modi ed when a signature was successfully veri ed by adding the values of the new signature to the mean reference vector with a weight of 1/8. Crane and Ostrem also discuss the possibility of using personalized feature sets for each person rather than using the same features for all persons. Some evidence is presented that personalized feature sets can improve the performance of a signature veri er. Lorette (1984) uses the following seven geometric and dynamic features that are claimed to be invariant under rotations and magni cations: 1. number of connected components (i.e. one plus the number of pen-ups) 2. number of loops 3. quanti ed cumulated phase for signature on their whole 4. initial direction of track pen coded in four quadrants 5. total duration (it is not clear if this is invariant under magni cation as claimed) 30

6. duration of connected components (total time - pen-up time) 7. mean and max velocities in connected components All the variables were normalized to have a mean of zero and standard deviation of one and the distance was then computed using Euclidean norm of the di erences. A database of 203 signatures from 14 volunteers (15 signatures from almost every volunteer) was used for evaluating the technique. Lorette claims that 92% of total variance of the test signatures was accounted by three principal component factors. The data was classi ed using hierarchical classi cation using only ve signatures of each person. The classi cation led to 14 classes although it is not quite clear if the classi cation method was guided to 14 classes or if the system itself discovered the number of classes. The remaining ten signatures were then assigned to nearest classes and this resulted in 91.7% correct classi cation. An iterative process was then used to improve the classes and this resulted in 93.6% correct classi cation. The details of improved classi cation are not clear and no forgeries were tested. The author does not discuss the bene ts of classi cation and whether similar or better results could have been obtained by using the ve sample signatures for each person and building the person's reference signature and then comparing the test signature with it. Parks, Carr and Fox (1985) in a US Patent discuss equipment designed to capture the signature as well as using features for HSV. Initially the following six features are suggested: 1. 2. 3. 4. 5. 6.

Total elapsed time Time in contact Number of segments Sum of increments in the x direction Maximum increment in the x direction (that is, maximum x -velocity) Sum of increments in y

It is proposed that a reference signature consisting of means and standard deviations of the features be used and at least six sample signatures be collected. It is noted that if six sample signatures are gathered under identical conditions, the standard deviations might be too small to be an accurate estimate of the standard deviations of the persons signatures. It 31

is therefore proposed that the standard deviations be modi ed to be the means of the standard deviations of the individual and the standard deviations of the population at large. The resulting values should be used if they are found to \conform to what experience shows to be realistic limits". If the values are not \realistic" further sample signatures may be obtained and some of the previously obtained sample signatures may be discarded if they were widely inconsistent with the other sample signatures. In some cases, the whole attempt to obtain a reference signature may be aborted and a new set of sample signatures obtained at another occasion! Once a satisfactory reference signature has been obtained, the test signature is obtained and compared and the distance between the two is computed. The details of distance computation are not clear but one of the methods proposed recommends that each feature value be compared with the mean reference value and the di erence in terms of standard deviations be obtained. It is then suggested that this be multiplied by a weighting factor of 1, 2, 4 or 8. This is then compared to two thresholds, a high score is allocated (and test signature rejected) if the scaled di erence is above the larger threshold and a zero score is given if it is below the lower threshold, otherwise a score that is the di erence between the scaled di erence and the lower threshold is given. The sum of all scores is then compared to another threshold that is used in making a decision on acceptance. A number of other suggestions have been made including di erent values of thresholds for di erent individuals and the possibility of basing the threshold on the value of the merchandise being bought and the credit rating of the person. Suggestions are also made about updating of the reference signature by applying a weighting of 90% to stored parameters and 10% the new ones obtained from the test signature. The patent also includes a list of 93 features that may be used in addition to the six already proposed. It is suggested that an optimum number of parameters is between 10 and 20 but higher number may be desirable in some situations and somewhat di erent parameters may be used in di erent instances of the same HSV system. De Bruyne and Forre (1985) proposes a set of 18 holistic (i.e. global) features including six dynamic features and other static features such as the following (dynamic) signals:

 total time,  number of pen lifts,  writing time, 32

 pen-up time, (pen-up time + writing time = total time, therefore all three features should not be used) as well as the maximum writing velocity and the time at which this velocity occurs. The static features include the following:

   

area, proportion, standard deviations of x and y values, ratio of total displacements in the x and y directions,

Reference signatures were computed using 10 sample signatures. The test signature values are compared with those of the reference signature features' values as well as with those of the forgers and a maximum likelihood test is applied. Another approach for comparing signatures is proposed that involves giving a grade for each feature. Grades are between 1 and 6 depending on the distance of the feature value from the reference mean value scaled by the standard deviation. In another US Patent, Asbo and Tichenor (1987) describe a pen or stylus for capturing the signature which is claimed to be such that it permits the person to grip the stylus in any position with respect to his or her hand. The patent discusses capturing the signature samples and their normalization by the `normalization angle' which is supposed to represent the average direction of the signature. Essentially this is rotating the signature. Little details are provided of how HSV will be carried out once these normalized data values have been obtained but the patent description notes that the standard procedure of building a reference signature using a number of features could be used. Lam and Kamins (1989) study HSV based on Fourier transform of the signature (after considerable preprocessing) and then use the 15 harmonics with the highest frequencies for each signature for veri cation. The study unfortunately does not evaluate the proposed technique well since only one genuine signature writer was used and 19 forgers tried to forge his signature. This limited evaluation leads to a FRR of 0% and a FAR of 2.5%. Also, the proposal to use the 15 harmonics with the highest frequencies requires that information about the harmonics be stored in the reference signature and it may be best to use only the ten or twenty lowest harmonics. Also, the authors make no use of the velocity and acceleration information which 33

should be just as useful as the positional information if not more so. The approach has potential but much further work is needed. Hastie, Kishon, Clark and Fan (1991) describe a model in which a test signature is assumed to consist of a reference signature which is transformed from occasion to occasion. The authors describe a ve-step method for HSV. The steps are: 1. Smoothing - a cubic spline approximation is used to average out the measurement errors. 2. Speed - speed is computed after smoothing 3. Time Warping - a time warp function is computed so that correspondence between the reference signature and the signature being veri ed is found. 4. Segmentation - the signature is segmented using low speeds regions (low speed is 15% of mean speed) into a sequence of segments called letters. 5. Averaging - estimating the reference signature. The test signature is processed using steps 1 to 4 above. Distance between the test signature and the reference signature is found at the end of Step 3 and, if a decision is not made there, at the end of Step 4. The method is demonstrated using data recorded from a graphics tablet which captured the (x,y ) coordinates as well as the downward pressure on the pen. Nelson and Kishon (1991) present results of using the analysis described in the paper by Hastie et al (1991). Ten samples of genuine signatures from each of 20 subjects and a number of forgeries for four of the 20 subjects were used for testing the proposed dynamic HSV technique. Nelson and Kishon describe how they compute pen velocities and accelerations in x and y directions using spline-smoothing and further computations of path velocities, path accelerations as well as path tangent angles, tangential accelerations, centripetal accelerations, jerks and curvatures. The authors note that both the pen pressure and speed have highly repeatable patterns in valid signatures of a person. The authors present results of plotting path length versus signature time in the hope of nding clusters for valid signatures that are quite separated from forgeries. It was found that such separation did not always exist. An interesting point is made that the shape and dynamics of a signature might play complementary roles in HSV 34

since if a forger is trying to get the shape right, he is unlikely to get the dynamics right and vice versa. The features used in this study were:

        

Signature time Signature path length Root mean square (RMS) speed RMS centripetal acceleration RMS tangential acceleration RMS total acceleration RMS jerk Average horizontal speed Integral of centripetal acceleration magnitude

The authors describe a two-stage veri cation scheme in which a rst screening stage is used to reject those signatures that are obviously not close the reference signature. The four best measures from the above list were selected for each user and these were used for the screen test which appears to have worked well for the given data. The second stage is not discussed in this paper. Mighell, Wilkinson and Goodman (1989) discuss an o -line system for detecting casual (i.e. non-professional) forgeries in which the forger has had no opportunity to practice the signature being forged. The system is evaluated using 80 genuine signatures and 66 forgeries but most results are presented for only one person. The signatures were collected on cards which were then scanned into the computer followed by thresholding to produce binary images that are centred and normalized to t into a 128 X 64 matrix. A backpropagation learning algorithm is employed using a training set of 10 genuine signatures and 10 forgeries for one person and the testing is done on 70 true signatures and 56 forgeries. A FRR of 1% with a FAR of 4% is reported and lowering the threshold resulted in 0% FRR and 7% FAR. Although the results for one person are impressive, they cannot be expected to apply to a larger population. Also, the performance of the technique degraded remarkably if no forgeries were available for training. 35

The above work forms part of a PhD thesis by Wilkinson (1990) in which he studies o -line HSV with a focus on detecting casual forgeries. He uses a test database of 1190 images including 590 genuine signatures from nine subjects each contributing 50 to 70 samples collected over a period of 18 months and 396 forgeries from 44 volunteer forgers who were shown the name of the person whose signature was being forged but not the actual signature. The images were scanned into computer and considerable preprocessing followed. One of the techniques proposed involves building a slope histogram of a signature using 20 valid signatures and 10 forgeries for each of the nine subjects. By building this slope histogram, Wilkinson attempts to exploit the regularity of length and curve of the signature. Overall signature shape at various angles is evaluated to form a histogram. Histograms are then passed to a classi er which compares the histograms to those constructed from a small number of valid signatures. Wilkinson reports an equal error rate averaging about 7%. The second approach presented in Wilkinsons paper is the synthetic discriminant function (SDF) approach. This technique selects a linear lter which produces a speci ed output for each image of a training set. If forgeries are included in the training set, error rate is reduced to 4%. These two methods are then combined to form an integrated system, which results in an average error rate of approximately 1% on casual forgeries and 5-6% on skilled forgeries. In another PhD thesis, Pender (1991) explores the use of neural networks for detecting casual forgeries. The training of neural networks requires genuine signatures as well as forgeries although signatures of other people may be used as forgeries. A database of signatures was created, in which static signature features were collected from ve individuals over two years. It had 380 genuine signatures and the same ve individuals signed 265 forgeries in which the individuals knew the name of the person whose signature was being forged but had not viewed a genuine signature. As in the earlier work, the signatures were scanned and processed reducing them to 128 X 64 size. A FRR of 3% and FAR of zero-e ort forgeries of 3% has been reported. Lee (1992) in his thesis aims to design a simple on-line system yielding good performance for point-of-sale applications. Lee developed a comprehensive database of 5603 genuine signatures from 105 human subjects and 4762 forgeries for the 105 subjects. 240 of the genuine signatures were `fast signatures' in which nine subjects were asked to sign as fast as they could resulting in writing times to be 10% to 50% lower than their normal writing time. 210 of the genuine signatures were obtained from seven subjects standing up, each of them providing 20 signatures when signing at a height 36

of 90 cms and another 10 at a height of 60 cms. The forgeries consisted of random, skilled and timing forgeries. Random forgeries were collected when the forger knew the name and spelling of the genuine signer but no knowledge of the genuine signature, skilled forgers were provided two samples of genuine signatures on a piece of paper and told that the veri er uses dynamic features of the signature and were allowed to practice on paper and tablet. The practice time varied from 3 to 20 minutes. The timing forgers were given information about the shape of the signature as well as average genuine signature duration and allowed to practice on the tablet with timing feedback provided during practice. Each genuine signature is forged by two forgers, each forger contributes to all three types of forgeries and six samples of each type of forgery are collected from each forger. This yielded 3744 forgeries (however 105 X 2 X 18 = 3780 it is not clear why 3780 forgeries were not obtained). Further forgeries were collected for 22 randomly selected of the 105 individuals. Eight di erent individuals provided six skilled forgeries of each of the 22. These provided 792 forgeries (although there should have been 22 X 8 X 6 = 1056 forgeries). It is assumed that some of the forgeries were rejected, but it is not clear on what basis. The database consisted of one subject that provided 1000 genuine signatures and for this person 325 skilled forgeries from 13 individuals providing 25 forgeries each were collected. A subset of the database is designed for all experimentation. This consisted of 22 subjects each providing 11 genuine signatures (six for each were used for the reference signature and ve for testing) and 704 forgeries from eight forgers for each individual, each providing 4 forgeries for each person. It is not quite clear how this subset was selected and why such a small subset was chosen from a much larger set that was available. An equal error rate of 3.8% was reported. Lee describes a set of 42 features which include 13 static features. A number of features relate to the velocity of signature writing e.g. average and maximum velocity, durations of negative and positive velocities in the x and y directions and the averages as well as di erences between the maximum and averages for each of the four. Lee investigates a number of problems in his thesis. These include nding a small subset from the 42 features that performs better than the whole set. He studies techniques for selecting this small subset if no forgery data is available and if it is. He also studies how much performance improvement is obtained if forgery data is used in determining the subset of features. The above subsets are di erent for di erent subjects but Lee also investigates the possibility of using the same common subset for all users. Lee also discusses a number of techniques for feature selection. In one of 37

the techniques that uses only genuine signatures, for each feature, the mean for subject i is compared with the means of the same feature for all other subjects and the minimum of such distance is computed for each subject for each feature. Importance of each feature for an individual is then given by this \maximum" distance. This algorithm is used to nd several subsets for each user; 34-features performed better than 42-features. A similar approach is used when forgery data was available for each subject. The distance computed now was between the ith feature of the subject and corresponding feature of the forgeries for that individual. It was shown that 23 or 24 features gave the best performance. These two experiments seem to indicate that Lee did not appreciate the di erent aims of his two experiments. The rst experiment is trying to nd a subset to minimise the zero-e ort FAR while the second is minimising the skilled FAR. Lee also attempts to nd a common subset of features that is identical for all subjects. The approach used involves nding best subsets for individuals as described above and then ordering features based on the frequencies of their occurrence in the individual best subsets. The basis for this approach is not provided and it is suspected that the approach may not lead to best overall subset. The performance of best 10 common features was signi cantly worse than that of 10 individual features for each subject. The individualised subsets of features are also tested on a larger database that had 1485 genuine signatures and 608 forgeries (how these were selected is not made clear) in addition to 165 and 152 respectively that were used in the subset selection and for reference. 1% FRR and 14% FAR were obtained. The system was also evaluated on one individual who had signed 1000 times; 0.3% FRR and 17% FAR were obtained. Furthermore, the system was evaluated to verify signatures that were written quickly; the performance was much worse although normalising the features vector improved the performance. Chang, Wang and Suen (1993) present a technique based on Bayesian Neural Network for dynamic HSV of Chinese signatures. A set of 16 features is used which include the following:

   

Total time Average Velocity Number of segments Average length in the eight directions of the signature 38

 Width/height ratio  Left-part/right-part density ratio  Upper-part/lower-part density ratio The database consisted of 800 genuine signatures from 80 people and 200 simple and 200 skilled forgeries by 10 forgers. Results show about 2% FRR, 2.5% skilled forgeries FAR and 0.1% zero-e ort FAR. Unfortunately the paper does not present any details of the experiments regarding how the reference signature was computed, how data was collected, etc. Mohankrishnan, Paulik and Khalil (1993) propose a HSV method based on an autoregressive (AR) model that treats the signature as an ordering of curve types. A database of 58 sample signatures from each of 16 subjects taken over six sessions was used for testing. No skilled forgeries were available but random forgeries were used. Each signature was divided in eight segments and each segment modeled by an AR model. Three elements were used for each segment giving 24 features for each signature. Total error rates using threshold values that gave equal FAR and FRR for each user vary from a low of 7.92% to a high of 21.83%. Darwish and Auda (1994) present a comparative study of 210 features they claim have been previously studied in the literature for static HSV in an attempt to nd the best feature vector for HSV of Arabic signatures. The methodology followed was to calculate the mean, standard deviation and the spread percentage for every feature for every person in their database and select those features that have spread less than 25% for all persons. Some of the selected features were found to have very small interclass variations and were excluded. The rationale for using this approach is not presented. The set of 210 features was reduced to 12. A signature database of 144 signatures from 9 signers was used; 8 samples for each signer were used for building the reference feature set and the remaining 8 were used for testing. Signatures were collected on paper at di erent times of the day and then scanned. A fast backpropagation neural network method was used as the classi er. A FRR of 1.4% is reported. No skilled forgeries were available and no zero-e ort forgeries were tested. Fairhurst and Brittan (1994) discuss issues related to selection of best features by generating combinations of likely features and a metric by which to assess the relative merits of the features and thus select the best set. The authors discuss a simple approach of selecting the best features based on individual performance of each feature but that assumes that the features are independent which is often not true. Another technique, called Sequential 39

Forward Selection (SFS), is recommended. In this technique, starting from n features (n could be zero), usually one feature at a time is added so that the feature added improves the performance the best. The process can be terminated if a prespeci ed performance is achieved or a prespeci ed number of features is obtained. The authors then describe a parallel approach to nding the best set of features since the SFS approach and other similar approaches are computationally intensive. The paper appears to promote the use of individual set of features and individual thresholds for each person. Nelson, Turin and Hastie (1994) propose a set a set of 25 features which include two time-related features, six features related to velocities and accelerations, four shape-related features, eight features giving the distribution density of the path tangent angles and four giving angle-sector densities of the angular changes and a feature relating to the correlation between the two components of the pen velocity. They discuss the statistical basis of HSV and then use three di erent methods for computing the distance between the reference signature and the test signature viz Euclidean distance method, Mahalanobis distance method and the Quadratic discriminant method. A simple method of feature selection in described which essentially consists of computing the ratios of the standard deviation to the mean for each feature and rank-ordering the features according to this ratio. Although this appears to be quite a reasonable approach, there is no guarantee that a feature with small normalized standard deviation would provide good discrimination between the genuine signatures and forgeries. A variety of schemes are evaluated e.g. using individual best 8, 10, 12, 14 of the 25 features using Euclidean distance model. The performance of all these sets is similar although the individual best 8 and 10 seem to perform the best near FRR of zero. The Mahalanobis distance measure is used with the individual best ten features but the performance is not better than using the Euclidean distance. The Quadratic distance model is also evaluated. This model requires genuine reference signatures as well as forgery data during training but the results are better than the Euclidean or Mahalanobis models. Nelson et al also test the majority voting model that Lee (1992) found to be the best but Nelson et al found the majority model to give results that were worse than the Euclidean distance model. Using a Euclidean distance method, they nd best 10 of the 25 features and obtain 0.5% FRR and a 14% FAR which they consider satisfactory for credit card applications. Plamondon (1994) presents a multilevel HSV system that uses global features as well as point-to-point comparison using personalized thresholds. Global features used include total pen-down time, the percentage of pen-up time, the percentage of time when the angular velocity is positive etc and 40

these features are used for the rst stage of the veri cation. The signature is normalized using rotation and scaling and local correlations are computed between portions of the test signature velocity values with the corresponding values of the reference signature using segments alignment using elastic matching. This second stage is followed by a third stage involving calculation of variations between the normalized coordinate values of the test signature and the reference signature using local elastic pattern matching. For evaluation, three signatures from each of eight individuals are used and eight other people provided three forgeries for each of the eight genuine signers in 64 sessions after having access to genuine signatures and information on the dynamics of these signatures was provided as a sound sequence. Another set of genuine signatures were obtained from six other subjects, each providing nine signatures, three of which were used as reference signatures. Tests were performed with the two databases for adjusting the discriminating function to minimize errors. It therefore appears that test signatures and reference signatures were used in deciding individual thresholds that minimised errors. Therefore the results obtained, FAR of 0.5% and FRR of 0.0%, cannot be considered reliable. Also the testing is very limited and the number of signatures tested was small. More recently, Gupta and Joyce (1997a) have proposed HSV algorithms with the aim of using a set of features that are simple and easy to compute, invariant under most two-dimensional transformations e.g. rotation, slant, size, more global in nature than local since local variations in signature are common, and nally few in number and certainly less than 10 in all. They note that it is the dynamics of handwritten signatures rather than the shape that is more important. They use the following six features in the initial experiments: 1. 2. 3. 4. 5. 6.

Total time Number of velocity sign changes in the x-direction Number of velocity sign changes in the y-direction Number of acceleration sign changes in the x-direction Number of acceleration sign changes in the y-direction The total pen-up time

The experiment used the following procedure: 41

Step 1: First, for each user, a reference signature is built by computing the mean and standard deviation of the values. The vector of these means and standard deviations are R and S respectively and are used as the reference signature. Step 2: Now the reference signatures are discarded and the remaining signatures in the database were tested as follows. Step 3: For each test signature, a value for each of the features was computed. The vector of these values is T. Step 4: The di erence between the vector T and the mean reference signature vector R was computed. The di erence vector was normalized by dividing each element of the di erence vector by the corresponding element in S. Let this normalized vector be Z. Step 5: A norm of Z was now computed. Initially the L1 norm or the largest absolute value was used. Step 6: If the norm of the di erence vector was larger than a prede ned threshold, the test signature was rejected, otherwise accepted. The FRR and FAR were then computed using all the genuine signatures that did not participate in building the reference signature and the forged signatures in the database. In the initial experiments no random forgeries were used. Their aim of achieving a FRR of 0% or very close to it was achieved by using the above features and using ten sample signatures for computing the reference signature, although it may not be possible to obtain ten sample signatures in some applications. The FRR was higher with fewer sample signatures. It is shown that time by itself is the best single discriminator and by itself is able to provide a FRR of only 2.6% with a FAR of 16.6% for the given database. The pen-up time is also a good discriminator and using this attribute by itself they obtained FAR of 6.1% with a FAR of 20.9%. The number of sign changes in the x -acceleration and the y -acceleration gave a FRR of about 2% with a FAR of about 28%. Further experimentation was carried out to evaluate how many sample signatures need to be used and how this number a ects the performance of the method. The results from this experimentation seem to indicate that three sample signatures for building the reference signature are not sucient and at least ve should be used while seven or ten would be better. Including path length in the set of attributes improves the performance of the technique and good results were obtained when path length is included and the reference signature is built using 10 sample signatures. A FRR of about half-a-percent is obtained with a FAR of little more than 10%. Details about the values of the various features for the genuine signatures 42

and the forged signatures are presented and it is shown that feature values for most forged signatures are quite random and therefore some by chance will happen to be close to the reference mean. Fortunately though if a forged signature has a feature value close to the reference mean for one feature, it is often not close to the mean for another feature. There are two important points arising from the Gupta and Joyce study that should be stressed: 1. A surprising number of forged signatures have feature values that are more than twenty standard deviations away from the reference signature mean. Many are even more than 50 standard deviations away from the reference signature mean. This would not have been surprising for random forgeries but all these forgeries are skilled forgeries produced by volunteers who had practiced forging the signatures. 2. Feature values of many forged signatures were far away from the reference signature means for the following features: total time, the acceleration sign changes, the pen-up time, path length, and the XAcceleration zero count. For other attributes, i.e. the two velocity sign changes, Y-acceleration zero count and the segment count, many more forged signatures had feature values closer to the corresponding reference signature mean.

5.3 Capturing Shape Dynamically

Most static HSV of course is based on capturing the shape or some aspects of the shape of the signature. Nagel and Rosenfeld (1977) study static dynamic shape capturing with a cheque clearing application in mind. Signatures were input to the computer by scanning, digitizing and processing the images of real bank cheques. A number of features were then extracted; projection of the signature image onto the coordinate axes obtained by summing the rows and columns of the image; high and low letters were identi ed since spellings were assumed to be known (this is not always possible!); and feature values were measured for the signature as a whole and for the identi ed letters. The authors discuss a model of handwriting and conclude that there are three types of strokes viz. long strokes either going upwards or downwards e.g. letters b,d,f,h,k,l,t and f,g,j,p,q,y,z; very long strokes like those in f and j; and nally short strokes in most letters other than f, l and t. The following features were used in their experiments: the ratio of signature width to shortstroke height, the ratio of signature width to long-stroke height, the ratios 43

of the height of a long stroke to the height of the immediately preceding short stroke and, nally, the slope features for appropriate long letters. The study clearly assumes that a signature is a name written down in English rather than just any signature. Using only a small number of signatures, it was found that the FAR was zero and FRR was 8% to 12%. Further investigation of this technique is needed though. A major limitation of the proposed approach is that the analysis assumes that a signature is a name written down which is often true for people that have grown up in the USA. The technique is unlikely to be e ective for European signatures and signatures written in languages other than English. Ammar, Yoshida and Fukumara (1986) discuss HSV and note that when the forgery looks very similar to the genuine signature most static veri cation techniques are ine ective. In such situations one must capture the signature as a gray image rather than a binary image and consider features like the following:

 vertical position that corresponds to the peak frequency of the vertical      

projection of the binary image. vertical position that corresponds to the peak frequency of the vertical projection of the high pressure image (an image that only contains pressure regions above some threshold value). the ratio of the high pressure regions to the signature area. threshold value separating the high pressure regions from the rest of the image. maximum gray level in the signature. di erence between the maximum and minimum gray levels in the signature. signature area in number of pixels.

Using 200 genuine signatures from 20 individuals and 200 forgeries from 10 forgers and digitizing each signature into a 256 X 1024 array of 256 gray levels, the procedure used involved rst removing the noise by averaging and thresholding. The veri cation procedure used the \leave-one-out" method since the number of genuine signatures was small, and used the remaining signatures for building the reference signatures. The authors claim that the proposed technique resulted in 9% FRR and 7.5% FAR. 44

Brault and Plamondon (1993b) describe an algorithm for segmenting handwritten signatures with the aim of rebuilding the signature with a minimal number of points and to produce segments as close as possible to the psychomotor reality of their execution. The work is based on their work described earlier in which it is claimed that handwriting is made of curvilinear and angular strokes that are partially superimposed. For each point of the curve, the algorithm tries to iteratively construct a vertex centered at that point using several neighbouring points. A number of angles between the line joining the neighbouring points are calculated and those that have only a small angle are considered part of the same vertex. The algorithm was applied to a set of signatures and the authors claim that the results were generally in agreement with human perception. Sabourin, Plamondon and Beumier (1994) present a technique involving both a local matching method and a global interpretation of the scene. The local comparison method involves segmenting the signature image into a collection of arbitrarily shaped primitives and then template matching. The system was evaluated using a database of 800 handwritten signature images from 20 writers (40 signatures each) written on white sheet 7 paper. Of the 800 signatures, 248 were used as a training database for the system. The details of this prototyping are not very clear since 28 signatures from eight signers were used (it is not clear how these were selected) and four signatures from seven writers were also used (again the selection criteria is unknown). This adds up to 252 (224 + 28) but the authors claim they had 248 signatures. The training signatures, it appears, are then used in deciding individual thresholds and system performance is then carried out using 20 signatures as reference signatures for each writer. The remaining 20 signatures are used as test signatures although some would have been used in the training database. Very impressive results of a FRR of 1.5% and a FAR of 1.37% are obtained. Unfortunately such results do not really provide much information about the real performance of this technique. Qi and Hunt (1994) discuss static HSV based on extracting global and local features of a signature image. The following global features were used:

 height and width of a signature image,  width of the signature image with blank spaces between horizontal elements removed,  slant angle of the signature,  vertical centre-of-gravity of black pixels, 45

 maximum horizontal projection,  area of black pixels, and  baseline shift of the signature image. The following local (or grid) features were included. These include the structural information of image elements, for example

   

angle of a corner curvature of an arc intersection between the line strokes number of pixels within each grid

To test the technique the authors collected 450 signatures from 25 people; 15 were selected randomly to provide 20 genuine signatures over a course of one month, 5 were selected randomly to provide one simple forgery for each of the 15 subjects given only the printed name of the person, and 5 provided skilled forgeries given samples of genuine signatures and allowed to practice, one for each of the 15 subjects. The signatures were scanned into a computer. This was followed by considerable preprocessing. False rejection rates varied from 3% to 11.3% while the FAR varied from 0% to 15%. Dimauro, Impedovo and Pirlo (1994) present the idea that a signature consists of a sequence of fundamental components delimited by abrupt interruptions which the authors claim occur in positions that are constant in the signature of each individual, although the number of components in one signature may be di erent than in another signature of the same subject. These components are called fundamental strokes and the technique presented carries out spectral analysis of the strokes. The authors describe a technique in which two tables are built based on the reference signatures giving the components found in each reference signature and their sequence. Since the number of components in di erent signatures of the same subject can be di erent a clustering technique is used to nd which components are there in a signature and a sequence of these components is built. The veri cation involves nding the components of the test signature by using clustering and then checking that the components appear in the sequence derived from the reference signatures. If the sequence does not t, the test signature is rejected otherwise the components are compared with those of the reference signatures. The technique is tested on signatures of twenty 46

persons who provided 50 reference signatures and 40 test signatures each (90 signatures by each person) and forgeries from ten people who watched all genuine signatures being signed and provided four forgeries for each of the twenty genuine signers (thus each forger provided 80 forgeries). An overall FRR of 1.7% and FAR of 1.2% was obtained. The paper notes that Fourier analysis of components was also used but details are not provided. The major problem with the evaluation is the use of 50 reference signatures which is quite unrealistic. Gupta and Joyce (1997b) describe a technique which uses dynamics to capture signature shape. In the simplest form the technique may be explained by using the following symbols: A for a peak of the x pro le B for a valley of the x pro le C for a peak of the y pro le D for a valley of the y pro le A signature may now be processed to identify all the peaks and the valleys in the x and y pro les (that is, x and y variation with time during the signing of the signature) and each of the pro les may be represented by a string of symbols representing the peaks and valleys that are encountered as the pro les are scanned. The x pro le is therefore represented by a string like ABABABAB::: and the y pro le is represented by a string like CDCDCDCDC::: with each peak and valley in the two pro les having a time associated with it (the time is not shown in this very simple representation). The peak and valley times may then be used to interleave the x and y pro le representations so that the signature shape is represented by a string like ACBADBCAD which simply indicates that the signature's x and y pro les together had only 9 peaks and valleys (typically there might be 50 or more) that were x peak followed by y peak, then an x valley, etc. Another way to look at this representation is to view it as a description of the pen motion i.e. from the initial position (x peak) the pen rst moved north-west to reach a y -peak then south-west to reach an x-valley and then turned around and moved south-east to reach an x-peak and y -valley and so on. ACBADB therefore represents a pen motion something similar to the letter S written from the top to the bottom. The representation would be reversed if the letter S was written from the bottom to the top. The representation does not capture the curvature or the size of the curves that make the letter S and therefore ACBADB is a representation for many curves that look somewhat similar and thus the representation provides considerable exibility and tolerates considerable variation in the way the pen moves. 47

The representation thus captures the shape as well as the direction of pen movement during signature writing. Given the exible representation, similar representations should always be obtained for the signatures of a person in spite of minor variations in the genuine signatures. To compare two signature shapes, the symbolic representation for each of the two signatures is found and the two strings are then compared and a distance between the strings is found. A number of sample signatures are used as reference and the test signature is compared to each of them and either the mean distance or the smallest distance is found. The basis of using the smallest distance is that the sample signatures provide a collection of signatures that show the habitual variations in a person's signature and the test signature should be compared with the reference signature closest to it. The distance so computed is now compared with the threshold and if smaller the test signature is accepted. Performance evaluation of the technique shows that it captures the shape of the signature well, resulting in a small FRR but high skilled FAR. This technique may be combined with that of Gupta and Joyce (1997a) to build a two-stage scheme which results in a total error rate of about 15% which is still a little high. Further work is continuing to improve the technique.

6 Conclusion As was touched on earlier, the state of the art in HSV makes it impossible to draw de nitive conclusions about which techniques are the best since: 1. Performance of a HSV system that uses di erent features for di erent individuals is better than a system that uses the same features for all. 2. Performance of a HSV system that uses di erent threshold values for di erent individuals is better than a system that uses the same threshold value for all. 3. Performance of a HSV system that uses more signatures for building the reference signature is better than a system that uses a smaller number of signatures. 4. FRR of a HSV system that uses more than one test signature to make a judgement about whether the subject is genuine or not is better than a system that uses only one signature. 48

5. Performance of a HSV system that uses all the genuine signatures including those that are used in performance evaluation in building the reference signature is better than a system that does not use any test signatures in building the reference signature. 6. Performance of a HSV system that uses the genuine signatures as well as some or all the forgeries that are used in performance evaluation in building the reference signature or in deciding which features to choose and/or what threshold to select is better than a system that does not use any test signatures in building the reference signature. 7. Performance of a HSV system that is tested on only a small database of test signatures that has signatures from only a small number of subjects is likely to be better than a system that uses a larger database which has signatures from a larger number of subjects. 8. Performance of a HSV system that is tested on a database of test signatures that were screened to eliminate some subjects that had problem signatures is likely to be better than a system that has not carried out any such screening. The survey seems to indicate that any technique using statistical features is unlikely to provide a total error rate (FAR + FRR) of less than 10% if a reasonably large signature database is used. Most research work that claims much better results have been found to have weaknesses in their performance evaluation. Dynamic HSV is an active research area that is leading to new commercial products. The best techniques are likely to be based on using a combination of statistical features as well as the shape of the signatures.

References [1] M. Ammar, Y. Yoshida and T. Fukumara (1986). A New E ective Approach for O -line Veri cation of Signatures by Using Pressure Features. IEEE Trans on Systems, Man and Cybernetics Vol SMC-16, No 3, pp 39-47. [2] E. L. Asbo and H. Tichenor (1987). Method and Apparatus for Dynamic Signature Veri cation. US Patent Number 4, 646, 351, Feb 24. [3] Z. Bahri and B. V. K. V. Kumar (1988). Generalized Synthetic Discriminant Functions. Journal Opt. Soc. Am. A, vol. 5, No 4, pp 562-571. 49

[4] L. Bechet (1990). Method of comparing a handwriting with a reference writing. US Patent 4, 901, 358. [5] J. Brault and R. Plamondon (1984). Histogram Classi er for Characterization of Handwritten Signature Dynamic. Proc of 7th International Conf on Pattern Recognition, Montreal, pp. 619-622. [6] J. Brault and R. Plamondon (1989). How to Detect Problematic Signers for Automatic Signature Veri cation. Int Carnahan Conf on Security Technology, pp. 127-132. [7] J. Brault and R. Plamondon (1993a). A Complexity Measure of Handwritten Curves: Modeling of Dynamic Signature Forgery. IEEE Trans on Systems, Man, and Cybernetics, Vol 23, No 2, pp 400-413. [8] J. Brault and R. Plamondon (1993b). Segmenting Handwritten Signatures at their Perceptually Important Points. IEEE Trans on Pattern Analysis and Machine Intelligence, Vol 15, No 9, pp 953-957. [9] P. de Bruyne and R. Forre (1985). Signature Veri cation using Holistic Measures. Comp Security, Vol 4, pp 309-315. [10] H. Chang, J. Wang, and H. Suen (1993). Dynamic Handwritten Chinese Signature Veri cation. Proc Second Int Conf on Document Analysis and Recognition, pp. 258-261. [11] Computing Canada (1995). Vol. 21, No. 16, page 44(1). [12] H. D. Crane and J. S. Ostrem (1983). Automatic Signature Veri cation using a Three-axis Force-Sensitive Pen. IEEE Trans on Systems, Man and Cybernetics, Vol SMC-13, No 3, pp 329-337. [13] A. M. Darwish and G. A. Auda (1994). A New Composite Feature Vector for Arabic Handwritten Signature Veri cation. Proc IEEE Int Conf on Acoustics, V2, pp 613-666. [14] G. Dimauro, S. Impedovo and G. Pirlo (1994). Component-Oriented Algorithms for Signature Veri cation. Int Journal of Patt Rec and Art Int, Vol 8, No 3, pp 771-793. [15] M. C. Fairhurst and P. Brittan (1994). An Evaluation of Parallel Strategies for Feature Vector Construction in Automatic Signature Veri cation Systems, Int Journal of Patt Rec and Art Int, Vol 8, No 3, pp 661-678. 50

[16] R. F. Farag and Y. T. Chien (1972). On-line Signature Veri cation. Proc Int Conf on Online Interactive Computing, Brunel University, London, p 403. [17] P. M. Fitts (1954). The Information Capacity of the Human Motor System. In Controlling the Amplitude of Movement, J Exp Psych, Vol 47, pp 381-391. [18] G. K. Gupta and R. C. Joyce (1997a). A Study of Some Pen Motion Features in Dynamic Handwritten Signature Veri cation. Technical Report, Computer Science Dept, James Cook University of North Queensland. [19] G. K. Gupta, and R. C. Joyce (1997b). A Study of Shape in Dynamic Handwritten Signature Veri cation. Technical Report, Computer Science Dept, James Cook University of North Queensland. [20] W. Harrison (1958). Suspect Documents. Praeger Publishers, New York. [21] T. Hastie, E. Kishon, M. Clark and J. Fan (1991). A Model for Signature Veri cation. Proc IEEE Int Conf on Systems, Man, and Cybernetics, Charlottesville, pp. 191-196. [22] N. M. Herbst and C. N. Liu (1977). Automatic Signature Veri cation Based on Accelerometry. IBM J Res Dev, pp 245-253. [23] O. Hilton (1956). Scienti c Examination of Documents. Callaghan and Co, Chicago. [24] O. Hilton (1992). Signatures - Review and a New View. J of Forensic Sciences, JFSCA, Vol 37, No 1, pp 125-129. [25] J. D. Jobson (1992). Applied Multivariate Data Analysis, Volume II: Categorical and Multivariate Methods. Springer-Verlag. [26] C. F. Lam and D. Kamins (1989). Signature Veri cation Through Spectral Analysis, Pattern Recognition. Vol 22, No 1, pp 39-44. [27] F. Lamarche and R. Plamondon (1984). Segmentation and Feature Extraction of Handwritten Signature Patterns. Proc of 7th International Conf on Pattern Recognition, Vol 2, Montreal, pp. 756-759. [28] F. Leclerc and R. Plamondon (1994). Automatic Signature Veri cation: The State of the Art - 1989-1993. Int Journal of Patt Rec and Art Int, Vol 8, No 3, pp 643-660. 51

[29] L. L. Lee (1992). On-line Systems for Human Signature Veri cation. Ph.D. Thesis, Cornell University. [30] J. S. Lew (1983). An Improved Regional Correlation Algorithm for Signature Veri cation Which Permits Small Speed Changes Between Handwriting Segments. IBM J Res and Dev, Vol 27, No 2, pp. 181-185. [31] C. N. Liu, N. M. Herbst and N. J. Anthony (1979). Automatic Signature Veri cation: System Description and Field Test Results. IEEE Trans on Systems, Man and Cybernetics, Vol SMC-9, No 1, pp 35-38. [32] G. Lorette (1984). On-line Handwritten Signature Recognition based on Data Analysis and Clustering. Proc of 7th International Conf on Pattern Recognition, Montreal, pp. 1284-1287. [33] A. J. Mauceri (1965). Feasibility Studies of Personal Identi cation by Signature Veri cation. Report No. SID 65 24 RADC TR 65 33, Space and Information System Division, North American Aviation Co., Anaheim, Calif.. [34] B. Miller (1994). Vital Signs of Identity. IEEE Spectrum, pp. 22-30. [35] D. A. Mighell, T. S. Wilkinson and J. W. Goodman (1989). Backpropagation and its Application To Handwritten Signature Veri cation. Adv in Neural Inf Proc Systems 1, D. S. Touretzky (ed), Morgan Kaufman Pub, pp 340-347. [36] N. Mohankrishnan, M. J. Paulik and M. Khalil (1993). On-Line Signature Veri cation using a Nonstationary Autoregressive Model Representation. 1993 IEEE Int Sym on Circuits and Systems, pp 2303-2306. [37] R. N. Nagel and A. Rosenfeld (1977). Computer Detection of Freehand Forgeries, IEEE Trans on Computers. Vol C-26, No 9, pp 895-905. [38] W. Nelson, W. Turin and T. Hastie (1994). Statistical Methods for Online Signature Veri cation. In International Journal of Pattern Recognition and Arti cial Intelligence, Volume 8, Number 3. Singapore, pp. 749-770. [39] W. Nelson and E. Kishon (1991). Use of Dynamic Features for Signature Veri cation. Proc IEEE Int Conf on Systems, Man, and Cybernetics, Charlottesville, pp. 201-205. 52

[40] A. S. Osborn (1929). Questioned Documents. Boyd Printing Co., Albany, NY, 2nd Edition. [41] M. Parizeau and R. Plamondon (1990). A comparative Analysis of Regional Correlation, Dynamic Time Warping, and Skeletal Tree Matching for Signature Veri cation. Trans on Pattern Analysis and Machine Intelligence, Vol 12, No 7, pp 710-717. [42] J. R. Parks, D. R. Carr and P. F. Fox (1985). Apparatus for Signature Veri cation. US Patent Number 4, 495, 644. [43] D. A. Pender (1991). Neural Networks and Handwritten Signature Veri cation. PhD Thesis, Department of Electrical Engineering, Stanford University. [44] R. I. Phelps (1982). A Holistic Approach to Signature Veri cation. In Proceedings of the Sixth International Conference of Pattern Recognition, page 1187, IEEE. [45] R. Plamondon and G. Lorette (1989). Automatic Signature Veri cation and Writer Identi cation - The State of the Art. Pattern Recognition, Vol 22, pp. 107-131. [46] R. Plamondon (1994). The Design of an On-line Signature Veri cation System: From Theory to Practice. In International Journal of Pattern Recognition and Arti cial Intelligence, Volume 8, Number 3, Singapore. [47] R. Plamondon and M. Parizeau (1988). Signature Veri cation from Position, Velocity and Acceleration Signals: A comparative Study, Proc of 9th International Conf on Pattern Recognition, Vol 1, Rome, Italy, pp 260-265. [48] R. Plamondon (1993). Looking at Handwriting Generation from a Velocity Control Perspective. Acta Psychologica, Vol 82, pp 89-101. [49] R. Plamondon, A. M. Alimi, P. Yergeau and F. Leclerc (1993). Modelling Velocity Pro les of Rapid Movements: A Comparative Study. Biological Cybernetics, Vol 69, pp 119-128. [50] Y. Qi and B. R. Hunt (1994). Signature Veri cation using Global and Grid Features. Pattern Recognition, Vol 27, No 12, pp 1621-1629. [51] R. Sabourin, R. Plamondon and L. Beumier (1994). Structural Interpretation of Handwritten Signature Images. In International Journal of 53

Pattern Recognition and Arti cial Intelligence, Volume 8, Number 3, Singapore, pp. 709-748. [52] R. L. Sherman (1992). Biometric Futures. Computers & Security, Vol 11, pp 128-133. [53] J. Sternberg (1975). Automated Signature Veri cation using Handwriting Pressure. 1975 WESCON Technical Papers, No 31/4, Los Angeles. [54] T. S. Wilkinson (1990). Novel Techniques for Handwritten Signature Veri cation. PhD Thesis, Department of Electrical Engineering, Stanford University. [55] M. Yasuhara and M Oka (1977). Signature Veri cation Experiment Based on Nonlinear Time Alignment: A Feasibility Study. IEEE Trans on Systems, Man and Cybernetics, Vol SMC-7, No 3, pp 212-216. [56] K. P. Zimmermann and M. J. Varady (1985). Handwriter Identi cation from One-bit Quantized Pressure Patterns. Pattern Recognition, Vol 18, No 1, pp 63-72.

54