Multinational License Plate Recognition System

2 downloads 0 Views 270KB Size Report
Image-based Car License Plate Recognition (CLPR) systems provide an inexpensive automatic solution for remote vehicle identification. Localization stage of ...
Multinational License Plate Recognition System: Segmentation and Classification Vladimir Shapiro Orbograph Ltd., P.O.Box 215, Yavne 81102, Israel [email protected] Abstract Image-based Car License Plate Recognition (CLPR) systems provide an inexpensive automatic solution for remote vehicle identification. Localization stage of the CLPR yields a gray-scale plate clip with printed characters. This paper describes the method of plate clip segmentation into isolated characters, feature extraction and classification. The method is independent on character size, thickness, illumination and is capable of handling plates from various countries. The method uses extensively the gray-scale information and is robust to breaks in character connectivity. It is tolerant to character deformations, such as shear and skew. Promising results have been obtained on Israeli and Bulgarian plates.

1. Introduction While the first industrial automatic systems of CLPR began emerging in 80-ies, an outburst of commercial systems occurred in the past 10 years [1,5,6,11]. Although hundreds of CLPR systems are available in the market worldwide, the research and development still continues and new sophisticated solutions appear. This is due to the growing demand for the automatic vehicle identification required for traffic and border control, calculation of parking time and payment, search for stolen cars or unpaid fees, the requirement for reliable identification at different lighting conditions, presence of random or structured noise in the plate, and nationality specific features, concerning plate’s size and font. A CLPR system can be conceptually considered as containing two major components: - License Plate Localization (LPL) - License Plate Optical Character Recognition (LPOCR) The LPOCR has a potential to serve as a verifier, indicating whether the image fragment clipped by the LPL, referred to as a “plate candidate”, comprises the actual plate. Otherwise LPL attempts to find better candidate [10]. It would be tempting to adopt a commercial “offthe-shelf” OCR as recognition “engine” for LPOCR system. Available today at very affordable price, such commercial OCRs are designed to primarily meet

Georgi Gluhchev Institute of Information Techonolgies 1113 Sofia, Akad. G. Bonchev St., bl.29A [email protected] needs of a printed page processing and recognition, where a bright background of uniform intensity is implicitly assumed. Achieving read-rates of 98-99% is not an unusual practice then. Unfortunately, such OCR packages fail to segment the plate strip properly in very many cases. License plate imagery is equivalent to a text, scanned with very low resolution, additionally hindered by a non-homogeneous background of multimodal intensity distribution and by significant, often unpredictable, noise factors. Various statistics useful for text segmentation such as e.g. the text pitch [8] cannot be reliably estimated on a short plate string. All that causes the OCR performance to drop abruptly. Plate segmentation, i.e. separation into characters seems to be the most challenging LPOCR task. Segmentation, based on connected component analysis [6], would fail if characters are broken or binarization is not perfect. It is assumed in [11] that character connected components are typically well separated as no more than two characters are connected. This assumption seems to be too optimistic for the real life. The paper describes the LPOCR component of the CLPR system. Its scope is defined in Section 2; Section 3 considers the feature extraction; the plate clip segmentation is explored in Section 4. Classification and post-processing procedures are presented in Section 5 and 6, respectively. Discussion and Conclusions follow.

2. Scope of the Method

Figure 1. Examples of a typical source image and localized plates.

A typical output of the LPL stage [10] after deskewing is shown in Figure 1. The LPOCR has much broader responsibility in the described system than just conventional recognition of isolated characters at the

0-7695-2128-2/04 $20.00 (C) 2004 IEEE

final stage of the work. The paradigm implies multipurpose use of the recognition: at the plate localization stage the recognition serves as a “verifier” of the “plate candidates” provided by the segmenter, that generates a number of such candidates. The recognition is activated in an accuracy-optimized mode for achieving optimal recognition results after having the best plate candidate approved.

3. Training and Feature Extraction The training process is partially supervised in the sense that character candidates segmented automatically are visually verified to contain exactly one character. Then, an isolated character is manually associated with the corresponding class. At the stage of recognition, when the system is already trained, the automatic classifier plays the role of the automatic verifier. Training is performed with character templates of mixed countries of origin. Each template keeps, however, its origin country label, which will allow later determination of the plate nationality. Size and intensity normalization is carried out. Each character is warped to the matrix of predefined size that is equal for the training and recognition, here 32 x 60. Gray-scale matrix is normalized through histogram stretching.

3.1. Thinned Object Representation Line thickness independent feature S ( X ) called Object Thinned Representation (OTR) is chosen as a shape descriptor. Let us assume that the character rectangle consists of an object X of cardinality K X , which is X ’s area, and a background X , which is the object’s complement set of cardinality K X . S(X) is

iteratively evaluated according to the formula (1), where B is a structuring element, Ο is a thinning operation (see [2]), A is rectangle’s area and w X is a coefficient: (1) S ( X ) = X Ο B i mod 4 ; while K X < w X A and i ⊂ [0,1,...] 0 1 0 0 0 0 0 0 0 0 0 0 B0 = 0 1 0, B1 = 0 1 1, B2 = 0 1 0, B3 = 1 1 0 . 0 0 0 0 1 0 0 0 0 0 0 0        

w X was empirically set to 0.25. Eq.(1) says that the object is iteratively shrinking as i increments until X area contracts to w X A portion of the rectangle size A . This technique does not preserve object connectivity, which is not critical for the classification method. The OTR somewhat reminds on a skeleton. It is sensitive to a certain degree to boundary profile and

tends to produce non-desirably spurious shapes. An algorithm [3] with proper setting of parameters to inhibit small but spiky boundary behavior could be tried as an alternative.

3.2. Characteristic Background Spots (CBS) Sometimes, a clip submitted for the recognition represents homogeneous region mostly or fully populated by approximately uniform gray levels. The latter clip obviously needs to be rejected. A single OTR vector would not be sufficient to do so. This is the reason we introduce another feature vector in addition to the OTR. The other potentially problematic situation, that the OTR intends to resolve, is a case when a character in question represents a subset of the template, e.g. “3” in question is matched to the template of “8”.

(a) (b) (c) Figure 2. (a) An original character; (b) its OTR template; (c) CBS template.

The CBS describes a “shape” of character image rectangle occupied by the background. The CBS is obtained via shrinking the character background by a number of iterations (a natural way of implementing that is the morphologic erosion). The shrinking continues until a certain percentage of the background area remains unpopulated by the object pixels. A numbers of iterations required to get CBS depends on the initial K X . Formally, CBS, denoted as S ( X ) , is expressed as: S ( X ) = X Ο B i mod 4 ; while K X < w X A

(2)

where i ⊂ [0,1,...] . w X was set to 0.2. Eq.(2) says that the background iteratively contracts as i increments until the background area K X contracts to w X A portion of the rectangle A . Both OTR and CBS representations are invariant to a large degree to the original character thickness.

4. The Segmenter The segmenter aims at separating the plate clip into isolated character bounding boxes. Zero overlapping is assumed. The algorithm is based on the adaptive iterative thresholding and connected component analysis. Due to the non-homogeneous illumination and/or presence of stamps, bolts, etc noise on the plate, characters might occupy more than one connected component. Most frequently, the components are situated one above the other. Thus, we prepare an “extended” bounding box of the component by

0-7695-2128-2/04 $20.00 (C) 2004 IEEE

spanning up to the plate bottom. Separated component resided below might belong to noise see Figure 4 f. To solve the problem of disconnectedness, the character classification algorithm, see Section 5, works on both the original and the extended connected components, see e.g. Figure 4c and Figure 4d, and accepts the one with higher confidence level. Plate Character Isolation Algorithm Outline (Figure 3)

assess global threshold Tg [9] and binarize the image do while (right plate boundary not reached) begin find connected components // on the rest of the plate if (one character) // based on width and proportion begin perform classification jump to the next component reassess global threshold Tg //on the rest end else // two or more characters reduce the Tg by ∆T // reduces connectivity end Note: Extra boundary peeling might be required to eliminate some margins around the plate at the preliminary stage.

out from the gray-scale plate clip and presented to the classification algorithm described below.

5. The Classifier An unknown character segmented by the algorithm, described in Section 4, is normalized to a predefined size. Thus, the unknown character and the templates (OTR and CBS) are of same size and can be analyzed element-wise. Each of the two templates serves as a “mask” for extracting the gray-scale pixels from the zones in which the template “hits” the unknown character matrix; see Figure 5. In the case of matching the unknown character against its intra-class feature vector, it is reasonable to anticipate that the OTR would mask primarily the pixels belonging to the character, and CBS – mostly the background ones. Oppositely, when the unknown character matrix is matched to other class vectors, the OTR would hit many of background pixels and CBS – many of the object pixels. To estimate the similarity quantitatively we introduce the following distance measure Dk between an unknown character gray level matrix G and a template TC , where Ck is a class label and k

Gray-scale clip Tg = 90

T g = 90

Tg = 90

Tg = 87 Tg = 90

T g = 90

T g = 90

Tg = 81

Tg = 90

Tg = 87

number of classes i.e. allowed characters in license plate alphabet. Dk = wOTR Var (G ∩ T OTR ) + wCBS Var (G ∩ T CBS ), (3) k = 0,..., N

Figure 3. Sequential stages of plate segmentation as a function of adaptive local thresholding ( ∆T = 3 ) (plate colors are inverted for better visibility)

Ck

Ck

where Var is the variance; wOTR and wCBS are weight coefficients set to 1.0 and 0.9, respectively. Dk is expected to be minimal for the unknown character’s native class Ck , i.e. C k = arg min k ( Dk ) (4) Plates, having high Ck , do not resemble text-like information [7] and are thus rejected.

(a)

(a)

(b)

(b) (c)

(d)

(e)

(f)

Figure 4. Non-connected components belonging to same character: (a),(b) source plates; (c),(e) connected components found; (d) extension led to proper character aggregation; (f) extension led to introducing extra noise

As a result of segmentation each successfully isolated and approved character’s bounding box is cut

(c)

(d)

(e)

(f)

Figure 5. (a) License plate gray-scale matrix; (b) character after segmentation; (c) character’s bounding box clipped from (a); (d) same, after size normalizati on; (e) its masking by one of OTR templates of class “9”; (f) its masking by a corresponding CBS template

Compared to the OCR carried out on purely black-andwhite image matrix of normalized size, the gray-scale recognition gives a boost of 10-12%. The above-described classification procedure is

0-7695-2128-2/04 $20.00 (C) 2004 IEEE

fully generic, not limited to the license plate context. It could serve recognition of any printed texts with noisy environment.

6. The Postprocessor The postprocessor is responsible for refining recognition results by making use of the specific license plate context. A regular mode of classification extensively used at the segmentation phase is the so-called “international”, i.e. when templates of all countries of origin are mixed within the same pool. Although the algorithm is, to a large degree, independent on the template font used, more precise classification results are obtained when “native” country templates are applied in matching. Thus, the postprocessor automatically determines the country “locale” of the current plate. The system has been tested on plates from Israel and Bulgaria, and its architecture is open for further geographic extension. The benefit of applying a proper country locale is also due to a difference in country alphabets. Both contain digits; the Bulgarian alphabet incorporates several capital letters. Implementation-wise the majority of characters recognized in the “international” mode plus some grammar-inspired clues, determine the country of origin. The additional semantic-based refinement pass of classification picks up only the native country templates, which allows improving the read-rate furthermore. (b) (a) (d) (c) Figure 6. (a),(b) Bulgarian plate characters; (c),(b) same characters from Israeli plates

The semantic rules vary from country to country. In Bulgaria, the plates start with a letter or two, followed by a space, several digits and a couple of letters, see e.g. Figure 1. The grammar model could be schematically designated as “L[L] DDDDLL”, where “L” and “D” stand for a letter and a digit, respectively. Israeli plates strictly follow the rule: “DD-DDD-DD”. On the test sample of 500 plates, 81.2% of plate candidates have been approved at the localization stage. The percentage of wrongly approved plates (false positives) was 0%, as the rest to 100% plates have been rejected. Character level read-rate versus the misread was 85.2% / 3.4%.

7. Discussion and Conclusions The above-described system achieves several important goals: - Allows a very reliable verification of a plate candidate generated at the phase of localization - Adaptively segments the plate image coping with

tough illumination conditions and image distortions - Classifies gray-scale characters of variable size and resolution with a very reasonable accuracy. As mentioned in Section 1, OCR packages are incapable of delivering a “magic-wand” solution, however, they could be useful when the segmentation task is completed, and characters are properly separated and clipped. Then, a relatively high read rate could be anticipated. At least on most of the readable by a human eye plates, we could expect read rate levels of 90% and higher. The experiments are on going. Further directions of the research lay in applying approaches known in a context of conventional OCR/ICR systems as “multi-expert” combination, or “voting” [4]. Using RGB cameras with known in advance plate background/foreground colors would allow higher precision in character isolation. Applying the algorithm to multiple frames of the plate video stream would let additional gain in accuracy [1].

8. References [1] Y.T. Cui and Q. Huang: Extracting Characters of License Plates from Video Sequences, Machine Vision and Applications vol 10, 1998, pp. 308-320 [2] A.K.Jain, Fundamentals of Digital Image Processing, Prentice Hall, Englewood Cliffs, NJ, 1989 [3] W-B. Goh and K-Y. Chan, “Part-Based Shape Recognition Using Gradient Vector Field Histograms”, CAIP’03, LNCS 2756, 2003, pp. 402-409, [4] Y.S. Huang and C.Y. Suen, “Combination of multiple experts for the recognition of unconstrained handwritten numerals”, IEEE PAMI-17, 1995, pp. 90-94 [5] L. Jilin, M. Hongqing, L. Peihong, “A High Performance License Plate Recognition System Based On The Web Technique”, IEEE Conf. On Intelligent Transportation Systems, Oakland, CA, 2001, pp.14-18 [6] S-H. Lee, Y-S. Seok, E-J. Lee “Multi-National Integrated Car-License Plate Recognition System Using Geometrical Feature and Hybrid Pattern Vector”, Conf. Circuits/Systems, Computers and Comm., Phuket, Thailand, 2002, pp. 12561259 [7] R. Lienhart and W. Effelsberg, “Automatic Text Segmentation and Text Recognition for Video Indexing”, ACM/Springer Multimedia Systems, vol. 8, 2000 pp.69-81 [8] Y. Lu, “Machine Printed Character Segmentation - An Overview”, Pattern Recognition, vol.28, 1995, pp.67-80 [9] N. Otsu, “A Thresholding Selection Method from Gray Level Histograms,” IEEE Trans. Syst., Man, Cybern., vol.9, 1979, pp.62-66 [10] V. Shapiro, G. Gluhchev, S. Bonchev, V. Velichkov, “License Plate Localization”, Automatics and Informatics’03, Sofia, 2003, pp. 41-43 [11] M. Shridhar, J.W.V. Miller, G.Houle, L. Bijnagte, “ Recognition of License Plate Images: Issues and Perspectives”, ICDAR, Bangalore, India, 1999, pp. 17-20

0-7695-2128-2/04 $20.00 (C) 2004 IEEE