Handwritten Arabic Word Recognition based on ... - Semantic Scholar

43 downloads 20835 Views 515KB Size Report
to the results obtained for normalized word images. .... signature vecrification [22]. For our part, we .... recognition using digital curvelet transform, Journal.
Handwritten Arabic Word Recognition based on Ridgelet Transform and Support Vector Machines Hassiba Nemmour and Youcef Chibani Signal Processing Laboratory, Faculty of Electronic University of Houari Boumediene Algiers, Algeria [email protected], [email protected]

Abstract In this paper, we propose a system for handwritten Arabic word recognition based on Ridgelet transform and SVMs classifiers. Specifically, the One-against-all SVM multiclass implementation is used to recognize several Arabic words. In addition, Ridgelet features are evaluated comparatively to Radon transform as well as to the results obtained for normalized word images. Experiments are performed on samples extracted from the IFN/ENIT database. The results highlight the reliability of the Ridgelet-SVM combination for handwritten Arabic word recognition.

1.

Introduction

In many document analysis applications, words are considered as the unit component in the recognition task. However, despite of the impressive progress in handwriting recognition, accuracy of word recognition systems is far from the human performance. In fact, due to the high variability in writing styles and tools as well as the unconstrained handwriting aspect, words are quite complex patterns [1]. Specifically, there are two approaches for handwritten word recognition [2,3]. The first approach is analytic; it proceeds by a segmentation into isolated characters. Then, words are recognized through character recognition. This approach is useful for problems with very large vocabulary where it becomes impossible to refer each word to a class of interest. The second approach is called holistic since it considers the whole word as a single entity corresponding to a class of interest. The holistic analysis is adequate for problems with medium and small vocabulary such as literal amount reading in bank checks [2,4]. Another advantage of this approach is that it captures all-coarticulation and variability effects into word images [1]. Recall that in the last years various research works were carried out to enhance handwritten Latin word recognition [2,5]. More recently, a progressive interest was assigned to Arabic script [4,6,7,8,9]. Nevertheless, unlike Latin, Arabic language has a set of particularities which make its recognition too challenging. For instance, Arabic is semi-cursive since a

single word can include many cursive sub-words or connected-components. In addition, several diacritics are used in Arabic writing. Moreover, character shapes change according to their positions within the word [10]. In the present work, we are interested to Arabic word recognition. Note that researches in this field were inspired from methods that are already used for Latin word recognition. Thereby, Hidden Markov Models (HMM) were extensively used especially for problems with very large vocabulary [2,3,4,5,6,10,11]. Note that over the past years, HMM and then Neural networks were the classical choice to achieve the recognition task. Currently, Support Vector Machines or SVMs are the best candidate for solving all handwriting recognition tasks with medium number of classes [12]. The literature reveals a large use of SVMs especially for handwritten digit recognition [13,14]. For feature extraction, earlier works employed statistical descriptors such as relative positions, mean and variance based on profiles or histograms [15]. Due to the diacritics and spaces within Arabic words, structural features that are computed on skeletons of connected components such as the number of loops and concavities are more reliable [4]. Recently, some robust transforms such as wavelet packets have shown satisfactory results [15]. Thus, in this work we propose the use of SVM for handwritten Arabic word recognition. Besides, we investigate the applicability of Ridgelet transform for feature extraction. The Ridgelet is obtained by applying one-dimensional wavelet transform to Radon coefficients. Since, there is no reference using the Radon transform for handwritten Arabic words, the Ridgelet transform is evaluated comparatively to the results obtained with Radon features. This paper is then organized as follows. In the next section we give a brief presentation on SVMs classifiers. Section 3 presents the Ridgelet transform while the experimental results are given in section 4. The final section gives the main conclusions of the paper.

2. Support vector machines SVMs are designed for solving binary classification. Their training consists of finding the optimal separating hyperplane between two classes [16]. Specifically, let

Pn , yn  R M   1 be a set of training patterns so that M: is data dimension, n  1, , N c , Nc : is the number

of

samples

in

the

class

c

and

let

  1 be a set of functions. The training process selects the function f which maximizes the f :R

M

margin between the two classes [17]. Then, data are classified according to the signal of the decision function so that :

 SV  f P   sign   yi i k Pi , P   b   i 1 

(1)

zero. The kernel k . ,. is any mathematical function, respecting Mercer’s conditions [17]. The Radial Basis Function (RBF) kernel provides commonly the best performances for pattern recognition applications. For two samples P1 and P,this kernel is expressed as:

d P1 , P   P1  P

2

(2)

(3)

 is user-defined.

For solving multi-class problems, two multi-class implementations of SVMs are commonly used [18]. The One-Against-All (OAA), which is the earliest one, performs C binary SVMs to solve a problem with C classes. The second approach that can be used is the One-Against-One (OAO) which employs C×(C-1)/2 SVMs each of which is designed to separate two classes.

3.

For an image 𝑓 𝑥1 , 𝑥2 the Ridgelet transform can be computed according to the following steps : 1)

The optimal hyperplane corresponds to f x   0 while b is a bias. SV is the number of support vectors, which are training data whose Lagrange multipliers  i are not

1   k P1 , P   exp   d P1 , P  2  2 

according to several angular directions. Radon coefficients correspond to projections representing the shadow of the shape at each angle [20]. Consequently, linear singularities in any direction are expressed by high magnitudes. Thenafter, In order to characterize better linear singularities, the one-dimensional wavelet transform is applied on Radon slices to yield the Ridgelet coefficients. Hence, along the Radon axis projection, the Ridgelet is constant while in the direction orthogonal to these ridges it is a wavelet [19].

Feature extraction

Presently, we investigate the applicability of the Ridgelet transform for describing handwritten Arabic word characteristics. Recall that the use of global transform for handwriting recognition is not new. After Fourier and Hough transforms, more efficient transformations such as wavelet and wavelet packets were used for handwritten Arabic word recognition [15]. Nevertheless, the wavelet transform performs well for describing point singularities while handwritten words are composed of linear singularities. Furthermore, the Ridgelet transform has been introduced to image processing in order to improve the characterization of line singularities [19]. Ridgelet is based on Radon transform that is computed

Compute the Radon transform such as [19]:

𝑅𝐴 𝜃, 𝑟 =

𝑓 𝑥1 , 𝑥2 𝛿 𝑥1 cos 𝜃 + 𝑥2 cos 𝜃 − 𝑟 𝑑𝑥1 𝑑𝑥2

(4)

𝛿: Dirac distribution. 𝜃: Angular variable. 𝑟: Radial variable. 2)

Apply the 1D wavelet on each Radon slice in order to obtain the Ridgelet coefficients 𝑅 𝜃, 𝑟 .

The Ridgelet transform has been successfully used in various image processing applications such as texture classification [20]. For character recognition it was employed for printed Chinese character recognition [19]. Unfortunately, there are no references on the use of Ridgelet transform for handwriting recognition. Nevertheless, some other transforms such as the curvelet have been employed for handwritten Bangla character recognition [21] while Radon transform was used for signature vecrification [22]. For our part, we propose the use of Ridgelet transform for improving handwritten Arabic word recognition in a medium vocabulary problem. We investigate also the performance of SVMs for the recognition stage.

4.

Experimental results

The Ridgelet-SVM recognition system was evaluated on data extracted from the well known IFN/ENIT database [23]. This database contains 26400 images of Tunisian town names. Words are written by more than 411 scripts using different writing tools. In our experimental analysis, we collected 24 classes of interest grouping names with the largest appearance frequencies in the ENIT database. This database has a set of particularities, which make its training a challenging task. As shown in figure (1), words of classes 10 and 11 differ only in the first letter. In addition, samples of class 13 are composed of a word and digits while those of the class 22 are composed of two words. Another particularity of this database is related to the size of data, which is quite variable.

Class 10

class 11

vector and the corresponding recognition rate is weak. Moreover, the best results were obtained when we used square Radon matrices. Variations of the Recognition Rate according to the size of Radon matrix are plotted in figure (2). It is easy to see that a 32×32 Radon matrix makes the best compromise between recognition accuracy and the size of the feature vector which has in this case, 1024 components.

Class 13

Class 22

Figure 1. Examples from the selected database

The SVM implementation is based on the OAA approach using the RBF kernel. Note that several run passes were carried out in order to find the best parametric selection. Radon transform was computed without size normalization. Then, to obtain a feature vector with a unified size, the idea was to fix the number of both radial projections and angular directions specific to Radon transform. However, the results in terms of Recognition Rate (RR) showed that the derived Radon features provide weak discrimination ability. This outcome indicates the necessity to normalize the data size before computing the Ridgelet transform. Then, several sizes were tested by considering normalized images as input vectors for SVMs. The most meaningful results are listed in table (1).

Table 1. Overall Recognition Rates obtained for normalized data Normalization 20×70 pixels 30×80 pixels 30×100 pixels

RR(%) 80,3 79,7 80,2

The best recognition rate is obtained when images are normalized to a size of 20×70 pixels, which allows also the smallest size for the feature vector. Therefore, the Radon transform was computed by using this normalization. Nevertheless, to reach a successful recognition with Ridgelet transform, the related Radon matrix should precisely reflect linear singularities in word images. Thereby, before computing Ridgelets, several parameter selections were tested for Radon transform. Specifically, different choices for the number of angular directions (or the number of 𝜃) and radial projections (or the size of 𝑟 in pixels) were evaluated in terms of Recognition Rate. Note that if the size of Radon matrix is too large, it will lead to a very large feature

Recognition Rate (%)

84,0 83,5 83,0 82,5 82,0 81,5 16

32

48

64

80

96 112 128

Radon matrix size

Figure 2. Variations of the recognition rate according to the size of Radon matrix

Once Radon parameters selected, the wavelet transform is applied on each Radon slice to obtain Ridgelet coefficients. Table (2) summarizes the results obtained for Ridgelet features. For each Radon matrix, this table gives the size of the feature vector, the Racognition Rate (RR) as well as the RunTime (RT) estimated in seconds. The results obtained for Radon features are given for comparison.

Table 2. Performance evaluation of Ridgelet transform Radon matrix

Radon

Ridgelet

Size

RR(%)

RT(s)

Size

RR(%)

RT(s)

128×128

16384

83,3

695

8192

80,9

355

64×64

4096

83,5

199

2048

83,5

105

32×32

1024

83,3

56

512

84

35

16×16

256

81,8

42

128

82,1

26

Roughly speaking, Ridgelet features computed on small Radon matrices (less than 64×64 Radon features) improve both the Recognition rate and the runtime. Specifically, the best performance is obtained when using a 32 × 32 Radon matrix. In this case, Ridgelet transform provides a gain of 0,7% over Radon features and more than 3% compared to normalized data. In

RR (%)

addition, the Ridgelet requires small runtimes because it significantly reduces the feature vector size. Note that, Ridgelet features are composed of both approximation and details of the wavelet decomposition. Then, in the last test we evaluated the feature selection of the Ridgelet transform. Since we are using the one dimensional wavelet, the number of details is controlled by the number of wavelet decompositions which depends itself on the Radon matrix size. Thereby, with a matrix of 32 × 32 Radon coefficients the number of decomposition levels can go to 5. The results in terms of Recognition Rate showed that after the 3rd decomposition level, the Ridgelet keeps approximately the same performance. On the contrary, when changing the wavelet coefficient selection, the Recognition Rate varies significantly. In fact, there are three possible configurations which are: approximation with all details (APP+ALL DET) which provides a feature vector of 512 components, the approximation (App) which yields a feature vector of 128 elements, and all details (ALL DET) with vectors of 384 features. As shown in figure (4), if we consider only the approximation, the recognition accuracy is altered to 70%. On the other hand, the best performance is obtained when considering all details. The most important remark is that the approximation does not improve the precision when it is added to details. Hence, the all details configuration constitutes the best selection for the Ridgelet transform since it improves the recognition rate while reducing the size of data.

85 80 75 70 65 60 55 50 45 40

the Ridgelet outperforms both Radon and normalized data. Nevertheless, is has been shown that an appropriate parametric selection should be carried out to reach satisfactory precision. Therefore, many other tests should be drawn such as the choice of the wavelet function as well as its parameters can improve again the recognition rates.

References [1] Vinciarelli,A., 2002. A survey on off-line cursive word recognition, Journal of the Pattern Recognition Society, Vol. 35, 1433-1446. [2] Knerr, S., Augustin, E., Baret, O., Price, D., 1998. Hidden Markov model based word recognition and its application to legal amount reading on French checks, Journal of computer vision and image understanding, Vol. 70, 404-419. [3] Plötz, T., Fink, G.A., 2009. Markov models for offline handwriting recognition : A survey, International Journal of Document Analysis and Recognition, Vol. 12, 269-298. [4] Farah, N., Souici, L., Sellami, M., 2005. Classifiers

combination and syntax analysis for arabic literal amount recognition, Engineering Applications of Artificial intelligence, Vol. 19, 29-39. [5]

Chen, M.Y., Kundu, A., Zhou, J., 1994. OffLine handwritten word recognition using a hidden Markov model type stochastic network, IEEE transactions on Pattern analysis and Machine Intelligence, Vol. 16, 481-496.

[6] Al Badr, B., Mahmoud, S.A., 1995. Survey and bibliography of Arabic optical text recognition. Signal Processing 41, 49–77. [7] Amin, A., 1998. Off-line Arabic character recognition: the state of the art. Pattern Recognition 31 (5), 517–530. [8] Essoukhri Ben Amara, 2002. Sur la proble´matique et les orientations en reconnaissance de l’e´criture arabe. CIFED2002, pp. 1–8. [9] Benouareth, A., Ennaji, A., Sellami, M., 2008.

Arabic handwritten word recognition using HMM with explicit state duration, Eurasip journal on advances in signal processing,vol.2008. [10] Assma, O.H., Khalifa, O.O., Hassan, A., 2008. Handwriiten Arabic word recognition: A review of common approaches, Proceedings of the International Conference on Computer and Communication Engineering, Kuala Lumpur, 801805. APP

ALL DET

APP+ALL DET

Figure 4. Recognition Rate for different selections of the Ridgelet coefficients

[11] Mohamed, M.A., Gader, P., 2000. Generalized hidden Markov models—Part II: Application to handwritten word recognition, IEEE Transactions on Fuzzy Systems,Vol. 8, 82-94. [12] Justino, E.J.R., Bortolozzi, F., and Sabourin, R.,

2005. A comparison of SVM and HMM classifiers in the off-line signature verification, Pattern recognition letters, Vol. 26, 1377-1385. [13] Schölkopf,

5.

Conclusion

In this paper we proposed a Ridgelet-SVM combination for handwritten Arabic word recognition. The Ridgelet transform is employed in order to highlight all linear singularities within handwritten words. Using data extracted from the IFN/ENIT database, the experimental design was focused on finding the best Ridgelet configuration for Arabic word characterization. The results obtained in the different tests, highlight the efficiency of the proposed recognition system. In fact,

B., 1997. Support vector learning, Thèse de PhD : Université de Berlin, 173 pages. [14] Keysers, D., Paredes, R., Ney, H., and Vidal E., 2001. Combination of tangent vectors and local representations for handwritten digit recognition, Lecture Notes in Computer Science, Vol. 2396, 538-547. [15] Broumandnia,

A., Shanbehzadeh, J., et Varnoosfaderani, M.R., 2008. Persian/arabic handwritten word recognition using M-band packet wavelet transform, Image Vision and Computing Journal, Vol.26, 829-842.

[16] Vapnik,

V., 1995. The nature of statistical learning theory, Springer-Verlag, New York. [17] Burges, C. J. C., 1998. A Tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, Edited by Ussama Fayyad, Vol. 2, 121-167. [18] Hsu, C-W., and Lin, C-J., 2002. A comparison of Methods for Multi-class Support Vector Machines, IEEE Transactions on Neural Networks, Vol. 13, 415-425. [19] Chen, G.Y., Bui, T.D., and Krzyzak, A., 2006. Rotation invariant feature extraction using Ridgelet and Fourier transforms, Journal of Pattern Analysis and Application, Vol. 9, 83-93. [20] Arivazhagan, S., Ganesan, L., Subash Kumar, TG., 2006. Texture classification using Ridgelet transform, Pattern Recognition Letters, Vol. 27, 1875-1883. [21] Majumdar, A., 2007. Bangla basic character recognition using digital curvelet transform, Journal of pattern recognition Research, Vol. 1, 17-26. [22] Coetzer, J., Herbst, B.M., and du Preez. J.A., 2004. Offline signature vverification using the discrete radon transform and a hidden markov model, Eurasip journal on Applied Signal Processing, 559-571. [23] www.ifnenit.com