Object Signature Features Selection for Handwritten ...

15 downloads 8523 Views 231KB Size Report
Keywords: feature selection, object signature, handwritten Jawi recognition, trace transform. ... Malay manuscripts that have not been fully digitized yet. Features ...
Object Signature Features Selection for Handwritten Jawi Recognition Mohammad Faidzul Nasrudin, Khairuddin Omar, Choong-Yeun Liong, and Mohamad Shanudin Zakaria1

Abstract. The trace transform allows one to construct an unlimited number of image features that are invariant to a chosen group of image transformations. Object signature that is in the form of string of numbers is one kind of the transform features. In this paper, we demonstrate a wrapper method along with several ranking evaluation measurements to select useful features for the recognition of handwritten Jawi images. We compare the result of the recognition with those obtained by using methods where features are randomly selected or no feature selection at all. The proposed methods seem to be most promising. Keywords: feature selection, object signature, handwritten Jawi recognition, trace transform.

1 Introduction The object signature feature developed by [1][2] has shown its usefulness for various applications such as face recognition [3][4], Korean character recognition [5] and image database retrieval [2]. The object signature feature is invariant to affine distortion. It is based on the trace transform that theoretically allows us to use an unlimited number of features, which are mathematically appropriate, but perceptually indescribable. Although most of them will not be useful; nevertheless, one can investigate the features and make the appropriate choice for the specific task with the help of experimentation Feature selection methods in classification can be divided into three categories [6]. The first category, referred to as filter, defined as a preprocessing step and can be independent from learning. A filter method assesses the relevance of features Mohammad Faidzul Nasrudin, Khairuddin Omar, and Mohamad Shanudin Zakaria Centre for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor D.E., Malaysia e-mail: {mfn,ko}@ftsm.ukm.my, [email protected] Choong-Yeun Liong Centre for Modelling and Data Analysis (DELTA), School of Mathematical Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor D.E., Malaysia e-mail: [email protected] A.P. de Leon F. de Carvalho et al. (Eds.): Distrib. Computing & Artif. Intell., AISC 79, pp. 689 –698. springerlink.com © Springer-Verlag Berlin Heidelberg 2010

690

M. Faidzu et al.

by looking at the intrinsic properties of the data [7]. This method calculates a score for each feature and then selects features according to the scores [8]. Information gain and chi-square, according to [9][10], are among the most effective filter methods of feature selection for classification. In the second category, which is named wrapper [11], utilizes the learning system as a black box to score subsets of features. In wrapper, a search procedure in the space of possible feature subsets is defined, and various subsets of features are generated and evaluated by a specific classification algorithm. The third category called the embedded method [12] performs the feature selection within the process of training. Concerning the selection of the object signature, to the best of our knowledge, there have not been any work dedicated to it. The challenge in this study is that the object signature descriptor is a string of numbers, which is like a signature of the object. The task of identifying an object is by comparing two strings of numbers (one from the test image and the other from the reference image), that are circularly shifted and possibly scaled versions of each other. The only reported method for object signatures comparison is by computing their correlation coefficient for all possible shifts [1]. To express this as a distance, the inverse cosine of the maximum value of the correlation coefficient is taken. The distances are ranked and the smallest number indicates the two most similar signatures. Since the shifting value varies from image to image, existing filter feature selection methods for classification are not suitable for the object signature. Obviously, wrapper method that “wrapped” around the classification model is a preferred choice. In such case, the classification model refers to similarity measurement between two “signatures”. Jawi is a cursive script that was derived from the Arabic alphabets and was adopted for the use of Malay language writing. Jawi can largely be found in old Malay manuscripts that have not been fully digitized yet. Features based on the trace transform has shown its effectiveness for printed Jawi character recognition [13]. In the study, features were selected based on manual selection with trial approach. Instead in this study, we would like to propose a wrapper that evaluates the usefulness of feature subsets based on recognition performance determined by ranking evaluation measurements, the mean average precision (MAP) [14] and normalized discounted cumulative gain (NDCG) [15]. The recognition performance of the feature subset from the wrapper method is compared to the recognition performance from other selection methods. In terms of data, we use handwritten Jawi images instead of the printed ones. In Section 2 we present the background to object signature from the trace transform. In Sections 3 and 4, we elaborate on the experiments and the results and discussion, respectively. We conclude in Section 5.

2 Object Signature from the Trace Transform Let us imagine an image f ( x, y ) criss-crossed by all possible lines l ( r, θ , t ) that one can draw on it (refers Fig.1). Let L ( r, θ ) denote the set of all lines, the trace transform is a function g (T , f , r,θ ) defined on L ( r, θ ) with the help of trace

Object Signature Features Selection for Handwritten Jawi Recognition

691

functional T (some functional of the image function f ( x, y ) when it is considered along the line l ( r , θ ), as a function of parameter t), then

g (T , f , r , θ ) = T [ f ( r , θ , t ) ]

(1)

One then calculates another functional, P, along the columns of the transform, i.e. over parameter r, and finally a functional Ф over the string of numbers created this way, i.e. over parameter θ [16][17][18]. The result is a single number called triple feature П that is defined as: Π ( f ) = Φ [ P [ T [ f (r , θ , t ) ] ] ]

(2)

where П represents the extracted triple feature of image f ( x, y ) .

Fig. 1 Definition of the parameters of an image f ( x, y ) and tracing line l ( r, θ , t )

The extracted triple feature is influenced strongly by the properties of the chosen functionals T, P and Ф. For practical applications, such as feature extraction, these functionals may be chosen so that the triple feature has intended properties such as invariant to affine distortion. Using an appropriate combination of the functionals T, P and Ф selected by a suitable feature selection method, thousands of triple features can be generated. For example, [11] and [19] proposed the functionals one should use in order to produce features invariant to rotation, translation and scaling for fish image database retrieval and insect footprint recognition respectively. In this paper we are not going to attempt to characterize an object by a single number produced by the cascaded application of the three carefully chosen functional T, P and Ф, but instead we are going to use only the first two functionals. Using only functionals T and P will allow us to characterize an object by a string

692

M. Faidzu et al.

of numbers. The object signature is a function, called the associated circus, ha (φ ) , defined in terms of the function h (φ ), which is produced by applying functionals T and P:

ha (φ ) ≡ h (φ )

−1 / (λ P K T − K P )

(3)

Parameters λP, KT and KP are some real valued numbers which characterize functionals T and P (for details refer to [2]). If λP KT − K P = 0, the associated circus is defined as: ha (φ ) ≡

dh (φ ) dφ

(4)

If we plot in polar coordinates the associated circus function of the original image, ha1 (φ ) , and the associated circus function of the affinely distorted image,

ha 2 (φ ), the functions will produce two closed shapes, which are connected by a linear transformation. In order to be able to compare these two shapes, they have to be normalized so that their principal axes are coincidental. This can be done by a linear transformation applied to each shape separately as described in detail in [1]. The normalized shapes hn1 (φ ) and hn 2 (φ ) are the signatures of the two images. For practical applications, the task of identifying an object is just comparing two strings of numbers, hn1 (φ ) and hn 2 (φ ) , that are circularly shifted and possibly scaled versions of each other. Figure 2 shows two signatures in polar coordinates of a Jawi character in two different font types that are very similar in shape and differ mainly by rotation and scaling.

Fig. 2 The signatures of Jawi character “Pa” in two different font types

3 Experiments on Handwritten Jawi Recognition Generally, we divided all the experiments into three steps. Firstly, we performed the feature selection based on the proposed ranking evaluation measurements. Then, based on those selected features subset, we run another experiment to compare their recognition performance with other four methods of choosing feature subsets. Lastly, we run an experiment to recognize all collected handwritten images using the best feature subset.

Object Signature Features Selection for Handwritten Jawi Recognition

693

3.1 Data We collected nine sets of scanned articles written by nine different writers. Each article was designed such that all possible combinations of 36 Jawi characters exist at least once in the text. This was to ensure that all kinds of character combinations were tested since character shape changes depending on the character position in a word. All the scanned pages were decomposed into a set of 4835 sub word images using Connected Component segmentation algorithm. Each article contains 213 distinct sub words. In order to reduce the time and magnitude of the computation, for feature selection, we randomly selected only 2 sets of images. Each set contains 213 of those distinct sub word images. One as the test set and another as the reference set. For the comparison experiments, we generated another 16 sets of images randomly. Eight of the image sets were used for testing and the rest as the reference. For the final experiment, all the 4835 sub word images were randomly divided into 9 sets of images for cross validation procedure.

3.2 Feature Extraction For the trace transform method we computed the object signature, ha, by applying the functionals T and P. We tested seven different T functionals and eleven P functionals. The T functionals were: − T1: Integral of f (t ), where f (t ) is the value of the image function along the tracing line; − T2: Max of f (t ) ; − T3: Integral of f ′(t ) ;

− T4: Integral of f ′′(t ) ; − T5: Lp quasi-norm (p = 0.5) = q 2 , where q = integral of

f (t ) ;

− T6: Median R+: f (t − c ) where c is the median abscissa; − T7: Weighted Median R+: f (t − c ) where c is the median abscissa and the weights are f (t ) (t − c) . The first seven P functionals are the same as T1 to T7, called P1 to P7, respectively. In addition, the following four P functionals were used: − P8: t − (median index dividing the integral of f (t ) ); − P9: (Average of t ) − (index max of f (t ) ); − P10: t − (gravity center of f (t ) ); − P11: t − (median index dividing integral of

f (t ) );

694

M. Faidzu et al.

where f (t ) is the value of the image at sample point t along the tracing line. In the definitions of T functionals, R+ means that the integration is over the positive values of the variable of integration. The explanation on the weighted median can be found in [1]. We then generated all possible pairs using these seven T functionals and eleven P functionals. In total, there were 7 × 11 = 77 candidate circus functions or signatures to characterize an image. Each image was traced by lines one pixel apart, i.e. the value of parameter p for two successive parallel lines used differed by 1. For each value of p, 48 different orientations were used. This means that the orientations of the lines with the same p differed by 7.5 degrees. Each line was sampled with points one pixel apart, that is to say parameter t took discrete values with steps equal to 1 inter-pixel distance. For the comparison of two signature values, we computed a novel distance measure called normalized circular cross-distance, a modified version of the normalized circular cross correlation function used in [20]. The multiplicative expression in the original normalized circular cross correlation function is substituted by a distance measure. The normalized circular cross-distance, NCXD, is defined as:

N

NCXD (d ) = ∑ i =1

hat (i )



N

∑h i =1

at

(i )

2

hal (i − d ) N

∑h i =1

al

(i )

(5)

2

where hat and hal are the signature values of the test image and the reference image respectively, N is the length of the signatures and d is the shift. Two signatures are most similar when the NCXD is minimum. We chose the minimum value over 48 shifts (equals to 48 different orientations used). Then, we used the sum of these NCXD numbers across all signatures as measure of similarity of two images. We then ranked the numbers. The smallest number indicates the two most similar images.

3.3 Feature Selection Not all signatures are useful. Features that have all zeros or all in one fixed value will be discarded. For the rest, we applied the hill climbing search on a test set for selecting feature subsets. Given a set of features S1 = { f1,h , f n } the algorithm works as follows: 1. Start with the feature fs that individually performs best on the test set and put it into the set of best features B1; then set S 2 = S1 \ { f s } . 2. For k = 2, … , n do: 2.1 Evaluate the ranking performance of Bk −1 ∪{ f k } on the test set. 2.2 If the set produces lower MAP or NDCG value, then the { f k } is discarded. Otherwise, { f k } is added to the set of best features Bk = Bk −1 ∪{ f k } and set Sk +1 = Sk \ { f k } .

Object Signature Features Selection for Handwritten Jawi Recognition

695

When all object signatures are tested, it terminates. The feature subset Bk finally selected is the one that performs best among all subsets considered by the algorithm. Each run generated results in ranks. For that we adopted two widely-used measures in evaluation of ranking methods for information retrieval, which are MAP and NDCG. MAP is a measure on precision of ranking while assuming that there are two types of item, which are positive (relevant) and negative (irrelevant). Precision at n measures the accuracy of the top n sorted items and is defined as: P ( n) =

In n

(6)

where In is the number of positive items within the top n. Average precision of a query, AP, is defined as: N

AP = ∑

n =1

P(n) × pos (n) I

(7)

where n represents position, N is the number of results, pos(n) is a binary function indicating whether the item at position n is positive, and I is the number of positive items. MAP is defined as AP averaged over all sorted items. NDCG is designed for measuring ranking accuracies when there are multiple levels of relevant judgment. NDCG at position n in sorted items is defined as: n

NDCG ( n ) = Z n ∑ j =1

2R( j) − 1 log(1 + j )

(8)

where n denotes position, R(j) denotes score for rank j, and Zn is a normalization factor to guarantee that NDCG is equal to 1 in a perfect ranking. In evaluation, NDCG is further averaged over all queries. After we got the selected feature subset from the MAP and NDCG, we then compared them with four other feature subsets. The first subset called All-77, consist of all possible signatures which is 77 in total. The signatures of the other three subsets, Random-1, Random-2 and Random-3, are randomly generated. The number of feature chosen is relatively equal to the number of feature selected by the MAP and NDCG methods. Therefore, the number of features chosen for the random methods are 25, 23 and 23 respectively.

4 Results and Discussion Each test image is correlated against each of the 213 reference images. The correlation result is put in a rank that is measured by the MAP and NDCG. Feature that increases the MAP or NDCG value will be selected. The feature selection experiment showed that 20 and 23 out of the 77 features were selected by the MAP and NDCG methods respectively. Based on those selected feature subsets, we then run comparison experiments with the other four feature subsets. Each method

696

M. Faidzu et al.

Table 1 Average percentage of correct sub words recognition and the number of feature used based on the MAP, NDCG, All-77, Random-1, Random-2 and Random-3 feature subsets Average percentage of correct recognition Top-1

1–5

6 – 10

11 – 15

16 –

Number of features

MAP

24.46

41.58

8.92

5.89

43.61

20

NDCG

24.99

42.15

8.08

6.10

43.66

23

All-77

21.13

36.36

9.34

7.67

46.64

77

Random-1

13.04

22.43

7.67

7.67

62.23

25

Random-2

14.82

27.07

7.25

7.62

58.06

23

Random-3

12.99

22.80

6.88

6.73

63.59

23

Method

Table 2 Average percentage of correct sub words recognition based on MAP and NDCG using 9-fold cross-validation on all sub word images Method

Average percentage of correct recognition Top-1

1–5

6 – 10

11 – 15

16 –

MAP

60.19

78.34

6.18

3.66

11.82

NDCG

60.16

78.20

6.02

3.64

12.14

will produce 9 results and the average computed. The results of the comparison experiments on sub words recognition based on the feature subsets selected by the MAP, NDCG and the four other methods are presented in Table 1. The result of the final experiment using 9-fold cross-validation on all the sub word images set is presented in Table 2. From the results presented in Table 1, we can see that the best recognition method is the one based on the features selected by MAP and NDCG. Using all possible features (All-77) from the trace transform functionals combination for the recognition had produced a lower result. Using a random approach is not the best way either. From Table 2, the recognizer based on features selected by MAP and NDCG has shown decent results for the recognition of the handwritten Jawi images in the test set. The methods had recognized up to 78.34% of all the sub words images for the top-5 recognition.

5 Conclusion Feature selection of object signature based on the trace transform for the recognition of handwritten Jawi images has been demonstrated. Object signature theoretically allows us to use an unlimited number of features. Using a proper feature selection method, thousands of relevant features can easily be selected. We proved that using a wrapper method along with several ranking evaluation measurements (MAP and NDCG) to select useful features is better than random selections or no

Object Signature Features Selection for Handwritten Jawi Recognition

697

feature selection at all. These feature selection schemes have greatly enhanced the applicability of the trace transform based method. Acknowledgments. The authors would like to thank the University for the Research Grants No. UKM-GUP-TMK-07-02-034 and UKM-AP-ICT-17-2009.

References 1. Kadyrov, A., Petrou, M.: Object Signatures Invariant to Affine Distortions Derived from the Trace Transform. Image and Vision Computing 21(13-14), 1135–1143 (2003) 2. Kadyrov, A., Petrou, M.: Object Descriptors Invariant to Affine Distortions. In: Proceedings BMVC 2001, Manchester, UK, vol. 2, pp. 391–400 (2001) 3. Srisuk, S., Petrou, M., Kurutach, W., Kadyrov, A.: Face Authentication using the Trace Transform. In: Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 305–312 (2003) 4. Srisuk, S., Petrou, M., Kurutach, W., Kadyrov, A.: A Face Authentication System using the Trace Transform. Pattern Analysis and Applications 8(1-2), 50–61 (2005) 5. Kadyrov, A., Petrou, M., Park, J.: Korean Character Recognition with the Trace Transform. In: Proceedings of the International Conference on Integration of Multimedia Contents, ICIM 2001, Chosun University, Gwangju, South Korea, November 15, pp. 7–12 (2001) 6. Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003) 7. Saeys, Y., Inza, I., Larranaga, P.: A Review of Feature Selection Techniques in Bioinformatics. Bioinformatics 23(19), 2507–2517 (2007) 8. Mladenic, D., Grobelnik, M.: Feature Selection for Unbalanced Class Distribution and Naïve Bayes. In: Proceedings of the Sixteenth International Conference on Machine Learning (ICML), pp. 258–267 (1999) 9. Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: Proceedings of 14th International Conference on Machine Learning, pp. 412–420 (1997) 10. Forman, G.: An Extensive Empirical Study of Feature Selection Metrics for Text Classification. Journal of Machine Learning Research 3, 1289–1305 (2003) 11. Kohavi, R., John, G.H.: Wrappers for Feature Selection. Artificial Intelligence 97, 273–324 (1997) 12. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth and Brooks, Pacific Grove (1984) 13. Nasrudin, M.F., Omar, K., Liong, C.-Y., Zakaria, M.S.: Invariant Features from the Trace Transform for Jawi Character Recognition. In: Omatu, S., Rocha, M.P., Bravo, J., Fernández, F., Corchado, E., Bustillo, A., Corchado, J.M. (eds.) IWANN 2009. LNCS, vol. 5518, pp. 256–263. Springer, Heidelberg (2009) 14. Yates, R.B., Neto, B.R.: Modern Information Retrieval. Addison Wesley, Redwood City (1999) 15. Jarvelin, K., Kekalainen, J.: Cumulated Gain-based Evaluation of IR Techniques. ACM Transactions on Information Systems 20(4), 422–446 (2002) 16. Kadyrov, A., Petrou, M.: The Trace Transform and Its Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI 23, 811–828 (2001)

698

M. Faidzu et al.

17. Kadyrov, A., Petrou, M.: The Trace Transform as a Tool to Invariant Feature Construction. In: Proceedings ICPR 1998, Brisbane, Australia, pp. 1037–1039 (1998) 18. Kadyrov, A., Fedotov, N.: Triple Features. Pattern Recognition and Image Analysis: Advances in Mathematical Theory and Applications 5(4), 546–556 (1995) 19. Shin, B.S., Cha, E.Y., Cho, K.W., Klette, R., Woo, Y.W.: Effective Feature Extraction by Trace Transform for Insect Footprint Recognition, MI-tech Report Series, Computer Science Department, The University of Auckland, New Zealand, Multimedia Imaging Report 12 (2008) 20. Azarnasab, E.: Robot-in-the-loop Simulation to Support Multi-Robot System Development: A Dynamic Team Formation Example, M.Sc. Thesis, College of Arts and Sciences, Georgia State University (2007)