Isolated Handwritten Digit Recognition Using oBIFs

0 downloads 0 Views 320KB Size Report
oriented Basic Image Features (oBIFs) with the background concavity features can be ... line/paragraph level, word level and character level. A classical problem ... problem targets to enhance the recognition rates by enhancing the feature ...

Isolated Handwritten Digit Recognition Using oBIFs and Background Features Abdeljalil Gattal1, 2 , Chawki Djeddi1

Youcef Chibani

Department of Mathematics and Computer Science, Larbi Tebessi University, Tebessa, Algeria. 2 National Higher School of Informatics (ESI), Oued Smar, Algiers, Algeria [email protected], [email protected]

LISIC Lab., Faculty of Electronics and Computer Science, University of Sciences and Technology Houari Boumédiene (USTHB), Bab-Ezzouar, Algiers, Algeria [email protected]

1

Imran Siddiqi Department of CS Bahria University, Islamabad, Pakistan [email protected] Abstract— This study demonstrates how the combination of oriented Basic Image Features (oBIFs) with the background concavity features can be effectively employed to enhance the performance of isolated digit recognition systems. The features are extracted without any size normalization from the complete image as well as from different regions of the image by applying a uniform grid sampling to the image. Classification is carried out using one-against-all support vector machine (SVM) while the experimental study is conducted on the standard CVL single digit database. A series of evaluations using different feature configurations and combinations realized high recognition rates which are compared with the state-of-the-art methods on this subject. Keywords-Isolated handwritten digits; oBIFs; Background Features; Support Vector Machine.

I.

INTRODUCTION

Handwriting recognition is one of the most researched pattern classification problem with a wide variety of applications including automatic transcription of handwritten documents, processing of postal mails, recognition of courtesy amounts on bank checks, extraction of information from forms and many more [1,2,3]. Despite more than three decades of extensive research, the problem remains open for researchers mainly due to challenges involved in the recognition. These include segmentation of handwriting into basic units (words, characters or graphemes), writer-dependent allographic variations, variations in size writing style and the presence of various kinds of noisy elements. Handwriting recognition has been carried out at line/paragraph level, word level and character level. A classical problem within the broader umbrella of handwriting recognition is the recognition of digits [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] which also makes the subject of our present study and finds applications in problems like recognition of postal codes and financial amounts etc. Research on digit recognition has matured significantly over the years and systems realizing high recognition rates have been reported in the literature. The recent research on this problem targets to enhance the recognition rates by enhancing the feature extraction or/and classification techniques, the two most significant component of any pattern recognition system.

Combination of different types of features followed by a feature selection and combination of different classifiers to enhance the recognition rates have been investigated in this regard. The focus of this study lies on the former of the two, i.e. enhancement of the feature extraction step to improve the overall recognition rates in isolated digit recognition problem. This enhancement is carried out by combining the oriented Basic Image Feature (oBIFs) and the background features computed from images of isolated digits using a uniform grid sampling. The feature extraction step does not require size normalization of the digits. Classification is carried out using one-against-all Support Vector Machine (SVM) and the realized results are compared with a number of state-of-the-art methods. Among well-known digit recognition systems, gradient features [14], combination of statistical and structural features [12] and finding the most effective combination of features from a large pool of features using a genetic algorithm [13] are few of the notable investigations on this subject. From the view point of classification, classifiers like Modified Discrimination Function (MQDF) [4], Support Vector Machine [5, 6], Neural Networks [7,8,9], Hidden Markov Models (HMM) [10] and fuzzy logic have been explored[11]. The objective of the present study is to demonstrate how the combination of oriented Basic Image Features (oBIFs) and background features can be exploited to achieve high recognition rates on non-normalized isolated handwritten digits. We first discuss the feature employed in our study in the next section followed by the classification mechanism in Section III. Section IV details the experiment conducted along with a comparative analysis and discussion on the realized results. Finally, we conclude the paper with a discussion on future perspectives on the subject. II. FEATURE EXTRACTION Feature extraction is the most critical step in any pattern classification task. The key idea of feature extraction is to find an effective, discriminatory representation of the patterns under study (digits in our case) that minimizes the intra-class variability and which maximizes inter-class variability [16, 17, 18]. In case of quantitative features, the feature extraction step

projects each pattern under study as a point in an ndimensional feature space, n being the dimensionality of the feature vector. Ideally, if the chosen features are relevant, patterns belonging to the same class appear as clusters in the feature space allowing an effective classification. In our study on isolated digit recognition, we have chosen to employ a combination of oBIFS and background features. The oBIFS allow capturing the textural information in digits while the background features exploit the geometrical and topological properties of digits for a discriminatory representation. Each of these features is discussed in the following sub-sections followed by their computational details for our specific problem. A. The oriented Basic Image Features (oBIFs) The oriented Basic Image Features (oBIFs) [19,20] represent an effective texture descriptor that has been successfully applied to problems like character recognition [21], texture classification [22] and writer identification [23]. Since each handwritten digit can be viewed as a different texture, the oBIFs are likely to be effective for the digit recognition problem as well. The oriented Basic Image Features (oBIFs) is an extension to the Basic Image Features (BIFs) [19,20] and involves combining the local orientation with the local symmetry information. Every location in the image is categorized into one of the seven local symmetry classes. These include dark line on light, light line on dark, dark rotational, light rotational, slope, saddle-like or flat. The classification is based on the response of a bank of six Derivative-of-Gaussian (DoG) filters (up to second order) of size determined by the scale parameter σ. A supplementary parameter ε determines if a location is to be classified as flat. To combine the local orientation information with the local symmetry information, depending upon the symmetry class of a location, possible orientations are identified. If the location is attributed to the dark rotations, light rotational or the flat class, no orientation is assigned. For the for the dark line on light, light line on dark and saddle-like classes, n possible orientations can be assigned while for the slope class includes a total of 2n possible orientations. This results in a feature vector of dimension 5n+3. In our implementation of the oBIFs on isolated handwritten digits, we fix the orientation quantization parameter n=4. This results in a total of 23 entries in the oBIFs dictionary. Finally, the (normalized) histogram of the oBIFs computed from the image of the digit is used as descriptor. Figure 1 illustrates a sample digit encoded using the oBIFs and the corresponding histogram.

Binarized image

An image is encoded into oBIFs

Normalized histogram

Figure 1. Example of oBIFs computation for handwritten digit image ‘9’ for σ 1 and ε 0.1 .

B. Background Features The background features (BF) [5, 24] are based on the concavity information and are aimed at capturing the topological properties of a digit. These features represent the number of white pixels have a specific concavity configuration. The label of each concavity configuration is chosen based on the Freeman code with four directions for each white pixel. Each direction is explored until a black pixel or the limit of the digit is met. In addition to the nine standard concavity configurations, we also consider five additional configurations to more precisely the presence of model the loops in digits. These configurations are illustrated in Figure 2.

sampling where each region is of the same size and has the same shape. Figure 4 shows an example of a digit split into a 2x1 grid.

Figure 4. Splitting the digit image using uniform grid of 2x1.

III.

Figure 2. Concavity Configurations [5, 24]

The concavity labels of the background pixels for a sample of digit ‘9’ are illustrated in Figure 3 where each color represents a different concavity configuration. The 14 bin (normalized) histogram of the concavity labels is computed from a handwritten digit and is used as feature.

CLASSIFICATION

The classification is based on multi-class Support Vector Machine (SVM) using the one-against-all implementation. The one-against-all technique [26, 27, 28] builds 10-binary classifiers to solve the 10-class digit recognition problem. Each of the 10-binary classifiers is trained to distinguish one class from all others. The classifiers are trained using the features discussed in the previous section. During the recognition phase, features extracted from the query image are fed to the trained SVM and the class for which the decision function reports the maximum value is chosen. ; When

0, . .9 ;

(1)

is the feature vector of the query digit.

Two important parameters required for training the SVM include the regularization parameter (C) and the Radial Basis Function (RBF) kernel parameter (σ). These parameters are empirically determined on a validation set and are fixed to C=205 and σ = 12. In the next section, we present the experimental settings and the corresponding results. IV.

Figure 3. Concavity labels for handwritten digit image ‘9’

The oBIFs and the background features are extracted from the complete image of digit as well as by first applying the Uniform Grid Sampling (UGS) [25] to the digit image. This allows extracting features from different regions of the image separately. A uniform grid creates rectangular regions for

EXPERIMENTAL RESULTS

We carried out a series of experiments to evaluate the effectiveness of the proposed features for digit recognition. The oBIFs are generated using different values of the parameter ( 1,2,4,8 ) from the complete image as well as using the uniform grid sampling. The parameter is fixed to a small value of 0.1.Likewise, the background features are also extracted from the complete image as well as after sampling the image. The different features extracted in our study along with the respectively dimensionality are summarized n Table I.

TABLE I. Feature oBIFs1 oBIFs2 oBIFs3 oBIFs4 oBIFs12 oBIFs21 oBIFs22

oBIFs23

BF1 BF12 BF21 BF22 BF23

SUMMARY OF USED FEATURES Parameters 1 (Complete image) 2 (Complete image) 4 (Complete image) 8 (Complete image) 2,8 for 1st region(1,1) 1,4 for 2nd region(1,2) 2,8 for 1st region(1,1) 2,8 for 2nd region(2,1) 2,8 for 1st region(1,1) 2,8 for 2nd region(1,2) 1,4 for 3rd region(2,1) 1,4 for 4th region(2,2) 2,8 for 1st region(1,1) 2,8 for 2nd region(1,2) 2,8 for 3rd region(1,3) 1,4 for 4th region(2,1) 1,4 for 5th region(2,2) 1,4 for 6th region(2,3) -

Using UGS No No No No 1x2 grid

Dimension 23 23 23 23 92

2x1 grid

92

2x2 grid

184

2x3 grid

No 1x2 grid 2x1grid 2x2 grid 2x3grid

276

14 28 28 56 84

The performance of the system is quantified using the standard precision and recall measures computed in a similar fashion as in the ICDAR 2013 digit recognition competition [15]. The recognition rates of these experiments are summarized in Table II.

oBIFs

RECOGNITION RESULTS ON FEATURES COMBINATION. Features combinations

BF1 ,BF21 BF1 ,BF22 BF1 ,BF21, oBIFs23 BF1 ,BF21, oBIFs22 BF1 ,BF22, oBIFs22 BF1 ,BF21, oBIFs22, oBIFs4 BF1 ,BF22, oBIFs22,oBIFs4 BF1 ,BF21, oBIFs22, oBIFs3 BF1 ,BF22, oBIFs22,oBIFs3 BF1 ,BF21, oBIFs22, oBIFs3, oBIFs4 BF1 ,BF22, oBIFs22,oBIFs3, oBIFs4 BF1 ,BF21, oBIFs23, oBIFs4 BF1 ,BF21, oBIFs23, oBIFs3 BF1 ,BF21, oBIFs23, oBIFs3, oBIFs4

Recall (%) 93.09 91.71 94.68 94.92 94.94 95.06 95.19 95.06 95.19 95.06 95.19 94.71 95.05 95.03

Precision (%) 93.12 91.72 94.74 94.95 94.96 95.08 95.21 95.08 95.21 95.08 95.21 94.77 95.10 95.09

We also computed the precision and recall for each of the digit classes separately to find the challenging classes from the view point of recognition. Class-wise recognition rates of these experiments are presented in Table IV. In general, the precision and recall values are more or less consistent across different digit classes. Relatively low recall is realized on some digits (2, 6 and 8). In addition, it should be noted that some pairs like ('3', '8'), ('9', '7') and ('6', '8') offer a relatively more challenging recognition problem due to low inter-class variation.

RECOGNITION RESULTS ON DIFFERENT FEATURES

Feature

BF

In addition to individual features, we also evaluated some the feature combinations to improve the overall recognition rates. Table III summarizes the performance of these combinations. A highest recall of 95.19% with a precision of 95.21% is achieved when combining BF1, BF22, oBIFs22 and oBIFs3 (or oBIFs4) making a feature vector of dimension 277. TABLE III.

The experiments were carried out on the CVL Single Digit database [15]. The database comprises 7,000 digits for training, a validation set of same size and an evaluation set containing of 21,780 digits. All digit images are binarized using the method KittlerMet binarization [29] method prior to feature extraction. The features are extracted directly from the binary digit images without any size normalization.

TABLE II.

reporting a recall 93.63%, and a precision of 93.70%. Among the background features, BF21 extracted from 2x1 grid realize the highest recognition rate of 90.29%.

oBIFs1 oBIFs2 oBIFs3 oBIFs4 oBIFs12 oBIFs21 oBIFs22 oBIFs23 BF1 BF12 BF21 BF22 BF23

Recall (%)

Precision (%)

59.90 67.42 76.53 73.25 88.95 91.52 93.63 93.25 84.66 80.03 90.29 88.15 86.12

60.72 67.68 76.57 73.88 88.93 91.51 93.70 93.36 84.82 80.55 90.25 88.19 86.08

It can be seen from Table II that the performance of the different feature configurations varies significantly. The oriented Basic Image Features (oBIFs22) extracted from the four regions of the digit image outperform all other features

TABLE IV.

RECOGNITION RESULTS ON INVIDIDUAL CLASSES

Class 0 1 2 3 4 5 6 7 8 9 Average

Recall (%) 98.76 97.38 92.01 95.59 97.43 96.56 94.03 93.76 89.03 97.38 95.19

Precision (%) 92.84 97.43 95.70 94.81 95.37 95.68 95.97 92.82 94.08 97.38 95.21

We also compare the performance of the proposed system with state-of-the-art digit recognition systems submitted to the Digit Recognition Competition (HDRC) held in conjunction with ICDAR 2013. A total of 7 teams submitted 9 different systems to the HDRC 2013. Only 2 of these systems do not

require any size normalization (Jadavpur and Tébessa I). The database (CVL) and evaluation protocol considered in our experiments is the same as that of the competition to allow a meaningful comparison as summarized in Table V. TABLE V.

REFERENCES [1]

COMPARISON OF PROPOSED METHOD WITH STATE-OF-THEART METHODS

Rank

Method

[2]

Precision (%) 97.74

Normalized Digits Yes

1

Salzburg II

2 3

Salzburg I Orand Proposed Method Jadavpur Paris Sud François Rabelais Hannover

96.72 95.44

Yes Yes

95.21

No

94.75 94.24

No Yes

91.66

Yes

89.58

Yes

9

Tébessa II

78.43

Yes

10

Tébessa I

77.53

No

4 5 6 7 8

It can be seen from Table V that the proposed method realizes a precision of 95.21% which is comparable to performance of the top 3 participants of the competition. It should however be noted that the proposed system does not require any size normalization and the features are directly extracted from binarized images of isolated digits. Comparing only the performance of systems which do not require size normalization, our system realizes the highest recognition rate. These results validate the effectiveness of the oriented Basic Image Features and the background features for recognition of isolated digits. V.

CONCLUSIONS AND FUTURE WORKS

This study was aimed at enhancing the feature extraction stage of in isolated digit recognition systems to improve the overall recognition rates. We investigated the oriented Basic Image Features (oBIFs) and the background (concavity) features for this purpose. The features are extracted using different parameter settings from the complete image of digit as well as by using a uniform grid sampling while the classification is carried out using on-against-all SVM. The system evaluated on the standard CVL single digit database using the same experimental protocol as that of the Handwritten Digit Recognition Competition (HDRC-2013) realized high precision and recall rates. Although high recognition rates are realized, there are certain digit classes which offer a challenging recognition problem. We intend to address these confusing pairs by incorporating additional features in our further study. Moreover, the classification step of the present study is very much traditional. We plan to enhance the classification module by using a combination of different classifiers to arrive at the final decision about the digit class.

[3]

[4] [5]

[6] [7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16] [17]

[18]

[19]

G. Dimauro, S. Impedovo, G. Pirlo and A. Salzo, “Automatic Bankcheck processing. A New Engineered System,” International Journal of Pattern Recognition and Artificial Intelligence, vol.11, n°4, pp. 467-504, 1997. M. Cheriet, Y. Al-Ohali, N. E. Ayat and C. Y. Suen , “Arabic Cheque Processing System: Issues and Future Trend. Advances in Pattern Recognition,” ed. B.B. Chaudhuri (Springer Verlag, 2007), pp. 213–232. M. Suwa, “Segmentation of connected handwritten numerals by graph representation”, in Proc. Eighth International Conference on Document Analysis and Recognition (ICDAR’05), Seoul, 2005, vol. 2, pp. 750– 754. G. S. Lehal and Nivedan Bhatt, “A Recognition System for Devnagri and English Handwritten Numerals, ” in Proc. Of ICMI, 2000. A. Gattal, Y. Chibani, C. Djeddi and I. Siddiqi, “Improving Isolated Digit Recognition using a Combination of Multiple Features,” in Proc. of 14th International Conference on Frontiers in Handwriting Recognition (ICFHR-2014), Crete Island, Greece ,2014, pp. 446–451. D. DeCoste and B. Scholkopf, “Training invariant support vector machines,” Machine Learning, vol. 46, pp. 161–190, 2002. Y. Hwang and S. Bang, “Recognition of Unconstrained Handwritten Numerals by a Radial Basis Function Neural Network Classifier,” Pattern Recognition Letters, vol.18, pp. 657–664,1997. J. Cao, M. Ahmadi and M. Shridhar, “A Hierarchical Neural Network Architecture for Handwritten Numeral Recognition,” Pattern Recognition, vol. 30, pp. 289–294 ,1997. U. Meier, D.C. Ciresan, L.M. Gambardella and J. Schmidhuber, “ Better digit recognition with a committee of simple neural nets,” in Proc. 11th International Conference on Document Analysis and Recognition (ICDAR), 2011, pp. 1250–1254. S. Awaidah and S. Mahmoud, “A Multiple Feature/Resolution Scheme to Arabic (Indian) Numerals Recognition Using Hidden Markov Models,” Signal Processing, vol. 89, pp. 1176–1184 ,2009. M. Sadok and A.T. Alouani, “A Fuzzy Logic Based Handwritten Numeral recognition expert System,” Proc. 29th IEEE Southeastern Symposium on Systems Theory, Cookeville, TN, March 1997, pp. 34– 38. L. Heutte, T. Paquet, J.V. Moreau, Y. Lecourtier and C. Olivier, “A structural/statistical feature based vector for handwritten character recognition,” Pattern Recognition Lett., vol. 19, no.7, pp. 629–641, 1998. Y. Kimura, A. Suzuki and K. Odaka, “Feature selection for character recognition using genetic algorithm,” IEEE Fourth International Conference on Innovative Computing, Information and Control (ICICIC), Kaohsiung , Dec. 2009, pp. 401-404. J.X. Dong, A. Krzyzak, C.Y. Suen, “A multi-net learning framework for pattern recognition,” Proceedings of the Sixth International Conference on Document Analysis and Recognition, Seattle, 2001, pp. 328–332. M. Diem, S. Fiel, A. Garz, M. Keglevic, F. Kleber and R. Sablatnig, “ICDAR 2013 Competition on Handwritten Digit Recognition (HDRC 2013),” In Proc. of the 12th Int. Conference on Document Analysis and Recognition (ICDAR), 2013, pp. 1454-1459. P.A. Devijver and J. Kittler, “Pattern recognition, a statistical approach, Prentice Hall,” London, pp. 480, 1982. K. K. Kim, J. H. Kim, and C. Y. Suen, “Segmentation-based recognition of handwritten touching pairs of digits using structural features,” Pattern Recognition, vol. 23, no.1,pp. 13-24, 2002. A. Gattal and Y. Chibani, “SVM-Based Segmentation-Verification of Handwritten Connected Digits Using the Oriented Sliding Window”, International Journal of Computational Intelligence and Applications (IJCIA) , vol.14, n°1,pp. 1–17, 2015. L. D. Griffin, M. Lillholm, M. Crosier, and J. Sande, "Basic Image Features (BIFs) Arising from Approximate Symmetry Type," in Scale Space and Variational Methods in Computer Vision, vol. 5567, X.-C. Tai, K. Mørken, M. Lysaker, and K.-A. Lie, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 343-355.

[20] L. D. Griffin and M. Lillholm, "Symmetry Sensitivities of Derivativeof-Gaussian Filters," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 6, pp. 1072-1083, 2010. [21] A.J. Newell, L. Griffin, Natural image character recognition using oriented basic image features, in: Proceedings of the International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2011,pp. 191-196. [22] A.J. Newell, L.D. Griffin, R.M. Morgan, P.A. Bull, “Texture-based estimation of physical characteristics of sand grains,” in Proceedings of the International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2010, pp. 504-509. [23] A.J. Newell and L.D. Griffin, "Writer identification using oriented basic image features and the delta encoding", Pattern Recognition, vol. 47, no. 6, pp. 2255-2265, 2013. [24] A. Britto, R. Sabourin, F. Bortolozzi, C.Y.Suen. « Complementary features combined in an HMM-based system to recognize handwritten digits ». In 12th International Conference on Image Analysis and Processing(ICIAP), Mantova, Italy, 2003, pp. 670-675. [25] J. Favata and G. Srikantan, "A Multiple Feature/Resolution Approach To Handprinted Digit and Character Recognition. " International journal of imaging systems and technology, vol. 7, n° 4, pp. 304–311, 1996. [26] V. N. Vapnik. “The Nature of Statistical Learning Theory,” SpringerVerlag, London, UK, 1995. [27] C. Hsu and C. Lin, “A comparison of methods for multi-class support vector machines.” IEEE Trans. on Neural Networks, vol. 13, pp. 415425, 2002. [28] Y. Guermeur, A. Elisee and H. PaugamMoisy, "A new multiclass svm based on a uniform convergence result," IJCNN, vol. 4, pp. 183-188, 2000. [29] J. Kittler and J. Illingworth, "Minimum Error Thresholding, " Pattern Recognition, vol. 19, no.1, pp. 41-47,1986.

Suggest Documents