Recognizing Handwritten Devanagari Words Using ...

2 downloads 0 Views 662KB Size Report
Using Recurrent Neural Network. Sonali G. Oval and Sankirti Shirawale. MMCOE, Pune, Maharashtra [email protected], [email protected].
Recognizing Handwritten Devanagari Words Using Recurrent Neural Network Sonali G. Oval and Sankirti Shirawale MMCOE, Pune, Maharashtra [email protected], [email protected]

Abstract. recognizing lines of handwritten text is a difficult task. Most recent evolution in the field has been made either through better-quality pre processing or through advances in language modeling. Most systems rely on hidden Markov models that have been used for decades in speech and handwriting recognition. So an approach is proposed in this paper which is based on a type of recurrent neural network, in particularly designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies. Recurrent neural networks (RNN) have been successfully applied for recognition of cursive handwritten documents, in scripts like English and Arabic. A regular recurrent neural network (RNN) is extended to a bidirectional recurrent neural network (BRNN). Keywords: Devanagari Script, Handwriting recognition, Hidden Markov model, Recurrent neural networks.

1

Introduction

Handwriting is the most natural mode of collecting, storing, and transmitting information which also serves not only for communication among human but also serves for communication of humans and machines [6]. The problem of handwritten word recognition which is one of the challenging problems in pattern recognition has been studied for several decades. The reasons behind this difficulty are large variability in handwriting style, large varieties of pen-type, overlapping wide strokes, and a lack of ordering information of strokes [8]. Handwriting recognition is traditionally divided into online and offline recognition [21]. In online recognition, a time series of coordinates, representing the movement of the pen tip, is captured, while in the offline case [20], only an image of the text is available. Because of the greater ease of extracting relevant features, online recognition generally yields better results [1]. HMMs are able to segment and recognize at the same time, which is one reason for their popularity in unconstrained handwriting recognition. The idea of applying HMMs to handwriting recognition was originally motivated by their success in speech recognition, where a similar conflict exists between recognition and segmentation. Over the years, numerous refinements of the basic HMM approach has been proposed, such as the writer-independent system considered in, which combines point-oriented and stroke-oriented input features [1]. HMMs have several well-known © Springer International Publishing Switzerland 2015 S.C. Satapathy et al. (eds.), Proc. of the 3rd Int. Conf. on Front. of Intell. Comput. (FICTA) 2014 – Vol. 2, Advances in Intelligent Systems and Computing 328, DOI: 10.1007/978-3-319-12012-6_45

413

414

S.G. Oval and S. Shirawale

drawbacks. One of these is that they assume that the probability of each observation depends only on the current state, which makes contextual effects difficult to model. Another is that HMMs are generative, while discriminative models generally give better performance in labeling and classification tasks. Recurrent neural networks (RNNs) do not suffer from these limitations and would therefore seem a promising alternative to HMMs. The main reason for this is that traditional neural network objective functions require a separate training signal for every point in the input sequence, which in turn requires presegmented data [15]. The literature survey of different recognition methods for handwritten words is described in section 2. Section 3 describes the Devanagari script. The implementation details of the proposed system, preprocessing and normalization, Feature extraction and Classification are explained in section 4. Section 5 describes the results and data set, and the paper is concluded in section 6.

2

Literature Survey

During recent years, research toward Indian handwritten character recognition is getting increased attention although the first research report on offline handwritten Devanagari characters was published in 1977 [4]. Many approaches have been proposed toward handwritten Devanagari numeral, character, and word recognition in the past decade. Two approaches are mainly used in handwritten character recognition. First is segmentation-based approach and the other is segmentation free approach (holistic approach). In the first approach, the words are initially segmented into characters or pseudocharacters, and then, recognized. As a result, the success of the recognition module depends on the performance of the segmentation technique. The second approach treats the whole word as a single entity and it recognizes without doing explicit segmentation. Three different classifiers, namely nearest neighbor, k-NN, and SVM were tested independently to recognize handwritten Devanagari numerals in [12]. The performance of SVM in terms of accuracy was better than the other two classifiers. In [17], the feature vector is entered as an input to one of the feedforward backpropagation neural network for the classification of handwritten Devanagari characters. Kumar [9] compared the performances of SVM and MLP classifiers with six different features on handwritten characters and found that the performance of SVM classifier was superior to MLP in all the six cases. But the classification time required for SVM was greater than that of MLP. A modified quadratic classifier is applied by Pal et al. [10] on the features of handwritten characters for recognition. In [11], two classifiers are combined to get higher accuracy of character recognition with the same features. Combined use of SVM and MQDF is applied for the same [14]. The work reported in [16] presents a two-stage classification approach for handwritten Devanagari characters. The first stage is using structural properties like shirorekha and spine in a character. The second stage exploits intersection features of characters, which are then fed to a feedforward neural network (FFNN) for further classification. A segmentation based approach to handwritten Devanagari word recognition is proposed by Shaw et al. [6]. On the basis of the header line, a word image is segmented into pseudocharacters. HMM are proposed to recognize the

Recognizing Handwritten Devanagari Words Using Recurrent Neural Network

415

pseudocharacters. The word level recognition is done on the basis of string edit distance. A continuous density HMM is also proposed by Shaw et al. [8] to recognize a handwritten word images. The states of the HMM are not determined a priori, but are determined automatically based on a database of handwritten word images. An HMM is constructed for each word. To classify an unknown word image, its class conditional probability for each HMM is computed. The class that gives highest such probability is finally selected.

3

Devanagari Script

This script emerged out of Siddham script an immediate descendant of Gupta script ultimately deriving from the Brahmi Script. It follows left to right fashion for writing [8]. The Devanagari alphabet is used for writing Hindi, Sanskrit, Marathi, Nepali and it is closely related to many of the scripts in use today in South Asia, Southeast Asia and Tibet. This script is cursive in nature. Devanagari has 13 independent vowels or “svara”, 33 independent consonants or “vyajana” and 12 dependent vowel signs shown in Fig. 1. (a) Most of the consonants can be joined to one or two other consonants so that the inherent vowel is suppressed. The resulting conjunct form is called a ligature or a compound character. Commonly used compound characters appearing in our lexicon of words are shown in Fig. 1.

Fig. 1. (a) Devanagari Character Set; (b) Three Strips of a Devanagari word

416

S.G. Oval and S. Shirawale

4

Classification

4.1

Preprocessing and Normalization

First, the image was rotated to account for the overall skew of the document, and the handwritten part was extracted from the form. Then, a histogram of the horizontal black/white transitions was calculated, and the text was split at the local minima to give a series of horizontal lines. Once the line images were extracted, the next stage was to normalize the text with respect to writing skew and slant and character size. 4.2

Feature Extraction

To extract the feature vectors from the normalized images, a sliding window approach is proposed. The width of the window is one pixel, and nine geometrical features are computed at each window position. Each text line image is therefore converted to a sequence of seven-dimensional vectors. The seven features are the following: • • • • • • • 4.3

The mean gray value of the pixels. The center of gravity of the pixels. The second-order vertical moment of the center of gravity. The positions of the uppermost and lowermost black pixels. The rate of change of these positions (with respect to the neighboring windows). The number of black-white transitions between the uppermost and lowermost pixels, and The proportion of black pixels between the uppermost and lowermost pixels. Classification

4.3.1 Recurrent Neural Network (RNN) A recurrent neural network (RNNs) is a connectionist model containing a selfconnected hidden layer [1]. RNN’s provide a very elegant way of dealing with (time) sequential data that embodies correlations between data points that are close in the sequence. Fig. 4 shows a basic RNN architecture with a delay line and unfolded in time for two time steps. In this structure, the input vectors are fed one at a time into the RNN [15]. One of the key benefits of RNNs is their ability to make use of previous context. However, for standard RNN architectures, the range of context that can in practice be accessed is limited. The problem is that the influence of a given input on the hidden layer, and therefore on the network output, either decays or blows up exponentially as it cycles around the recurrent connections. This is often referred to as the vanishing gradient problem [15], [1].

Recognizing Handwritten Devanagari Words Using Recurrent Neural Network

417

Fig. 2. Structure of RNN

4.3.2 Bidirectional RNN A bidirectional recurrent neural network (BRNN) is proposed that can be trained using all available input information in the past and future of a specific time frame [18]. Fig. 5 shows a basic BRNN architecture with a delay line and unfolded in time for three time steps. For many tasks, it is useful to have access to future, as well as past, context. In handwriting recognition, for example, the identification of a given letter is helped by knowing the letters both to the right and left of it. Bidirectional RNNs (BRNNs) are able to access context in both directions along the input sequence. BRNNs contain two separate hidden layers, one of which processes the input sequence forward, while the other processes it backward [15]. Both hidden layers are connected to the same output layer, providing it with access to the past and future context of every point in the sequence. Combining BRNNs and LSTM gives BLSTM [2].

Fig. 3. Structure of BRNN

Long Short-Term Memory (LSTM) [19] is RNN architecture designed to address the vanishing gradient problem. An LSTM layer consists of multiple recurrently connected subnets, known as memory blocks [3]. Each block contains a set of internal units, known as cells, whose activation is controlled by three multiplicative gate units [5], [1].

418

S.G. Oval and S. Shirawale

Bidirectional RNNs achieve this by presenting the input data forwards and backwards to two separate hidden layers, both of which are connected to the same output layer.

5

Results

5.1

Data Set

For the offline recognition of handwritten in Marathi and Hindi languages, a technique that is a combination of two approaches in a single writer environment is presented in this paper. In this project the data set is used a legal amount words. The data set was taken from “Database Development and Recognition of Handwritten Devanagari Legal Amount Words” author of this paper are R. Jayadevan, S. R. Kolhe, Umapada Pal. A data set contained 26,720 handwritten legal amount words written in Hindi and Marathi languages. The database was constructed by taking handwritten data from ninety writers. The sample words are shown in Fig. 4.

Fig. 4. Samples handwritten words

5.2

Results

The results of pre processing steps are as shown in Fig.5 (a) shows the original image. Fig.5 (b) shows the filtered image. For filtration we use median filter. Fig.5(c) shows the output of binarization (we use Otsu’s binarization algorithm).Also it shows the skew angle. Fig5 (d) shows the output of corrected skew of the image. Fig5 (e) shows the vertical projection of the image and Fig5 (f) shows the output of horizontal projection.

Fig. 5. (a) Original Image

(b) Filtered Image

Recognizing Handwritten Devanagari Words Using Recurrent Neural Network

419

(c) Binarizated Image

(d) Skew correction

(e) Vertical Projection

(f) Horizontal Projection Fig. 5. (Continued)

6

Conclusion and Future Work

In this paper we described a proposed approach for recognizing offline handwritten text, using a single recurrent neural network (RNN). The key advance is a recently introduced RNN objective function known as Connectionist Temporal Classification (CTC). CTC uses the network to label the entire input sequence at once. This way the network can be trained with unsegmented input data, and the final label sequence can be read directly from the network output.

420

S.G. Oval and S. Shirawale

In the future work comparison will be done in between HMM and RNN using following parameters.  

database size and Test set coverage.

It is also planned to overcome the problem of Out-Of-Vocabulary words (OOV). This could be done by using the network probability and the edit distance to the nearest vocabulary word.

References 1. Graves, A., Liwicki, M., Fernandez, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 855–868 (2009) 2. Vinciarelli, A.: Online and offline handwriting recognition: A comprehensive survey. Pattern Recognition 35, 1433–1446 (2002) 3. Graves, A., Fernández, S., Schmidhuber, J.: Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 799–804. Springer, Heidelberg (2005) 4. Graves, A., Fernandez, S., Liwicki, M., Bunke, H., Schmidhuber, J.: Unconstrained Online Handwriting Recognition with Recurrent Neural Networks. Advances in Neural Information Processing Systems 20, 1–7 (2008) 5. Shaw, Parui, S.K., Shridhar, M.: A segmentation based approach to offline handwritten Devanagari word recognition. In: Proc. IEEE Int. Conf. Inf. Technol., pp. 256–257 (2008) 6. Shaw, Parui, S.K., Shridhar, M.: Off-line handwritten Devanagari word recognition: A holistic approach based on directional chain code feature and HMM. In: Proc. Int. Conf. Inf. Technol., pp. 203–208 (2008) 7. Rajput, G.G., Mali, S.M.: Fourier descriptor based isolated Marathi handwritten numeral recognition. Int. J. Comput. Appl. 3(4), 9–13 (2010) 8. Liwicki, M., Graves, A., Bunke, H., Schmidhuber, J.: A Novel Approach to On-Line Handwriting Recognition Based on Bidirectional Long Short-Term Memory Networks. In: ICDAR 2007, pp. 367–371 (2007) 9. Hanmandlu, M., Agrawal, P., Lall, B.: Segmentation of handwritten Hindi text: A structural approach. Int. J. Comput. Process. Lang. 22(1), 1–20 (2009) 10. Schuster, M., Paliwal, K.K.: Bidirectional Recurrent Neural Networks. IEEE Trans. Signal Processing 45, 2673–2681 (1997) 11. Morillot, O., Likforman-Sulem, L., Grosicki, E.: Comparative study of HMM and BLSTM segmentation-free approaches for the recognition of handwritten text-lines, pp. 783–787. IEEE (2013) 12. Agrawal, P., Hanmandlu, M., Lall, B.: Coarse classification of handwritten Hindi characters. Int. J. Advanced Sci. Technol. 10, 43–54 (2009) 13. Jayadevan, R., Kolhe, S.R., Patil, P.M., Pal, U.: Offline Recognition of Devanagari Script: A Survey. IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, 782–796 (November 17, 2011)

Recognizing Handwritten Devanagari Words Using Recurrent Neural Network

421

14. Arora, S., Bhatcharjee, D., Nasipuri, M., Malik, L.: A two stage classification approach for handwritten Devanagari characters. In: Proc. Int. Conf. Comput. Intell. Multimedia Appl., pp. 399–403 (2007) 15. Kaur, S.: Recognition of handwritten Devanagari script using features based on Zernike moments, zoning and neural network classifier. M.Tech Thesis, Dept. Comput. Sci. Eng., Punjabi University, Patiala, India (2004) 16. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997) 17. Kumar, S.: Performance comparison of features on Devanagari handprinted dataset. Int. J. Recent Trends 1(2), 33–37 (2009) 18. Steinherz, T., Rivlin, E., Intrator, N.: Offline cursive script word recognition - a survey. International Journal of Document Analysis and Recognition 2(2), 90–110 (1999) 19. Pal, U., Sharma, N., Wakabayashi, T., Kimura, F.: Off-line handwritten character recognition of Devanagari script. In: Proc. 9th Conf. Document Anal. Recognit., pp. 496–500 (2007) 20. Pal, U., Chanda, S., Wakabayashi, T., Kimura, F.: Accuracy improvement of Devanagari character recognition combining SVM and MQDF. In: Proc. 11th Int. Conf. Frontiers Handwrit. Recognit., pp. 367–372 (2008) 21. Frinken, V., Fornés, A., Lladós, J., Ogier, J.-M.: Bidirectional language model for handwriting recognition. In: Gimel’farb, G., Hancock, E., Imiya, A., Kuijper, A., Kudo, M., Omachi, S., Windeatt, T., Yamada, K. (eds.) SSPR & SPR 2012. LNCS, vol. 7626, pp. 611–619. Springer, Heidelberg (2012)