A Handwritten Bengali Consonants Recognition

0 downloads 0 Views 8MB Size Report
Bengali is written from left to right and there is no upper case or lower case in Bengali script. Though ... paper is influenced by a study on the type design and text.

A Handwritten Bengali Consonants Recognition Scheme based on the Detection of Pattern Primitives Priyanka Das

Tanmoy Dasgupta and Samar Bhattacharya

Dept. of Electronics and Communication Engg. Techno India University, Kolkata, India Email: [email protected]

Dept. of Electrical Engineering Jadavpur University, Kolkata, India Email: [email protected], samar [email protected]

suggested by Barman et al. [9] where the Bengali characters are recognised with the help of artificial neural network (ANN) instead of the traditional OCR techniques. Other types of feed forward ANN models, like the multilayer perceptron (MLP) classifier based on stroke features had been in use [10], [11] earlier. In 2009 Basu et al. suggested a technique using convex hull based feature set for the recognition of handwritten Bengali basic characters and digits [12]. Languages like Bengali and Odia were considered for the development of consonantvowel recognition system (CVRs) using read speech corpus [13]. A robust algorithm was also applied for recognition of strokes present in the Bengali characters [14]. Recently, recognition of handwritten Bengali numerals has been done incorporating the use of mathematical morphology and auto generation of structuring elements (SEs) [15]. The present paper is influenced by a study on the type design and text composition used in various Indian scripts by Ghosh [16]. In the present work, the handwritten Bengali consonants are recognised by analysing their constituent pattern primitive sets. There exists a pattern primitive set for each alphabet. These sets are the fundamental constituent simple strokes that construct a character. The rest of the paper is organised as follows. Section II introduces the pattern primitive set. The recognition process is elaborated in the section III. Section IV outlines the experimental results and the conclusive remarks are presented in Section V.

Abstract—The present work demonstrates a novel scheme for recognising Bengali handwritten consonants by exploring the primitive set of strokes that construct the characters. The Bengali consonants are first manually analysed in order to decompose them into their constituent pattern primitives. Once an exhaustive list of such primitives are prepared, a scheme based on mathematical morphology is devised to identify their existence in the scanned images of the handwritten characters. The characters are identified on the basis of the detected set of primitives. Although, the scheme involves multiple iterations, it runs reasonably fast and doesn’t require any kind of training. Index Terms—Handwritten character recognition, pattern primitives, pattern recognition

I. I NTRODUCTION Bengali is an ancient Indo-Aryans language and the second most popular language in the Indian sub-continent. There are more than hundred million Bengali speakers in the world [1]. The script used in Bengali originated from ‘Siddham’, an ancient branch of ‘Brahmi’ script, which is the origin of almost all Indic scripts. The early Brahmi script split into two major branches, the north Indian script and the south Indian or Dravidian script. Bengali falls under the north Indian script. The Bengali script consists of eleven vowels and thirty-nine consonants, ten numerals and various punctuations. Bengali is written from left to right and there is no upper case or lower case in Bengali script. Though there have been enormous amount of research done on Latin scripts [2], [3], in the recent years, the non-Latin scripts have gained attention as well [4], [5]. Bengali, being one of the major languages having a non-Latin script, has received the interest of many researchers working in the domain of character recognition. In 1998, Chowdhury et al. published one of his pioneering works on a complete set of printed OCR system [6]. A novel multi-stage approach which used the concept of headline (matra), upper and lower part of the character, disjoint sections and vertical lines, was suggested for the recognition of Bengali handwritten text [7]. In 2006, Pal et al. [8] developed a simple yet accurate scheme where the characters were considered to be water reservoirs for recognition of Bengali handwritten digits. In the recent years, syntagmatic and paradigmatic methods have been used for the analysis of the anatomy of Bengali letterforms by some semioticians [1]. Even a view based approach was

c 978-1-5090-1047-9/16/$31.00 2016 IEEE

II. PATTERN P RIMITIVE S ET Pattern primitive sets are widely used for character recognition of various Indic scripts. It mainly recognises the different types of strokes present in the characters. Most of the characters in English language are composed of a single stroke or a two, whereas in contrast, due to the anatomical complexities of Bengali characters, they are written with the help of two or more than two strokes. Also, at the top of the base character, a horizontal line is attached which is known as headline or matra. Fig. 1 introduces the pattern primitive set we have considered for our present work. An extensive study has been done in this case, where it can be shown that a set of 30 strokes are sufficient to recognise all the consonants in the Bengali alphabet. These strokes are called the primitives.


Fig. 1. Pattern primitive set for Bengali consonants

Fig. 3. Primitives for the next 14 Bengali consonants

III. R ECOGNITION P ROCESS Once the set of all pattern primitives is constructed, the individual Bengali consonant letterforms are analysed to decompose them into their constituent pattern primitives. Fig. 2 – 4 shows the decomposition for the 39 Bengali consonants. The scheme for recognition of the characters involves detecting the presence of certain primitives in the scanned images. However, since different characters have some distinctive anatomical features, the presence of some of the primitives are weighed higher than that of the others. For example, the presence of the pattern primitive no. 12 in the letterform of a character almost certainly lets one identify it as the Bengali consonant /NO/. A. Detection of primitive straight lines The most common type of the pattern primitives is straight lines. They are depicted in pattern primitive 1 to 5 and 17. The presence of any of them can be recognised by analysing the scanned image of the character and calculating the slope of possible line segments. Any line segment in an image can be described by x = a+by, where, (x, y) denotes the locations of the pixels on the line segment, and, a and b can be calculated

Fig. 2. Constituent pattern primitives for the first 13 Bengali consonants


B. Detection of other curved primitives The detection of the rest of the curved primitives is done by calculating a distance score by comparing portions of the normalised scanned character image with the predefined shape primitives. To calculate the distance score, three shift operations on the image pixels are defined. The detailed methodology is presented in algorithm 1–2. Fig. 5 represents a handwritten Bengali consonant /gO/ and its constituent pattern primitives. For instance, this character is constructed using four pattern primitives, namely primitive 02, 05, 10 and 11. Thus the presence of these pattern primitives in a character would simply imply that it is the Bengali consonant /gO/. Algorithm 1: Calculation of the matching matrix S that represents the percentage of between a character image and a pattern primitive set Input: The input character image E and the reference set of shapes R; Output: The matching score S; begin represent the input shape as E = {E1 , E2 , · · · , Ek } create a set containing possible primitive shape matches; while number of primitive sets > 0 do create a matching matrix S containing the match scores; reference the row locations of S by E ; reference the column locations of S by R ; represent the matching score between the ith element of E and j th element of R by Sij while i, j > 0 do if Ei closely matches with Rj then assign Sij = 1.0; end else if Ei partially matches with Rj then assign Sij = 0.5; end else assign Sij = 0.0; end increment i and j; end decrement the number of available primitive sets; end Return the matching matrix S; end

Fig. 4. Primitives for the last 12 Bengali consonants

from the theory of least-square straight line fitting as in (1). P P P N P x P y2 P P x xy y y xy P , b= P . a = (1) N N P P P y2 P y2 y y y y Here, N represents the number of pixels present in the straight line segment, which can be calculated after skeletonising the character image. The number N can also be used as a measure of the length of the line segment under consideration. In this way, the presence of a vertical straight line in the scanned character can be determined by checking whether P N P P y2 ≈ 0, y y and the presence of a horizontal line segment can be detected by checking whether P N P P x ≈0 y xy over a certain range of pixels on the scanned character image. In a similar fashion, any line segment that makes an θ wrt the +ve x-axis of the image, can be detected by checking whether P N P P y2 y y = tan θ P N P P x y xy

As depicted in fig. 6, there is a possibility of the existence of ambiguously written characters. This happens because of the interchangeability of a few pattern primitives and only minor variations in their usage. Possible candidates for pattern ambiguity are the pairs /kh O/ and /gh O/, /dO/ and /óO/; /dh O/ and ¯ /dzO/. ¯ /óh O/; two /nO/; /bO/ and /rO/; /eO/, /ùO/ and “

over a certain range of pixels.


Algorithm 2: Computation of optimal score Input: Matrix containing the matching score S; Output: The optimal path through S that maximises the score; begin Define a shift down (S ↓ ), a shift right (S  ) and a diagonal shift (S & ) operator as follows: S↓ = Ai,j + max{(Ai+1,j + Ai+2,j ), (Ai+1,j + Ai+2,j+1 )}; S = Ai,j + max{(Ai,j+1 + Ai,j+2 ), (Ai,j+1 + Ai+1,j+2 )}; S & = Ai,j + (Ai+1,j+1 + max{Ai+1,j+2 , Ai+2,j+2 , Ai+2,j+1 }); create an empty matrix L of order n × k Return the matching matrix S; for i = 0, . . . , n − 1 do for j = 0, · · · , k − 1 do if max{S ↓ , S  , S & } = S ↓ then move downards; end else if max{S ↓ , S  , S & } = S  then move right; end else if max{S ↓ , S  , S & } = S & then move down diagonally towards right; end end store the computed optimum scores to the column elements of the ith row of L; end return L; end

(a) character /gO/

(b) primitive 02

(d) primitive 10

Fig. 6. List of possible ambiguous characters

(a) 1.0

(b) 0.5

(c) 0.5

(d) 0.0

(e) 0.0

Fig. 7. (a) A reference constituent primitive of /gO/ and (b)–(e) matching scores for different similar input segments

Fig. 7 represents a sample reference segment created from primitive 10 and the corresponding matching scores of similar but disoriented input segments. After generating the matching scores for all the 30 pattern primitives, they are further approximated by a set of 8 directed straight line segments. So the pattern primitives can be represented by Freeman’s chain code [14]. Fig. 8 represents the directed Freeman’s chain code for approximating the character primitives by line segments. For instance, primitive 10 can be thought to be a piecewise straight line approximation consisting of the chain 2-4-2. Similarly, primitive 11, 05 and 02 can be approximated to the chains 8, 7-7 and 1 respectively.

(c) primitive 05

(e) primitive 11

Fig. 5. The Bengali consonant /gO/ and its constituent pattern primitive Fig. 8. Directed Freeman’s chain code


Char. /kO/ /gO/ /NO/ /tCh O/ /dZh O/ /tO/ /d¯O/ ¯ /nO/ /t”h O/ /d”h O/ /pO/ /bO/ /mO/ /rO/ /So/ /sO/ /óO/ /eO/ “ /N/ /˜n/

Fig. 9. Character string for the Bengali consonants /kO/, /gO/ and /tCh O/

Time (ms) 23 13 18 22 36 15 18 20 36 20 20 14 26 29 35 20 36 26 17 08

Accu. (%)


Time (ms)

Accu. (%)

88 90 78 94 70 79 90 74 80 86 85 96 80 88 70 86 76 80 91 98

/kh O/

18 28 20 25 35 18 26 13 10 16 30 20 26 32 28 15 30 12 08

79 76 96 90 72 86 87 84 88 70 86 76 77 83 80 90 80 90 95

/gh O/ /tCO/ /dZO/ /nO/ /th O/ /d¯h O/ ¯/t”O/ /d”O/ /nO/ /FO/ /bh O/ /dzO/ /lO/ /ùO/ /HO/ /óOh / /t”/ /h/ ˙


(a) primitive 10

This, together with the fact that all the operations and matching score generation processes are vectorised, makes the overall character recognition process very fast and most characters are recognised in less than 40 ms with an accuracy of over 80% for most of the characters in a computer running Ubuntu GNU/Linux version 16.04 on an Intel(R) Celeron(R) CPU 1007U @ 1.50 GHz processor. Table I provides the results of running the detection algorithms on 100 samples of each of the characters.

(b) primitive 29

Fig. 10. Piecewise linear approximation of some primitives. (a) produces a sting 2-4-2 and (b) produces 1-6

Thus the presence of the chain 2-4-2-8-7-7-1 in a character represents the presence of the primitives 10, 11, 05 and 02. Hence the character can be recognised as the consonant /gO/. This is depicted in fig. 9. In this way, there remains a method for cross-checking the correctness of the recognition process.

V. C ONCLUSION Developing a character recognition algorithm with cent percent accuracy is impossible because even while reading their own handwritings, human beings often make mistakes. The character recognition method developed here describes the shapes of the characters to the recognition algorithm instead of learning them. This is exactly how a child learns to read and write. Although, this method is not entirely full-proof, it serves the purpose with good speed and accuracy even on a low end machine.

IV. E XPERIMENTAL R ESULTS The schemes illustrated in this paper have been implemented using the Python binding of OpenCV. This scheme is tested on a set of handwritten sample Bengali consonants collected from few faculty members and students of the institution the first author is affiliated with. The algorithm first tries to determine the presence of the primitives in a scanned character image and then redo the whole process by considering the piecewise linear approximations of the primitives. Two of such approximations are depicted in fig. 10. This helps improve the accuracy of recognition. Moreover, this provides the user two possible ways of running the programme. The user can opt for a faster but slightly less accurate recognition method by running the approximation based algorithm only. When higher accuracy is required, the user can run the finer method and cross check the results with those obtained by the piecewise linear approximation based method at some expense of the recognition speed. However, it is found that maximum number of times the program runs for a certain character is at most 17.

R EFERENCES [1] S. Chandra, P. Bokil, and D. Udaya Kumar, “Anatomy of bengali letterforms: A semiotic study,” in ICoRD15 Research into Design Across Boundaries Volume 1, ser. Smart Innovation, Systems and Technologies, A. Chakrabarti, Ed. Springer India, 2015, vol. 34, pp. 237–247. [2] F. Ross and G. Shaw, Non-Latin scripts From metal to digital type. St. Bride Foundation, London, 2012. [3] R. Plamondon and S. Srihari, “Online and off-line handwriting recognition: a comprehensive survey,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 22, no. 1, pp. 63–84, Jan 2000. [4] R. Sinha, “A journey from indian scripts processing to indian language processing,” Annals of the History of Computing, IEEE, vol. 31, no. 1, pp. 8–31, Jan 2009. [5] V. Lajish and S. Kopparapu, “Online handwritten devanagari stroke recognition using extended directional features,” in Signal Processing and Communication Systems (ICSPCS), 2014 8th International Conference on, Dec 2014, pp. 1–5.


[6] B. B. Chowdhury and U. Pal, “A complete printed ocr system,” in Pattern Recognition, Elsevier Publication), vol. 31, March 1998, pp. 531–549. [7] A. Rahman, R. Rahman, and M. Fairhurst, “Recognition of handwritten bengali characters: A novel multistage approach,” in Pattern Recognition, vol. 35, 2002, pp. 997–1006. [8] U. Pal, B. B. Chaudhuri, and A. Bela¨ıd, “A system for bangla handwritten numeral recognition,” IETE Journal of Research, Institution of Electronics and Telecommunication Engineers, vol. 52, no. 1, pp. 27–34, November 2006. [9] S. Barman, A. K. Samanta, T. hoon Kim, and D. Bhattacharyya, “Design of a view based approach for bengali character recognition,” in International Journal of Advanced Science and Technology, vol. 16, February 2010, pp. 49–62. [10] T. Bhowmik, U. Bhattacharya, and S. K. Parui, “Reocgnition of bangla handwritten characters using an mlp classifier based on stroke features,” in Proceedings of ICONIP, Kolkata, India, 2004, pp. 814–819. [11] S. Basu, N. Das, R. Sarkar, M. Kundu, M. Nasipuri, and D. K. Basu, “Handwritten bangla alphabet recognition using an mlp based classifier,” in Proceedings of 2nd National Conference on Computer Processing of Bangla, Dhaka, February 2005, pp. 285–291.

[12] S. Basu, N. Das, R. Sarkar, M. Kundu, M. Nasipuri, P. K. Saha, and S. Pramanik, “Recognition of handwritten bangla basic characters and digits using convex hull based feature set,” in 2009 International Conference on Artificial Intelligence and Pattern Recognition(AIPR-09), 2009, pp. 380–386. [13] K. Manjunath, S. Kumar, D. Pati, B. Satapathy, and K. Rao, “Development of consonant-vowel recognition systems for indian languages: Bengali and odia,” in India Conference (INDICON), 2013 Annual IEEE, Dec 2013, pp. 1–6. [14] M. B. Islam, M. M. B. Azadi, M. A. Rahman, and M. M. A. Hashem, “Bengali handwritten character recognition using syntactic method,” in NCPB, Independent University of Bangladesh, 2005, pp. 264–275. [15] P. Das, T. Dasgupta, and S. Bhattacharya, “A novel scheme for bengali handwriting recognition based on morphological operations with adaptive auto-generated structuring elements,” in 2nd International Conference on Control, Instrumentation, Energy and Communication (CIEC16), 2016, pp. 211–215. [16] P. K. Ghosh, “An approach to type design and text composition in indian scripts,” Stanford, CA, USA, Tech. Rep., 1983.