A Two-stage Online Handwritten Chinese ... - Semantic Scholar

3 downloads 2134 Views 219KB Size Report
A Two-stage Online Handwritten Chinese Character Segmentation Algorithm. Based on Dynamic ... based on a dynamic programming algorithm, which uses geometrical .... classifier, and a class probability distribution given by the classifier. It will be used .... box-free Recognition of Handwritten Japanese Text. Considering ...
A Two-stage Online Handwritten Chinese Character Segmentation Algorithm Based on Dynamic Programming 1

Xue Gao1, Pierre Michel Lallican2, Christian Viard-Gaudin3 South China University of Technology, Guangzhou 510641, P.R. China 2 Vision Objects, Nantes 44980, France 3 Ecole Polytechnique de Nantes, Nantes 44306, France [email protected], [email protected], [email protected]

Abstract In this paper, an online handwritten Chinese character segmentation method is proposed. It is based on a dynamic programming algorithm, which uses geometrical features extracted from the handwritten strokes. The algorithm is carried out in two stages: pre-segmentation and recognition-based segmentation. The experimental results on 2363 sentences, representing nearly 70,000 characters and more than 370,000 strokes, show that the presegmentation stage keeps incorrect segmentation rate below 1% with an over-segmentation rate limited to 11%. The final correct segmentation rate is about 88%, without using any language model, indicating the effectiveness of proposed approach.

1. Introduction Character segmentation is essential, in which a sequence of continuous handwriting signal is decomposed into sub-patterns of individual symbols intended for the recognition system, since most of them are based on analytical approach using a character recognizer as the core of the system. And it has long been a critical area of OCR systems, particularly for the recognition of cursive script [1,2]. The crucial role plays by segmentation can be illustrated by the degradation in recognition performance on segmented and unsegmented handwriting [2]. Today, the problem of segmentation still exists, especially for handwritten Chinese characters [3, 4], which has not been addressed as much as for western characters, although the segmentation is still more important for the recognition system [5]. Up to now, only a few results have been reported, and most of them are focused on the offline handwritten Chinese characters segmentation. For

offline handwritten Chinese character segmentation, although the reported approaches may vary greatly, all of them are seeking to use the prior knowledge about Chinese character structuring as shown in papers [514]. For online handwritten Chinese character segmentation, we have only found very few directly related papers. References [15~17] proposed segmentation approaches based on a projection analysis and dynamic programming (DP). In a first step, the projection analysis is used to find candidate segmentation points. And then, by using recognition and language model information, a DP algorithm is applied to find most reliable segmentation paths. Another possible way for online Chinese character segmentation is to deal with it directly in contextual processing as shown in paper [4]. In that case, all the strokes and their combinations can be a candidate segmentation hypothesis, and the contextual and recognition information may be used to verify the candidate segmentation paths. However, this may lead to a burden for the recognizer since some Chinese characters can contain up to 48 strokes and there are nearly 7,000 Chinese character categories in the GB2312-80 set. Therefore, a tradeoff is needed between the number of segmented character candidates fed into the classifier and the recognition speed. To alleviate this difficulty, we propose a two-stage segmentation method based on dynamic programming. It is applied to online Chinese character segmentation, and it uses geometrical features extracted from the handwritten strokes. The sequences of online handwritten strokes are grouped based on geometrical information at a first stage, (here referred as presegmentation). Then, the pre-segmented characters are fed into neural network to achieve recognition-based segmentation process, where the probability and some geometrical information for each candidate character category are combined inside the cost function in the

Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 1520-5263/05 $20.00 © 2005

IEEE

dynamic programming scheme to find the best segmentation path. Here, we use, as the recognition engine, a neural network classifier that can process all the symbols of GB2312-80. It allows to obtain a quite comparable recognition rate for isolated Chinese characters as for the segmented characters proposed by this method. In the following, we first address the presegmentation stage; then discuss the recognition-based segmentation process, and give experimental results to show the effectiveness of proposed method; finally, some concluding remarks are made.

2. Pre-segmentation Dynamic programming is an extensively studied and widely used tool in operation research for solving sequential decision problems. For online handwritten Chinese character segmentation, it leads to group sequential strokes into characters. Therefore, it is a natural choice to address character segmentation by DP algorithm, provided that a reasonable cost function is defined. At the pre-segmentation stage, the aim is to reduce the number of word candidate hypothesizes to an acceptable extent while keeping a low segmentation error rate. In the proposed approach, we use only the shape and position information of handwritten strokes for computing the cost function to achieve presegmentation. It seems enough at this stage to use only these visual clues, without any language information, as will do a human being who does not know the corresponding language. As shown in Figure 2, a segmentation network is built based on the stroke sequence, which is ranked according to the time index of the strokes. Here the square boxes represent different segmentation points and the arcs represent the cost for the segmentation points. With the DP algorithm, we can find a segmentation path with minimum costs, that is, the optimal segmentation path. Given a sequence of strokes {s1 ,, s N } the DP

Qopt

min{Q (t , q )  c(t , q, t , q )} q

min{Q(t , q) | t q

with t 

(1) (2)

N}

t  q 1

(3)

and q, q d M

(4)

where c(t , q, t  , q ) is the cost for segmentation hypothesis {st q ,, st } associated with previous hypothesis

{st   q  ,, st  } ,

path, M is the upper boundary of stroke number for each Chinese character, the maximum stroke number for the Chinese character set GBK is 48 (“啬 ”), we have in our work limited this value to M = 30. As it is known, the cost function plays an important role in DP based handwritten character segmentation algorithms. In the proposed approach, based on the observation of the structural characteristics of Chinese handwritings and experimental results, the three following measurements have been used in the cost function and proven to be effective in Chinese handwriting segmentation. Size: Most characters within a line of script have about the same size, so the height and width of segmented Chinese characters can be used as a constraint in finding the best segmentation paths. Space: It can be seen that the spaces between characters are generally larger than those within the characters. This is true because the writers normally need to keep some inter-character space to make the script being easily recognized by others. Therefore, such information has been introduced in the cost function to “punish” those segmented characters with less inter-character space. Point radius: we define the point radius for each sampled ink point as its distance from the ink gravity center of the segmented characters which the point belongs to, as shown in Figure 1. It can be seen that the average point radius of a sentence with correct segmentation will normally be less than that of incorrect ones. Experimental results show that this measure can be used to deal with either horizontal or vertical handwritings effectively.

Figure 1. Point radius

algorithm can be formulated as:

Q (t , q )

{st q ,, st } , Qopt the cost for optimal segmentation

Q(t , q)

denotes

minimum accumulated cost for current hypothesis

In order to make the segmentation algorithm adapted to different handwriting styles, average height and width of segmented Chinese characters are needed to normalize the cost function. One simple method for estimating those parameters can be done by projection. However, such an estimation may sometimes be quite imprecise, especially when the characters are written cursively and/or aslant. In the proposed approach, the strokes instead of characters are used in estimating a reference height and a reference width and have produced satisfying segmentation results on different character sizes.

Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 1520-5263/05 $20.00 © 2005

IEEE

the segmentation paths and to produce recognition results of the sentences. Similar as in pre-segmentation, a candidate segmentation network is constructed based on the results from pre-segmentation as shown in Figure 3, and a DP algorithm is used in finding the best segmentation path.

Results concerning this pre-segmentation stage will be presented in section 4, table 1.

3. Recognition based segmentation As stated in [4], characters cannot be segmented unambiguously prior to recognition. In the proposed approach, a neural network classifier is used to verify









D

E

F

Figure 2. (a) Candidate segmentation network (b) segment of a sentence (c) pre-segmentation result At this stage, each possible group of given hypothesizes will be sent to the neural network classifier, and a class probability distribution given by the classifier. It will be used to define the cost function for that grouping. However, the experimental results on our sentence database show that class probabilities are not enough in finding the correct segmentation path. The reasons may come from many aspects including classifier itself, as every part in recognition may affect segmentation results. In our situation, the complexity of stroke elements, the variability of handwriting styles and the existence of similar characters in Chinese characters may be important aspects in affecting segmentation. For example, the handwritten Chinese character “茶” can sometimes be segmented as pattern without stroke “Њ ”, and still be recognized with high class probability by classifier. From a visual point of view, the stroke “Њ ” only accounts for very small part of character inks. This may lead to incorrect segmentation and recognition. Therefore, we combine some geometrical information with class probability in the cost function, which includes: (a) Overlap ratio between consecutive characters; (b) Geometrical features concerning height, width and relative position of candidate segmentation in a line of handwriting. The overlap ratio refers to the overlap normalized by the size of overlapped characters. Normally, higher overlap ratio means higher possibility that two parts should be merged into one character. Instead of finding a threshold for the overlap ratio to guide the

segmentation, the parameter is combined into the cost function in the proposed approach. In fact, some easily confused characters by classifier may be different with respect to geometrical information, as shown for example in (b), with the characters “螐” “-” “_”. Therefore, this combination will also be useful in improving recognition results of the handwritten sentences.

D

E

4. Experimental results

Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 1520-5263/05 $20.00 © 2005

IEEE

F

Figure 3. (a) Candidate segmentation network (b) segment of a pre-segmented sentence (c) segmentation result.

Table 2. Segmentation and recognition rate (%) Segmentation method Segmentation Recognition Manual segmentation 100 79.7 Class probability only* 58.9 52.97 Proposed method 88.2 75.7 *: “Class probability only” means that the cost function for segmentation is only based on neural network classifier output

Figure 4. Some examples of test set sentences The database used in our experiments is part of our newly collected sentences written with digital pens, which allow writing on traditional paper. Therefore, the handwriting is more natural. The database contains 2363 Chinese sentences extracted from newspapers and web pages. In the following, we will report the experimental results based on all the sentences of this database, Figure 4 gives some examples, whereas all the data used to train the classifier comes from another base. The neural network classifier used can recognize all 6763 categories of characters from the GB2312-80 set and 131 western characters and other symbols. For pre-segmentation, the incorrect segmentation rate (R1) and over-segmentation rate (R2) are used to evaluate the performance of the algorithm. We define as a reference segmentation point each pen-up point where a character is finished in the sequence of sampled ink points. All reference segmentation points (manually segmented) make up the set P. R1 refers to the ratio of reference segmentation points not found by algorithm in P, R2 the ratio of segmentation points given by the algorithm but not in P among all endpoints of strokes that are not in P. Table 1 gives the pre-segmentation results. Table 1. Pre-segmentation results R1(%)

R2 (%)

0.57

11.1

Stroke number 377238

Character number 69783

From Table 1, it can be seen that pre-segmentation greatly reduces the number of candidate characters that have to be sent to the neural network classifier while keeping R1 below 1%, showing the proposed presegmentation algorithm is effective. In Table 2, we give the character segmentation and recognition performance on our sentence database after the recognition based segmentation stage.

From Table 2, it can be seen that, the proposed approach has only a 4% remaining gap of recognition rate with manual segmentation, while the correct segmentation rate, i.e. the ratio of correctly segmented characters among all reference segmented characters, is 88.2%, showing its effectiveness; Comparing with that obtained when only class probability being used, the algorithm proposed in this paper has obtained a great improvement in segmentation and recognition performances. Handwritten sentence segmentation and recognition is an open task in character recognition field due to variety of writing styles. Some cursive characters in sentences, even manually segmented, are possible to be miss-recognized. The results given in this paper are satisfying, when considering no language model being used.

5. Conclusion In this paper, an online handwritten Chinese character segmentation approach has been proposed. It is based on a double dynamic programming algorithm. The experimental results show that the approach is feasible and effective though some segmentation errors still exist. It can be concluded that geometrical information is useful in segmentation when combined effectively with recognition information given by the classifier. As stated in [4], we believe that language model is another important resource in solving ambiguities for handwritten Chinese character segmentation. However, in our opinion, to solve segmentation completely needs combining all useful information from handwriting. It’s such opinion that leads to our present work. In the following, the language model will be introduced in our system to improve segmentation.

Acknowledgments This paper has been supported by scholarship of French Foreign Office and the Natural Science Foundation of Guangdong (NO. 04300098).

Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 1520-5263/05 $20.00 © 2005

IEEE

References [1] R. Plamondon, and S.N. Srihari, "On-line and off-line Handwriting Recognition: A Comprehensive Survey", IEEE Trans. PAMI, vol.22, no.1, pp.63-84, 2000. [2] R.G. Casey, and E. Lecolinet, "A Survey of Methods and Strategies in Character Segmentation", IEEE Trans. PAMI, vol.18, no.7, pp.690-706, 1996. [3] R.W. Dai, H.W. Hao, and X.H. Xiao, "System and Intergration of Chinese Character Recognition", ZheJiang Science &Technology Publishing House, 1998. [4] C.L. Liu, S. Jaeger, and M. Nakagawa, "Online Recognition of Chinese Characters: The State-of-the-Art ", IEEE Trans. PAMI, vol.26, no.2, pp.198-213, 2004. [5] Y. Lu, P.F. Shi, and K.H. Zhang, " Segmentation of Free Format Handwritten Chinese Characters based on Structure Features of Characters", Acta Electronica Sinica, vol.28, no.5, pp.102-104, 2000. [6] B. Zhou, S.P. Ma, and J. Zhe, "Modification of the Chinese Character Segmentation Method based on Units Amalgamation", Journal of Chinese Information Processing, vol.13, no.2, pp.33-39, 1999.

[13] T. Yamaguchi, T. Yoshikawa, and T. Shinogi, etc., "A Segmentation Method for Touching Japanese Handwritten Characters based on Connecting Condition of Lines", Proc. 6th Int. Conf. Document Analysis and Recognition, pp.837841, 2001. [14] J. Gao, X.Q. Ding, Y.S. Wu, "A Segmentation Algorithm for Handwritten Chinese Character Strings", Proc. 6th Int. Conf. Document Analysis and Recognition, pp.633636, 2001. [15] H. Murase, "Online Recognition of Free-format Japanese Handwritings", Proc. 9th Int. Conf. Pattern Recognition, pp.1143-1147, 1988. [16] T. Fukushima, and M. Nakagawa, "On-line Writingbox-free Recognition of Handwritten Japanese Text Considering Character Size Variations", Proc. 15th Int. Conf. Pattern Recognition, pp.359-363, 2000. [17] C. Hong, L. Gareth, and Y.M. Wu, etc., "Segmentation and Recognition of Continuous Handwriting Chinese Text", Int. J. Pattern Recognition and Artificial Intelligence, vol.12, no.2, pp. 223-232, 1998.

[7] D.L. Ming, J. Liu, and J.Z. Hu, etc., "An Improved Algorithm for Handwritten Chinese Characters Text", Journal of Huazhong University of Science &Technology, vol.28, no.2, pp.87-89, 2000. [8] Y.H. Tseng and H.J. Lee, "Recognition-based Handwritten Chinese Character Segmentation Using Probabilistic Viterbi Algorithm", Pattern Recognition Letters, vol.20, no.8, pp.791-806, 1999. [9] L.Y. Tseng, and R.C. Chen, "Segmenting Handwritten Chinese Characters based on Heuristic Merging of Stroke Bounding Boxes and Dynamic Programming", Pattern Recognition Letters, vol.19, no.8, pp.963-973, 1998. [10] J. L. Xue, X. Q. Ding, and C.S. Liu, etc., "Location and Interpretation of Destination Addresses on Handwritten Chinese Envelopes", Pattern Recognition Letters, vol.22, no.6/7, pp.639-656, 2001. [11] S. Zhao, Z. Chi, P. Shi, etc., "Handwritten Chinese Character Segmentation Using A Two-stage Approach", Proc. 6th Int. Conf. Document Analysis and Recognition, pp. 179-183, 2001. [12] K.Q. Wang, J.A. Kangas, and W.W. Li, "Character Segmentation of Color Images from Digital Camera", Proc. 6th Int. Conf. Document Analysis and Recognition, pp.210214, 2001.

Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05) 1520-5263/05 $20.00 © 2005

IEEE