Cuneiform Symbols Recognition Using Pattern

0 downloads 0 Views 11MB Size Report
Finally, I would never have been able to finish my Thesis without the help from friends, ..... transform has two parts phase and magnitude, the first one is used to restore the real ...... The cuneiform writing (Sumerian) was not initially composed of ...

Republic of Iraq Ministry of Higher Education and Scientific Research University of Technology Department of Computer Science

Cuneiform Symbols Recognition Using Pattern Recognition Techniques

A Thesis Submitted to the Department of Computer Science of the University of Technology in a Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science BY

Ali Adel Saeid AL-Tammeme


Prof. Dr. Abdul Monem S .Rahma

Asst.Prof. Dr. Abdul Mohssen j. Abdul hossen


‫الرِح ِيم ﴾‬ ‫الر ْح َم ِن َّ‬ ‫﴿ بِ ْس ِم اللّ ِه َّ‬

‫ك الَّ ِ‬ ‫ِ‬ ‫نسا َن ِم ْن َعلَق ﴿‪﴾٢‬‬ ‫اْل‬ ‫ق‬ ‫ل‬ ‫خ‬ ‫﴾‬ ‫‪١‬‬ ‫﴿‬ ‫ق‬ ‫ل‬ ‫خ‬ ‫ي‬ ‫ذ‬ ‫َ‬ ‫َ‬ ‫ْ‬ ‫َ‬ ‫َ‬ ‫اس ِم َربِّ َ‬ ‫َ‬ ‫َ‬ ‫اقْ َرأْ بِ ْ‬ ‫َ‬ ‫ك ْاْلَ ْك َرم ﴿‪ ﴾٣‬الَّ ِذي َعلَّ َم بِالْ َقلَ ِم ﴿‪﴾٤‬‬ ‫اقْ َرأْ َوَربُّ َ‬ ‫َعلَّ‬ ‫ِ‬ ‫نسا َن َما لَ ْم يَ ْعلَ ْم ﴿‪﴾٥‬‬ ‫اْل‬ ‫م‬ ‫ْ‬ ‫َ‬ ‫َ‬ ‫صدق هللا العظيم‬

‫(سورة العلق ‪:‬األيه ‪)5-1‬‬


All my thanks first of all are addressed to Almighty Allah who has guided my steps towards the path of knowledge and without his help and blessing, this Thesis would not have progressed or have seen the light. My sincere appreciation is expressed to


supervisor Dr.

Abdul Monem S .Rahma for providing me with support, ideas and


I am extremely grateful to all members of Computer Science Department of University of Technology for their general support Finally,

I would never have been able to finish my


without the help from friends, and support from my family .

Thank you all


Supervisor's Certification

We certify that this dissertation entitled (Cuneiform Symbols Recognition Using Pattern Recognition Techniques) was prepared by "Ali Adel Saied AL-temmeme" under our supervision at Computer Science Department/University of Technology in a partial fulfillment of the Degree of Ph.D. in Computer science .

Signature Name: Prof. Dr. Abdul Monem S.Rahma Date : /


Signature Name: Asst.Prof. Dr. Abdul Mohssen j. Abdul hossen Date : /


Table of Contents Subject Abstract List of Contents List of Tables List of Figures List of Abbreviations List of Algorithms Chapter One: General Introduction 1.1 introduction 1.2 Pattern recognition 1.2.1Pattern recognition approaches 1.2.2 pattern recognition and cuneiform writing 1.3 Literature Survey

Page No i iii Vi Viii Xi Xiii 1 1 3 4 4

1.4 Aim of Thesis 1.5 Thesis Contributions 1.6 Organization of Thesis Chapter Two: Theoretical Background 3.1 2.1Introduction 2.2 Character Recognition Types 2.2.1Online character recognition 2.2.2 Offline character recognition 2.3 Preprocessing 2.3.1 Image Enhancement 2.3.2 Image Enhancement Filters 2.4 Image Segmentation 2.4.1 image Thresholding 2.4.2 Image Thresholding Techniques 2.4.3 Image connected-component labeling 2.5Image distance transform 2.5.1 Distance Transforms With Sampled Functions 3.6 Feature extraction 2.6.1. Statistical Features 2.6.2. Global Transformation and Series Expansion Features 2.6.3. Structural Features 2.7 Classification 2.7.1Probabilistic neural networks (PNN) 2.7.2 Support Vector Machine (SVM) i

6 6 8 9 10 10 10 10 11 12 16 17 18 20 25 26 29 29 30 33 38 38 40

2.8 Post-Processing 2.9 Image fusion 2.10 Evaluation Measures of cuneiform recognition System 2.11 Proposed learning dataset Chapter three: Proposed Assyrian cuneiform recognition system 3.1 Introduction 3.2 Architecture of the Proposed System 3.3 Assyrian cuneiform recognition system (ACRS) Model1 3.3.1 Image Acquisition Stage 3.3.2 Preprocessing Stage (1) 3.3.3 Image Thresholding 3.3.4 Preprocessing (stage2): The Elimination Of Rejected 3.3.5 Features extraction: 3.3.6. Training stage 3.3.7 Classification stage 3.4. Post- processing 3.4.1. Proposed solution for duplicated problems 3.4.2. Locates the density centroid 3.4.4 Training Ttage 3.4.4 Test feature extraction stage 3.4.5 Classification stage 3.5 Accuracy supporting for cuneiform symbols recognition by Image fusion approach Chapter four: Experiments and Results Discussion 4.1 Introduction 4.2 Cuneiform tablets Images dataset 4.3 Cuneiform tablet image preprocessing 4.3.1 Image Enhancement 4.3.2 Removing spots 4.3.3 Removing writing lines 4.4 Image Thresholding 4.5 Feature extraction 4.5.1 Elliptical Fourier Descriptors (EFD) 4.5.2 Projection histograms 4.5.3 Hu’s moments 4.4.4 Zernike moments (ZM)

103 103 105 105 107 109 110 111 112 114 115 115

4.5.5 Polygon approximation 4.6 Classification 4.7 Results and Discussion

117 123 126

4.8 Analytical comparison

127 ii

43 44 45 45 49 49 51 52 53 54 59 74 84 86 92 93 94 95 96 98 100

Chapter five: Conclusions and Suggestions 5.1 Conclusions 5.2 Suggestions for Future Work References

129 131 149

Table of Tables Table No 2.1 3.1 3.2 3.3 3.4 3.5 3.6 3.7 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5.10 4.11 4.12 4.13

Table Caption

SVM Kernel decrement functions thresholds values computed agents Skewness metric the MSE values for each segments correspond it is length values break points generated from bordered point The computed AEV values for each dominate points The updated AEV values approximated points Structure points Comparison of results of recognition accuracy ratio after applied differing LPF with different size Comparison of results values agents each cut of frequency values of ideal filter. Comparison of results according to each thresholds values Comparison of results after applied the firs line remove algorithm according to different thresholds values Comparison of results of applying the follow thresholding methods according to their consuming time. Feature vector constructed by EFD Comparison of recognition results with applied EFD according each classifier Comparison of recognition results with applied Projection histograms according each classifier Comparison of recognition results with applied Hu’s moments according each classifier Comparison of recognition results with applied ZM according each classifier. Experimental results according to each diversity values by polygon approximation algorithm Experimental results according to each diversity values by proposed polygon approximation algorithm Comparison of recognition results and average Classification Time iii

Page No 43 58 73 67 67 77 77 83 106 107 109 110 110 113 118 118 115 116 118 119 121

4.14 4.15 4.16 4.17 4.18

according each classifier Comparison of recognition results according to different image size. Comparison of recognition accuracy results according to different standard deviation values δ Comparison of recognition accuracy results according to duplicated state recognition state. Comparison of recognition accuracy results according to image size Comparison of recognition accuracy results according to each eatures extraction method and average Classification time

122 123 124 128 127

Table of Figures Figure NO 1.1 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21 2.22 2.23

Figure Caption pattern recognition system image enhancement frequency transform steps Gaussian kernel mask values frequency domain technique image enhancement with low frequency Ideal Low pass filter Image segmentation methods Image connected-component labeling Stricture element Dilation and erosion process regenerated connected component segment . image distance transform the lower envelop of n parabolas distance transformed with it's states Projection histograms feature extraction method The computation of unit disk polygon approximation polygon approximation approaches associated approximate error break points General architecture of a PNN SVM with hard margin hyperplane H support vector maximum margin iv

Page No 2 12 14 14 15 16 17 21 22 22 24 25 26 27 30 31 34 34 35 36 38 40 41 42

2.24 2.25 2.26 2.27 2.28 2.29 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 3.19 3.20 3.21 3.22 3.23 3.24 3.25 3.26 3.27 3.28 3.29 3.30 3.31

SVM model with soft margin changing data space Wavelet Based image fusion three dimension shape model of cuneiform symbols Virtual dataset confirm symbol versus it’s virtual symbols probabilities Proposed ACRS system architecture Architecture of module1 cuneiform images tablets enhanced image by frequency domain, enhanced images with frequency domain with difference values of cut off frequency. Image Binirizetion methods Binrized image with low quality Binarized image after applied histogram equalizing process cuneiform images of same character with difference features extract connection component image labeling spots free cuneiform image the affection of thresholding value cuneiform image segmentation distance transform Preface 1 Erasing cuneiform lines. lines off binary cuneiform problem of statistical algorithm. The cuneiform writing line will removed after select suitable threshold value. Preface 2 MSE line remove Boundary extraction Edge thinning Freeman’s chain code Break points approximate boundary figure Polygon approximation Quality approximation . Cuneiform patterns Approximated features points v

42 43 44 46 47 48 50 51 52 53 54 55 55 56 59 60 62 63 63 64 65 66 68 69 69 70 70 72 74 74 75 75 78 78 79 80 82

3.32 3.33 3.34 3.35 3.36 3.37 3.38 3.39 3.40 3.41 3.42 3.43 3.44 3.45 3.46 3.47 3.48 3.49 3.50 3.51 3.52 3.53 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16

Approximate points Features vector cuneiform character Classification process Color cuneiform image Enhanced State Cuneiform Binirized Image Spot off cuneiform image cleared cuneiform binary image Cuneiform labeled image The classification results against each symbol Cuneiform character matching code problem cuneiform patterns with their centroid virtual cuneiform patterns density centroid pixel separated training binary cuneiform symbols extract search point Model (2) post-processing Illumination problem Classification problem Cuneiform image fusion proposed diagram approximated figures output by the proposed method. deformation problem cuneiform image correspond their enhanced mages output about different LPF with different size. removing spots removing writing lines image binarized methods learning patterns with same direction features vector of EFD Elliptical Fourier Descriptors Hu’s moments features vector four ZM values about each square zoon polygon approximation approximating steps Symbols classification results conform symbol deformation Character cuneiform recognition


84 84 86 78 89 89 90 90 91 91 92 92 93 93 94 94 95 97 99 100 100 101 104 105 106 108 109 111 112 112 113 113 116 117 120 121 122 125

Table of Algorithms Algorithm No. 2.1 2.2 2.2.1 2.3 2.4 2.5 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12

Algorithm Caption image enhancement by frequency domain Iterative threshold algorithm Iterative threshold algorithm (locally applying ) Connected Components Extraction. sampling distance transform Reverse polygonization algorithm Cuneiform image Thresholding spots elimination Isolation algorithm Statistical lines remove method MSE lines remove method Cuneiform Symbolled approximate algorithm Generate Structure feature Training virtual symbol learning set PNN Cuneiform symbols classification Training post processing test features post processing Cuneiform image fusion algorithm


Page No 15 20 20 24 28 37 57 61 64 67 71 80 82 85 88 96 98 102


Meaning Ant colony Optimization Assyrian cuneiform character recognition system Break Points Back Propagation City block Distance Transform Convolution Neural Network Connected-Component Labeling Compression Ratio Dominate Points Distance Transform Euclidean Distance Transform Elliptical Fourier Descriptor Feature Extraction Fourier Transform Gabor Fourier transform High-Pass Filters Ideal Lowpass Filters k- Nearest Neighbor Low-Pass Filters Matching Score Neural Networks Optical Character Recognition Probability Density Function Particle Swarm Optimization Probabilistic Neural Network Pattern Recognition Recognition Ratio Support Vector Machine Symbol Structure Vector True Positive Tabu Search Wavelet Transform Zernike Moments



Writing represents one of the most oldest important inventions in humanity, where the beginning was in the land of Mesopotamia, Writing has undergone many stages of development, This type of writing takes the engraving patterns on stone or pressed on clay tablets which were formed to make the cuneiform characters. International museums like Iraqi Museum include thousands of cuneiform tablets, where the large proportion of them has not translated as result of the lack numbers of translators and the difficulty of this language. Therefore this thesis presents a proposed recognition cuneiform system as solution for mention problem depending on patterns recognition techniques especially with optical character recognition (OCR). In addition to solve the problems that are related with erosion unwanted objects like spots and writing lines, as a new approach , by depending on image morphology and distance transform. Where the proposed training dataset in this thesis is the virtual rectangular shapes as a new approach distributed in four patterns forming number of classes, which takes in consideration of shadows probabilities, which associated with cuneiform geometric feature, ( as new proposed factor ), as results of reflected light . This thesis presents comparison state between a proposed algorithm for features extraction by polygon approximation principle and the classical features extraction methods like elliptic Fourier descriptor, zerink moments, H’u moments, and projection histogram. Where the classification task is implemented depending on more than one classifier like probabilistic neural network (PNN) and support vector machine (SVM) with multiple kernel discriminant functions, for achieving reliable decision by depending on evaluating a new proposed features extraction methods. Another side offered in this thesis is a proposed post processing algorithm to solve the duplicated state where the cuneiforms character have the same classification features , depending on the computed approximated points with applied distance transform. In order to evaluate the system performance and

the mention comparison about the features extraction techniques,. The accuracy results obtained from the comparing test for features extraction techniques its consecutively, 95% about the proposed approximating technique with using PNN, 70% for EFD with SVM by a polynomial kernel, 57% for projection histogram with SVM by a RBF kernel , 35% for Hu’s moment with SVM by a RBF kernel ,26% for ZM with PNN. Therefore after adopting the proposed algorithm for a features extraction the recognition accuracy by proposed system is 94 %. Finally the high accuracy result that archived about preprocessing proposed algorithms according to erosion process for unwanted objects like spots and writing line is 95% and 92% consecutively.

Chapter one General Introduction

1.1 introduction Pattern Recognition is one of the important branch of artificial intelligence. It is the science which tries to make machines as human intelligent to recognize patterns and classify them into desired categories in a simple and reliable way, to make right decision, in various applications like remote sensing, artificial intelligent and computer vision, through by its metrics or developed method like statistical estimation and recognition, clustering, fuzzy sets, syntactic recognition, approximate reasoning and (NN) Neural Networks. Therefore the humans want to take advantage of this development for automating its applications for more seek, manipulating, accessing, analyzing, and decision making. Writing has been and remained the essential aspect of humanity because it reflects all aspects of human communication and documentation over time. However it supports the attention about historical stages of writing development which is important as an implementing with automated aspects in writing, archiving, and translating .

1.2 Pattern Recognition Pattern recognition (PR) is a field of machine learning that achieved learning process by theories and methods for design machine to recognize the pattern of object , that lead finally to assign the correct pattern to object . However the structure PR system can be summarized as follows [Pri13] .


Chapter one

general introduction

 Data acquisition and preprocessing: the raw data will take from environment and subject to preprocessed to removed noise and unwanted features.  Feature extraction : the relevant data from processed data are extract to create the classification features that represented by features vector.  Decision making: here the decision makes by classifier or the descriptor from extracted features. The block diagram about PR system is shown in figure (1.1) [Pri13], [Vin12].

Figure 1.1 pattern recognition system

Generally PR can be categorized about training state in two types [Pri13]. Supervised learning type: with this type of learning, the training set has been provided where it is instance labeled according to correct output, therefore the learning procedure leads to generate a function model attempt to satisfy a meeting between the input pattern and the output target pattern. Unsupervised learning type : the learning set has not been labeled according to target pattern, therefore this model attempts to find inherent patterns in dataset that can be used to inform the new testing data about the correct pattern. 2

Chapter one

general introduction

1.2.1Pattern recognition approaches [Ani00]. a- Template Matching approach. One of the simplest and earliest approaches to pattern recognition is based on template matching. The pattern to be recognized is matched against the stored template while taking into account all allowable pose (translation and rotation) and scale changes. The similarity measure, often is a correlation. b- Statistical Approach. In the statistical approach, each pattern is represented in terms of features or measurements and is viewed as a point in a dimensional space. c. Syntactic Approach. In many recognition problems involving complex patterns, it is more appropriate to adopt a hierarchical perspective where a pattern is viewed as being composed of simple sub patterns which are themselves built from yet simpler sub patterns d- Neural Networks approach. Neural networks can be viewed as massively parallel computing systems consisting of an extremely large number of simple processors with many interconnections. Neural network models attempt to use some organizational principles (such as learning, generalization, adaptively , fault tolerance and distributed representation, and computation) in a network of weighted directed graphs in which the nodes are artificial neurons and directed edges (with weights) Pattern recognition today has been applied in a wide area of science and engineering with their applications like manufacturing, healthcare and military. Below are some important applications of PR.    

Optical character recognition (OCR). Automatic speech recognition. Personal identification systems. Object recognition. 3

Chapter one

general introduction

1.2.2 pattern recognition and cuneiform writing Despite of extensive applications about PR in different directions but this aspect is still limited with field of cuneiform writing, especially with respect to research and scientific thesis. Therefore this thesis will support the trend in the recognition field especially with OCR technique in Assyrian cuneiform writing about the first millennium BC.

1.3 Literature Survey The reviews of various methods and approaches that were used for developing cuneiform writing approaches as recognition, retrieving and preprocessing are presented in this section. In 2018 Nils M. Kriege. et al. proposed two methods for recognized cuneiform symbols, the first one is a graphical model based on graph edit distance computed by efficient heuristic model. The second approach is related to convolution neural network (CNN) which is presented to overcome the cost of computation with learning face. However the recognition accuracy with second model is increased according to increasing the size of dataset. The recognition rate achieved was 90.23%, [Nil18]. In 2016 Khalid Fardousse. et al. introduced a simple and efficient motives recognition system using feature extraction technique from motives images, the features is based are polygonal approximation and chain-code normalized. This system which has been evaluated on our polygonal forms database and basics motives database. The system for recognizing off-line handwritten craft motive has been developed. Different Pre-processing, segmentation techniques and classifier neuronal with different features are also discussed. The maximum recognition rate that achieved was 90% with Radial Basis Function classifier, and 94% with Feed Forward NN,[Kha16]. In 2014 Fahimeh Mostofi. et al . proposed intelligent recognition system for Ancient Persian Cuneiform characters based on supervised learning model, it’s a back propagation neural network model(BP) where the testing data set is created by subjecting the original learning set to Gaussian Filter with different values of stander devotion. The otsu’s binirized model was adopted for 4

Chapter one computing global threshold achieved was 89-100 % [fah14].

general introduction value.





In 2013 Naktal M.Edan. proposed methods for recognizing the cuneiform symbols depending on statistical and structure features derived by projection histogram, center of gravity and connected component features. However to separate each distinguish feasters according to each class of symbols, the kmean clustering was used. Multilayer Neural Network (MLP) was applied for classification task where the recognition rate of accuracy level was different according to each class from 83.3% to 90.4%.[Nak13]. In 2012 Kawther K. Ahmed. proposed more than one method for extracting recognition information from cuneiform image tablets about off line and online approaches and applying evaluation with them. It was depended on structure feature as skeleton features vector defined as Symbol Structure Vector (SSV) with depending on k- nearest neighbor (KNN) as classification model. Where the recognition rate achieved was 97%.[Kaw12]. In 2006 IN 2006 R Sanjeev Kunte. et al present an OCR system developed for the recognition of basic characters (vowels and consonants) in printed Kannada text, which can handle different font sizes and font types. Hu’s invariant moments and Zernike moments that have been progressively used in pattern recognition are used in this system to extract the features of printed Kannada characters. Neural classifiers have been effectively used for the classification of characters based on moment features. An recognition rate of 96·8% has been obtained,[San06]. Hilal Yousif. et al suggested recognition method for a hand written images of cuneiform text . The method make use of the fact that there are finite number of images for the symbols and trying to differnation between them depending on intensity profile curves, which represent intensities of selected pixels in the image . Accuracy rate was achieved to be above 90% [Hil06]. In 2001 Al-Aai proposed recognition approach for cuneiform symbols depending on extract recognition features that generated from binary cuneiform image tablet by depending on suggested seven transform forms applied on 5

Chapter one

general introduction

each pixel’s with their neighbors about each cuneiform symbols. However each cuneiform characters will have distinguish features related to number of symbols and their directions used for recognition task. The classification process was implemented through by (association rules) like indexing process was distributed on tree structure, [Ani01]. The results published in the literature showed all research did not take into consideration the geometric shape about cuneiform symbols which therefore effected on generating a different segmented pattern depending on reflected light angle. The other side is not concerns and interest about treating the unwanted objects like spots and writing cuneiform line associated with cuneiform patterns, where the spots as a result of the images' distortion. Finally the duplicate state about recognizing cuneiform characters occurred when the recognition features are correspond therefore it leads to reaching a duplicated recognition state. However, the previous problems related with shadows and spots will be illustrated in (appendix A ). This thesis will stand on these problems and solve them by proposing algorithms with a new proposed virtual training dataset.

1.4 Aim of Thesis The aim of thesis is to developing approach to recognize cuneiform Assyrian image tablets by applied Pattern recognition techniques especially with optical character recognition (OCR), depending on the proposed virtual training dataset of regular triangles patterns and assisting approach to adopt a polygon approximation technique as features extraction method and proposed post processing algorithm to solve duplicated recognition state.

1.5 Thesis Contributions: The main contribution of this thesis can be illustrated as follows: 1- proposing Assyrian cuneiform character recognition system (ACRS)

depending on applied the principals of optical character recognition (OCR) that leads to create evident advantage related with supporting search engine process.


Chapter one

general introduction

2- proposing efficient virtual training dataset consisting of trigonometric





forms reflecting all the possibilities in which the cuneiform symbols are formed as a patterns forms, depending on the analysis of the threedimensional geometry about the cuneiform symbol with its shadows are formed by the effect of the light angle. proposing preprocessing algorithms for erasing an effected objects like spots and writing lines. Two algorithms were proposed for erasing writing lines depending on distance transform with sample function as a new object and approaches and proposed erasing spots algorithm depending on image labeling approach through by image morphology. proposing a new approach for extract features to create features vector by polygons approximation with dominate points (DP) through by proposing algorithm that combines the two approximation approaches . Adopting probabilistic neural network (PNN) as a new classification approach to classify the cuneiform symbols direction as (horizontal, vertical or diagonal). Compering the recognition accuracy results that achieved, about cuneiform symbols between the common features extractions methods used and proposed method by employing the PNN and support vector machine (SVM) for multi classification process .

7- proposing a new post processing technique for solving duplicated recognition state depending on (extracted features by specific approximated points according to each patterns and distance transform) . 8-adopting image fusing technique with transform approach by wavelet transform to increasing accuracy recognition results. 9-proposeing thresholding algorithm for implements segmentation state depending on Niblack , Sauvola methods . were the selection criterion of them based on statistical Skewness measure.


Chapter one

general introduction

1.6 Organization of Thesis This thesis organize in six chapters, here a brief description of their contents are given: Chapter two: This chapter briefly describes the historical stages of the emergence of cuneiform writing in addition to the associated problems related to processes of cuneiform recognition. Chapter three: It describes optical character recognition, its types, theoretical background steps and algorithms that will be based to design the recognition system. Chapter four: It presents a proposed algorithms that are used to design the proposed system and the implementation of each one. Chapter five: It discusses the experimental and the result obtained from implementation of the proposed recognition system steps and compares the result with traditional methods. Chapter six: It presents the conclusions and illustrates a number of suggestions for future work.


Chapter two Theoretical Background

2.1 Introduction Optical character recognition (OCR) is considered one of the main branches in pattern recognition starting from the middle of the century, specifically in 1950 until today, this field is subjected to research and development aspects, as a result about it supporting for institutional government applications, which can easily be explored in financial, banking and archiving applications .

There are many definitions of (OCR), some of them are defined as process for selecting an image segment from scanned image file and determine the corresponding text character [You12],[ Pri17] , or it’s the process of choosing the right pattern for the image segment. Typically the framework of optical character recognition systems can be reviewed as the following steps[Roh12],[Pri17] :

1-preprocessing. 2- segmentation. 3- features extraction . 4- classification. 5- post-processing.


Chapter Two

Theoretical Background

2.2 Character Recognition Types Character recognition manly is classified into two types character recognition [Mus16], [Dew09].

online and offline

2.2.1 Online Character Recognition It’s automatic conversion process that cause to convert the digital information (digital ink) generated by Personal Digital Assistant (PDA) or tablets to suitable text in real time [Dew09] . This process is applied with depending on spatial similarity metrics related with different strokes features as number, directions and ordered. Frequently this feature is digital information translated as dynamic representation of sensor states about electronic pen-tip states as up, dawn and movement with number of challenges interface, this process since reduces the time consuming and accuracy. Generally online character recognition process is less difficult than offline recognition process through by available dynamic information , [Mus16] . 2.1.2 Offline Character Recognition This technique is applied on scanned image of type written or hand written text, for recognizing it’s characters. Generally all offline character recognition techniques starting by submitting image firstly to enhanced process for generating suitable features agree with classification model[Dew09] . However offline recognition process is more difficult than online techniques with reason of the problems that relative with noise, distortion and different styles of handwriting, [Mus16] .

2.3 Preprocessing Preprocessing represent the essential step in recognition systems after image acquisition process. Basically the preprocessing step is designed and applied for next analysis step to reduce the amount of noise and maintain as possible the significant information. Generally the preprocessing operations include image thinning, , edge detection, noise removal… and image normalization [pov14] .


Chapter Two

Theoretical Background

2.3.1 Image Enhancement It is one of the most important image processing techniques and through it the digital image features are reconstructed to suit the nature of the its application, whether it is medical, military or satellites images.. The primary objective is to treated all the associated problems related to blurring, contrast and noise [Jan15] .The process of enhancement task takes two directions, the first one submitted to human vision as criterion for evaluation and the second is moving towards supports and improves image qualities used to support the identity process by machine vision [Moh17].Image enhancement techniques can be classified in to categories [Jan15] , [Moh17] : A. Spatial Domain Techniques In this technique, each pixel in the image is treated independently with its pixel neighbors. For obtain the required results these techniques are used in this direction methods like histogram, equalization, power and logarithmic transforms. The advantages of this techniques are that it is easy to apply and understand. The disadvantages are the implementation process include all components of the image as a uniform way and it’s not useful and serious when the implementation is limited to specific areas in the image[Gur14],[Sne12] .This technique can be formulated according to the mathematical formula: g(x,y)=T[f(x,y)]

.. (2.1)

where f(x,y) represents input image, g(x,y) the output image and (T) is spatial technique’s operator B. Frequency Domain In frequency domain technique image enhancement process is implemented by transforming the image to frequency domain by discrete transform. Like discrete (Fourier transforms (DFT) , Discrete Cosine Transforms (DCT) and Discrete wavelet Transforms (DWT)) , and manipulates image’s coefficient transformed by selected operator (filter) for subject to invers transform process .Where the image orthogonal transform has two parts phase and magnitude, the first one is used to restore the real 11

Chapter Two

Theoretical Background

image values. The transformed technique can be represented by following equation form[Raf02], [pin14]: g(x,y)=h(x,y)*p(x,y)

... (2.2)

Where g(x,y) is transformed image ,p(x,y) original image and h(x,y) is transformation function ,the block diagram steps for this technique can be summarized as follows [pin14] [32]:

Figure 2. 1: image enhancement frequency transform steps.

About pros, this technique gives excellence results for smoothing and eliminates high frequency noise, compared with spatial approach, the coins side is represented by it cannot with as a simultaneously process treated all image region For a satisfactory result [Moh17] . With first type the output processed image has a smooth version compared with the original where the second result is the sharping image features.

2.3.2 Image Enhancement Filters The type of filters in image enhancement paradigm can be categorized in two types 1- Low-Pass filters (LPF) 2- High-pass filters (HPF) 1. Low-Pass Filters In Spatial Domain This type of low-pass filters in spatial domain takes two faces, or models which is a linear or non-linear. In linear the value of new image’s pixel came from computation process with participation of its neighbors pixels where in non–linear 12

Chapter Two

Theoretical Background

methods it depends on selection criterion that predefined for selecting the optimal pixel among their neighbors [pin14] . In the below mention low-pass filters in spatial domino. Generally the term of linear filters is defined as follow, [Man15] , [Raf02] . K(x,y)=∑


) (



Where w is kernel filter with (s,t) coordinate ,f image's pixel, (c,r) is the row and column indexes.  Mean or Averaging Filter: average filter is computed by dividing the summation value of image pixels in local predefined window W size by the number of pixels in window, where the computed value represents a new image pixel value . ∑ F(x,y)= 1/N∑ ( ) … (2.4)

N=number of pixels in window.  Median Filter Y(n)= med[X(n - k),..., X(n),..., X(n + k)]

… (2.5)

Where y(n), output image. [X(n - k),..., X(n + k)], Ranked pixels values in specific window size.  Gaussian Filter: the Gaussian filter is used for smoothed image’s edges through by attuned the high frequencies of image’s color . The kernel mask is approximate by Gaussian function as follow for applying the convolution mask process by previous form (3.3) . where is standard deviation. G𝝳=

e –(x2+y2)/





Chapter Two

Theoretical Background

Figure 2.2: Gaussian kernel mask values

2. Low-Pass Filters In Frequency Domain Gray image color tones in frequency domain distributed in two frequencies low and high, each of them takes his turn about incorporating the images components. Therefore most image gray levels tones occupy the low color frequencies while the edges and noise take high. In frequency domain space the high frequencies color tones take its please around the origin of the axes where the low centric is near to origin coordinate figure (2.3). Therefore for reaching to eliminate state about the noise or attenuates the high one, the suitable chosen employs a low pass filter about this problem, this filter makes cutoff about the high distributed frequencies and only allows the low frequencies to take its please in new generated image figure(2.4.d). As with following algorithm steps [Raf02] , [Mil08] .



Figure 2.3: frequency domain technique . a) cutoff the high color frequencies ,b) allow the high color frequencies


Chapter Two


Theoretical Background





Figure 2 :4 a) image enhancement with low frequency . original image ,b) histogram about cutoff high color frequencies c) histogram about cutoff low color frequencies, d) blurred enhanced image features , e) sharped enhanced image features .

Algorithm (2.1): Image enhancement by frequency domain Input: Gray image Output: Enhanced image begin Step1:read gray image (Igray). Step2:comput foriour image F(u,v) by applied DFT on Igray. step3 :Multiply F(u,v) by low pass filter H(u,v) as follow k(u,v)= F(u,v)* H(u,v). step4: compute inverse f(u,v) of DFT about k(u,v). step5: take the real part from previous step to create enhanced image (Im). Step6:return (Im).

End Where …(2.7)

F(u,v) image furrier transformed . …(2.8)


Chapter Two

Theoretical Background

f(x,y) inverse image furrier transformed. 3.Smoothing Frequency Domino Filters

1- Ideal Lowpass Filters ( ILPF). 2- Butterworth Lowpass (BLPF). 3- Gaussian Lowpass Filters (G LPF).

4.Ideal Lowpass Filters ( ILPF). Ideal lowpass filters can be define as H(u,v) ={

( (

) )



Where D0 is nonnegative value representing the radios of cutoff frequency and ( ) is the distance value starting from point to center of frequency, ideal filter pints that all low frequencies with amount value less than and equal D 0 will pass where other (outside) was attenuated, figure(2.5), [Raf02] .

Figure 2.5 . Ideal Low pass filter


Chapter Two

Theoretical Background

2.4 Image Segmentation Its image processing techniques that lead to segment the image's pixel to segments of regions were each of them has distinguish labels. This simplification process is used to simplify images features to easier or meaningful feature form that used to support the advanced analysis's or recognition stages [Suj14],[Anj17]. Image segmentation techniques are categorized into two branches its block and layer based segmentation as seen in following diagram [Nid15].

2.6 image segmentation methods

2.4.1 Image Thresholding Thresholding is popular image segmentation technique it's adopted by large number of binriztion methods. It leads to separate the image into two sets group of regions based on selected threshold value (T). If pixel intensity color value is larger than the threshold, it will represent foreground region in the opposite case. It's considered as background, as mathematical formula below Thresholding [Anj17], is implemented in two sides either local or global side. In first one, the value of threshold is determined in every position of image's window where the global threshold is computed once depending on all image information's. The second method is adopted where the image has evident separation between the character's and image's 17

Chapter Two

Theoretical Background

background. On the contrary, local method has clear results about the image that has locally color features [Anj17] . In below, the various thresholding techniques will be reviewed. G(x,y)={

( (

) )

… (2.10)

2.4.2 Image Thresholding Techniques A. Niblack Method The threshold is computed by this method in every rectangle image's window locally through calculating the values of mean and variance for all intensity color pixels in each window as follows[Pra06] . T=M+kσ .

… (2.11)

Where k takes a constant value between [0,1] . The (m, σ) represents the mean and standard deviation respectively. In communally the size of the local image's windows is [15 x15], .The disadvantage side of this method has low results exactly where the original image has a degradation feature (noise) .

B -Sauvola’s Method This method is proposed to solve the noise problem about Nilblack methods, by depending on the same parameter that was reviewed in advanced like mean and standard deviation, the formula that will be depended to compute the threshold as follows. σ T=m(1-k(1- ) ) 𝑟 …(2.12) It sets to 0.5, and 128 respectively. Where k and r are constants. Like the previous method, the size of window must be determined previously and it produces low results where the edge between the background and character has low contrast. [Pra06] .


Chapter Two

Theoretical Background

C. Otsu’s Method It’s global thresholding method that convert gray image to binary image . It’s linear discriminant statistical method that separates the image features as to homogeneity two colors bands, the first one related with foreground (objects, symbols ) and other background . Otsu’s thresholding method starting with iterative histogram procedure separates the image colors as two colors intervals (I0=dark, I1= light),. The color density with first is I0={0,1,2,3…,I}, and second is I1={I+1,i+2,..,k-1}.Therefore the global thresholding value is computed by the following formula [jam11] .

𝝳2w= wb(I) * 𝝳2b(I)+ wf(I) * 𝝳2f(I)


Where ( ).


( ).

wf(I)=∑ µb(I)= ∑

𝝳2f(I)= ∑


( )/wb(I).

µf(I)= ∑ 𝝳2b(I)= ∑



( )/ wf(I) (


( )) /wb(I) (


( )) / wf(I)


Where wα , µα, and 𝝳α2 is weight, mean, and variance of class α. Repeating the separation process for choosing new intervals density color ( each iterating shifting one level density color ) iteratively and recalculating the above equations (14-19) until satisfy minimum value of (13) as the selection criterion. D. Iterative threshold The global threshold value can be determined by iterative threshold technique as in the flowing algorithm steps [She13] :


Chapter Two

Theoretical Background

Algorithm (2.2): Iterative threshold Algorithm: input: gray image , output binary image. Begin Step1: compute the initial threshold value (T) ,by using average color image intensity . Step2: by using initial threshold value ,separate the image into two sets of group regions R1,R2. Step3:comput the mean values M1,M2 for each group. Step4: update the new value of threshold as flow: T=M1+M2/2; Step5: repeat steps 2-4 iteratively until M1,M2 has no change End The adopt new version of iterative threshold by applying it locally as follow algorithm (2.2.1) steps [Par97] : Algorithm (2.2.1):Iterative Threshold Algorithm: (locally applying ) input: gray image , output binary image. Begin

Step1: compute global( T) threshold by iterative threshold algorithm (3-2) . Step 2: for each pixel with it's eight neighbored compute adaptive threshold locally as follow (3-4) steps: Step3: if difference between maximum and minimum is less than T assign new value of new pixel relative to it density color’s pixel (bright or dark) . Step4: if difference between maximum and minimum is greater than or equal T assign new pixel value to be white if the old pixel value is near to maximum or the new takes black value if old pixel value is near minimum. End

2.4.3 Image Connected-Component Labeling Image connected-component labeling (CCL) represents an important filed in pattern recognition, computer vision and machine intelligence. By using this technique each connected segment in binary image it will own characterized unique label distinguish it from other labels [KUA03] [Kur15] , figure(2.7). This technique is required in different applications like target identification, diagnosis application, and 20

Chapter Two

Theoretical Background

biometric applications . There are many theories and algorithms that have contributed to the evolution of this technique especially the speed of performance in real time applications, generally these can be classified in four classes[Lif09] . Multi-scan algorithms, Two-scan algorithms, Hybrid algorithms and tracing-type algorithms . The first category can be adopted with image morphology as an implementation [Raf02] depending on the dilation principle as follows



Figure 2.7. Image connected-component labeling .a) binary image, b) labeling image

1-Image Morphology Image morphology represents an important field in image processing technique their theoretical side was introduced in 1964 by two French researchers (Matheron and Serra) when they presented a set of formulas about image analysis. Image morphology is a combination of non-linier process relevant with form of binary image feature, it depends on structure pixels values (geometry and topology features ) instead of their color density [Mil08] values, therefore the result come about morphology processing with new image features, it supports the pattern recognition and image analysis techniques. 2-Stricture Elements The important factor to applied morphology process in binary image is achieved by rectangular array structures or kernel mask (Stricture element), it takes different patterns contain of zeros and ones , figure (3.8), [Rav13]. The design about choosing suitable one of them it depends on particular problems .By sliding this mask on image 21

Chapter Two

Theoretical Background

the morphology process can take two states. The first one when values of pixels (ones) about stricture element matched the corresponding neighbors of image pixel, this state is called fit state , and when the match condition satisfied for at least a single pixel the hit state was found, [Jan12] .

Figure 2.8: Stricture element

3-Dilation and Erosion, [Rav13]. The dilation and erosion represents major important operators states in image morphology, they satisfy after applying the convolution kernel mask (Stricture element) on the image . The dilation state is met when image pixel has hit state then the value of new pixel in (new image) equaled one this leads to increasing state will satisfy for object figure (2.9.e ). About erosion state, the value of image pixel equals one if the fit state is satisfied (revers state compeer with first state) figure(2.9.c).




Figure 2. 9 ) Dilation and erosion process a) binary image , b) erosional image , b) dilated image


Chapter Two

Theoretical Background

Dilation A ϕ B={Z|(B^)z⋂



Erosion: A Ө B={Z|(B)z≤A}


Where ϕ is dilation operators , B is structure element. 4-Extraction of Connected Components For reach to image labeled ,the first approach or strategy (Multi-scan algorithms) will be applied (represented by Extraction of Connected Components ) on binary image A where it contains foreground pixels with labeled value equals (1) and background their pixel’s labeled value is (0). This process is implemented iteratively with restricted condition depending on dilation (20) concepts. Initial step started by locating the first foreground pixel (p ) where it represents seed point for reconstructed matrix XK (eq) with the structure element B scan the image A for computing the following form [Mil08] , [Raf02] ,[Shi09] . let X0=p where K=0,1,2…n …(2.22)

This iterative process would be terminated after the terminated condition was satisfied as XK=XK-1. Note: for applied image labeling concept, each regenerated connected components will have assigned distinguished labeled for each connected component element .


Chapter Two

Theoretical Background

With applying the bellow algorithm, it can be seen below the steps for regenerating the original binary image figure(2.10) .However image labeling is satisfied after each connected object subject to below algorithm with distinguished labeled value [Mil08] , [Raf02]. Algorithm (2.3) : Connected Components Extraction. Input :Binary image. Output : Connected Component segment begin Step1: read input binary image IB Step2: locate the first foreground pixel p and it's location p(x,y). Step3:initlize the stricture element B . Step4:k=0; Step4:intilize the connection component matrix XK(0,0). Step5 :set XK(x,y) =p(x,y). Step6: repeat Y=Xk Applied the dilation process on Xk and interest the result with original Image IB as following formula. Xk+1= dilation(B,XK )∩ IB . K=K+1. Untiled ( Y==XK+1) Step7: set CC_MATRIX=Y. Step8:return ( CC_MATRIX). end

Figure 2.10: regenerated connected component segment .


Chapter Two

Theoretical Background

2.5 Image Distance Transform Image distance transform (DT) plays an essential role in many applications like pattern recognition, computer vision , robotics and image matching particularly for binary image matching with using suitable features crated by matching approaches [Muh00]. Distance transform (DT) is a conversion process that applied on binary image to produce gray-level image that each pixel of it represents real value corresponds the minimum distance between object pixel (Ob) and background pixel (Bg), figure(3.11) that can be defined as in the following form[Don04] . Where I(x,y) ɛ {Ob,Bg} Id(x,y)={


Where Ob object pixel,





( ) * + … (2.23) ) * + ) (

background pixel .



Figure 2.11: image distance transform a) binary image, b) binary values representation ,c) image distance transform .

Distance transform algorithms use different distance metrics for computing real distance beginning from non-euclidean metric like city block distance transform (CBDT), chamfer distance and Euclidean Distance Transform (EDT). Each of them was reflected positively or negatively on outputs qualities related with time consuming and precision .


Chapter Two

Theoretical Background

2.5.1 Distance Transforms With Sampled Functions [Ped12] Distance transform by sample function represents generalized approach for distance transform of binary image on grid (rows, columns) instead of binary value as depended. Therefore with samples functions, the basic intuition for computing the image distance transform depending on appearance or loss feature with each pixels defined as cost feature related with each pixel . Let (£={1,2,3,…, n}) uniform 1D is one dimensional grid, F: £→R where F is a function , then the distance transform defined by sampled function (2.24), is demonstrated as follows. D f(p) =min qϵ£((p-q)2 +f(q)).


Where Df Euclidean distance transform value, p testing point, q nearest point. About every point qϵ£, there is a restriction where distance transform (F) is bordered from above by parabola presenting the rooted position (q, f(q)). Therefore distance transform is realize by lower envelop of these parabolas, figure (2.12) and its value corresponds the high of lower envelop.

Figure 2.12: the lower envelop about n parabolas

For computing image distance transform, the following two steps must be implemented: 1- calculating the lower envelope of n parabolas. 2- solving the mention equation (2.24) by substituting the lower envelope’s height at grid position. Where the two parabolas determine the distance transform that are intersect at single point therefore the intersection (s) position between two parabolas defined by grid positions (r,q) as follows equation (2.25). . 26

Chapter Two ( ( )


) ( ( )

Theoretical Background )


The lower envelop is calculated by sequentially calculating the first q ordered parabolas related to their horizontal positions. Where parabola is considered from q and finding the intersection position with another parabola in v[k]. Therefore there are two states that are satisfied. First, figure (2.13.a ) if the intersection position is after z[k] then the lower envelope must be adjusted as with the following algorithm steps. The second opposite state figure (2.13.b) considers the deleted state of th k parabola that lead to K parabola from v[k] is not contained in new lower envelop



2.13 : distance transformed with it's states. a) state 1, b) state2 The following one dimension distance transform algorithm by sample function grid defined as sample function ,


Chapter Two

Theoretical Background

Algorithm (2.4): Sampling Distance Transform. Input =Row image pixels. Output= Distance value. begin Step1: k=0. Step2: v[0] =0. Step3: z[0] =-∞. Step4: z[1] =+∞. Step5: for q= 1 to n-1 while (true) Step6: compute the intersection point by the following form s=(( f (q)+ q2) -( f (v[k])+ v[k]2))/(2q - 2v[k]). Step7: if s End K=k+1. V[q]=q. Z[k]=s. Z[k+1]= +∞. Step8: k=0. Step9: for q= 0 to n-1 Step 10 while z[k+1]