Hand Written Hindi Numerals Recognition System

8 downloads 0 Views 1MB Size Report
Amin (1980) used numerous statistical and grammatical approaches to recognize ... Said et al. (1999) fed the pixels of the normalized digit image into a neural.
Republic of Iraq Ministry of Higher Education and Scientific Research University of Baghdad College of Science

Hand Written Hindi Numerals Recognition System A Thesis Submitted to University of Baghdad College of Science in Partial Fulfillment of the Requirements for the Degree of Higher Diploma of Science in Computer Sciences By

Yousif Moshtak Latif (B.Sc. 2006)

Supervisor

Dr. Loay E. George

October 2011

Thee-ALQudah 1432

‫ﺑَﺴﻢ ﺍﷲ ﺍﻟﺮﺣﻤﻦ ﺍﻟﺮﺣﻴﻢ‬

‫ﺍﻗْﺮَﺃْ ﻭَﺭَﺑﱡﻚَ ﺍﻟْﺄَﻛْﺮَﻡُ‬

‫ﺍﻟﱠﺬِﻱ ﻋَﻠﱠﻢَ ﺑﺎﻟْﻘَﻠَﻢ‬

‫ﻋَﻠﱠﻢَ ﺍﻟْﺈِﻧﺴَﺎﻥَ ﻣَﺎ ﻟَﻢْ ﻳَﻌْﻠَﻢْ‬ ‫ﺻﺪﻕ ﺍﷲ ﺍﻟﻌﻠﻲ ﺍﻟﻌﻈﻴﻢ‬ ‫ﺳﻮﺭﺓ ﺍﻟﻌﻠﻖ‬

Abstract In this thesis a simple recognition system based on relative density distribution of each Hindi numeral object will be utilized to recognize it. A recognition technique has stages as follows: Preprocessing stage such as Image De-colorization thresholding, clipping the document image. After that feature extraction stage implement where instead of using the image data to represent the numeral object a small set of geometrical features called moments will be used in this proposed system. The last step in the feature extraction stage, the calculation of moments and adding the result of moments to the database, two sets of moments were determined; one for the horizontal distribution and the second for vertical distribution. Three experiments were manipulated to test different images that contain different types of thickness of pens. The results show that appropriate thickness of the line is directly proportional to the error rate, where every time the thickness of the number written is reduced the percentage of error is also reduced. The system is implemented using visual basic program version 6.0

Supervisor Certification I certify that this thesis is prepared under our supervision at the Department of Computer Science/College of Science/Baghdad University, by Yousif Moshtak Latif in partial fulfillment of the requirements for the degree of Higher Diploma of Science in Computer Science.

Signature: Name

: Dr. Loay E. George

Title

: Assist. Prof.

Date

:

/

/ 2011

In view of the available recommendations, I forward this thesis for debate by the examination committee.

Signature: Name : Dr. Loay E.George Title

: Head of Computer Science department/college of science /University of Baghdad.

Date :

/

/ 2011

Certification of the Examination Committee We chairman and members of the examination committee certify that we have studied this thesis"Hand Written Hindi Numerals Recognition System" presented by the student Yousif Moshtak Latif and examined him in its contents. We have found it is worthy to be accepted for the degree of Higher Diploma of Science in Computer Science.

Signature:

Signature:

Name: Dr. Sarab M. Hameed Title: Assist. Prof. Date : / /2011

Name: Dr. Baraa A. Attea Title: Assist. Prof. Date : / /2011

(Member)

(Member)

Signature: Name: Dr. Loay E. George Title: Assist. Prof. Date: / /2011 (Supervisor) Approved by the Dean of the College of Sciences, University of Baghdad.

Signature: Name: Dr. Saleh Mehdi Ali Title: Assist. Prof. Date : / /2011 (Dean of College of Science)

Dedicated To My Dear Parents . . . My Dear wife . . . My LovelyBrothers and Sisters . . . For Their support and Help

And to every one gave me Help and advice . . .

Yousif

Contents 0B

Abstract

I

Contents

II

List of Abbreviations

IV

Chapter One: Introduction 1-1 Introduction

1

1-2 Literature Survey

3

1-3 Aim of Thesis

8

1-4 Organization

9

Chapter Two: Theoretical Background 2-1 Image Representation

10

2-2 Types of Digital Image

11

2-2-1 Binary Images

11

2-2-2 Gray-Scale Images

11

2-2-3 Color Image

12

2-2-4 Multispectral Image

12

2-3 Pattern Recognition System 2-3-1 Typical Pattern Recognition System

2-4 Preprocessing Operation

12 13 14

2-4-1 Image De-Colorization Preprocess

15

2-4-2 Thresholding

15

2-5 Moments

16

Chapter Three: Handwritten Hindi Numerals Recognition System Design & Implementation 3-1 Introduction

19

3-2 System Layout

19

3-2-1 The Enrollment Phase

19

3-2-2 Identification Phase

21

3-3 Implementation Steps

21

3-3-1 Input Image File

22

3-3-2 Preprocessing Stage

24

3-3-3 Feature Extraction Stage

28

3-3-4 Data Base

31

3.3.5 Matching process

32

3-4 Implemented System

33

Chapter Four: Experiments 4-1 Introduction

35

4-2 Experiment 1

35

4-3 Experiment 2

36

4-4 Experiment 3

36

1B

Chapter Five: Conclusions & Suggestions 5-1 Conclusions

40

5-2 Future Work Suggestions

41

References

42

List of Abbreviations OCR

Optical Character Recognition

WHD

2B

Weighted Hausdroff Distance

LVQ

3B

Learning Vector Quantization

PNNs

4B

Probabilistic Neural Networks

DN

5B

Discrete Numbers

RGB

6B

Red, Green, Blue

NRS

7B

Handwritten Hindi Numerals Recognition System

Hm

8B

Image height

Wm

9B

Image width

XL

10B

Lift side of image

XR

1B

Right side of image

YU

12B

top side of image

YD

13B

MOM T

g ( x, y )

Bottom side of image 14B

Moments 15B

Threshold gray scale value

V

Acknowledgment First of all great thanks are due to Allah who helped me and gave me the ability to achieve this research from first to last step. I would like to express my deep gratitude and sincere thanks to my supervisor Dr. Loay E. George for guidance, assistance and encouragement during the course of this project. Special thanks to the staff of University of Baghdad for their general support and help. Special thanks go to all people who provided me with any kind of help, and to all my friends for giving me advise.

Yousif Moshtak Latif 2011

Chapter One Introduction 1.1 Introduction Handwritten numerals recognition is widely used, for example in office automation, check verification, and a large variety of banking business, postal address reading, and data entry applications. The calligraphic nature of the Arabic script is distinguished from other languages in several ways. For example, Arabic text is written from right to left, with 28 basic characters, of which 16 have from one to three dots. Those dots differentiate similar characters. The shape of an Arabic character depends on its position in the word; a character might have up to four different shapes depending on it being isolated, connected from the right, connected from the left, or connected from both sides. On the other hand, Arabic (Indian) numerals are not cursive. Indian numerals are used in Arabic writing, while Arabic numerals are used in Latin languages. The term, ‘Arabic numerals’, is used here to refer to the Indian numerals used in Arabic writing. Although Arabic text is written from right to left, Arabic (Indian) numbers are written left to right with the most significant digit being the left-most one and the least significant digit the right-most. This is similar to Latin numerals. Figure (1.1) shows the Arabic (Indian) and Latin digits 0 to 9 (from right to left). Digit "1" is similar in Arabic and Latin. Arabic digit "5" is similar to Latin digit "0". Digit "9" of Arabic and Latin are similar with lower stroke projecting to down-right in Arabic and down-left in Latin. There exist two manners to write digit "4" in Latin and two manners to write digit "3" in Arabic as shown in Figure (1.2).

Chapter One

Introduction

Fig. (1.1) Arabic (Indian) and Latin handwritten numerals (from 0 to 9)

Fig. (1.2) Different methods to write digit 4 (Latin) and 3 (Arabic)

An Arabic number may consist of an arbitrary number of digits. The recognition system classifies each digit independently, preserving its relative position with respect to other digits, in order to obtain the actual value of the number after recognition. Traditional OCR (Optical Character Recognition) systems for machineprinted text segment text image into sub-images of the individual characters comprising the text. These character images are recognized separately and later reassembled into words. Unfortunately, reliable character segmentation is not always possible, even for machine-printed text. Low-quality image may exhibit both joined and broken characters, caused by noise or by sampling at a coarse resolution. Certain types of machine-printed text are inherently difficult to segment regardless of image quality. The Arabic language uses a 2

Chapter One

Introduction

script character set so that adjacent characters in a word are usually joined, in a manner similar to European handwritten cursive. The script nature of arabic text makes character segmentation difficult even for high quality image of machine-printed text [1].

1.2 Literature Survey Various methods have been proposed, and high recognition rates are reported, for the recognition of English handwritten digits. many researchers have addressed the recognition of Arabic text, including Arabic numerals. In addition, several researchers have reported the recognition of Persian (Farsi) handwritten digits. However, the reported recognition rates for Arabic/Farsi digits need more improvements to be practical. Several researchers have reported some articles on the Arabic Optical Text Recognition; In addition, several researchers have reported a bibliography, advances in Arabic text recognition, assessment of handwritten recognition technologies, and state of the art and future trends of Arabic text recognition [2]. Among the published works the following articles were selected: • Amin (1980) used numerous statistical and grammatical approaches to recognize the isolated characters. He used the graphic table as input unit, after that he classified the characters depending on their shapes, then recognized some of the basic determinates; such as the main line shape of the character, the number of its dots, the position of the dots according to the character, the number of sections that form the character, and the zigzag nature existing in some characters, the second type of determinates are the start point and the quality of convexity [3]. • Ameer H.Murad (1984) designed a system for recognizing the printed Arabic characters written in 'Alruqah' font. He used two manners, the first one is by using template matching and the second is the projection profile. In both of the 3

Chapter One

Introduction

two manners, the Arabic characters are classified into groups, where the characters of each group are similar in the mantling attribute for the first manner. A set of recognizing operations had done in order to define the group that the characters belong to. After that the character is recognized by comparing the basic attributes that recognize the character with these belong to other characters in the group (number of dots and their positions [4]. • El-wakel and Shoukry (1989) presented a system to recognize the isolated Arabic characters, written by using graphic table as input unit, they classified the characters through symbolizing the characters as vectors, and each vectoris represented by: (i) Number of dots [0, 1, 2, 3]. (ii) Dots position. (iii) The number of parts that form the character. (iv) The slant of the secondary part of the character [5]. • Al-Yosefi and Udaba (1992) used the statistical approach in recognizing the Arabic characters; they divided the character into two parts, primary and secondary, and isolate the secondary parts of the character to recognize them separately [6]. • Shahrezea et al (1995) used the shadow coding method for recognition of Persian handwritten digits. In this method, a segment mask is overlaid on the digit image, and the features are calculated by projecting the image pixels into these segments. The Persian digit images are represented by line segments which are used to model and recognize the digits. Additional features and classifiers are needed for discriminating the digit pairs ‘‘0–5’’, ‘‘7–8’’, and ‘‘4–6’’ [7].

4

Chapter One

Introduction

• Mohamed (1996) had produced a new approach for recognizing some of the cursive Arabic words by using statistical methods to determine the attributes that distinguish the pattern without segmenting the word. The research goal was to recognize a number of words using the ART3 neural network [8]. • Manmatha et al. (1996, 1997) initially proposed the word spotting idea for handwritten manuscripts. They presented a preliminary work on matching techniques and “pruning” methods, which can quickly discard unlikely matches for a given word using simple word features such as the aspect ratio of a word’s bounding box [9, 10]. Extensions are made to the matching algorithm and it was tested on three data sets extracted from 10 pages each page was picked up from the George Washington collection; they can be found in [11]. • Rashid A.A. Al_Zubaidy (1997) investigated the patterns classification of printed Arabic characters using back-propagation neural network. The learning data is obtained from a group of allocated characteristics to each character by using four methods the test results indicated that the network is capable to recognize the characters [12]. • Bouslama (1999) presented a fuzzy logic based algorithm utilizes the structural techniques for extracting local features from the geometric and topological properties of Arabic characters [13]. • Said et al. (1999) fed the pixels of the normalized digit image into a neural network for classification purpose. The number of the hidden units for the neural network classifier is determined dynamically [14]. • Kolcz et al. (2000) present matches for a user-provided template word image are searched for in each line of several pages using dynamic time warping on a number of features. This line based approach is expensive since the line is not segmented into words and the word has to be searched for at every possible position in the line. A number of heuristics are used to limit the search along 5

Chapter One

Introduction

the lines and also to re-orient portions of lines for matching. In addition, the matching algorithm aligns each feature using separate dynamic time warp and combines the results heuristically. This means that for the same word-line pair, each feature may produce a different alignment. They provide a result for 4 hand-picked queries with multiple templates they referred that the best result for any individual word template has a precision of 0.4 or less [15]. • Y. Lu et al. (2001) presented an approach to word image matching based on weighted Hausdroff distance (WHD) to facilitate the detection and location of the user-specified words in the document images. Their proposed methods eliminates the spaces between adjacent characters in the word images and have made scale normalization then the WHD is determined to measure the distance between the template image and the word image extracted from the document image. They used English and Chinese document images in their experiments [16]. • Amir S. Almallah (2001) introduced a new manner in recognizing Arabic handwritten. Two levels are proposed for classification, the first one starts after the preprocessing operation (noise removal, smoothing, and thinning). Then the input word is segmented into characters that form it. Then, some of the characteristics of the Arabic character (supplemented and the closed shape) are extracted to reduce the range of search After that the second classification stage is started by representing the character to be recognized with the embalmed chain, and a set of statistical classification formulas are used to compare the chain with the embalmed chains of stored characters [17]. • Al-Taa’i (2001) proposed a system that recognizes handwritten numerals. He proposed a system that extracts the features from the patterns, and then uses these features to train a Learning Vector Quantization (LVQ) Neural Network in the training phase. The weights that are obtained from the training phase are

6

Chapter One

Introduction

stored to be used later in the recognition phase to recognize the input patterns [18]. • Salah et al. (2002) developed a serial model for visual digit classification based on primitive selective attention mechanism. The introduced technique is based on parallel scanning of a down-sampled image to find interesting locations through a saliency map, and on extracting key features of those locations at high resolution [19]. • Al-Zubaidy (2002) had designed a system to recognize the printed and cursive Arabic characters using neural network. She dealt with the Arabic characters and assumed them as 119 characters (assuming that the character shape in certain location is not the same in other location) with the numbers and symbols in the system that she had proposed. A set of 39 features was extracted (like the closed shape, lines, and complementary); and this set is used as input fed to multilayer artificial neural network to classify the characters [20]. • Belongie et al. (2002) have proposed the shape context approach. For shape matching, they considered this kind of approach as the best existing classifier for handwritten digits. Two shapes are matched by establishing the correspondences between their outlines. The outlines are sampled and the shape context histograms are generated for each sample point each histogram describes the distribution of sample points in the shape with respect to the sample point at which it is generated. Points with similar histograms are deemed correspondences and a warping transform between the two shapes is calculated and performed [21]. • Tomai et al. (2002) presented their paper which has shown the difficulty of historical handwritten manuscript recognition. Their goal was to produce a word-by-word mapping between a scanned document image and a manual transcript of that document. This would allow transcription words to be exactly 7

Chapter One

Introduction

located on a page. For each line of the document, multiple segmentation hypotheses are generated and the produced segments are recognized by an OCR method. The recognizer uses a limited lexicon, which is obtained from the perfect transcript. Even at a lexicon size of at most 11 words. the recognition performance was found poor. This clearly shows that OCR is not a viable option for historical manuscript recognition [22]. • Clocksin and Khorsheed (2002) proposed a new approach to recognize Arabic handwritten; this approach did not segment the word into characters because of the difficulty of the character segmentation but depends on converting each word into normalized polar map [23]. • Sasi. et al. (2003) proposed a system that recognizes automatically handwritten characters. The system combines Wavelet Packet transform with neuro-fuzzy approach. Characteristic features are extracted by taking wavelet packet transform, and then these features are saved to be used, later, as input to the fuzzy classifier; they are fuzzified and then given to a neural network recognition system [24]. • Al-Omari et al. (2004) presented a recognition system for the online handwritten Indian numerals from one to nine. The system skeletonizes the digits, and, then,their geometrical features are extracted. Probabilistic neural networks (PNNs) are used for classification. The authors claim that the system may be extended to Arabic characters [25].

1.3 Aim of Thesis This thesis aims to design a simple and fast recognition system that deals with the hindi numeral images. It segments each numeral image and extracts its moment features with taking into consideration the normalization

8

Chapter One

Introduction

requirements for the both the width of characters lines and its size. The Euclidean distance measure is used to perform the matching task.

1.4 Organization This thesis is organized into five chapters as follows: Chapter One: Introduction This chapter contains introductory information and the literature survey. Chapter Two: Theoretical Background This chapter deals with definitions, formulas and the theories relevant to the topics used in subsequence chapters. Chapter Three: Handwritten Hindi Numerals Recognition System Design & Implementation This chapter presents the description of the designed numerals recognition and identification system. Chapter four: Experiments This chapter discusses the experimental environment in some detail. Chapter five: Conclusions & Suggestions This chapter provides the significant features of the results and provides some suggestions for the future work.

9

Chapter Two Theoretical Background 2.1 Image Representation The human visual system receives an input image as a collection of spatially distributed light energy; this form is called an optical image. Optical images are the types that a human deal with every day; cameras, andscanners capturesthem, monitors display them, and we seethem. A digital image can be described as combination of Discrete Numbers(DN), arranged in matrix form (columns and rows), each (DN) called imageelement and briefly referred as pixel. An image array of size M×N is presentedby:

Where f (i, j); i = 0, 1, 2, …M-1, and j = 0, 1, 2, … N-1, represents thebrightness at the point of coordinates (i,j). The range of f ( ) assigned for each image elementsdefined the type of the image (i.e. binary, gray scale, or color) [26, 27].

2.2 Types of Digital Images

Chapter Two

Theoretical Background

Digital images are presented in different types, depending on therequiringarea. For examples, simple binary image are useful for recognition process, whilegray scale image is required to visualize the differences between image regions,colored images are mostly used to globalize the faint regions, and superficial theimage behavior. However, multi-spectral images have specific implementations(usually used for remote sensing applications), as described below [26]:

2.2.1 Binary Images Binary images are the simplest type of images and can take on two values, typically black and white, or '0' and '1'.A binary image is referred to as a 1 bit/pixel image because it takes only 1 binary digit to represent each pixel. These types of images are most frequently used in computer vision applications where the only information required for the task is general shape, or outlineinformation. Binary images are often created from gray-scale images via a threshold operation [28].

2.2.2 Gray-scale Images Gray-scale images are referred to as monochrome, or one-color images. They contain brightness information only, no color information. The number of bits used for each pixel determines the number of different brightness levels available. The typical image contains 8 bit/pixel data, which allows us to have 256 (0-255) different brightness (gray) levels. This representation provides more than adequate brightness resolution, in terms of the human visual system's requirements, and provides a "noise margin" by allowing for approximately twice as many gray levels as required. This noise margin is useful in real-world applications because of the many different types of noise (false information in the signal) inherent in real systems. Additionally, 11

Chapter Two

Theoretical Background

the 8-bit representation is typical to the fact that the byte; whichcorresponds to 8-bits of data, is the standard small unit in the world of digital computers.In certain medical or astronomy application the 12 or 16 bits/pixel representations are needed. These extra brightness levels become useful only when the image is "blown-up", that is, when a small section of the image is made much larger [28].

2.2.3 Color Images Color images can be modeled as three band monochrome image data where each band of data corresponds to a different color. Typical color images are represented as red, green and blue, or RGB images [28].

2.2.4 Multispectral Images Multispectral images typically contain information outside the normal human perceptual range. This may include infrared, ultraviolet, x-ray, acoustic, or radar data. These are not images in usual sense because the information represented is not directly visible by the human system. However, the information is often represented in visual form by mapping the different spectral bands to RGB components [28].

2.3 Pattern Recognition System A pattern defines as “opposite of a chaos; it is an entity, unclearly defined, thatcould be given a name”. For example, a pattern could be a fingerprint image, ahandwritten cursive word, a human face, iris recognition, or a speech signal[29].Recognition is regarded as a basic attribute of human beings, as well asother

living

organisms.

Automatic

(machine)

recognition,

description,classification and grouping of patterns are important problems in a variety ofengineering and scientific disciplines such as biology, psychology, 12

Chapter Two

Theoretical Background

medicine,marketing, computer vision, artificial intelligence, and remote sensing [30]. A human being is very sophisticated informatics system, partly, becausehe has a superior pattern recognition capability. According to the nature ofthe patterns to be recognized, recognition may be divided into two major types: (i) The recognition of concrete items. (ii) The recognition of abstract items. Theconcrete items recognize characters, pictures, music, and the objects around us. This recognition process involves the identification and classification of spatialand temporal pattern.Examples of spatial patterns are characters, iris, fingerprints, weather maps,physicalobjects and pictures. Temporal patterns include speech waveforms,electrocardiograms, target signatures, and time series. On the other hand, theabstract items recognize an old argument, or a solution to a problem, with our eyesand ears closed [31]. The study of pattern recognition problems may be logically divided intotwo major categories: 1. The study of the pattern recognition capability of human beings and otherliving organisms. 2. The development of theory and techniques for the design of devices capableof performing a given recognition task for a specific application. While the first subject area is concerned with such disciplines as psychology, physiology, and biology.The second area deals primarily with engineering, computer.and information science [31].

2.3.1 Typical Pattern Recognition System The typical pattern recognition process is shown in Figure (2.2). Theoperations directly applied on the input data are referred as preprocessing operations. Thepreprocessing step includes a number of image processing operations 13

Chapter Two

Theoretical Background

thattransform the input image into another form [30]. After proper preprocessing, severalpattern features are extracted and assembled in a pattern vector; which is plotted in the patternspace,the whole process applied for extraction purpose are considered as feature extraction step. The application of proper decision theory sketchesthe optimum boundaries in the space of the recognition during the classificationstep.

Input

Preprocessing

Feature Extraction

Classification

Output

Fig. (2.1) Block diagram of typical pattern recognition system

2.4 Preprocessing Operation Generally, the preprocessing operations involve all these procedures thatare performed to simplify the image format and make it ready to generate simple,clearand understandable features. These operations may involve different imageprocessing methods: (i)Image partitioning into small blocks. (ii)Image binarizationby implementing certain threshold or by utilizing certain edge detector and, (iii)Imageenhancement methods (to remove the undesired redundancies or to globalize theimage features etc.). In the following subsection two of the commonly used pre-operations for recognitiontasks areclarified:

2.4.1Image De-Colorization Preprocess 14

Chapter Two

Theoretical Background

Mostly, numbers images are presented in colored form (i.e. RGB bands). Forcharacter and numeral recognition purpose,specifically for simplification, the processed imagesare adopted to be in a gray scale form.Therefore, all accessed colored images to the system convertedinto gray scale version, simply by averaging, pixel-to-pixel, the RGB image bands [27]. The following equation illustrates the color conversion operation given by [32]:

gray(𝑥𝑥, 𝑦𝑦) =

𝑟𝑟𝑟𝑟𝑟𝑟 (𝑥𝑥,𝑦𝑦 )+green (x,y)+blue (x,y) 3

….. (2.2)

2.4.2 Thresholding This method is based on a simple concept. A brightness parameter (T),which is called the brightness threshold, is chosen and applied to the image img(x, y) according to the following criterion: If img(x, y) >T then img(x, y) = object = 1 Else img(x, y) = background = 0

This kind of thresholding implies interest in light objects appeared inside darkbackground. For dark objects appeared within light background.the following criterion is used: If img(x, y) < T then img(x, y) = object = 1 Else img(x, y) = background = 0

The output is the label “object” or “background” which, due to itsdichotomous nature, can be represented as a Boolean variable “1” or “0”. Whilethere is no universal procedure for threshold selection that is guaranteed to workon all images, there are a variety of alternatives; one of them is the fixedthresholding 15

Chapter Two

Theoretical Background

[28]. Fixed thresholding can be defined as an alternative to usea threshold value that is chosen independently of the image data [32].

2.5 Moments The ability to extract invariant features from an image is important in the field of pattern recognition. An object within an image can be identified regardless of its position, size and orientation. One of the most promising features for extracting such invariance is invariant moment. As recognition features, moments have been applied to a variety of image processing problems; this includes, aircraft identification, character recognition…etc. In general, moments describe the distribution of arranged numeric quantities at some distance from a reference point or axis. The operation of the image feature vector extraction by moments is one of the common techniques used these days, where each moment order has different information for the same image. These moments are also divided into orthogonal (e.g. Zernike and Legendre), non-orthogonal such as Regular moments, and Complex moments. The theory of moments provides an interesting and sometimes useful alternative for series of expansions used for representing objects and shapes. The use of moments for image analysis is straightforward if we consider a binary or gray level image segment as a two-dimensional density distribution function. In this way, moments may be used to characterize an image segment andextractits properties that have analogies in statistics. Operations such as rotation, translation, scaling, and reflection may exist for each object appeared in an image. They are also called transformations. These transformations cause changes for each order of image moments. Manysolutions were introduced to keep the moments constant or invariant, one of the most common solution is called moment invariants. 16

Chapter Two

Theoretical Background

Various types of moment's theory have been used to recognize image patterns in a number of applications. Moment descriptors (moment invariants) are used in many pattern recognition applications, The Image or shape feature invariants remain unchanged if that image or shape undergoes any combination of the following changes: 1. Change of size (Scale), 2. Change of position (Translation), 3. Change of orientation (Rotation), and 4. Reflection

The moment invariants are very useful way for extracting features from two-dimensional images. The moment invariants can be subdivided into skew and true moment invariants, where the skew moment invariants are invariant under change of size, translation, and rotation only but the true moment invariants are invariant under all of the previous changes including reflection. [33].figure (2-2)shows the Classificationof the moment invariants,the determination of moments was accomplished by using the following equations: 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤 ℎ−1

𝐵𝐵𝐵𝐵(i)(𝑖𝑖 − 𝑋𝑋𝑋𝑋)n ….. (2.3)

𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 ℎ𝑡𝑡−1

𝐵𝐵𝐵𝐵(i)(𝑖𝑖 − 𝑦𝑦𝑦𝑦)n ….. (2.4)

Momx(𝑛𝑛) = �𝑖𝑖=0 Momy(𝑛𝑛) = ∑𝑖𝑖=0

Moment Invariants 17

Chapter Three Handwritten HindiNumerals Recognition System Design&Implementation 3.1 Introduction In this chapter, the whole digital processing techniques involved in the established numeral recognition system will be explained and discussed in details. Generally, the designed system implies digital processing methods. The adopted digital techniques have been coded as program sub modules, and all sub-modules have been assembled as a single

computer

application

named

HandwrittenHindiNumeralsRecognition System“NRS”

3.2 System Layout NRS Consist of two phases: (i) Enrolment and, (ii) Identification. The input to the system is the number images and the output is ID of the tested numbers. The system’s design is shown in Figure (3.1).

3.2.1 The Enrollment phase This system is trained to recognize the images of numerals. The extracted features from the training images are registered in the system database to be used in identificationphase. The first stage in this system unitperforms thefollowing preprocessing task: 1. Image de-colorizing.

Chapter Three Handwritten numerals Recognition System Design& implementation

2. Image Binarization by thresholding. 3. Cropping. Start

Input number image

Preprocessing

En rol me nt

Feature Extraction Stage

Identification

Matching stage Numbers database Is identified

No

Print Matching Failure

Yes Print the number End Fig. (3.1) TheNumbers Recognition Systemdesign The second stage is feature extraction stage that involves: 20

Chapter Three Handwritten numerals Recognition System Design& implementation

1. Scan the cropped image horizontally to determine the number of objects pixels at each column. 2. Scan the image vertically to determine of number of object pixels at each row. 3.Use the two scanning arrays to determine their corresponding Momments. The result of this stage is creating a moment feature set for each training input image and store the moment set in a database.

3.2.2Identificationphase The input to this system is, also, in an image of unknown numeral. This system involves with the classificationdecision. Beside to the preprocessing stage and feature extraction stage, mentioned in "enrollment phase" paragraph (3.2.1), the identification process. Has an additional stage (i.e., matching stage) for identification decisionthe feature of the input number image, which isfed to this system, will be extracted, matching it with the stored number images in thedatabase to decides if it has similarity with one of the patterns stored in the database or not. The result of this system isidentity number of the input numeral images.

3.3. Implementation Steps In the following sub-sections the implemented steps of each stage of the system is clarified. These steps were implemented using visual basic language (version-6).

3.3.1 Input Image File

21

Chapter Three Handwritten numerals Recognition System Design& implementation

The process of numeral recognition begins by scanning a paper that contains the numeral image that we want to recognize it.

3.3.2 Preprocessing Stage Some preprocessing processes are used to make the data of the numeral image more suitable for data reduction and to make the analysis task easier. It is a necessary stage when the requirements are typically obvious and simple, such as ignoring all information that is not required for the application and these processes are:

A. Image De-colorization Converting the color image to gray scale basically each pixel incolor image have three components of color (i.e. red, green, blue), the value ofeach color component is represented by one byte. The gray value to each pixel iscomputed by equation. (2.2). Applying this equation on all image pixels will lead toconvert the color image to agray image as shown in figure (3.3). Algorithms (3.1) illustrate the implemented steps of reading bitmap file image and de-colorize thenumber image into gray scale image.

(a) Colored image (b) gray scale image Fig. (3.3) Sample of an input numeral-image

22

Chapter Three Handwritten numerals Recognition System Design& implementation

Algorithm (3.1) Input Image File & Image De-colorization Input: Color image Output: Gray Image img(x, y) Begin Open bitmap file images for read Read bitmap file image Put the bitmap file image in image1andimage2 Close file images Initialize an array img (0 to y-1, 0 to x-1) For all i ∈ [0 , y – 1] and j ∈ [0 , x – 1]

Set img (i, j) = gray scale value obtained from equation. (2.1)

𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 (𝑥𝑥, 𝑦𝑦) =

End loop j: End loop i

𝑟𝑟𝑟𝑟𝑟𝑟 (𝑥𝑥, 𝑦𝑦) + green(x, y) + blue(x, y) 3

End.

B. Thresholding In the thresholding process, the threshold value is assigned, and the gray image isconverted to a black and white image, seeFigure (3.4)this will make the determination of thenumber’s features become easier Algorithm(3.2) presents the implemented steps of thresholding process.

23

Chapter Three Handwritten numerals Recognition System Design& implementation

Algorithm (3.2) Thresholding Input: gray image img(x, y) Output: black & white Image img(x, y) Begin Read bitmap file image For Y = 0 To image height For X = 0 To image width If img(x, y) >T then img(x, y) = 1 Else img(x, y) = 0 Next X Next y End.

(a) Gray scale image (b)Threshold image Fig. (3.4) An Example of Thresholding of NumeralImage

24

Chapter Three Handwritten numerals Recognition System Design& implementation

C. Clipping The process of number Edge calculationis begins by scanning the numeral image that we want to recognize it, the scanning processconsist of the following steps: 1. Calculate Left Edge Clip. 2. Calculate Right Edge Clip. 3. Calculate Top Edge Clip. 4. Calculate bottom Edge Clip.

Algorithm (3.3) presents the implemented steps of the clipping task. Figure (3.5) shows an example to illustrate the clipping edge. Algorithm (3.3) clipping Input: Black & white Image img(x, y) Output: image withcalculated edge numberimg(x, y) Begin Read bitmap file image '''''''''Calculate the Left Edge XL = -1 Do XL = XL + 1: I = 0 For Y = 0 toimage height: I = I + Img (XL, Y): Next Y Loop Until (I > 1 or XL = Wm) '''''''Calculate the Right Edge XR = Width Do I = 0: XR = XR - 1 For Y = 0 toimage height: I = I + Img (XR, Y): Next Y Loop Until (I > 1 or XR = 0) 25

Chapter Three Handwritten numerals Recognition System Design& implementation

'''''''Calculate the Top Edge YU = -1 Do YU = YU + 1: I = 0 For X = 0 to image width: I = I + Img (X, YU): Next X Loop Until (I > 1 or YU = Hm)

'''''''Calculate bottom Edge YD = height Do YD = YD - 1: I = 0 For X = 0 toimage width: I = I + Img(X, YD): Next X Loop Until (I > 1 or YD = 0) End.

X1

X2

Y1

Y2

(a) Black & white Image

(b) Clipped image

Fig. (3.5) Sample of Number Edge Calculation 26

Chapter Three Handwritten numerals Recognition System Design& implementation

After the determination of bounding edge of the numeral object, then the dimension of the original numeral image is reduces to confined by the Width =X 2 –X 1 & Height = Y 2- Y 1 as illustrated in figure (3.5). Algorithms R

R

R

R

R

R

R

R

(3.4) present the implemented steps of the numeral body conferment task. Algorithm (3.4) Numeral Body Array Input: Image with calculated edge numberimg(x, y) Output: clipped image img(x, y) Begin Read bitmap file image ………Make Clipping If XL