Implementation of a normalized cross-correlation coefficient-based

1 downloads 0 Views 214KB Size Report
It was found that OpenCV offers a tool that is easy to use for systems that require ... It is also suggested from previous papers that the premature elimination of .... different font styles so the numbers could be detected regardless of the font.
Implementation of a Normalized Cross-correlation Coefficient-based Template Matching Algorithm in Number System Conversion Francisco Emmanuel T. Munsayac Jr III, Lea Monica B. Alonzo, Delfin Enrique G. Lindo, Renann G. Baldovino* and Nilo T. Bugtai Manufacturing Engineering and Management (MEM) Department Gokongwei College of Engineering, De La Salle University 2401 Taft Avenue, 0922 Manila, Philippines Corresponding author: *[email protected] Abstract— In digital image processing, template matching is a technique used for finding or searching for areas of an image that could either match or be similar to the template image. In this study, an algorithm that utilizes both Python programming and the OpenCV library for template matching in number system conversion was successfully demonstrated. Images containing binary numbers were tested for template matching and converted to string. Then, these strings were converted to their respective decimal equivalents. It was found that OpenCV offers a tool that is easy to use for systems that require recognizing patterns of an image. Furthermore, it was observed that the ease of use is accompanied with various limitations such as dependence to pre-processing or having fixed scale, rotation, font, and background color. Index Terms— computer vision, normalized cross-correlation, number system conversion, template matching

II. COMPUTER VISION A. Template Matching Medical imaging, video tracking, and motion analysis are few of computer vision where template matching plays a crucial role [4]. OpenCV is a library of programing functions designed for computer visions. It is a cross platform program that features both 2D and 3D toolkits written in C++ [5]. One special feature of OpenCV is template matching. Template matching slides the box of the template over the given input image. Figure 1 depicts a sample output of finding the windows in a given structure image.

I. INTRODUCTION In its simplest form, template matching is representing images as a two-dimension array of intensity and is compared using metrics like Euclidean distance [1]. Intensity-based method is regarded as optimization process of finding max degree of similarity between the basis photo and templates. It matches based on border, uniqueness, texture and entropy. There are other methods include assessing gray levels before the actual image processing. Another similar method is feature-based matching which is usually used even if some of the features are gone. Other known applications of template matching is in road vehicle detection, handwriting identifications, and pulmonary nodules detection. Problems with the simple template matching include being not in scale or being rotation invariant. A slight difference on the image can cause problems in detecting, making the algorithm not flexible to changes. This problem can be solved by creating several templates for a single object. Moreover, it must be noted that template matching is a computer hungry operation, especially if it needs massive amounts of features to correlate [2]. To speed up the computational process, complexity of the filtering must be reduce using convolution theorem. Another technique is image pyramids, which are formed by repeated filtering and subsampling of the original image to produce sequence of reduced resolutions [3].

978-1-5386-0912-5/17/$31.00 ©2017 IEEE

Fig 1. A sample window detection using template matching

The template t[x, y] and the image s[x, y] are shown to have pixel location on the images. The center of the template moves over each point in the search image and calculate the sum of the products between the coefficient of t[x, y] and s[x, y]. When all positions are considered, the best match will then be computed while the best position will be considered for linear spatial filtering. Gallery creation is important and is considered to be a critical part in template matching applications. It requires more than one object to detect or to match with [6]. The concept is also crucial with the object character recognition (OCR). In character recognition, it should contain all possible text fonts and if needed, all possible orientation of the text.

If we consider a general case of matching N templates with an image, this variant of template matching highlights edge points, corner points, and even high variance area of the image [7]. B. Normalized Cross-correlation (NCC) The correlation between two signals is a standard approach to feature detection. It does not a have an efficient frequency domain where fast Fourier transform (FFT) can be applied [8]. NCC has already been extensively used as a metric to evaluate the degree of similarity or patterns between compared images [9]. Its main advantage over the typical cross correlation is that NCC is less sensitive to changes in background illumination. Moreover, NCC is effective on similarity tests but it falls off from time critical applications. NCC if compared with SAD or SSD, is more accurate but is heavier on computational load [10]. It is also suggested from previous papers that the premature elimination of locations improves the speed of the correlation [11]. Given a template image of t, whose position is to be determine in an image f, the value of the normalized crosscorrelation (NCC) or CCOEF- NORMED is calculated using the formula in Eq. 1: (

2) =

1,



,

(

1( , ) −

2)

(Eq. 1)

2( , ) −

1) (

As shown also from the formula, NCC was obtained by summing the weighted correlation functions of the basic functions [12]. Problems where template matching usually can be used include finding an image pattern’s position and estimate it. This can be applied in feature finding applications. This only works if the image pattern is given and preferably, the same pattern is the one needed to be found. The formula was derived from the sum expansion of a given template image, and rectangular function of Image (x, y). NCC is applied since there is no change in the template and input image. In addition, there is no need for change in illuminations and background noise in the image. III. BINARY SYSTEM The binary number system is considered to be the most fundamental of all number systems in digital systems [13]. It is represented by a series of 1’s and 0’s and has a base of 2. Example of a binary number, 101100101, is expressed from right to left. In converting one binary number to its decimal value counterpart, it will produce a weight, a binary number located in the right hand most being the least significant bit (LSB) while the left hand most being the most significant bit (MSB) as represented in Table 1. TABLE I. MSB 28 256

27 128

26 64

BINARY NUMBER REPRESENTATION

Binary Digit 25 24 23 32 16 8

22 4

21 2

LSB 20 1

On the other hand, in converting a decimal number to binary system, repeated division of 2 will be performed and each remainder of either 0 or 1 will be recorded. Table 2 illustrates the detailed process of converting the decimal value 294 to binary number 100100110. TABLE II.

DECIMAL-TO-BINARY PROCESS CONVERSION

Number 294 Divide by 2 result 147 Divide by 2 result 73 Divide by 2 result 36 Divide by 2 result 18 Divide by 2 result 9 Divide by 2 result 4 Divide by 2 result 2 Divide by 2 result 1 Divide by 2 result 0

remainder

0 (LSB)

remainder

1

remainder

1

remainder

0

remainder

0

remainder

1

remainder

0

remainder

0

remainder

1 (MSB)

The main objective of this research is to develop an algorithm that utilizes the template matching function of OpenCV through Python programming in converting a raw digital number image to its actual binary and decimal number equivalent. IV. METHODOLOGY A. Tools and Software In this paper, the OpenCV library was used as the main tool and Python as the main programming language [14]. The whole program mainly revolves on the use of cv2.MatchTemplate() function in the OpenCV [15]. The said function slides the template image over the input and compares the template with the input image. The resulting image is the input image having the detected template images boxed out. The two important components are the following: source image (s) and template image (t). B. Program Framework As displayed in Figure 2, the first step in the framework is the image acquisition for both the template image and the test image. Pre-processing of the image includes the resizing of the image template to a desired size. That is, having the template image with an actual size to be found in the test images. The next step involves the conversion of the image to grayscale, thus any font color shall be considered. Figure 3 shows the two templates used in this study. Note that the two images used were of 1:1 size used in the program. They can be inserted manually on the image.

they will be immediately converted into strings. The output string will then be inverted prior the operation of number conversion. The last step is decimal conversion. It is simply performed using a loop that searches for every occurrence of 1 and its location index in the string. V. RESULTS AND DISCUSSIONS A. Threshold Values It was clearly observed that the two templates have used different threshold values. In Table 3, the template used for 0 requires a 0.9 threshold value while the template for 1 is 0.91. TABLE III. Threshold 0.89 0.90 0.91

Fig 2. Program framework

Fig 3. Template images

The command used in converting the images to grayscale was cv2.cvtColor(img_rgb,cv2.COLOR_BGR2GRAY). With the concept of gray scaling, the processed image reduces the required number of templates needed, thus, making processing speed faster [9]. C. Pre-processing and Image Acquisition In segmentation [16], selected elements of an image is determined and separated. They are recognized individually but trouble occurs when overlapping characters are present. Noise over images which contains the characters and graphics are some of the other problems which causes the program to fail to get the text [17]. In template matching, the so-called slider is moved across the image where, pixel by pixel, is checked until a perfect match is received. For every match, a red rectangle box around the object found will also be made. Moreover, the location of the object and the kind of template image that matched is appended to an array. The threshold is different for every template, and was identified through trial and error by modifying the value and checking whether the multiple instances of the template in the test images will be all boxed out. D. Sorting, Text Creation and Conversion After the template match, the objects identified are sorted according to the vertical locations. Sorting is based on the top left location of the template in the image, given by the template matching function of OpenCV. Those on the same row are grouped together and those on the same row are further sorted according to their horizontal locations. Once they are sorted according from left to right then from up to down directions,

TESTS FOR IDENTIFYING THE OPTIMAL THRESHOLD VALUE

Detected 0 24 16 8

Actual 0 16 16 16

Detected 1 50 30 20

Actual 1 20 20 20

As shown from the above table, increasing the threshold values increases the number of detected objects, thus, increasing false negative detections. Whereas, decreasing the threshold values decreases the number of detected objects, thus increasing false positive detections. B. Single-and Multi-line Binary Image to Text Conversion The identified threshold values were tested for different images of single-lined binary numbers. Figure 4 shows an example of this result.

Fig. 4 A sample single-line binary number image to text conversion

Template-matching for images with multi-lined binary number is similar to that of single-lined binary numbers. However, an additional step of identifying which row the digit belongs to is required for sorting. Nevertheless, the algorithm is able to perform conversion for these kinds of images, as depicted in Figure 5.

Fig. 5 A sample multi-line binary number image to text conversion

C. Colored Font Image to Text Conversion Aside from single- and multi-line images, the algorithm was also tested for a binary number image with colored font, as shown in Figure 6.

[4]

Fig. 6 Colored font image to text conversion

The other limitations of the algorithm made include extracting values from images with non-white backgrounds. Detection of number for images with fonts other than the used Arial is also not possible. VI. CONCLUSION OpenCV and Python programming were successfully demonstrated in template matching for images containing binary numbers. It was found that OpenCV’s template matching function allows easy detection of multiple occurrences of the templates in images. There is, however, a need to identify the optimal threshold value for each template as it varies with the template image. Locations of detected templates were appended into arrays and further sorted based on their vertical and horizontal locations. A string is then made from the sorted array. This string is converted to decimal using a loop that detects the occurrences of 1 and the indices containing it in the string. Although easy to use, there are several limitations of the algorithm and OpenCV’s template matching function. One is the dependence to the Arial font size which was used in the template image. Another is the dependence to the scale as the system is not able to properly detect occurrence of the pattern in the images when the pattern is slightly scaled, although the pattern technically is in the image. Another limitation is that the system can only recognize the templates in images with white background. These limitations lead to the following recommendations. First, there is a need to improve preprocessing techniques by adding color manipulation where the background and text would be modified in brightness and contrast such that the background can be transformed into white. Also, preprocessing techniques that could scale the template as required by the image could be done. Another modification is to have more templates of different font styles so the numbers could be detected regardless of the font. Lastly, the algorithm can be further developed to achieve conversion to other number systems such as hexadecimal and octal. REFERENCES [1] R. Brunelli and T. Poggio, “Face recognition: feature versus templates,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 15, no. 10, pp. 1042–1052, 1993. doi:10.1109/34.254061. [2] X. Peng and J. Xu, “Hash-Based Line-by-Line Template Matching for Lossless Screen Image Coding,” IEEE Trans. Image Process., vol. 25, no. 12, pp. 5601–5609, 2016. doi:10.1109/tip.2016.2612884. [3] Y. Lou and J. T. Yen, “Improved contrast for high frame rate imaging using coherent compounding combined with spatial matched filtering,” Ultrasonics, vol. 78, pp. 152–161, 2017.

doi:10.1016/j.ultras.2017.03.015. S. K. Sahani, G. Adhikari, and B. K. Das, “A fast template matching algorithm for aerial object tracking,” ICIIP 2011 Proc. 2011 Int. Conf. Image Inf. Process., no. Iciip, 2011.

[5] “OpenCV template matching.” [Online]. Available: http://docs.opencv.org. [6] M. Emambakhsh, Y. He, and I. Nabney, “Handwritten and Machine-Printed Text Discrimination Using a Template Matching Approach,” Proc. - 12th IAPR Int. Work. Doc. Anal. Syst. DAS 2016, no. 101779, pp. 399–404, 2016. doi:10.1109/das.2016.22. [7] I. Symposium, C. Applications, and I. Electronics, “D3 2014,” pp. 152–157, 2014 [8] J.P. Lewis, “Fast Normalized Cross-Correlation.” [Online]. Available: http://scribblethink.org/Work/nvisionInterface/nip.html. [9] D. M. Tsai and C. T. Lin, “Fast normalized cross correlation for defect detection,” Pattern Recognit. Lett., vol. 24, no. 15, pp. 2625–2631, 2003. [10] F. Y.M., “One-Dimensional Vector Based Pattern Matching,” Int. J. Comput. Sci. Inf. Technol., vol. 6, pp. 1–12, 2014. [11] A. Mahmood and S. Khan, “Correlation-coefficient-based fast template matching through partial Elimination,” IEEE Trans. Image Process., vol. 21, no. 4, pp. 2099–2108, 2012 [12] K. Briechle and U. D. Hanebeck, “Template matching using fast normalized cross correlation,” Proc. SPIE, vol. 4387, pp. 95–102, 2001. doi: 10.1117/12.421129. [13] “Binary to Decimal Conversion,” Electronics Tutorials, 2017. [Online]. Available: http://www.electronics-tutorials.ws. [14] “Template Matching Documentation.” [Online]. Available: http://docs.opencv.org. [15] OpenCV, “Colorspaces Documentation.” [Online]. Available: http://docs.opencv.org. [16] A. Singh and S. Desai, “Optical character recognition using template matching and back propagation algorithm,” 2016 International Conference on Inventive Computation Technologies (ICICT), pp. 1–6, 2016. doi:10.1109/inventive.2016.7830161. [17] J. Yoo, S. S. Hwang, S. D. Kim, M. S. Ki, and J. Cha, “Scaleinvariant template matching using histogram of dominant gradients,” Pattern Recognit., vol. 47, no. 9, pp. 3006–3018, 2014. doi:10.1016/j.patcog.2014.02.016.