Face Recognition under Varying Illumination with ... - IEEE Xplore

2 downloads 0 Views 2MB Size Report
Abstract—Face recognition under illumination variations is a challenging research area. This paper presents a new method based on the log function and the ...
IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 12, DECEMBER 2014

1457

Face Recognition under Varying Illumination with Logarithmic Fractal Analysis Mohammad Reza Faraji and Xiaojun Qi

Abstract—Face recognition under illumination variations is a challenging research area. This paper presents a new method based on the log function and the fractal analysis (FA) to produce a logarithmic fractal dimension (LFD) image which is illumination invariant. The proposed FA feature-based method is a very effective edge enhancer technique to extract and enhance facial features such as eyes, eyebrows, nose, and mouth. Our extensive experiments show the proposed method achieves the best recognition accuracy using one image per subject for training when compared to six recently proposed state-of-the-art methods. Index Terms—Face recognition, fractal analysis, illumination variation, logarithmic fractal dimension.

I. INTRODUCTION ACIAL appearance varies due to illumination, pose, expressions, age, and occlusion [1], [2]. Among them, illumination variations such as shadows, underexposure, and overexposure are crucial problems to be addressed in a practical recognition system [3]. This has led researchers to introduce various methods to deal with illumination changes in the past decades. These methods generally can be categorized into gray-level transformation methods, gradient or edge extraction methods, and face reflection field estimation methods [3]. Gray-level transformation methods perform a pixel-wise intensity mapping with a linear or non-linear transformation function in order to redistribute the intensities in a face image and correct the uneven illumination to some extent [3]. Histogram Equalization (HE) [4], Logarithmic Transform (LT) [5], and Gamma Intensity Correction (GIC) [6] are regarded as typical approaches in this category. Gradient or edge extraction methods extract the gray-level gradients or edges from a face image and use them as an illumination-insensitive representation [3]. Representative methods include Local Binary Patterns (LBP) and its modified versions Local Ternary Patterns (LTP) [7], [8], Local Directional Patterns (LDP) [9], Enhanced LDP (EnLDP) [10], Local Directional Number Patterns (LDN) [11], and Discriminant Face Demembers of local scriptor (DFD) [12]. LBP and LTP take around each pixel and neighborhood in a circle of radius threshold neighborhood pixels based on the value of the central pixel, where is usually set to be 8 and is set to be 1

F

Manuscript received June 13, 2014; accepted July 20, 2014. Date of publication July 25, 2014; date of current version July 30, 2014. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Alexandre X. Falcao. The authors are with the Department of Computer Science, Utah State University, Logan, UT 84322-4205 USA (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/LSP.2014.2343213

or 2. LDP, EnLDP, and LDN produce eight directional edge images using Kirsch compass masks and encode the directional information to obtain noise and illumination invariant representations. DFD is a 3-step feature extraction method that maximizes the appearance difference from different persons and minimizes the difference from the same person. Reflectance field estimation methods estimate the face reflectance field from a 2D face image to produce illumination-invariant representations [3]. Gradientface [13] and Weberface [14] are examples of these methods. Gradientface and Weberface compute the ratio of -gradient to -gradient and the ratio of the local intensity variation to the background of a given image, respectively, to produce illumination insensitive representations. On the other hand, fractal analysis (FA), as a type of texture analysis, has been recently used in medical imaging and image processing [15], [16], [17], [18]. For texture analysis of fractal features, image intensities are transformed to the fractal dimension (FD) domain [16]. The FD transform is considered as an edge enhancement and preprocessing algorithm that does not increase noise [15], [16]. Specifically, Al-Kadi et al. [16] enhance edges in respective images using FA to differentiate between aggressive and nonaggressive malignant lung tumors. Kim et al. [17] apply FA to detect and predict glaucomatous progression. Zlatintsi and Maragos [18] use multiscale FA to quantify the multiscale complexity and fragmentation of different states of the music waveform. In this paper, we propose to apply a novel FA feature-based preprocessing method to generate an illumination invariant representation for a given face image. To the best of our knowledge, this is the first attempt to apply FA in the face recognition task to achieve illumination insensitive representation. To this end, we first perform a log-based transformation to partially reduce the illumination effect and make the image brighter. This log function expands the values of dark pixels and compresses brighter pixels in the image. As a result, pixel values are spread more uniformly. We then transfer the scaled image to a Logarithmic FD (LFD) image using the Differential Box-Counting (DBC) algorithm [16], [19], [20]. Finally, we evaluate the performance of our method using the one nearest neighborhood (1NN) with norm (1NN- ) as the classifier. This paper makes four contributions: 1) Using a necessary and efficient log function to expand dark pixels and compress bright pixels for partial illumination reduction. 2) Transforming face images to the FD domain to produce the illumination invariant representation. 3) Enhancing facial features such as eyes, eyebrows, nose, and mouth while keeping noise at a low level. 4) Achieving a high face recognition accuracy using a simple classifier compared with several recently proposed state-of-the-art methods. The rest of this paper is organized as follows: Section II presents our proposed FA feature-based method to produce LFD images. Section III shows experimental results and evalu-

1070-9908 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

1458

IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 12, DECEMBER 2014

ates the performance of the proposed method. Section IV draws the conclusion and presents the directions of future work. II. METHODOLOGY A. FA Feature-Based Method The FD transform is an effective edge enhancer technique that keeps the noise level low [15], [16]. Since edge magnitudes are largely insensitive to illumination variations [11], we apply FA to produce corresponding FD images of given face images to achieve illumination insensitive representation. Fractals are defined as a geometrical set whose HausdorffBesicovitch dimension strictly exceeds the topological dimension. They describe non-Euclidean structures that show selfsimilarity at different scales [16]. Most of biological and natural features tend to have a FD [16]. We use the DBC algorithm, a popular method performing fast in the FD calculations when dealing with large images, to quickly transfer face images to FD images [16], [19], [20], [21]. of size to the In order to transfer the face image in the FD domain, we first compute the 3D matrix image that represents the number of boxes necessary to overlay at each pixel ( ) as follows: the image

(1) , is the scaling factor with where is the dimension of that represents how much a specific a maximum value of structure of pixels is self-similar to its surrounding, and . is a varying size nonlinear kernel of size , and , , , and are four nonnegative integers computed to on each pixel . Here, and center the kernel . The kernel , functioning as a moving window, is calculated as the following: (2) where and are the highest and lowest intensity values of neighboring pixels in the processing block. Finally, we generate the fractal slope by the linear regression line of and to represent the FD value at ( ) (i.e., ). To this end, we first apply the log function on all and the respective scaling factor to the elements of compress the dynamic range of images [16]. Next, we convert to a two dimensional matrix of size . , where each element That is, ( ) is a vector of size with and each element in is related to the pixel at location ( ) (i.e., ). The FD value at across scales from 1 to ) is computed as the fractal slope of the least square linear ( regression line by: (3) where

and

are the sums of squares as follows: (4)

Fig. 1. Illustration of the LFD process: 1) The face image is scaled by the log is computed using the DBC algorithm. 3) function. 2) The 3D matrix is converted to and the LFD image is obtained using Eq. (3).

Fig. 2. Results of the proposed FA feature-based method. (a) original face images; (b) scaled images after the log operation; and (c) LFD images.

(5) The transformation process to convert face images to FD images is illustrated in steps 2 and 3 of Fig. 1. B. Implementation This subsection illustrates how to implement the proposed method. First, we perform a log-based transformation to partially reduce the illumination effect and make the image brighter. This log function expands the values of dark pixels and compresses the values of bright pixels. Next, we transfer the scaled image to the FD domain using the FA feature-based method introduced in the previous subsection. The final image is called LFD image. The entire process to compute the LFD image is illustrated in Fig. 1 and the algorithmic view of the proposed method is summarized in Algorithm 1. Fig. 2 shows two face images along with their corresponding Log and LFD images. It clearly demonstrates that the log function reduces the illuminance to some extent. Furthermore, LFD images enhance the important features of faces such as eyes, eyebrows, nose, mouth, and the shape of the face in general. Comparing LFD images with their original images verifies that the proposed method produces illumination insensitive features. Fig. 3 presents LFD images of six Yale B face images for one subject and the other six illumination invariant images produced by Gradienface, Weberface, LBP, LDP, EnLDP, and LDN, respectively. It clearly shows that LFD images contain better or

FARAJI AND QI: FACE RECOGNITION UNDER VARYING ILLUMINATION WITH LOGARITHMIC FRACTAL ANALYSIS

Fig. 3. Illustration of original face images and their preprocessed images. (a) Sample illumination face images from Yale B and illumination invariant images produced by (b) Gradientface, (c) Weberface, (d) LBP, (e) LDP, (f) EnLDP, (g) LDN, and (h) LFD.

comparable illumination insensitive features than the other preprocessed images.

1459

Fig. 4. Illustration of sample images and their LFD images. (a) 21 samples from CMU-PIE (b) Corresponding LFD images.

Algorithm 1 The algorithmic view of the proposed method. Input: Original face image

. .

Output: The logarithmic fractal dimension image

1) Perform the log transformation on the original image Fig. 5. Comparison of recognition accuracy for PIE face images.

2) For a) Update the kernel

using Eq. (2).

b) Compute

where

using

Eq. (1). size

3) Convert the three dimensional matrix to the two dimensional matrix .

of of size

4) Perform the log operation on both the vector and the matrix . 5) Initialize the logarithmic fractal dimension image with the size of the original image , as all 0’s. 6) For each pixel ( a) Compute

) of the and

b) Set the value of

image

using Eq. (4) and Eq. (5). using Eq. (3).

III. EXPERIMENTAL RESULTS A. Experimental Settings We evaluate the proposed FA feature-based method, i.e., LFD, by conducting experiments on publicly available CMU-PIE and Yale B face databases with large illumination variations [22], [23]. Both databases are manually cropped and resized to pixels. is the only parameter of the LFD method and the scaling factor is in the range between 2 and . In these experiments, we set to be

10 for both databases. However, we investigate the influence values in the subsection III-D to show the of different values. The LFD method is compared insensitivity of big with several recently proposed state-of-the-art methods such as Gradientface, Weberface, LBP, LDP, EnLDP, and LDN. We implement each method in MATLAB and set its applicable parameters as recommended by its researchers. For LBP, we use the uniform operator with 8 members in a circle of radius 2 [7]. We use 1NN- as the classifier, which is also used in the Weberface method [14]. It assigns a probe image to its nearest neighbor reference image in the database. Therefore, results only show the influence of preprocessing methods in handling illumination. B. Results on PIE Face Database The CMU-PIE database contains 41,368 grayscale images pixels) of 68 individuals under various poses, illu( minations, and expressions. The illumination subset (C27) containing 21 frontal images per subject is used in our experiment. Fig. 4(a) shows all 21 images for a subject from this database and Fig. 4(b) shows their corresponding LFD images. For each individual, we use one image as the reference and the other 20 images as the probe. Fig. 5 shows the face recognition accuracy of different methods under each reference set using the 1NNmeasure. Table I summarizes the average recognition accuracy and its is a variant of standard deviation (SD) of eight methods ( our system without applying the log operation) for all reference

1460

IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 12, DECEMBER 2014

TABLE I AVERAGE RECOGNITION ACCURACY (%) AND CORRESPONDING STANDARD DEVIATION IN PARANTHESES FOR YALE B FACE IMAGES

TABLE II AVERAGE RECOGNITION ACCURACY (%) AND CORRESPONDING STANDARD DEVIATION IN PARANTHESES FOR YALE B FACE IMAGES

images. LFD achieves the highest average recognition rate of 97.86% with the smallest SD of 0.02, while the second highest rate obtained by the Gradientface method is 96.63%. Compared with six state-of-the-art methods, LFD improves the accuracy of Gradientface, Weberface, LBP, LDP, EnLDP, and LDN by 1.27%, 3.50%, 2.61%, 17.31%, 4.32% and 8.20%, respectively. It clearly shows the effectiveness of the LFD method. We also show the influence of the log operation on top of each of the seven compared methods in the last row. The result verifies that the log-based operation is necessary in our LFD method while none of the other compared methods has significant accuracy improvement using the log function. C. Results on Yale B Face Database The Yale B database contains grayscale face images of 10 individuals under nine poses and 64 illumination conditions. The first pose containing frontal face images is used in our experiment. These images are categorized into six subsets based on the angle between light source directions and the central camera axis: S0 (0 , 60 images), S1 (1 to 12 , 80 images), S2 (13 to 25 , 100 images), S3 (26 to 50 , 120 images), S4 (51 to 77 , 100 images), and S5 (above 78 , 180 images). In total, there are 640 images. S0 contains 6 images with different elevations of light source for each subject. The six images corresponding to one of the subjects are shown in the first column of Fig. 3. The degrees of their elevations are -35, -20, 0, 20, 45, and 90, respectively. Positive and negative elevations imply the light source is above and below the horizon, respectively. We conduct six experiments for each compared method. In each experiment, we use one of the six face images per subject in S0 as reference, and the remaining five images (50 images in total) and all the images in each of the other five subsets as probes. We then compute the average face recognition accuracy of each subset across all six experiments. Table II summarizes the average accuracy of each subset for six state-of-the-art compared methods and our proposed method. S0’ for each experiment contains images in S0 excluding the reference image. We also include the average accuracy of subsets 0’, 1, 2, 3, 4, and 5 (i.e., 630 probes in total) in the eighth column of Table II. The LFD method outperforms six state-of-the-art methods with the best face recognition accuracy of 95.13% and the comparable SD value of 0.08. The second best method, the Weberface method, achieves the face recognition rate of 92.51%. It should be noted that the second

Fig. 6. Comparison of the recognition accuracy of the proposed method with values ranging from 2 to 20 for PIE and Yale B images. different

best method for the PIE database ranks the 4th for the Yale B database. The proposed method improves the face recognition accuracy of Gradientface, Weberface, LBP, LDP, EnLDP, and LDN by 6.73%, 2.83%, 6.08%, 10.64%, 10.68%, and 15.45%, shows respectively. Similar to PIE database, the result for the log function has to be applied before the FD transformation. D. Parameter The proposed FA feature-based method has only one param. It is the maximum value for the scaling factor which eter represents how much a specific structure of pixels is self-simcan lead to difilar to its surrounding. Different values for ferent accuracy. Fig. 6 lists the recognition accuracy for PIE and Yale B databases using values ranging from 2 to 20 for . It clearly shows both databases have similar trends to. Specifically, accuracy for both databases increases wards value is almost 9 and then accuracy does gradually until values more than 9. This obnot significantly change for servation indicates different face databases would have similar since PIE and Yale B databases conbehavior regarding tain different illumination conditions. Therefore, for the image pixels, we recommend to use a value size of between 9 and 12 for the face recognition task to achieve decent compromise between computational time and good face recogto be 10 for both face databases in nition accuracy. We set our experiment. IV. CONCLUSIONS We propose a FA feature-based method to produce illumination invariant features from face images with illumination variations. Our experiments on two face databases (PIE and Yale B) illustrate the effectiveness of the proposed method and demonstrate it achieves the best face recognition accuracy when compared to six recently proposed state-of-the-art methods. Our contributions are: 1) Applying an effective and necessary log transformation to produce partially illumination reduced face images. 2) Applying FA feature-based method to produce the illumination invariant face representations (i.e., LFD images) while enhancing facial features such as eyes, eyebrows, nose, and mouth and keeping noise at a low level.

FARAJI AND QI: FACE RECOGNITION UNDER VARYING ILLUMINATION WITH LOGARITHMIC FRACTAL ANALYSIS

REFERENCES

[1] A. F. Abate, M. Nappi, D. Riccio, and G. Sabatino, “2d and 3d face recognition: A survey,” Pattern Recognit. Lett., vol. 28, pp. 1885–1906, 2007. [2] M. R. Faraji and X. Qi, “An effective neutrosophic set-based preprocessing method for face recognition,” in Proc. Int. Conf. Multimedia Expo, 2013. [3] H. Han, S. Shan, X. Chen, and W. Gao, “A comparative study on illumination preprocessing in face recognition,” Pattern Recognit., vol. 46, no. 6, pp. 1691–1699, 2013. [4] S. M. Pizer, E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. ter Haar Romeny, J. B. Zimmerman, and K. Zuiderveld, “Adaptive histogram equalization and its variations,” Comput. Vis. Graph. Image Process., vol. 39, no. 3, pp. 355–368, 1987. [5] Y. Adini, Y. Moses, and S. Ullman, “Face recognition: The problem of compensating for changes in illumination direction,” IEEE Trans. Patt. Anal. Mach. Intell., vol. 19, no. 7, pp. 721–732, 1997. [6] S. Shan, W. Gao, B. Cao, and D. Zhao, “Illumination normalization for robust face recognition against varying lighting conditions,” in Proc. IEEE Int. Workshop on Analysis and Modeling of Faces and Gestures, 2003, pp. 157–164. [7] T. Ahonen, A. Hadid, and M. Pietikainen, “Face description with local binary patterns: Application to face recognition,” IEEE Trans. Patt. Anal. Mach. Intell., vol. 28, no. 12, pp. 2037–2041, 2006. [8] X. Tan and B. Triggs, “Enhanced local texture feature sets for face recognition under difficult lighting conditions,” IEEE Trans. Image Process., vol. 19, no. 6, pp. 1635–1650, 2010. [9] T. Jabid, M. H. Kabir, and O. Chae, “Local directional pattern (ldp) for face recognition,” in Proc. Dig. Tech. Papers Int. Conf. Consumer Electronics, 2010, pp. 329–330. [10] F. Zhong and J. Zhang, “Face recognition with enhanced local directional patterns,” Neurocomputing, vol. 119, no. 0, pp. 375–384, 2013. [11] A. R. Rivera, R. Castillo, and O. Chae, “Local directional number pattern for face analysis: Face and expression recognition,” IEEE Trans. Image Process., vol. 22, no. 5, pp. 1740–1752, 2013.

1461

[12] Z. Lei, M. Pietikainen, and S. Li, “Learning discriminant face descriptor,” IEEE Trans. Patt. Anal. Mach. Intell., vol. 36, no. 2, pp. 289–302, 2014. [13] T. Zhang, Y. Y. Tang, B. Fang, Z. Shang, and X. Liu, “Face recognition under varying illumination using gradientfaces,” IEEE Trans. Image Process., vol. 18, no. 11, pp. 2599–2606, 2009. [14] B. Wang, W. Li, W. Yang, and Q. Liao, “Illumination normalization based on weber’s law with application to face recognition,” IEEE Signal Process. Lett., vol. 18, no. 8, pp. 462–465, 2011. [15] C.-C. Chen, J. S. DaPonte, and M. D. Fox, “Fractal feature analysis and classification in medical imaging,” IEEE Trans. Med. Imag., vol. 8, no. 2, pp. 133–142, 1989. [16] O. S. Al-Kadi and D. Watson, “Texture analysis of aggressive and nonaggressive lung tumor ce ct images,” IEEE Trans. Biomed. Eng., vol. 55, no. 7, pp. 1822–1830, 2008. [17] P. Y. Kim, K. M. Iftekharuddin, P. G. Davey, M. Toth, A. Garas, G. HolloÌ, and E. A. Essock, “Novel fractal feature-based multiclass glaucoma detection and progression prediction,” IEEE J. Biomed. Health Inform., vol. 17, no. 2, pp. 269–276, 2013. [18] A. Zlatintsi and P. Maragos, “Multiscale fractal analysis of musical instrument signals with application to recognition,” IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 4, pp. 737–748, 2013. [19] N. Sarkar and B. B. Chaudhuri, “An efficient differential box-counting approach to compute fractal dimension of image,” IEEE Trans. Syst., Man Cybern., vol. 24, no. 1, pp. 115–120, 1994. [20] S. Buczkowski, S. Kyriacos, F. Nekka, and L. Cartilier, “The modified box-counting method: Analysis of some characteristic parameters,” Pattern Recognit., vol. 31, no. 4, pp. 411–418, 1998. [21] C. J. Traina, A. Traina, L. Wu, and C. Faloutsos, “Fast feature selection using fractal dimension,” in Proc. 15th Braz. Symp. Databases, 2000, pp. 158–171. [22] T. Sim, S. Baker, and M. Bsat, “The cmu pose, illumination, and expression (pie) database,” in Proc. IEEE Int. Conf. Automatic Face and Gesture Recogn., 2002, pp. 46–51. [23] A. S. Georghiades, P. N. Belhumeur, and D. Kriegman, “From few to many: Illumination cone models for face recognition under variable lighting and pose,” IEEE Trans. Patt. Anal. Mach. Intell., vol. 23, no. 6, pp. 643–660, 2001.