Distinguishing computer graphics from ... - Wiley Online Library

10 downloads 1456 Views 357KB Size Report
Dec 10, 2013 - Classification accuracy reaches 95.1% with support vector ... image forensics; computer graphics; local binary patterns; multiresolution analysis ...
SECURITY AND COMMUNICATION NETWORKS Security Comm. Networks 2014; 7:2153–2159 Published online 10 December 2013 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/sec.929

RESEARCH ARTICLE

Distinguishing computer graphics from photographic images using a multiresolution approach based on local binary patterns Zhaohong Li1*, Zhenzhen Zhang1 and Yunqing Shi2 1 2

School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, 100044, China Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ 07102, U.S.A.

ABSTRACT With the ongoing development of rendering technology, computer graphics (CG) are sometimes so photorealistic that to distinguish them from photographic (PG) images by human eyes has become difficult. To this end, many methods have been developed for automatic CG and PG classification. In this paper, we present a simple, yet efficient, multiresolution approach to distinguish CG from PG based on uniform gray-scale invariant local binary patterns (LBPs) with the help of support vector machines (SVM). We select YCbCr as the color model. The original Joint Photographic Experts Group (JPEG) coefficients of Y, Cb, and Cr components and their prediction errors are used for two LBP operators. From each 2D array and each LBP operator, we obtain 59 uniform LBP features. In total, 12 groups of 59 features are obtained from each image. But after multiresolution analysis, we select six groups of 59 features for CG and PG classification. The proposed features have been tested with thousands of CG and PG. Classification accuracy reaches 95.1% with support vector machines and outperforms the state-of-the-art works. Copyright © 2013 John Wiley & Sons, Ltd. KEYWORDS image forensics; computer graphics; local binary patterns; multiresolution analysis; image authentication *Correspondence Zhaohong Li, School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, 100044, China. E-mail: [email protected]

1. INTRODUCTION In recent years, computer graphics (CG) provide people entertainments with incredibly photorealistic visual scene produced by advanced rendering software such as 3D Studio Max, AccuRender, Photoshop, and SketchUp. However, also because of the photorealistic rendering, CG can be used as a forgery of photo image in journalism, scientific research, justice and other areas for malicious social, economy or political purpose. We can foresee that the rendering software will become even more advanced and powerful and be able to produce highly realistic images to deceive human eyes. Therefore, distinguishing CG from photographic (PG) images automatically has turned out to be an important topic in digital image forensics. Because PG images are generated by digital cameras, it is expected that the distinct physical generation pipelines of camera must introduce unique intrinsic characteristics into PG, which are absent in CG. On the basis of this assumption, some distinguishing methods have been reported [1–4]. Dehnie et al. [1] used pattern noise caused Copyright © 2013 John Wiley & Sons, Ltd.

by imperfections of camera sensors to distinguish CG from PG. Ng et al. [2] proposed a geometry-based image model to reveal certain image processing differences, such as gamma correction in PG. Dirik et al. [3] developed four features that capture traces of color filter array and demosaicing in camera image processing pipeline and another feature to capture chromatic aberration for the discrimination of CG and PG. In their later work [4], Dirik and Memon introduced two features. The first one was a revisit to the third demosaicing feature proposed in [3]. The second one measured the sensor noise power changes all across the image. Both of the features can achieve high classification accuracies with high quality PG images. Apart from the aforementioned methods, which focus on one or two stages of camera image processing pipeline, works reported in [5–13] are based on the difference of image statistical features caused by difference of whole image formation procedures of CG and PG. Ng and Chang [5] studied three types of natural image statistics derived from the power spectrum, wavelet transform, and local patch of images to distinguish 2153

Z. Li, Z. Zhang and Y. Shi

Distinguishing CG from PG by using LBP features and SVM

CG from PG. Wu et al. [6] employed several visual features derived from color, edge, saturation, and texture features with the Gabor filter as discriminative features. Chen et al. [7] formed the distinguishing features by using statistical moments of characteristic function of wavelet subbands and their prediction errors. Sutthiwan et al. [8] employed second-order statistics to capture the significant statistical difference between CG and PG images. Chen et al. [9] built an alpha-stable distribution model to characterize the wavelet decomposition coefficients of natural images and extracted the fractional lower order moments in the wavelet domain. Li et al. [10] extracted the variance and kurtosis of second-order difference signals and the first four order statistics of predicting error signals as distinguishing features in the hue–saturation–value (HSV) color space. Pan and Huang [11] extracted a set of features derived from hidden Markov tree model to classify natural images and CG. Zhang and Wang [12] presented an approach combining imaging features and visual features from different image components. Wu et al. [13] took several highest histogram bins of the difference images as features to carry out classification, and these simple histogram features worked well. Image statistical features [5–13] have been proved to be useful in CG and PG classification. And it is mentioned in [14] that “Texture is an innate property of virtually all surfaces.” This inspires us that features developed for texture classification could potentially play a role in CG and PG classification. In this paper, we take a close look at the uniform gray-scale invariant local binary pattern (LBP) [15], which has been developed as an efficient local texture descriptor. In this popular technology (as of February 2012 [15] has been cited almost 1900 times according to Google), the features describe a texture by calculating the LBPs of the entire image, which is measured by histograms. The number of bins is suppressed from 256 (eight neighbors) to 59 by separating ‘uniform‘ and ‘non-uniform‘ patterns and merging ‘non-uniform‘ patterns to one bin. The LBP features can reflect both global and local information of one image related to image texture, which makes the LBP features very useful in classifying CG and PG. In our previous work [16], we consider eight neighbors that form a circle with radius of one to calculate the central pixel’s LBPs, 59 LBP features are then obtained. In further study on LBP features, it has been proved that combining the information provided by multiple operators of varying (P, R) is helpful in texture segmentation [15]. This motivates us to use multiresolution analysis in classifying CG and PG in this paper and test the features on a much larger database. For each pixel, we select two LBP operators, one is considering eight neighbors that form a circle with radius of one to calculate the pixel’s LBPs, and another one is considering eight neighbors that form a circle with radius of two to calculate the pixel’s LBPs. Hence, for each image, 59 LBPs are extracted from one LBP operator, respectively, from Y, Cb, and Cr components, and their prediction-error 2D arrays. Support 2154

vector machines are built for classification of thousands of CG and PG. Compared with the results in other literatures, the classification accuracy reported in this paper is higher. The rest of the paper is organized as follows. The LBP features and the features extraction are introduced in Section 2. Experimental results and discussions are presented in Section 3, and Section 4 concludes the paper.

2. PROPOSED METHOD Local binary pattern is a simple yet efficient method for texture classification, which has been proved to be powerful for texture classification in various texture analysis tasks. This motivates us to employ LBP to find the differences between CG and PG based on image structure information relating to texture. 2.1. Review of local binary pattern Local binary pattern is an efficient texture descriptor, which labels each image pixel a local pattern by comparing its gray value with those of its neighbors. When we compute LBP for one image pixel, and consider its P neighbors form a circle with radius R(R > 0), we call the image pixel as the central pixel, and the LBP of the central pixel is defined as follows [15]: P1   LBPP;R ¼ ∑ s gp  gc 2p

(1)

p¼0

where  sðxÞ ¼

1;

x≥0

0;

x < maxða; bÞ; c≤minða; bÞ ^x ¼ minða; bÞ; c≥maxða; bÞ > : a þ b  c; otherwise

(4)

where a, b are, respectively, the immediately horizontal and vertical neighbors of the pixel x, c is the diagonal neighbor of x as shown in Figure 4, and ^x is the prediction value of x.

Figure 2. (a) Constellation of neighborhood and (b) examples of ‘uniform‘ and ‘non-uniform‘ local binary patterns [15]. Security Comm. Networks 2014; 7:2153–2159 © 2013 John Wiley & Sons, Ltd. DOI: 10.1002/sec

2155

Z. Li, Z. Zhang and Y. Shi

Distinguishing CG from PG by using LBP features and SVM

Figure 3. Rotation invariant patterns [15].

including 2455 CG used in the previous work [16], and they are collected from [19] and [20]. More than 50 rendering softwares, for example, 3D Studio Max, After Effects, and Auto Cad, were used to generate those photorealistic CG images. The PG database contains 7500 digital camera images including 2455 PG used in previous work [16] too. Part of the images in the PG database is from [19], the rest are collected by our group. The image contents in both CG and PG database span a variety of outdoor and indoor scenes, including flowers, trees, animals, characters and architectures, and so on. 3.1. Experimental setting Figure 4. Pixel x and its neighbors for prediction.

To conclude, for each color component and each LBP operator, we extract 59 uniform LBP features from the original JPEG coefficients array and its prediction-error 2D array, respectively. The feature extraction framework of one color component for one LBP operator is shown in Figure 5. Finally, we obtain 12 groups of LBP features. When P = 8 and R = 1, we denote the LBP features extracted from Y, Cb, and Cr components and their prediction-error arrays Cr EY ECb as LBPðY8;1Þ , LBPCb ð8;1Þ , LBPð8;1Þ , LBPð8;1Þ , LBPð8;1Þ , and LBPECr ð8;1Þ , respectively. When P = 8 and R = 2, we then denote the LBP features extracted from Y, Cb, and Cr components and their prediction-error arrays as LBPY ð8;2Þ , Cr EY ECb ECr LBPCb ð8;2Þ , LBPð8;2Þ , LBPð8;2Þ , LBPð8;2Þ , and LBPð8;2Þ , respectively.

3. EXPERIMENTS AND DISCUSSIONS In experiments, all the CG and PG in our database are color images in JPEG format with moderate to good visual quality. The database used in our previous work [16] includes 2455 CG and 2455 PG; in this paper, the database is enlarged a lot. The CG database contains 7500 images

In our experiments, we use support vector machine (SVM) of polynomial kernel [21] as the classifier. To train the SVM classifier, 5/6 of the images are randomly selected as the training set (6250 CG and 6250 PG). The rest 1/6 form testing set. The experiments are repeated for 20 times to ensure reliable classification results. 3.2. Experimental results We test each group of 59 features on our image database, and the classification accuracy of each group of features is presented in Table I, where true positive (TP) represents the correct detection rate of CG, true negative (TN) represents the detection rate of PG images, and the accuracy is the arithmetic average of TP and TN. From Table I, first, we can observe that the LBP features extracted from Y component has much better performance compared with the features extracted from Cb and Cr components. Second, the performance of the LBP features extracted from the prediction-error arrays of Y component is also better than those extracted from the prediction-error arrays of Cb and Cr components. Furthermore, the LBP features extracted from the prediction-error arrays have higher accuracy than those extracted from corresponding original component. This is because the prediction operation has decreased the influence of image content to image statistical features. For multiresolution analysis, we then combine the LBP features from the most significant group to the least significant one.

Figure 5. Feature extraction framework of one color component for one local binary pattern operator.

2156

Security Comm. Networks 2014; 7:2153–2159 © 2013 John Wiley & Sons, Ltd. DOI: 10.1002/sec

Z. Li, Z. Zhang and Y. Shi

Distinguishing CG from PG by using LBP features and SVM

Table I. Classifying accuracy of single group of local binary pattern features. Feature

Feature size

TP

TN

Accuracy

LBPY ð8;1Þ LBPCb ð8;1Þ LBPCr ð8;1Þ LBPEY ð8;1Þ LBPECb ð8;1Þ LBPECr ð8;1Þ LBPY ð8;2Þ LBPCb ð8;2Þ LBPCr ð8;2Þ LBPEY ð8;2Þ LBPECb ð8;2Þ LBPECr ð8;2Þ

59 59 59 59 59 59 59 59 59 59 59 59

0.9050 0.8398 0.8380 0.9090 0.8548 0.8464 0.8826 0.8198 0.8053 0.9137 0.8656 0.8704

0.9172 0.8310 0.8190 0.9250 0.8773 0.8705 0.9151 0.8255 0.8346 0.9188 0.8920 0.8825

0.9111 0.8354 0.8285 0.9150 0.8661 0.8585 0.8989 0.8226 0.8199 0.9163 0.8788 0.8765

LBP, local binary pattern; TP, true positive; TN, true negative.

An interesting fact, the correlations between the features derived from Cb and Cr components of the test images and their prediction-error arrays are very high, while the features from any other combinations of YCb and YCr are less correlated, has been reported in [8] and in our previous work [16]. It indicates that using all of the features constructed from three color components will not improve the feature effectiveness significantly but rather increases computational complexity dramatically. Therefore, when combing the different groups of LBP features, we abandon the LBP features extracted from Cb and its prediction-error arrays. Then, we test the combined features on our image database, and the classifying accuracy using multiresolution can be obtained as shown in Table II. The experimental results show that the combined features have much better performance than single group features. As we can see, when combining two groups of LBP features extracted from the prediction-error arrays of Y Component, the classification accuracy increases about 2% compared with the most significant single group of LBP features. Then as the feature size increases, the classifying accuracy increases step by step. But when the feature size increases to be large enough, the classifying accuracy cannot be improved any more. This is because using SVM for distinguishing CG from PG requires the feature size to be relatively small enough compared with the size of image database. In whole, it can be concluded that the LBP features extracted from Y component are most crucial for the classifier, and the prediction-error operation is actually helpful for classifying CG and PG.

In order to illustrate the performance of our proposed features, we then compare it with the features proposed in [8,13] and in our previous work [16], where Markov features in [8] was proposed in 2009, Histogram bins features in [13] was proposed in 2011, and four groups of LBP features, Cr EY ECr LBPY ð8;1Þ ,LBPð8;1Þ ,LBPð8;1Þ , and LBPð8;1Þ ,were used together for SVM in our previous work [16]; all of them are recommended to be the most effective methods in distinguishing CG and PG in the past few years. The codes of the methods [8,13] are obtained from the authors. For fair comparison, we train and test the methods [8,13,16] using the same SVM kernel function with the same parameters used in our proposed method on our image database. The comparison between our method and the previous works [8,13,16] are presented in Table III. From Table III, it is observed that the proposed method outperforms these state-of-the-art works with higher accuracy. The accuracy of our method with 236-D features is about 1%, 4.5%, and 0.5% higher than that of [8], [13], and [16], respectively. When the size of LBP features increases to 354-D, the accuracy of our method is about 1.5% higher than that of [8] and it is about 5% higher than that reported in [13]. Furthermore, classifying accuracy of the proposed 354-D LBP features in this paper is 1% higher than that of our previous work [16], which indicates that using multiresolution analysis is helpful in classifying CG and PG. Besides, we compare the computational complexity of the proposed method with those of other methods [8,13,16] by calculating the features extraction time under the exactly identical conditions of the same computer (ASUS U31S Series with Intel core i3, memory 2 GB) and the same software (Matlab 7.1). One hundred PG images are randomly selected from 7500 PG images for testing the features extraction time, and the arithmetic average time is considered as the computational complexity. From Table IV, we can see that the Markov features [8] achieve the least computational complexity, and LBP features consume more extraction time than others. But the maximum gap between the extraction time of the proposed method and that of Markov features is about 0.3 s for one image. So the influence of the computational complexity depends on the size of image database that we need to detect, for example, if there are less than 1000 images, the difference of features extraction time will be less than 5 min, this excess time is valuable for achieving higher classifying accuracy. But when the testing image database is huge, we need to consider the computational complexity of the classifying method.

Table II. Classifying accuracy using multiresolution. Feature EY LBPEY ð8;1Þ + LBPð8;2Þ EY Y Y LBPEY + LBP ð8;1Þ ð8;2Þ + LBPð8;1Þ + LBPð8;2Þ EY Y Y ECr ECr LBPEY ð8;1Þ + LBPð8;2Þ + LBPð8;1Þ + LBPð8;2Þ + LBPð8;1Þ + LBPð8;2Þ EY Y Y ECr ECr Cr Cr LBPEY + LBP + LBP + LBP + LBP + LBP ð8;1Þ ð8;2Þ ð8;1Þ ð8;2Þ ð8;1Þ ð8;2Þ + LBPð8;1Þ + LBPð8;2Þ

Feature size

TP

TN

Accuracy

118 236 354 472

0.9329 0.9418 0.9446 0.9423

0.9386 0.9488 0.9573 0.9574

0.9357 0.9453 0.9510 0.9499

LBP, local binary pattern; TP, true positive; TN, true negative.

Security Comm. Networks 2014; 7:2153–2159 © 2013 John Wiley & Sons, Ltd. DOI: 10.1002/sec

2157

Z. Li, Z. Zhang and Y. Shi

Distinguishing CG from PG by using LBP features and SVM

Table III. Classifier test accuracy. Method Markov features of [8] Histogram bins of [13] LBP features of [16] EY Y Y LBPEY ð8;1Þ + LBPð8;2Þ + LBPð8;1Þ + LBPð8;2Þ EY EY Y ECr ECr LBPð8;1Þ + LBPð8;2Þ + LBPð8;1Þ + LBPY ð8;2Þ + LBPð8;1Þ + LBPð8;2Þ

Feature size

TP

TN

Accuracy

324 112 236 236 354

0.9280 0.8644 0.9355 0.9418 0.9446

0.9440 0.9371 0.9466 0.9488 0.9573

0.9360 0.9008 0.9411 0.9453 0.9510

LBP, local binary pattern; TP, true positive; TN, true negative.

Table IV. Computational complexity. Method Markov features of [8] Histogram bins of [13] LBP features of [16] EY Y Y LBPEY ð8;1Þ + LBPð8;2Þ + LBPð8;1Þ + LBPð8;2Þ EY Y Y ECr ECr LBPEY + LBP + LBP + LBP ð8;1Þ ð8;2Þ ð8;1Þ ð8;2Þ + LBPð8;1Þ + LBPð8;2Þ

Feature size

Features extraction time(s)

324 112 236 236 354

0.3017 0.4808 0.5674 0.5804 0.6134

LBP, local binary pattern.

4. CONCLUSION In this paper, a novel method for CG identification from PG images has been presented. We extract LBP as distinguishing features from YCbCr color model by using two LBP operators. Through multiresolution analysis, we combine features from Y, Cr components, and their prediction-error 2D arrays to be distinguishing features for classifying CG and PG. Compared with the state-ofthe-art works, the proposed method has achieved higher classification accuracy. In future work, we will focus on investigating the robustness of the proposed method, such as the influence on the classifying accuracy if CG and PG are postprocessed by JPEG compression, blurring, or adding noise.

ACKNOWLEDGEMENTS We would like to thank Dr. Xiaolong Li, Dr. Bin Yang, and Dr. Patchara Sutthiwan for their kindness by providing us with their codes. We also appreciate Guanshuo Xu for his kind suggestions. The first author was supported by the Basic Research Foundation of Beijing Jiaotong University (no. 2011JBM004) and the Fundamental Research Funds for the Central Universities (2013YJS015).

REFERENCES 1. Dehnie S, Sencar T, Memon N. Digital image forensics for identifying computer generated and digital camera images. In Proceedings of IEEE ICIP, 2006; 2313–2316. 2158

2. Ng TT, Chang SF, Hsu J, Xie L, Tsui MP. Physicsmotivated features for distinguishing photographic images and computer graphics. In Proceedings of ACM Multimedia, 2005; 239–248. 3. Dirik AE, Bayram S, Sencar HT, Memon N. New features to identify computer generated images. In Proceedings of IEEE ICIP. IV, 2007; 433–436. 4. Dirik AE, Memon N. Image tamper detection based on demosaicing artifacts. In Proceedings of IEEE ICIP, 2009; 1497–1500. 5. Ng TT, Chang SF. Classifying photographic and photorealistic computer graphic images using natural image statistics. In ADVENT Technical Report, Columbia University, #220-2006-6 (2004). 6. Wu J, Kamath MV, Poehlman S. Detecting differences between photographs and computer generated images. In Proceedings of the 24th IASTED international conference on Signal processing, pattern recognition, and applications, SPPRA‘06, 2006; 268–273. 7. Chen W, Shi YQ, Xuan GR. Identifying computer graphics using HSV color model and statistical moments of characteristic functions. In Proceedings of ICME, 2007; 1123–1126. 8. Sutthiwan P, Cai X, Shi YQ, Zhang H. Computer graphics classification based on Markov process model and boosting feature selection technique. In Proceedings of IEEE ICIP, 2009; 2913–2916. 9. Chen DM, Li JH, Wang SL, Li SH. Identifying computer generated and digital camera images using fractional lower order moments. In IEEE Conference on Industrial Electronics and Applications (ICIEA), 2009; 230–235.

Security Comm. Networks 2014; 7:2153–2159 © 2013 John Wiley & Sons, Ltd. DOI: 10.1002/sec

Z. Li, Z. Zhang and Y. Shi

10. Li WX, Zhang T, Zheng EG, Ping XJ. Identifying photorealistic computer graphics using second-order difference statistics. In Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 2010; 2316–2319. 11. Pan F, Huang JW. Discriminating computer graphics images and natural images using hidden Markov tree model. In Proceedings of IWDW, 2010; 23–28. 12. Zhang R, Wang RD. Distinguishing photorealistic computer graphics from natural images by imaging features and visual features. In International Conference on Electronics, Communications and Control (ICECC), 2011; 226–229. 13. Wu RY, Li XL, Yang B. Identifying computer generated graphics VIA histogram features. In Proceedings of ICIP, 2011; 1933–1936. 14. Haralick RM, Shanmugan K. Textural features for image classification. IEEE Transaction on Systems, Man and Cybernetics 1973; 3(6):610–621. 15. Ojala T, Pietikainen M, Maenpaa T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on

Distinguishing CG from PG by using LBP features and SVM

16.

17.

18.

19. 20.

21.

Security Comm. Networks 2014; 7:2153–2159 © 2013 John Wiley & Sons, Ltd. DOI: 10.1002/sec

Pattern Analysis and Machine Intelligence 2002; 24:971–987. Li ZH, Ye JY, Shi YQ. Distinguishing computer graphics from photographic images using local binary patterns. In The 11th IWDW, International Workshop on Digital-forensics and Watermarking , 2012. Chen W. Detection of digital image and video forgeries. Ph.D. Dissertation, Dept. of ECE, NJIT, 2008. Weinberger MJ, Seroussi G, Sapiro G. LOCO-I: a low complexity, context-based, lossless image compression algorithm. In Proceedings of Data Compression Conference. DCC ’96,1996; 140–149. http://www.creative-3d.net and http://www.3dlinks. com Friedman J, Hastie T. Additive logistic regression: a statistical view of boosting. The Annals of Statistics 2000; 28(2):337–407. Chang CC, Lin CJ. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2011; 1–39. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

2159