Binary Image Authentication With Tampering ... - IEEE Xplore

0 downloads 0 Views 400KB Size Report
Binary Image Authentication With Tampering. Localization by Embedding Cryptographic. Signature and Block Identifier. Huijuan Yang and Alex C. Kot, Fellow, ...
IEEE SIGNAL PROCESSING LETTERS, VOL. 13, NO. 12, DECEMBER 2006

741

Binary Image Authentication With Tampering Localization by Embedding Cryptographic Signature and Block Identifier Huijuan Yang and Alex C. Kot, Fellow, IEEE

Abstract—This letter proposes a novel two-layer blind binary image authentication scheme, in which the first layer is targeted at the overall authentication and the second layer is targeted at identifying the tampering locations. The “flippability” of a pixel is determined by the “connectivity-preserving” transition criterion. The image is partitioned into multiple macro-blocks that are subsequently classified into eight categories. The block identifier is defined adaptively for each class and embedded in those “qualified” and “self-detecting” macro-blocks in order to identify the tampered locations. Experimental results validate the arguments made. Index Terms—Authentication, binary images, data hiding, tampering localization.

Fig. 1. Patterns that (a) satisfy VH transition, (b) are excluded by IR transition, and (c) are excluded by C transition, excluding patterns that differ by rotation, complement, and mirroring. Pixels in gray represent the “do not care” pixels.

I. INTRODUCTION

W

ITH MORE important documents being digitized and the ease of editing with the available software, how to ensure the authenticity and integrity of digital documents, as well as detection of tampering and forgery, become a serious concern. Data hiding for binary images authentication has been a promising approach to alleviate these concerns. Recently, many researchers proposed techniques for document watermarking and data hiding [1]–[3]. Wu et al. suggested to employ a visual distortion table to assess the “flippability” of pixels in 3 3 blocks [1]. The shuffling technique is applied to equalize the uneven embedding capacity of the image. The connectivity-preserving pattern-based approach [2] handles the uneven “embeddability” of the image by embedding the watermark adaptively in the “embeddable” blocks. Authentication and tampering detection for grayscale and color images are proposed in [4] and [5]. Few papers reported in the literature address the problem of tampering localization for binary images. The method that attempts to identify the tampered locations for binary images [6], [7] performs watermarking on each sub-image independently. The uneven embedding capacity of binary images makes locating the tampering a difficult task. In this letter, we propose a two-layer authentication technique for binary images. The overall authentication is achieved in the first layer by hiding the cryptographic signature (CS) of the image. The localization of the tampering is achieved in the second layer by embedding the block identifier (BI) in the

Manuscript received February 14, 2006; revised May 14, 2006. Preliminary results of this paper were presented in [8]. The associate editor coordinating the review of the manuscript and approving it for publication was Dr. Min Wu. The authors are with the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798 (e-mail: ehjyang@ntu. edu.sg; [email protected]). Digital Object Identifier 10.1109/LSP.2006.879829

Fig. 2. Macro-blocks and finer blocks.

“qualified” or “self-detecting” macro-blocks (MBs). Specifically, we group multiple overlapping 3 3 blocks to form an MB and classify the MBs to “qualified” macro-blocks (QMBs) and “unqualified” macro-blocks (UMBs) based on the number of “flippable” pixels . The are chained, and the that is used to identify the tampering occurred both to the and its neighboring is embedded. II. “FLIPPABILITY” DECISION AND BLOCK SIGNATURE GENERATION The “flippability” of a pixel is determined by three transition criteria in a 3 3 block [2], which are calculated from the center pixel to its eight neighbors, i.e., the horizontal and vertical (VH), interior right angle (IR), and sharp corner (C) transitions are calculated before and after flipping the center pixel. If the transition numbers do not change, it implies that flipping the pixel does not destroy the connectivity between pixels, and no isolated corner pixel is created. Patterns that satisfy or are excluded by the criterion are shown in Fig. 1. Each macro-block is divided into nine regions, and each region consists of multiple finer blocks (FBs), as shown in Fig. 2. The block signature introduced in [3] is employed to generate the block signature (BS) for each MB. Each and centered at , where , FB is of size

1070-9908/$20.00 © 2006 IEEE

742

IEEE SIGNAL PROCESSING LETTERS, VOL. 13, NO. 12, DECEMBER 2006

TABLE I CLASSIFICATION OF THE MACRO-BLOCKS

Fig. 3. Block identifier for different classes.

, , and are nonnegative integers. The feature code of the FB is given by (1) where the weight depends on the locations of the pixels in the FB, and represents element-wise multiplication. In generating for each FB, the “flippable” locations are set to fixed values, e.g., 0s. The mean feature code of the th region is given by (2) where and denote the number of the FBs in each region of the MB in and directions, respectively. The for region of the MB is obtained by mapping the to a binary bit via a lookup table (LUT), that is BS

LUT

(3)

In the proposed scheme, the size of the FB is chosen to be 3 3, i.e., the overlapping 3 3 block is employed. III. AUTHENTICATION, TAMPERING DETECTION, AND LOCALIZATION A. Macro-Block Classification in The MBs are classified into eight categories based on each MB, as shown in Table I, where “ ” is the logical “and” operation. The reason why is chosen as the criteria to classify the MBs lies in the fact that once a location is tampered, the “flippability” condition will most probably change and will change. Hence, the class that it belongs to will change as well. B. Distance Computation for Consecutive QMBs are named as the “qualiThe MBs fied” macro-blocks (QMBs) that will participate in the block chaining process. In computing the distance between two consecutive QMBs, we define the horizontal distance and

Fig. 4. Block diagram of data embedding process.

vertical distance as the distance from the previous QMB to the current QMB in the horizontal and vertical directions, respectively. Let us use , and , to represent the indexes of the current and its previous QMBs for the horizontal and vertical directions. The sign for is defined as the index difference of the current and its previous QMBs in the horiif zontal direction, which is given by , otherwise and are calculated by and while if otherwise , is the number of the MBs in the horiwhere zontal direction, and is the absolute value of . For and . In processing the first QMB, we have the image in raster scan order, is generally small; however, is relatively large. When , is the effective index difference of the current and its previous QMBs in the horizontal direction. C. Block Identifier Formation The BI consists of the BS, the distances and , the sign , and the features of the “unqualified” macro-blocks (UMBs) . Given an image, it is first divided into multiple MBs and for each MB is computed. Thereafter, the MBs are classi, and the BS for each MB is generated. The fied based on distances between the two consecutive QMBs, and , are computed and represented by a fixed length binary sequence, e.g., bits and bits, while for MB , the BS will be embedded for “self-detecting” changes. To compute , each class of the UMBs is assigned a value, where

YANG AND KOT: BINARY IMAGE AUTHENTICATION WITH TAMPERING LOCALIZATION

743

Fig. 5. Block diagram of data extraction, authentication, and tampering localization process.

, 1, 2, 3. The total number of MBs for class of the is calculated and form the set , i.e., . The minimum and maximum of and their respective class values are mapped to binary bits to generate the binary features (4 bits are used currently). An LUT can be set up for security concerns for the mapping. In order to prevent the adversary from swapping any two MBs of the , the minimum, maxUMBs, especially for MBs and imum, median, and mean of the block indexes of are also mapped to binary bits, which are subsequently with the binary features to generate . The BI defined in Fig. 3 is given by UMBs

BI BI BI BI BI

BS for MB BS for MB BS for MB BS for MB BS for MB

pixels in each MB does not change in the data hiding process, the same process can be carried out to form the BI for the watermarked images . Identify the current MB as “tampered” if the new computed BS is different from the one extracted BS . Identify the UMBs between two consecutive QMBs as “tampered” if the new calculated , , , or are different from those extracted. The tampered area lies between the previous and the current QMBs. Extract CS from those blocks containing more “flippable” pixels than that required to embed the BIs, e.g., MBs . Decrypt CS by providing the public key of the authorized user to Decrypt to obtain . Perform the same process as that in the embedding process to generate of . Comparison of with gives the authentication results. IV. EXPERIMENTAL RESULTS AND DISCUSSIONS

(4)

where “ ” denotes the “concatenation” operation. It is easy to see that BI varies with the changes of the MB types. D. Data Embedding Process The data embedding process shown in Fig. 4 is described as follows. Divide the image into multiple MBs and compute for each MB. Classify the MBs into eight categories and form the BI. Embed the BI on the “flippable” pixels in each MB to enforce the odd-even feature of the 3 3 block. Embed CS in those MBs containing more “flippable” pixels than that required to embed the BIs, e.g., MBs . The CS is generated by setting all the “flippable” locations to “0s” to generate image . Feed into Hash to generate the hash value , e.g., is employed, which generates 128 bits hash value. FiEncrypt , where is the nally, generate CS private key of the authorized user. E. Data Extraction, Authentication, and Tampering Location Detection The data extraction, authentication, and tampering localization process is shown in Fig. 5. Since the “flippability” of the

A. Experimental Results Extensive experiments are carried out to test the efficiency of tampering detection and localization mechanism. The results are shown in Fig. 6. In the experiment, the parameters are chosen as , , , and . Generally, the threshold for a class should be chosen such that the full or majority of the BI for the corresponding class can be embedded. However, if the threshold is chosen to be lower than that required to embed the complete BI for the corresponding class, the actual length of BI that has been embedded is equal to in the MB. Due to the fact that in each MB can be determined in both the embedding and extraction processes, the localization of the tampering is determined by comparing the extracted BI with the recomputed BI for for the length of . The MB size should be chosen such that a good compromise can be made between the capacity required for embedding CS and the localization accuracy. The overall authentication results obtained are analogous to that described in [2]. Miss detection of the tampering will most likely occur for those MBs that do not have enough “flippable” pixels to embed a complete BI, e.g., MBs or the chosen threshold for a class is lower than that required to embed the complete BI. On the other hand, the false alarm will most likely occur in

744

IEEE SIGNAL PROCESSING LETTERS, VOL. 13, NO. 12, DECEMBER 2006

2

Fig. 6. Data hiding, tampering detection, and localization results. (a) Original image (496 496). (b) Watermarked image with 3193 bits embedded ( MB size 33 33). (c) Tampered image. Several regions in the image are tampered by erasing, cropping, cutting, and pasting. (d) Tampering localization results. The blocks painted in black are tampered.

2

those MBs whose neighboring MBs are tampered. The probaand the probability of false alarm bility of miss detection are used to evaluate the performance of the proposed and are given by scheme. In terms of percentage, and , where , , and denote the number of miss-detected, the number of false-identified, and the total number of tamperings, respectively. Forty text images are used in the experiments; the tamperings include erasing, cropping, cutting and pasting, and tampering in the uniform regions. Among a total of 731 tamperings, only 12 tamperings cannot be correctly localized, and only 11 tamperings are falsely identified, which results in and . Comparisons of our proposed method with that proposed by Kim et al. [6], [7] are made. The experimental results show that the localization accuracy achieved by our proposed method is about 16 times higher than that of Kim et al.’s method, due to a smaller size of the MB such as 33 33 being employed in our method comparing the sub-image of size 128 128 employed in Kim et al.’s method. In addition, the visual quality of the watermarked image is not easy to control for Kim et al.’s method due to one pixel being forced to flip in each small block of size 8 8 in the chosen regions, especially for those smooth text images. B. Discussions and Security Considerations The proposed scheme is based on the observation that any tampering occurred to the QMB will change its class type or BS. In addition, any tampering occurred between two consecutive QMBs may render a new QMB to be generated, and hence , , , and will change. It is worthwhile to note that the BS of the MB is generated by mapping in each region to a binary bit. In total, there are options of assigning the weights in each 3 3 FB; which option is chosen can be kept secret and only known to the receiver. Similarly, there are , i.e., , options for setting up the LUT; hence, there are enough uncertainties such that the probability for the adversary to figure out the LUT to compromise the BS is low. Even for the worst case, assume the adversary tries to figure out the binary bits in each MB, with 0.5 probability, a total of attempts is needed, where is the number of the MBs affected by the tampering. Generally, , the probability that the tampering will , which is considgo undetected is ered small. Discussions on the security of the LUT embedding

can be found in [9]. In order to tackle the sensitivity of the proposed scheme to random noise, error correction coding (ECC), , can be used to encode the watermark bits, e.g., BCH which consists of the BI and CS for each MB. Of course, the total capacity will drop in using ECC to gain the robustness. V. CONCLUSIONS In this letter, we propose a novel blind two-layer data hiding scheme for binary images authentication and tampering localization. The proposed technique of dividing the images into MBs and embedding the BI in each MB is effective in detecting tamperings occurred to the watermarked image, both in “qualified” and “unqualified” MBs. The BS, the distance between two “qualified” MBs, and the features of the “unqualified” MBs can track the changes effectively. The proposed two-layer authentication (the first layer is for the overall image authentication, and the second layer is for tampering detection and localization) is effective in detecting any changes, and in the meantime, the locations being tampered can be identified. REFERENCES [1] M. Wu and B. Liu, “Data hiding in binary images for authentication and annotation,” IEEE Trans. Multimedia, vol. 6, no. 4, pp. 528–538, Aug. 2004. [2] H. Yang and A. C. Kot, “Date hiding for text document image authentication by connectivity-preserving,” in Proc. IEEE ICASSP, Mar. 2005, vol. 2, pp. 505–508. [3] H. Yang, A. C. Kot, and J. Liu, “Semi-fragile watermarking for text document images authentication,” in Proc. IEEE ISCAS, May 2005, pp. 4002–4005. [4] P. W. Wong and N. Memon, “Secret and public key image watermarking schemes for image authentication and ownership verification,” IEEE Trans. Image Process., vol. 10, no. 10, pp. 1593–1601, Oct. 2001. [5] M. U. Celik, G. Sharma, E. Saber, and A. M. Tekalp, “Hierarchical watermarking for secure image authentication with localization,” IEEE Trans. Image Process., vol. 11, no. 6, pp. 585–595, Jun. 2002. [6] H. Y. Kim and R. L. de Queiroz, “Alteration-locating authentication watermarking for binary images,” in Proc. Int. Workshop Digital Watermarking, 2004, pp. 125–136. [7] ——, “A public-key authentication watermarking for binary images,” in Proc. IEEE ICIP, Oct. 2004, vol. 5, pp. 3459–3462. [8] H. Yang and A. C. Kot, “Two-layer binary image authentication with tampering localization,” in Proc. IEEE ICASSP, May 2006, vol. 2, pp. 309–312. [9] M. Wu, “Joint security and robustness enhancement for quantization based data embedding,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 8, pp. 831–841, Aug. 2003.