INVERSE HALFTONING WITH GROUPING SINGULAR ... - IEEE Xplore

4 downloads 0 Views 2MB Size Report
3 School of Software, Sun Yat-sen University, Guangzhou, P.R. China. ABSTRACT. The objective of inverse halftoning refers to reconstruct a high quality gray ...
INVERSE HALFTONING WITH GROUPING SINGULAR VALUE DECOMPOSITION Jun Yang1,2 1

Jun Guo1

Hongyang Chao3,∗

School of Information Science and Technology, Sun Yat-sen University, Guangzhou, P.R. China 2 SYSU-CMU Shunde International Joint Research Institute, P.R. China 3 School of Software, Sun Yat-sen University, Guangzhou, P.R. China ABSTRACT

The objective of inverse halftoning refers to reconstruct a high quality gray scale image from bi-level halftone image. However, reconstructing continuous-tone images from their halftoned versions is highly underdetermined, making this technique very difficult. In this paper, we present the Grouping Singular Value Decomposition (G-SVD), a novel approach which first groups similar image patches as input and then characterizes lower-dimensional regions in input space where the data density is peaked. By adding a constraint formulated via G-SVD into inverse halftoning, noises are separated from meaningful contents and similarity of nonlocal image patches is promoted. Our experiments shown that the proposed approach could improve the visual quality of reconstructed results and outperformed the state of the arts in terms of both objective and subjective measurements. Index Terms— inverse halftoning, grouping singular value decomposition (G-SVD), nonlocal similarity 1. INTRODUCTION Halftoning provides the ability of representing an image with only one color through the use of ink dots, and is thus widely used in today’s publishing applications, such as newspapers, books, magazines, etc. [1, 2]. Halftone images are typically difficult to manipulate. Many halftone image processing, such as scaling, compression, and enhancement could cause severe image degradation [3]. To enable these operations, gray images need to be reconstructed from halftones through inverse halftoning. For scanning printed (halftone) images inverse halftoning is also needed to reduce Moire effects. Therefore, as shown in Fig. 1, halftoned images are often inverse halftoned. Considering that inverse halftoning is an operation mapping {0, 1}H×W onto RH×W (H / W is the image height / width), this technique apparently belongs to the class of ill-posed inverse problems and is rather challenging. In the last two decades, lots of inverse halftoning This work was partially supported by NSF of China under Grant 61173081 and Guangdong Natural Science Foundation, China, under Grant S2011020001215. ∗ Corresponding author: Hongyang Chao ([email protected])

978-1-4799-8339-1/15/$31.00 ©2015 IEEE

1463

Fig. 1. An illustration of inverse halftoning: original image (left), halftoned (center) and inverse halftoned version (right). algorithms were developed to exploit local smoothness of images and have achieved good results, including tree-structured vector quantization [4], convex set projection [5], iterative statistical smoothing [6], nonlinear permutation filtering [7], anisotropic filtering [8], shearlet representation [9], Bayesian approaches [10, 11], look-up-table (LUT) based approaches [12] and wavelet-based approaches [13, 14, 15]. However, while local smoothness was preserved, these algorithms usually produced blurry results, and thus had shortcomings in keeping edges and textures. Targeting at this problem, the recently proposed BM3D-based approach [16] added a regularization term of nonlocal similarity, and achieved the state-ofthe-art performance. However, there is still one main problem to restrict its performance. The BM3D-based approach relied on a sophisticated deterministic annealing optimization [17], which requires to denoise reconstructed images in each iteration. Hence, its output usually consists of visible artifacts. To resolve this issue, while minimizing residual errors between an input halftoned image and the re-halftoned version of its reconstructed image so as to maintain local smoothness, we plug in a novel constraint expressed via the proposed Grouping Singular Value Decomposition (G-SVD) approach to promote self-similarity of nonlocal image patches. By grouping similar nonlocal image patches and then limiting the minimal nonzero singular value of each group to be a large number, our G-SVD constraint can capture a low-dimensional manifold from each group. As a result, saliency patterns within a group, e.g., the outline of a human’s face, can be discovered, while at the same time artifacts are well separated, which in turn enhances the similarity of nonlocal patches within this group. To optimize our whole framework, we also leverage G-SVD to build an iterative projection method, in order to ensure the lower bound of singular values of groups. Extensive experiments show the superior performance of our

ICIP 2015

12

framework compared with the state of the arts. Following we introduce the whole framework in Section 2, and then present experiments in Section 3. Finally, conclusions are addressed in Section 4.

Relative Difference

10

2. THE INVERSE HALFTONING FRAMEWORK

2.1. Preserving Local Smoothness A halftoned image Y is related to its continuous-toned version X by Y = H(X) where H represents a generic halftoning model, such as dot-diffusion dithering or error diffusion. To preserve local smoothness, we look for an approximation ˆ which can optimally represent X: X ˆ X

(1)

2.2. Constraining Nonlocal Similarity Apparently, the best outcome of Eq. (1) is an approximation ˆ which is exactly the same as the real continuous-toned imX age X. However, this is usually impossible as the optimization is highly nonconvex and ill-posed. What’s worse, lots of experiments have proved that Eq. (1) often tends to preserve local-region-wise smoothness and gets stuck in blurry solutions [16]. Hence, to restrict the solution set so as to escape from fuzzy results and better recovery discontinuities, we add a G-SVD constraint to promote nonlocal patch similarity: ˆ 22 , minkY − H(X)k ˆ X

ˆ >λ s.t. T (X)

8 6 4 2

Our inverse halftoning framework tries to preserve local smoothness under the constraint of nonlocal similarity. In this section we first introduce how to exploit unconstrained local smoothness, and then discuss the proposed G-SVD constraint. Finally an iterative projection method is built upon G-SVD to optimize the whole framework.

ˆ 22 minkY − H(X)k

G1 G2 G3

(2)

where T is the calculation part of G-SVD, and λ is a threshold. More specifically, T consists of a grouping phase and a ˆ T first decomposition phase. Given a reconstructed image X, performs grouping on similar image patches, and then decomposes singular values within each group, and finally outputs the minimal nonzero singular value. Usually components corresponding to the leading singular values represent common patterns, while others may be corrupted by noises. By conˆ to be a large value, the importance of saliency straining T (X) patterns relative to noises can be increased, resulting in artifact reduction and patch similarity enhancement. Following we discuss these two phases in detail. 2.2.1. Grouping Phase Intuitively, visually similar patches usually concentrate near a manifold, while visually different ones distribute diversely.

1464

0 0

(a)

10

20

30

40

Index of Desending Singular Value

50

(b)

Fig. 2. The influence of noises on singular values of groups. (a) Top: the original image (Ori); Bottom: the image corrupted by Gaussian noises with mean 0 and standard deviation 15 (Corr). (b) (SV@Corr − SV@Ori)/SV@Ori, where SV@Ori and SV@Corr are singular values of groups in the two images respectively. The groups G1 , G2 , G3 correspond to the reference patches I1 , I2 , I3 individually. The grouping phase aims at separating spatially distant manifolds, for the ease of finding a good linear representation to each manifold in the following decomposition phase. Similar ideas can be found in denoising-related works like [18]. Given an H × W image, every h × w adjacent pixels are extracted, in order to produce (H − h + 1) × (W − w + 1) normal patches of hw dimensions. Among them N reference patches covering the whole image are selected uniformly. For each reference patch Ii , C most-similar normal patches centered within the p × q surrounding pixels are found and ordered according to the Euclidean distance. These C patches are further rearranged as an hw × C matrix with each column storing one of them sequentially. In a conclusion, the grouping phase (GROUP) takes an image and the number of groups (i.e., N ) as input, and outputs N matrices of hw × C dimensions.

2.2.2. Decomposition Phase Let {G1 , G2 , . . . , GN } be an outcome of GROUP. We apply singular value decomposition (SVD) on each of the matrix and record the minimal nonzero singular value: d(Gi ) = min{s1 , s2 , . . . }, where sj is a nonzero singular value of Gi . The output of the decomposition phase is defined as the minimum of {d(G1 ), d(G2 ), . . . , d(GN )}. This phase targets at characterizing the low-dimensional manifold in each group, so that saliency patterns concentrate in the vicinity of a manifold can be separated from noises which are frequently orthogonal to the manifold. Fig. 2 offers an illustration. After corrupting images with zero-mean Gaussian noises, we can see that the leading singular values usually change little, while the rest suffer from large variations. Hence, by restricting the lower bound of nonzero minimal singular values to be large, noises can be successfully ignored.

Grouping

Singular Value Composition

ˆi G

Input

Algorithm 1 Inverse Halftoning with G-SVD Input: Y, λ, N, nIter ˆ Output: X

Matrix

ˆi G

Ii

1:

Thresholding

2: 3:

Gi Matrix

Gi Singular Value Decomposition

Iˆ i

4: 5:

Output

Averaging

ˆi are the updated Fig. 3. The pipeline of a projection. Iˆi and G versions of Ii and Gi respectively.

6: 7: 8: 9:

2.3. Iterative Projection

10:

Without constraints, the residual errors in Eq. (2) can by optiˆ around the decision boundmized by iteratively perturbing X ˆ is inconsistent with Y [16]. We name such an ary if H(X) iteration as a coding step. To deal with the added constraint, we insert an approximation update step after each coding step, to ensure that the G-SVD constraint is satisfied after projections. Given an apˆ it is projected via the following procedure: proximation X, ˆ to gener1. Apply grouping on N reference patches of X ate a set of matrices {G1 , G2 , . . . , GN }. 2. For each matrix Gi , calculate an singular value decomposition Gi = Ui ∗ Σi ∗ Vi T . 3. Threshold the singular values by keeping those having magnitudes larger than λ: ( σj , if σj > λ [THRESH(Σi , λ)]j = (3) 0, otherwise where σj is the j-th largest singular value in Σi . 4. Update Gi as Ui ∗ THRESH(Σi , λ) ∗ Vi T . ˆ by putting back updated patches. Over5. Update X lapped pixels are averaged. (we denote this operation as BACK). Obviously the above procedure (also shown in Fig. 3) is an extension of G-SVD. Such projection will continue iteratively until the stopping criterion has been met. The whole optimization process is summarized in Alg. 1. 3. EXPERIMENTS This section starts by evaluating the influence of threshold value (i.e., λ in the G-SVD constraint), followed by the investigation of iteration number. Finally, we compare our algorithm with state-of-the-art inverse halftoning algorithms. 3.1. Data and Implementation [Data] Following the standard protocol [15, 16], Eight images including Lena, Barbara, Peppers, Boat, House, Baboon, Hill and Man are used as test images, with their sizes

1465

11: 12: 13: 14: 15: 16:

iIter = 1 repeat /* Calculate an approximation */ ˆ = arg minX ¯ 22 X ¯ kY − H(X)k /* G-SVD-based projection */ i=1 ˆ N) {G1 , G2 , . . . , GN } = GROUP(X, repeat Ui , Σi , Vi = SVD(Gi ) Σi = THRESH(Σi , λ) Gi = Ui ∗ Σi ∗ Vi T i=i+1 until i > N ˆ = BACK({G1 , G2 , . . . , GN }) X iIter = iIter + 1 until iIter > nIter

being scaled to (H × W = 512 × 512). Four random selected Kodak images 1 including Kodim05, Kodim07, Kodim15 and Kodim19 are converted to gray versions and used, with the sizes being clipped centrally to (H × W = 512 × 512). The corresponding halftoned images are obtained by the standard Floyd-Steinberg diffusion [19]. [Implementation] For our framework, the patch size is set to H W 64 × 64 , i.e., 8 × 8. (C = 50) similar patches centered within (p × q = 10 × 10) surrounding pixels are found for each of (N = 127 × 127) reference patches with 4 pixels overlapped (p, q and N can be larger at the cost of running time, but we didn’t observe significant improvement). The initial input of optimization is simply set to the smoothed version of the input halftoned image. 3.2. Threshold We discuss the performances of various thresholds on Lena (illustrated in Fig. 5(a)).The PSNRs after 80 iterations are not reported as the improvements were marginal. We can observe that, increasing threshold tended to receive better performance and faster convergence, though too large threshold would lead to much information loss and resulted in worse PSNR. This result confirmed the effectiveness of our G-SVD constraint: by setting a suitable lower bound for the minimal nonzero singular value of patch groups, nonlocal similarity could be better maintained, thus producing superior results. The rest experiments are based on the same threshold λ = 55.8 for all test images. It is interesting that performance is hurt (not shown due to space limitation) when replacing the hard threshold with only keeping specified number of singular values. The reason behind remains to be studied. 1 http://r0k.us/graphics/kodak/

Fig. 4. From-left-to right of the first row: 200×200 portion of Lena, halftoned, wavelet-based [15], TV-based, BM3D-based [16] and our results. The second row to the bottom row are for Barbara and Peppers respectively. 34

38

33

36

32

34

30

λ= λ= λ= λ= λ= λ= λ= λ= λ= λ= λ= λ=

29 28 27 26 25 0

20

40

PSNR (dB)

PSNR (dB)

31 0 24.8 37.2 43.4 49.6 52.7 55.8 58.9 62.0 68.2 74.4 86.8

32 30 28

Table 1. The PSNRs / SSIMs of all frameworks Images Lena Barbara Peppers Boat House Baboon Hill Man Kodim05 Kodim07 Kodim15 Kodim19

Lena Barbara Peppers Boat House Baboon Hill Man kodim03 kodim05 kodim12 kodim16

26 24 60

Iteration Number

80

22 0

20

40

60

Iteration Number

(a)

WInHD 31.94 / 0.8640 25.72 / 0.7619 31.73 / 0.8314 29.19 / 0.7777 35.54 / 0.9063 23.04 / 0.6871 29.38 / 0.7523 29.45 / 0.7999 25.12 / 0.7805 30.49 / 0.8928 28.77 / 0.7262 27.19 / 0.8167

TV-Based 30.67 / 0.8410 24.45 / 0.7047 31.03 / 0.8241 27.95 / 0.7437 34.47 / 0.8997 21.60 / 0.5663 28.63 / 0.7274 28.64 / 0.7710 24.20 / 0.7240 29.55 / 0.8691 29.07 / 0.7352 25.58 / 0.7734

BM3D-based 32.90 / 0.8718 28.11 / 0.8441 32.64 / 0.8433 29.99 / 0.8088 37.29 / 0.9148 23.36 / 0.7138 30.01 / 0.7860 30.20 / 0.8249 25.99 / 0.8110 31.90 / 0.9094 30.17 / 0.7688 28.84 / 0.8441

Proposed 33.27 / 0.8844 30.27 / 0.8942 33.15 / 0.8537 30.19 / 0.8218 37.34 / 0.9155 23.38 / 0.7250 30.20 / 0.7961 30.30 / 0.8350 26.13 / 0.8278 32.09 / 0.9249 30.53 / 0.8088 29.54 / 0.8603

80

(b)

Fig. 5. (a) The PSNRs of multiple thresholds on Lena. (b) The PSNRs on all test images, with respect to iteration number. 3.3. Iteration Number Here we demonstrate the effect of iteration number. Fig. 5(b) shows the results of our inverse halftoning framework on all the test images, varying in iteration numbers. We can see that PSNRs were steadily increasing at first and then gradually saturated. This phenomenon proved the correctness of the proposed G-SVD-based iterative projection method. With the growth of iterations, our method gradually improved the quality of inverse halftoning, instead of harming the performance. 3.4. Comparisons with State of the Arts Now we compare our inverse halftoning framework with three benchmarks: wavelet-based (WInHD) [15], total-variationbased (TV) [8] and BM3D-based inverse halftoning [16]. As shown in Table 1, our framework achieves the best performance for all test images, in terms of PSNR and SSIM. More precisely, we have obtained the PSNR of more

1466

than 30 dB on Barbara, dramatically outperformed the stateof-the-arts. Besides, for the test images Lena and Peppers, our framework achieved 33.27 dB and 33.15 dB respectively. To the best of our knowledge, this is the first time to see in the open literature that inverse halftoning of Lena or Peppers can achieve a PSNR value of larger than 33 dB. Fig. 4 offers some subjective quality comparisons. For space limitation, we only show results of the first three test images. We can see that the texture of scarf in Barbara was reconstructed with very high fidelity. 4. CONCLUSIONS In this paper, we studied how to build an effective framework for inverse halftoning. To avoid blurry solutions, we proposed a G-SVD constraint to exploit the similarity of nonlocal image patch. Our G-SVD constraint may also benefit related image restoration tasks. Currently our method takes more computation time than existing methods. A simple solution is reducing number of iterations or reference patches, at the cost of a bit performance drop. We will study this problem in future.

5. REFERENCES

[12] J Guo, Y Liu, J Chang, and J Lee, “Efficient halftoning based on multiple look-up tables,” 2013.

[1] Helmut Kipphan, Handbook of print media: technologies and production methods, Springer, 2001. [2] Sung-Jin Kang, Hyun-Chul Do, Byung-Gwon Cho, Sung-Il Chien, and Heung-Sik Tae, “Improvement of low gray-level linearity using perceived luminance of human visual system in pdp-tv,” Consumer Electronics, IEEE Transactions on, vol. 51, no. 1, pp. 204–209, 2005. [3] Thomas D Kite, Niranjan Damera-Venkata, Brian L Evans, and Alan C Bovik, “A fast, high-quality inverse halftoning algorithm for error diffused halftones,” Image Processing, IEEE Transactions on, vol. 9, no. 9, pp. 1583–1592, 2000. [4] Ming Yuan Ting and Eve A Riskin, “Error-diffused image compression using a binary-to-gray-scale decoder and predictive pruned tree-structured vector quantization,” Image Processing, IEEE Transactions on, vol. 3, no. 6, pp. 854–858, 1994. [5] Søren Hein and Avideh Zakhor, “Halftone to continuous-tone conversion of error-diffusion coded images,” in Sigma Delta Modulators, pp. 133–154. Springer, 1993. [6] Ping Wah Wong, “Inverse halftoning and kernel estimation for error diffusion,” Image Processing, IEEE Transactions on, vol. 4, no. 4, pp. 486–498, 1995. [7] Yeong-Taeg Kim, Gonzalo R Arce, and Nikolai Grabowski, “Inverse halftoning using binary permutation filters,” Image Processing, IEEE Transactions on, vol. 4, no. 9, pp. 1296–1311, 1995. [8] Thomas D Kite, Brian L Evans, and Alan C Bovik, “Modeling and quality assessment of halftoning by error diffusion,” Image Processing, IEEE Transactions on, vol. 9, no. 5, pp. 909–922, 2000. [9] Glenn R Easley, Vishal M Patel, and Dennis M Healy Jr, “Inverse halftoning using a shearlet representation,” in SPIE Optical Engineering + Applications. International Society for Optics and Photonics, 2009, pp. 74460C– 74460C. [10] Robert L Stevenson, “Inverse halftoning via map estimation,” Image Processing, IEEE Transactions on, vol. 6, no. 4, pp. 574–583, 1997. [11] Y-F Liu, J-M Guo, and Jiann-Der Lee, “Inverse halftoning based on the bayesian theorem,” Image Processing, IEEE Transactions on, vol. 20, no. 4, pp. 1077–1084, 2011.

1467

[13] Jiebo Luo, Ricardo de Queiroz, and Zhigang Fan, “A robust technique for image descreening based on the wavelet transform,” Signal Processing, IEEE Transactions on, vol. 46, no. 4, pp. 1179–1184, 1998. [14] Zixiang Xiong, Michael T Orchard, and Kannan Ramchandran, “Inverse halftoning using wavelets,” Image Processing, IEEE Transactions on, vol. 8, no. 10, pp. 1479–1483, 1999. [15] R. Nowak R. Neelamani and R. Baraniuk, “Winhd: Wavelet-based inverse halftoning via deconvolution,” Rejecta Mathematica, vol. 1, no. 6, pp. 84–103, 2009. [16] Xin Li, “Inverse halftoning with nonlocal regularization,” in Image Processing (ICIP), 2011 18th IEEE International Conference on. IEEE, 2011, pp. 1717–1720. [17] Kenneth Rose, “Deterministic annealing for clustering, compression, classification, regression, and related optimization problems,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2210–2239, 1998. [18] Ajit Rajwade, Anand Rangarajan, and Arunava Banerjee, “Image denoising using the higher order singular value decomposition,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 35, no. 4, pp. 849–862, 2013. [19] Robert W Floyd, “An adaptive algorithm for spatial gray-scale,” in Proc. Soc. Inf. Disp., 1976, vol. 17, pp. 75–77.