IMAGE DETAIL ENHANCEMENT USING A ... - IEEE Xplore

5 downloads 4088 Views 2MB Size Report
IMAGE DETAIL ENHANCEMENT USING A DICTIONARY TECHNIQUE. Anustup Choudhury. ∗. Department of Computer Science. University of Southern ...
IMAGE DETAIL ENHANCEMENT USING A DICTIONARY TECHNIQUE Anustup Choudhury∗

Peter van Beek and Andrew Segall

Department of Computer Science University of Southern California Los Angeles, California [email protected]

Sharp Laboratories of America 5750 NW Pacific Rim Blvd Camas, Washington, USA {pvanbeek, asegall}@sharplabs.com

ABSTRACT We present a novel approach to detail enhancement using a dictionarybased technique. For each low-resolution input image patch, we seek a sparse representation from an over-complete dictionary and use that to estimate the high-resolution patch. We modify an existing dictionary-based super-resolution method in several ways to achieve enhancement of fine detail without introduction of new artifacts. These modifications include adaptive enhancement of reconstructed detail patches based on edge analysis to avoid halo artifacts and using an adaptive regularization term to enable noise suppression while enhancing detail. We compare with state-of-the-art methods and show better results in terms of enhancement with suppression of noise.

et al. [9] show how a joint compact dictionary can be trained to learn the correspondence between high-resolution training image patches and the features extracted from corresponding lowresolution patches.

Index Terms— Image enhancement, sparse representation 1. INTRODUCTION Most existing detail enhancement methods are based on the use of low-pass or smoothing operators, such as Gaussian filters or edgepreserving non-linear filters. Very often a hierarchical framework is used, such as a Laplacian pyramid, to decompose an image into a smooth low-frequency component and several high-frequency components. Each level is then enhanced separately and re-combined to form an enhanced image. Fattal et al. [1] recursively applied the bilateral filter [2] to generate multiple components of the image at different levels of detail. Farbman et al. [3] generated a multiscale decomposition using edge-preserving smoothing based on a Weighted Least Squares (WLS) optimization framework. Subr et al. [4] analyzed the local image extrema at multiple spatial scales, to build a hierarchy of oscillations. Fattal [5] used wavelets that are constructed according to the edge content of the image to perform multi-resolution analysis. Recently, Paris et al. [6] used a Laplacian pyramid framework, using local non-linear operators to achieve edge-aware detail enhancement and tone mapping. Zhang and Allebach [7] addressed both detail enhancement and noise removal based on an adaptive bilateral filter, where the range parameter is locally modified. In this paper, we propose to use a dictionary-based approach to perform detail enhancement. Dictionary-based methods and sparse modeling have been proposed recently to perform denoising and super-resolution [8, 9, 10]. The dictionary-based super-resolution method is motivated by results in sparse signal processing, which state that the low-dimensional projection of a signal may be used to accurately recover the corresponding high-resolution signal. Yang ∗ The

author performed the work while interning at Sharp Labs.

978-1-4673-2533-2/12/$26.00 ©2012 IEEE

977

(a) Input image “Flower”

(b) Result of proposed detail enhancement method Fig. 1. Detail enhancement using dictionary-based method The proposed dictionary-based method is a novel application of the work by Yang et al. [9] to detail enhancement. In this paper, we study the potential of dictionary-based methods to reconstruct and enhance texture detail beyond the capabilities of traditional filteringbased approaches. We have modified the original algorithm in several ways relevant to our goal of visually significant enhancement

ICIP 2012

of fine texture detail: 1) Instead of using high-resolution patches, we learn the dictionary on the residuals between high-resolution patches and low-resolution patches; 2) We propose to enhance the reconstructed detail patches adaptively, using edge analysis to suppress halo effects; 3) We propose to locally vary the regularization term in the optimization function adaptively, to remove noise and compression artifacts in some areas while enhancing detail in other areas; 4) We have investigated alternative feature extraction operators to improve the recovery of texture. We compare our results with several state-of-the-art approaches and demonstrate better enhancement of details in images. Our results contain no significant halo artifacts and better noise suppression during enhancement. An example result of the proposed method can be seen in Fig. 1. In Section 2, we describe the dictionary-based method and our enhancement algorithm. In Section 3, we show visual results using our method and compare with existing approaches. Finally, we conclude in Section 4. 2. DICTIONARY-BASED IMAGE ENHANCEMENT 2.1. Background Yang et al. [9] and Zeyde et al. [10] developed methods for singleimage super-resolution (SR) based on sparse modeling. These methods utilize an overcomplete dictionary Dh ∈ RnXK containing K “atoms” of size n. It is assumed that any patch x ∈ Rn in a highresolution image can be represented as a sparse linear combination of the atoms of Dh as follows: x ≈ Dh α, with ||α||0  K, α ∈ RK .

α

(2)

A feature extraction operator F is used [9, 10] instead of the raw pixel data, to emphasize the high-frequency components of y. The sparsity (number of non-zero coefficients) of the solution α∗ is controlled by λ. Lower λ implies more non-zero coefficients (atoms). A high-resolution patch is reconstructed by x = Dh α∗ . Overlapping patches are averaged, and, finally, a global reconstruction constraint may be enforced [9]. Dictionary training starts by sampling patch pairs from corresponding high- and low-resolution images (preserving the correspondences between spatial locations). High-resolution patches X h = {x1 , x2 , . . . , xm } are concatenated with low-resolution patch features Y l = {y1 , y2 , . . . , ym } and a concatenated dictionary is defined by:     wh X h Dh Xc = , Dc = , (3) l Dl wl Y with weights wh , wl . Optimized dictionaries are computed by: min

Dh ,Dl ,Z

||Xc − Dc Z||22 + λ||Z||1 s.t.

||Dci ||22 ≤ 1,

i = 1, . . . , K.

2.2. Proposed Detail Enhancement Method The proposed image detail enhancement method is based largely on the method by Yang et al. [9], with important modifications. An overview of the dictionary training procedure is provided in Fig. 2, while the enhancement procedure itself is illustrated in Fig. 3. Our emphasis in this work is on fine edge and texture detail enhancement without upscaling the image. Instead of the downsampling and blur operators used in prior work [9, 10], we utilized a bilateral filter [2] as a degradation operator to obtain “lowresolution” images from the high-resolution images in the training set. Kindly note that we have inherited the terms “low-resolution” and “high-resolution” images from [9]. Both these images have the same resolution. Since the bilateral filter tends to remove textures while preserving strong edges, the trained dictionaries as a result emphasize enhancement of texture and fine detail, rather than basic edge sharpening (which is more straightforward to achieve with conventional enhancement methods).

(1)

A patch y in the observed low-resolution image can be represented using a corresponding low-resolution dictionary Dl with the same sparse coefficient vector α. This is ensured by co-training the dictionary Dh with high-resolution patches and dictionary Dl with corresponding low-resolution patches. The observed low-resolution image is related to the original high-resolution image through a combination of known blur and down-sampling operators. Super-resolution reconstruction proceeds in a patch-by-patch manner and, for each observed patch y, starts by determining the sparse solution vector: α∗ = min ||FDl α − Fy||22 + λ||α||1 .

The process is performed in an iterative manner, alternating between optimizing Z and Dc using the technique in [11].

(4)

978

Fig. 2. Dictionary training procedure. Furthermore, instead of using the high-resolution patches by themselves during training, we use the differences (residuals) between each corresponding high-resolution patch and lowresolution patch. Residue patches for training are denoted by E = {x1 −y1 , x2 −y2 , . . . , xm −ym }. We use the residue patches in order to emphasize the relationship between low-resolution image patches and the edges and texture content within the corresponding highresolution image patches.Joint dictionary training  is performed with Eq. 4, where Xc = [ N/(N + M )E M/(N + M )Y l ]T . Here N and M are the dimensions of high- and low-resolution image patches in vector form. We found that using residual patches resulted in improved results. The enhancement process (Fig. 3) starts by applying sparse coding with respect to the low-resolution dictionary Dl on a patch of the input image, using Eq. 2. In order to enhance practical image data suffering from compression artifacts and other noise, one needs to enhance texture detail while suppressing the noise. We found that removal of blocking artifacts and noise can be achieved by locally reducing the number of atoms allowed in the sparse coding stage. On the other hand, representation of fine texture detail requires locally increasing the number of atoms. Hence, to locally adapt the number of non-zero coefficients of the solution vector α∗ , we adopted a simple technique to adapt λ. We used the standard deviation (σ) of a patch to indicate the local texture content, and empirically adapted

λ as follows:

⎧ ⎨ 0.5 0.1 λ= ⎩ 0.01

if σ < 10 if 10 ≤ σ ≤ 15 otherwise.

We have used the following set of 1-D filters (representing the feature extraction operator F) to emphasize high-frequency detail: f1 = [−1, 0, 1],

f2 = f1T

f3 = [−1, −2, 1],

f4 = f3T .

(5)

Features f3 and f4 are slightly different from [9] and resulted in better enhancement. To further emphasize texture detail, we experimented with several more advanced texture feature extraction methods, including Gabor filters and steerable complex pyramid filters; however, these did not result in any improvement. Since residuals were used to train Dh , the reconstructed detail patch x = Dh α∗ must be added to the input patch, as shown in Fig. 3. Prior to this addition step, we enhance the resulting image by multiplying the reconstructed detail patch by a scalar factor κ. We found this to be beneficial to visually enhance texture detail. However, strong enhancement resulted in undesirable halo effects along edges (undershoot and overshoot). In order to enhance texture while controlling halo effects, we adapted κ based on edge analysis. In our experiments, we used a Canny edge detector to check if a patch contains significant edge content. For patches containing strong edges, we used a low value κ1 ; otherwise, we used a high value κ2 . We performed averaging of overlapping patches in the output image to obtain the final enhanced image. Unlike Yang et al. [9], we did not enforce a global reconstruction constraint.

using Eq. 4 with λ = 0.1, for 50 iterations. The resulting dictionaries contained K = 512 atoms. During enhancement, we considered patches of size 5 × 5 with an overlap of 4 pixels horizontally/vertically. We used enhancement values κ1 = 1.2 and κ2 = 5 for all results in this paper. On a PC with Intel Core 2 Quad CPU with 3.25GB RAM, the training process takes around 3.5 hours and the enhancement process on a 512X512 image takes approximately 15 − 30 minutes depending on the value of λ. Processing time can be reduced significantly (by a factor of 5) by reducing the overlap between patches (to 3 pixels), at a slight decrease in performance. Visual results of the proposed enhancement method are shown in Fig. 1, 4 and 6. In Fig. 5, we show the results of (a) Photoshop’s unsharp mask and (b) a recent technique by Subr et al. [4] on the “Flower” image. These results should be compared to the result of

(a) Input image

(b) Detail enhancement result

Fig. 3. Proposed image detail enhancement method.

3. EXPERIMENTS AND RESULTS

(c) Input image

For training dictionaries, we utilized a dataset of 65 images obtained from [12], from which 100000 patches of size 5 × 5 were extracted. These images are the high-resolution images. Low-resolution images were generated by degrading these images with a bilateral filter with σrange = 20 and σdomain = 1. Dictionaries were optimized

979

(d) Detail enhancement result

Fig. 4. Results of our dictionary-based detail enhancement method. our proposed method shown in Fig. 1. The result in Fig. 5(a) suffers from very distinct halo artifacts that are not visible in our result. The result in Fig. 5(b) suffers from subtle halo artifacts, visible as white

(a) Enhancement with Photoshop’s Unsharp Mask (a) Input image

(b) Enhancement with [3]

(c) Enhancement with [6]

(d) Proposed method

(b) Enhancement with [4] Fig. 5. Results of existing enhancement methods, showing halo effects (cf. Fig. 1).

regions along the boundaries of the flower and the leaves. In many practical applications, it is important to be able to control or suppress noise while enhancing detail. In Fig. 6, we compare our detail enhancement method to existing methods [3, 6] on an image containing noise. The results of existing work in (b) and (c) indicate significant noise amplification. The dictionary-based method can handle noise better, while achieving significant detail enhancement, as shown in (d). 4. CONCLUSION

Fig. 6. Comparison of detail enhancement in presence of noise.

[4]

[5]

In this paper, we have proposed a novel method to perform image detail enhancement, based on the use of coupled dictionaries that are jointly trained from low-resolution and high-resolution images. We extended the work by Yang et al. [9] and applied it to enhancement of texture and fine detail without introducing artifacts. Our algorithm is edge-adaptive, resulting in enhancement without significant halo effects. Our algorithm also uses an adaptive regularization term, providing improved noise control. We compared our results to several existing methods, showing the promise of this approach. Future work may include application to video and reduction of the computational cost.

[6]

[7]

[8]

[9]

5. REFERENCES [10] [1] R. Fattal, M. Agrawala, and S. Rusinkiewicz, “Multi scale shape and detail enhancement from multi-light image collections,” in ACM SIGGRAPH 2007. [2] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in ICCV ’98, 1998, p. 839. [3] Z. Farbman, R. Fattal, D. Lischinski, and R. Szeliski, “Edge-

980

[11] [12]

preserving decompositions for multi-scale tone and detail manipulation,” in ACM SIGGRAPH 2008. K. Subr, C. Soler, and F. Durand, “Edge-preserving multiscale image decomposition based on local extrema,” in ACM SIGGRAPH Asia 2009. R. Fattal, “Edge-avoiding wavelets and their applications,” ACM SIGGRAPH 2009. S. Paris, S. Hasinoff, and J. Kautz, “Local Laplacian Filters: Edge-aware Image Processing with a Laplacian Pyramid,” in ACM SIGGRAPH 2011. B. Zhang and J.P. Allebach, “Adaptive bilateral filter for sharpness enhancement and noise removal,” IEEE TIP, vol. 17, no. 5, pp. 664 –678, May 2008. M. Elad, M. A. T. Figueiredo, and Y. Ma, “On the role of sparse and redundant representation in image processing,” Proceedings of the IEEE, vol. 98, no. 6, pp. 972–982, June 2010. J. Yang, J. Wright, T. Huang, and Y. Ma, “Image superresolution via sparse representation,” IEEE TIP, vol. 19, pp. 2861–2873, November 2010. R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse representations,” in Curves and Surfaces, 2010. H. Lee, A. Battle, R. Raina, and A. Y. Ng, “Efficient sparse coding algorithms,” in NIPS, 2007. A. Olmos and F. Kingdom, “A biologically inspired algorithm for the recovery of shading and reflectance images,” Perception, vol. 33, pp. 1463–1473.