Stereo matching using pixel classification and dual-weighted guided

5 downloads 0 Views 1MB Size Report
Stereo matching using pixel classification and dual-weighted guided filter. Mo Xi, Qiao Liyan, Luo Tiannan, Liu Wang,. Department of Automatic Test and Control ...
2015 IEEE 12th International Conference on Electronic Measurement & Instruments

ICEMT' 20I5

Stereo matching using pixel classification and dual-weighted guided filter Mo Xi, Qiao Liyan, Luo Tiannan, Liu Wang, Department of Automatic Test and Control, Harbin Institute of Technology 92 Xidazhi Street, Nanggang District, Harbin, China Email: moses_mo@126. com.qiaoliyan@163. [email protected] Abstract

Finding the subjectively correct binocular disparity of a stereopair has long been a challenging issue for researchers. A coarse-to-fine algorithm is presented in this article for seeking to a more reasonable solution to this issue. We focus on the disparity computation other than disparity refinement and post-processing procedures. Dual-weighted guided filter or a simplified guided image filter with dual-weights is proposed to boost computation efficiency and reduce matching ambiguity. Perceived as a classification step. a normalized cross-correlation based method aims for searching for reliable matching candidates. Three channels in colour space are biologically weighted for subjectively rational correspondence. As comparative experimental results show. the proposed method performs excellently without post-processing. All the algorithms are tested on Middlebury stereopsis benchmark. -

Fig.

R) and guidance patch (Right).

RGB

Keywords

binocular stereo matching. pixel similarity. regional similarity, dual-weighted guided filter. -

I. INTRODUCTION

A.

1. Performance of simplified guided filter kernel (Left, channel

Motivation

In the field of computer vision, stereo matching plays a significant role in finding correct binocular disparities, which has been a crucial research subject for decades. Techniques of stereo matching can be applied in obstacle detection, target recognition, path finding, etc. However, the determination of subjectively reasonable correspondence is challenging. Occlusion, depth discontinuities, projective distortion, excessively patterned, homogeneous or large textureless areas influence the matching result to varying degree respectively. 1 Pixel classification [ ] is a procedure of mutual consistency check and measuring the correlation confidence. It is a criterion for the judgment of namely stable pixel. There are few works on providing further discussion, which in most cases, pixel classification is a [2] pre-processing like process. Stefano et al. processed input image by subtracting mean value of intensities in a windows centered at each pixel to detect textureless regions, implicitly not only modified cost function was, but also behaved as a mean filter for similarity measurement. We manage to make this more explicit by introducing a pixel similarity classification process (PSCP), which has analogous functioning with pixel classification. In disparity computation, pixels of a stereopair are affected [3-4] by image sampling . PSCP of proposed method is to separate uncorrelated bad pixels with fine candidates, providing reliable candidates for correspondence and boosting searching efficiency as well. 978-1-4799-7071-1/15/$31.00 ©2015 IEEE

651

Stereo matching using the guided filter (GF) has features of similar artifacts to those of bilateral image filter and it is one of the fastest edge-preserving filters, an illustration of the proposed filter kennel is shown in Fig. 1. Though a pleasant performance the GF method has, it is computationally intractable [5] when calculating the GF kernel. To solves this, we believe that the mean value and variance of a specific window are sufficient for disparity computation other than considering all the related windows that embrace pixel i and}. 6 Biological weights [ ] can be integrated into the proposed method in the PSCP as an application of physiological based image processing. Experiments results of proposed method manifest higher computation efficiency and improvements on edge area than some other methods. B.

Related Works

1) Cost function: Three commonly-used primary 78 methods [ - ] for correspondence including sum of absolute difference (SAD), sum of squared difference (SSD), and normalized cross-correlation (NCC). SAD and SSD are used for the construction of cost volume both in local and global methods, effective types of cost function (CF) have been designed on the level of sub-pixel and integral pixel [9-12] , i.e. BT, LoG, ZNCC, HMI, etc. One minor flaw of these elaborate CFs, in most of them, is the complicated 7 structure that degrades computation efficiency. Census [ ] filter for CF gains great popularity, however, in contrast to its rational matching artifacts, calculation necessity of Hamming distance between coded strings could be tremendous. 2) Normalized cross-correlation: NCC calculates the cross correlation coefficient (CCC) between cost slice and the template, thus providing potential possibility for the 13 matching. A minimal algorithm [ ] models the possibility by a joint probability density of obtained CCC volumes, which can be regarded as a sum of conditional probability density function of CCC in its logarithmic form either. 14 NCC-like stereo energy model [ ] are also applied in many physiological researches of binocular disparity of human

2015 IEEE 12th International Conference on Electronic Measurement & Instruments eye. A ZNCC [12] CF utilizes mean value (or weighted intensities) to generate reliable artifact against radiometric change. 3) Adaptive support-weight methods: Yoon and Kweon [15] put forward a widely accepted adaptive support-weight (ASW) approach, in aspect of the artifacts, local methods can be matched with the global ever since [6], and mass works are related to ASW approach. Improvements of ASW approaches [16] are intended to replace the bilateral filter kernel by geodesic-distance weights or colour weights according to image colour segmentation or a O( 1) optimized filter kernel, et aI., along with novel strategies of cost aggregation and correspondence searching, i.e. the PatchMatch Stereo and Surface Stereo [17], has better performance on disparity continuity than other methods. Adaptive guided filter [18] uses alterable radius whose

r I

-----�

ICEMT' 20I5

truncated value depends on the image borders with fewer outliers, and takes 1st place in all local methods on Middlebury benchmark. Physiologically weighted method[6] presents a biological form of ASW for the evolution of another branch of stereo matching. Adaptive normalized cross-correlation [19] with log-Chromaticity Normalization is robust to variation of illumination due to a radiometric change, which indicate that ASW method can easily cooperate with other methods for matching robustness. II.

ALGORITHM

A profile of proposed method is shown in Fig. 2.

pSCP

Dis parity

Dual-Weighted

Candidates

Guided image filter

Disparity Map

Corres ponde n ce image Fig.

2. Outline of proposed approach. The first Step is pre-processing for subsequent steps including an application of mean filter, Rank

transformation, a BT method with biologically-weighted RGB to gray-scale graphical transformation. Secondly, bring PSCP into effect as a rough matching. The last step is the fine matching procedure using a dual-weighted guided filter by checking each possible disparity.

A.

shorter than window radius R of the following fine matching step is recommended for time saving not only for direct calculation, but also for keeping details of matching results, as well as maintaining the high efficiency of PSCP.

Pre-Processing Procedures

A mean filter is used for storage and calculation of mean values of all squared windows centered at pixels along epipolar line within a valid range. Mean intensity value of a window W centered at (x, y) of a channel Ch is computed as flCh

k

1

B.

PSCP Computing is a process for pixel similarity measurement. One straightforward method is to apply pixel dissimilarity method as BT or another similar method and set truncated value to quantify the dissimilarity, however, scales of that truncated value even empirical ones are difficult to find. Hence, we quantity it by introducing a two-level model shown as: /3 PSCP ( i,d ) = PS ( i,d r . RS ( i,d ) . (4)

. L IEW(x,yj iCh· (1),VChE{ R,G,B},VkE{I,2}, (1) ( x,Y ) =()) k

k

where k=1 denotes the reference image, and k=2 its correspondence. OJ denotes the total number of pixels in window W with a radius R, i is an index of pixel. It should be notified that the definitions of all variables are consistent in the context. For each pixel in ROB colour space, its gray-scale version can be computed by l' =0.299·/R +0.587·/G +0.114'/8' (2) There biological weights are used for this conversion as a replacement of commonly-used 1/3. Those weights of (2) are physiologically designed according to human eye' s distinguishing perception of different wavelength of visible light [6]. Rank transformation [20] with short radius r of a squared window centered at pixel i then utilizes I' to generate non-parametric maps:

Rankt (iiiE It) = L

,eJ¥, • .

R(!;), R(!;) =

{I

0

Pixel Similarity Classification Process

Here, PSCP(i,d) denotes the composite similarity value of pixel i in reference image and i+d in matching image with a disparity value of d. PS(i,d) denotes the pixel' s similarity with its correspondence, and RS(i,d) the similarity of small regions with a radius of r around pixel i and pixel i+d, three variables are all ranged within [0, I]. Parameters a., p are scalars that decided which level of the model plays the predominant role. To illustrated the impact of two scalars, take a. for instance, if 0.=0, PS(i,d)'s influence on PSCP is eliminated; if 0 I and fJ0.5 is a harsh th�eshold for most PSCP function values, which may be shghtly affected by a and fJ, a TruncS between 0.2 and 0.5 could provide sufficient candidates in many situations.

B.

ICEMT' 20I5

Matching Result

Fig. 4 shows experimental results of PSCP with DGF on Middlebury stereopairs along with several state-of-art elaborately processed algorithms. We computed the weights explicitly and applied a set of parameters: {r,R,a,fJ,(Jps,TruncS} ={5,11,1,0.2,5,OA}. All testing codes are not optimized. Compared to the final artifacts of ASW and ESAW, SNCC methods [2 2 ], for more algorithms, readers may refer to the evaluation on Middlebury benchmark. Proposed method without post-processing performs better on some details. The highlight of proposed method lies in the high computational efficiency of PSCP with DGF. As shown in Table I, proposed method sees a runtime bonus up to 30%. In these tests, PSCP are turned off by setting r=O, TruncS=O, a=O, fJ=O, and turned on respectively. However, PSCP is not efficient when working with some simple �ethods as SAD and SSD. Form Table 1, it can also be mferred that DGF are nearly two times the runtime as primitive guided filter method without left/right consistency checking, largely because that DGF requires . two ·

(a)

en

�:

� �.



0.8

PS

�CT

0.6

Reference

0.4

II

0.2

Matching pixels (b) (d)







0.2

RS

"TR

PSCP

(e) Matching pixels (c) Fig.

Fig.

3. Evaluatiou of PSCP's performauce. (a) Correspoudeuce with

regional similarity's peak-peak. From left to right: Reference image ' atchiug image, 3 1X3 1 patches ofRefereuce aud the Matchiug. (b) . . . . . PlXeI simiIanty estJmaltou aloug epipolar liue iu the matchiug image . of the reference pixel. (c) 31 x31 patch similarity estimation along the

(b) ESAW. (c) SNCC. (d) ASW. (e) Proposed method.



same epipolar line as (b).

4. Mathiug reults ou Middlebury stereopairs. (a) Grouud-truth

of Left-eye image, form left to right: Tsukuba, Veuus, Teddy, Coues.

Table

1. Algorithms runtime comparison ofGF and DGF

with/without PSCP ��=:=c=c==:=������ �:::::=�r===��������� Runtime with r=R=16/pixel

GF/s

654

DGF/s

2015 IEEE 12th International Conference on Electronic Measurement & Instruments

Without With Bonus Without With Bonus PSCP (%) PSCP (%) PSCP PSCP 12 1 19 Tsukuba 70 -23 1 35 86 24 1 32 209 Venus 1 86 356 II 29 1 1 54 23 410 892 Teddy 58 1 29 492 1 1 13 27 1 564 Cones 675 IV.

CONCLUSION

[3]

[4]

[5]

[6]

[7]

[8]

[11]

[13]

[14]

[15]

[16]

[17]

[18]

REFERENCES

[2]

[10]

[12]

A pixel similarity classification process with dual-weighted guided filter has been proposed in this paper. Pixel similarity and regional similarity cooperate to filter out mismatches, which makes this PSCP more reliable for subsequent procedures. Dual-weighted guided filter derives high level disparity map analogue to that of guided filter as an excellent performer among local matching approaches, and the calculation of dual-weights kernel are convenient and fast enough. Further efforts are encouraged to exert on the application of real-time DGF algorithms as well as a better regional similarity function to severe radiometric change.

[I]

[9]

YANG Q, WANG L, YANG R,et al.Stereo Matching with Color-Weighted Correlation, Hierarchical Belief Propagation, and Occlusion Handling[J].IEEE Tansactions on Pattern Analysis and Machine Intelligence, 2009,31 (3): 492-504. STEFANO D L, MARCHIONNI M, MATTOCCIA S.A Fast Area-Based Stereo Matching Algorithm[J]. Image and vision computing, 2004,22:983-1005. BIRCHFIELD S, TOMASI C. A Pixel Dissimilarity Measure that is Insensitive to Image Sampling[J].IEEE Tansactions on Pattern Analysis and Machine Intelligence, 1998,20(4): 401-406. XU L, JIA .I, KANG S B.lmproving Sub-Pixel Correspondence Through Upsampling[J].Computer Vision and Image Understanding, 2012,116 :250-261. HOSNI A, BLEYER M, RHEMANN C, et a1.Real-Time Local Stereo Matching Using Guided Image Filtering[C].in Proc. ICME'l1, Barcelona, Spanish, 2011:1-6. NALPANTIDIS L, GASTERATOS A.Biologically and Psychophysically Inspired Adaptive Support Weights Algorithm for Correspondence[J]. Robotics and Autonomous Stereo Systems,2010, 58:457-464. BROWN M Z, BURSCHKA D, HAGER G D.Advances in Computational Stereo[J]. IEEE Tansactions on Pattern Analysis and Machine Intelligence, 2003, 25(8):993-1008. SCHARSTEIN D, SZELISKI R. A Taxonomy and Evaluation of Correspondence Stereo Two-Frame Dense Algorithms[J].International Journal of Computer Vision, 2002,47(112/3):7-42.

[19]

[20]

[21]

[22]

ICEMT' 20I5

HIRSCHMOLLER H, SCHARSTEIN D.Evaluation of Stereo Matching Costs on Images with Radiometric Differences[J]. IEEE Tansactions on Pattern Analysis and Machine Intelligence, 2009,31 (9):1582-1599, 2009. HEO Y S, LEE K M, LEE S U.Simultaneous Depth Reconstruction and Restoration of Noisy Stereo Images Using Non-Local Pixel Distribution[C]. in Proc. CVPR'07, 2007, Minneapolis, USA. ANTUNES M, BARRETO J P.SymStereo. Stereo Matchmg Usmg Induced Symmetry[C].Int J Comput Vis, 2014,109:187-208. HIRSCHMOLLER H, SCHARSTEIN D. Evaluation of Cost Functions for Stereo Matching[C]. in Proc. CVPR'07, 2007, Minneapolis, USA. MITSUDO H. A Minimal Algorithm for Computing the Likelihood of Binocular Correspondence[J]. Japanese Psychological Research,2012,54(1): 4-15. GOUTCHER R, HIBBARD P B. Mechanisms for Similarity Matching in Disparity Measurement[J]. Frontier in Psychology, 2014,4:1-11. YOON K J, KWEON I S. Locally Adaptive Support-Weight Approach for Visual Correspondence Search[C]. in Proc. CVPR'05, 2005, San Diego, USA. HOSNI A, BLEYER M, GELAUTZ M. Secrets of Adaptive Support Weight Techniques for Local Stereo Matching[J]. Computer Vision and Image Understanding, 2013,117:620-632. BLEYER M, RHEMANN C, ROTHER C. PatchMatchStereo Stereo Matching with Slanted Support Windows[C]. in Proc. BMVC'II, 2011, Dundee, UK, pp.I-II. YANG Q, JI P, LI D,et al.Fast Stereo Matching Using Adaptive Guided Filtering[J]. Image and Vision Computing, 2014,32:202-211. HEO Y S, LI K M, LEE S U. Robust Stereo Matching Using Adaptive Normalized Cross-Correlation[J].IEEE Tansactions on Pattern Analysis and Machine Intelligence, 2011,33(4): 807-822. ZABIH R, WOODFILL J. Non-Parametric Local Transforms for Computing Visual Correspondence[C]. in Proc. ECCV'94, 1994, Stockholm, Sweden. HE K, SUN .I, TANG X. Guided Image Filtering[J]. IEEE Tansactions on Pattern Analysis and Machine Intelligence, 2013,35(6): 1397-1409. (2012) The Middlebury stereo vision website. [Online]. Available: http://vision.middlebury.edu/stereo AUTHOR BIOGRAPHY

Mo Xi was born in Lanzhou, China, in 1 989. He received B.S. and M.S. degrees in Automation and Physics from Beijing University of Aeronautics and Astronautics, China, in 2007 and 2013, respectively. Now he is a PhD candidate in Department of Automatic Test and Control of Harbin Institute of Technology, China. His research interests include computer vision, image processing and machine learning. Qiao Liyan received the B.S. and M.S. degrees in Electronic and Communication System from the Harbin Institute of Technology (HIT), Harbin, China, in 1 996 and 1 998, respectively. He obtained his PhD degree in Electronic Measurement from HIT in 2005. He is a member of Chinese Institute of Electronics, a member of Chinese Society for Measurement. His current research interests include machine vision, digital image processing, data acquisition technology, and mass-storage data record technology.

655