A Novel Rood-Diamond Search Algorithm for Fast ... - Semantic Scholar

3 downloads 0 Views 136KB Size Report
(MVP) distributions, 9 well-known sequences, listed in Table I, consist of different motion content are exhaustively simulated using FS with spiral block-matching ...
A Novel Rood-Diamond Search Algorithm for Fast Block Motion Estimation Chun-Ho Cheung and Lai-Man Po CityU Image Processing Laboratory,Department of Electronic Engineering, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China. Email: [email protected] and [email protected] ABSTRACT: Search patterns and the center-biased characteristics of motion vector distribution (MVD) have large impact on both searching speed and quality of block motion estimation. In this paper, we propose a novel algorithm using a rood-shaped search pattern as the initial step and large/small diamond search patterns as the subsequent steps for fast block motion estimation (BME). The rood-shaped pattern is to fit the rood-center-biased MVD characteristics of the real-world sequences by evaluating the 9 relatively higher probable candidates located as a rood-shaped pattern at the search-grid center. The proposed rood-diamond search algorithm (RDS) employs halfway-stop technique and could find small motion vectors with fewer points than the diamond search algorithm (DS) while maintains similar or even better quality. The speedup improvement of RDS over DS can be up to 40%. Simulations show that RDS is much more robust, provides faster searching speed and smaller distortions than other fast algorithms.

1. INTRODUCTION Block-matching motion estimation is the vital process for many motion-compensated video coding standards, in which temporal redundancy between successive frames are efficiently removed. It divides frames into equally sized rectangular blocks and finds out the displacement of the best-matched block from the previous frame as the motion vector to the block in the current frame within a search window (w). However, the motion estimation could be very computational intensive and can consume up to 80% of computational power of the encoder if exhaustively evaluating all possible candidate blocks. In the last two decades, many fast block-matching algorithms (BMA) were proposed for alleviating the heavy computations consumed by the brute-force full search algorithm (FS), such as the three-step search (3SS) [1], the new three-step search (N3SS) [2], the four-step search (4SS) [3], the block-based gradient descent search (BBGDS) [4] and the diamond search (DS) [5, 6] algorithms, etc. In 3SS, N3SS, 4SS and BBGDS, rectangular search patterns of different sizes are employed. As the center-biased global minimum motion vector distribution characteristics, more than 80% of the blocks can be regarded as stationary or quasistationary blocks and most of the motion vectors are enclosed in the central 5×5 area for w = ±7 (as depicted in Fig.1). Based on this phenomenon in real-world sequences, N3SS proposes the first step of 3SS to evaluate 8 extra neighboring candidates and employs halfway-stop technique to achieve significant speedup on sequences with (quasi-) stationary blocks while 4SS and BBGDS just use smaller square patterns to fit the center-biased motion vector distribution characteristics of the real-world sequences. Among them, DS employs a diamondshaped pattern and results in fewer search points with similar distortion performance as compared to N3SS and 4SS. Basically, DS performs block-matching just like 4SS. It rotates the square-shaped search pattern by 45° to form a diamondshaped one and with its size kept unchanged throughout the search before the new minimum block distortion measure (BDM) reaches the center of the diamond. The merits that DS yields faster searching speed can be regarded to (1) the

0-7803-7402-9/02/$17.00 ©2002 IEEE

diamond-shaped pattern, which tries to behave as an ideal circle-shaped coverage for considering all possible directions of an investigating motion vector, and (2) fewer checking points in the final converging step (only 4 instead of 8 as compared to square-shaped pattern BMA like N3SS and 4SS.) Without any exception, all conventional fast BMA are based on the convexity of uni-modal error surface assumption of the BDM [7]: the BDM of the matching blocks decreases monotonically towards the global minimum distortion. To minimize the distortion trapped by local minima, DS keeps unrestricted number of steps instead of step-size convergence during advancing to the subsequent optimal point of the search pattern. In this paper, we propose a novel fast BMA called rooddiamond search algorithm (RDS) by introducing a rood-shaped search pattern as the initial step, instead of the diamond-shaped, to the DS algorithm. Experimental results show that it can achieve fewer search points over other fast BMA and can maintain similar or even smaller distortion error. The rest of the paper is organized as follows. Section 2 analyses the motion vector distribution characteristics. The RDS algorithm will be described in section 3. Section 4 reports the significant experimental results and conclusions are given in section 5.

2. ROOD-CENTER-BIASED MOTION VECTOR DISTRIBUTIONS For an in-depth analysis on the motion vector probabilities (MVP) distributions, 9 well-known sequences, listed in Table I, consist of different motion content are exhaustively simulated using FS with spiral block-matching style and mean absolute distortion (MAD) as the BDM is employed. Videoconferencing sequences, such as “Miss America”, “Salesman” and “Claire”, consist of gentle, smooth and low motion content with (quasi-) stationary background. Fast motion with camera zooming can be found in the sequence “Tennis” while camera panning with translation and rigorous motion content can be found in sequence “Garden” and “Football”, respectively. As the latter 3 sequences involve higher motion content, both SIF and CCIR601 formats are included. Due to the space limitation, only the average MVP distribution using the six SIF/CIF sequences with w = ±7 are presented and cumulated at the corresponding positions of the one-forth search window, and are tabulated in Table II, III and IV. Format CIF (352×288, 80 frames)

Sequence Miss America Sales Claire Tennis SIF (352×240, 80 frames) / Garden CCIR601 (720×486, 40 frames) Football Table I. Image sequences used for analysis. As square-shaped search grid, which is originated at the central zero motion position/vector (ZMP/ZMV), is usually employed as a “square-shaped region” with radius1 p (p ∈ w) to cover the 1

In the discrete rectangular search grid, “radius” is usually referred to the displacement with same number of pels (checking points) from different directions away from the central zero motion vector position.

IV - 3397

search range from negative to positive search points in both horizontal and vertical directions. Thus, square occurrence (Probability(S) or P(S)) is used to refer to the sum of MVP, in which the search points are located on the square-region with p pels from different directions away from the ZMP. In addition, cumulated square occurrence (P(٪)) is referred to the total MVP of the central (2p+1)×(2p+1) area, i.e. the square-centerbiased (SCB) occurrence. Similarly, the diamond occurrence (P(D)) is the sum of MVP located on the diamond-shapedregion with orthogonal vertices of p pels from the ZMP. Then, the cumulated diamond occurrence (P(¹)) is referred to the total MVP of the central diamond-shaped (2p+1)×(2p+1) area, i.e. the diamond-center-biased (DCB) occurrence. On the other hand, rood occurrence (P(R)) is referred to the sum of the 4 MVP located on a rood-shaped-region with p pels orthogonally away from the ZMP. A rood-shaped-region with radius of 2 pels (|p|=2) is shown in Fig.2(a). The cumulated rood occurence(P(+)) is then referred to the total MVP along the rood-shaped pattern at the center with p-pel wing from the ZMP, i.e. the rood-center-biased (RCB) occurrence.

inheriting from (2), the RCB/DCB/SCB MVP distribution characteristics drops gradually away from ZMP. Radius |p| P(¸ | Ƒ) P(+ | Ƒ) P(+ | ¸) 0 100.00 100.00 100.00 1 96.16 96.16 100.00 2 94.76 91.39 96.44 3 95.40 88.47 92.74 4 96.30 86.77 90.11 5 97.28 85.62 88.01 6 97.86 84.78 86.64 7 97.03 83.48 86.04 Table IV. Conditional probabilities (%) between different patterns at different radius (pel) -7

-2 81.80% 77.52% 74.76% -7

Total probabilities (%) at corresponding checking-point within the search window p 0 1 2 3 4 5 6 7 0 45.44 7.94 1.30 0.73 0.46 0.18 0.70 0.26 1 2.76 1.41 0.43 0.20 0.10 0.07 0.12 15.71 2 2.04 0.84 0.22 0.14 0.05 0.10 0.07 4.37 3 1.37 0.92 0.34 0.12 0.19 0.10 0.19 3.75 4 0.90 0.30 0.16 0.13 0.06 0.17 0.08 0.86 5 0.75 0.11 0.05 0.04 0.03 0.03 0.03 0.68 6 0.30 0.11 0.07 0.06 0.05 0.09 0.06 0.40 7 0.18 0.21 0.17 0.12 0.17 0.08 0.23 0.71

Table II. Average distribution measured at absolute distance |p| from the center of the search grid using 6 CIF/SIF image sequences for search window ±7. In Table III, about 81.80% of motion vectors are found located in the central 5×5 area, i.e. p = ±2 pels. Moreover, 77.52% and 74.76% of motion vectors are found DCB and RCB distributions, respectively. Thus, most of the real-world sequences are usually gentle, smooth and varies slowly and can be regarded as stationary or quasi-stationary. Fig.1 shows the three biased characteristics of distribution with radius of 2 pels using w = ±7. Radius |p|

0 1 2 3 4 5 6 7

P(S) 45.44 26.41 9.95 7.76 3.27 2.26 2.23 2.68

P(Ƒ) 45.44 71.85 81.80 89.56 92.83 95.09 97.32 100.00

P(D) 45.44 23.65 8.43 7.93 3.95 3.10 2.74 1.79

P(¸) 45.44 69.09 77.52 85.45 89.40 92.50 95.24 97.03

P(R) 45.44 23.65 5.67 4.48 1.31 0.86 1.10 0.97

P(+) 45.44 69.09 74.76 79.24 80.55 81.41 82.51 83.48

Table III. Probabilities (%) and cumulated probabilities (%) of different patterns at different radius (pel) As indicated in Table II and III, four dominant features of MVP distribution can be concluded: (1) global optimal distribution is SCB within p = ±2 pels, especially the ZMV (0, 0); (2) MVP usually decreases away from the ZMP; (3) optimal motion vectors found along the vertical and horizontal directions are often much higher than the other locations with the same radius and regarded as RCB MVP distribution. For example, there are about 15.71% and 7.94% of motion vectors found in vertical and horizontal directions with radius of 1 pel away from the ZMP. They are much higher than the diagonal positions, which totally contributes about 2.76% at the same radius; and (4) as

-2

2

7

Square-center-biased portion 2 Diamond-center-biased portion Rood-center-biased portion 7

Fig.1 Over 96% of motion vector distribution possesses RCB characteristics in the central 5×5 DCB area. In this section, three conditional probabilities between different biased behavior are used to find out the most dominant characteristics. P(¹|٪) = Probability(¹)/Probability(٪),

(1)

P(+|٪) = Probability(+)/Probability(٪),

(2)

P(+|¹) = Probability(+)/Probability(¹).

(3)

Based on P(¹|٪) as shown in Table IV, the MVP distribution possesses 94.76% DCB characteristics in the 5×5 SCB area and maintains steady over 90% away from the ZMP. Thus, DS provides promising quality and searching speed as described in [5, 6]. Besides, if we further investigate the relationships between the three biased conditional probabilities away from the ZMP, we obtain the following inequalities at the critical radius of |p| = 2 pels: P(+|¹)



P(¹|٪) > P(+|٪), for 0



|p|



2,

P(¹|٪) > P(+ | ¹) > P(+|٪), for 2 < |p| < |w| .

(4) (5)

From Eqn.(4), within the central 5×5 area, RCB characteristics in the DCB distribution is found even higher probabilities than the DCB one in the SCB distribution. This implies there is still a place for further optimization with different search pattern, such as rood-shaped search pattern, over the diamond-shaped one. This central rood-shaped pattern with radius |p| < 2 pels could work more efficiently on finding small motion vectors than the diamond-shaped one. In contrast, the central diamondshaped pattern with radius beyond |p| > 2 pels works more efficiently on larger motion vectors, as indicated by (5). On the other hand, both RCB characteristics in either DCB or SCB distribution are in a smoothly decreasing manner away from the ZMP. As P(+|¹) always gives about 5% more than P(+|٪) for the central 5×5 area, it implies the real-world sequences possess higher RCB motion vector distribution characteristics in the central DCB area rather than the SCB one. Thus, rood-shaped search pattern with |p| = 2 pels as shown in Fig.2(a) is proposed

IV - 3398

to further optimize the diamond-shaped pattern of DS, instead of other BMA with rectangular search patterns. For examples, RCB behavior achieves over 96% in the DCB distribution while it only gives about 91% in the SCB distribution, with |p| = 2 pels. In Fig.1, the RCB distribution is observed as the most dominant center-biased behavior in terms of probabilities per number of search points in different center-biased regions.

-7

-6

-5

-4

-3

-2

-1

0 +1 +2 +3 +4 +5 +6 +7

-3 1

-2

1

-1 0

1

1

1

+1

1

+2

1

1

1

+3

1

First step

2

Second step

3. ROOD-DIAMOND SEARCH ALGORITHM (a)

3.1 Rood-diamond searching pattern The DS algorithm uses a large diamond-shaped pattern (LDSP) and small diamond-shaped pattern (SDSP), as depicted in Fig.2(b). As the motion vectors distribution possesses over 96% RCB characteristics in the central 5×5 DCB area, an initial rood-shaped pattern (RSP), as shown in Fig.2(a), is proposed as the initial step to the DS algorithm, and is termed the rooddiamond search algorithm (RDS).

-7

-6

-5

-4

-3

-2

-1

0 +1 +2 +3 +4 +5 +6 +7

-3 1

-2 -1 0

1

+1

2

1

1

1

2

1

1

1

1

+2 +3

1

First step

2

Second step

(b) -7

-6

-5

-4

-3

-2

-1

0 +1 +2 +3 +4 +5 +6 +7

-3 1

-2 -1 0 +1 +2

Large diamond-shaped pattern (LDSP)

+3 +4

Rood-shaped pattern (RSP)

Small diamond-shaped pattern (SDSP)

(b) (a) Fig.2 Search patterns used in the rood-diamond search algorithm. 3.2 The RDS algorithm RDS differs from DS by (1) performing a rood-center-biased RSP in the first step, and (2) employing a halfway-stop technique for quasi-stationary or stationary candidate blocks. Below summarizes the RDS algorithm. Step(i) Starting: A minimum BDM is found from the 9 search points of the RSP located at the center of search window. If the minimum BDM point occurs at the center of the RSP, the search stops. (This is called the first-step-stop as shown in Fig.3(a).) Otherwise, go to Step(ii). Step(ii) Half-diamond Searching: Two additional search points of the central LDSP closest to the current minimum of the central LDSP are checked, i.e. two of the four candidate points located at (±1, ±1). If the minimum BDM found in previous step located at the middle wing of the RSP, i.e. (±1, 0) or (0, ±1), and the new minimum BDM found in this step still coincides with this point, the search stops. (This is called the second-step stop, e.g. Fig.3(b)). Otherwise, go to Step(iii). Step(iii) Searching: A new LDSP is formed by repositioning the minimum BDM found in previous step as the center of the LDSP. If the new minimum BDM point is still at the center of the newly formed LDSP, then go to Step(iv) (Ending); otherwise, this step is repeated recursively. Step(iv) Ending: With the minimum BDM point in the previous step as the center, a new SDSP is formed. Identify the new minimum BDM point from the new 4 candidate points2, which is the final solution for the motion vector.

2

Some of the new points have been already checked in previous steps, and this fact can be applied to any subsequent steps.

+5

1

5

5

5

6

3

6

4

6

5

6

4

5

2

1

1

1

1

2

1

3

1

3 3 4

1 1

First step

2

Second step

3

Third step

4

Forth step

5

Fifth step

6

Sixth step

(c) Fig.3 Examples of RDS: each candidate point is marked with the corresponding step number, in which only one is found to be the minimum BDM point (unfilled). (a) First-step-stop with MV(0,0). (b) Second-step-stop with MV(-1,0). (c) Unrestricted search path of RDS for MV(-5,+2). 3.3 Analysis of RDS algorithm Fig.3 shows three typical examples.In Fig.3(a) and 3(b), two halfway-stop examples are shown. The RDS algorithm checks only 9 (first-step-stop) and 11 (second-step-stop) search points, respectively. They lead to the theoretical speedup of 25 and 20.5 times, respectively, as compared to the 225 checking points used in the FS algorithm. With the introduction of roodcenter-biased motion vector distribution characteristics, it can facilitate the optimization of RDS to DS and reduces computations significantly, especially for low bit-rate video applications, by tackling image sequences with (1) gentle or no motion, such as background information, that is reasonably handled by first-step-stop, and (2) small motion, in which a more accurate estimation is accomplished by the second-stepstop. On the other hand, it is obvious that the only case where the RDS algorithm performs as the DS algorithm when the minimum BDM point is neither fallen on the center nor any points on the middle wing of the RSP. RDS usually keeps advancing between successive LDSP by 3 or 5 new points as in the DS algorithm after it proceeds to the third or forth step, if necessary. In the beginning of the RDS, two special cases with different number of new points when using LDSP are, as compared to the DS algorithm: (1) only 2 new points closest to the minimum BDM found in the first step are used to form half of the central LDSP. For examples, the motion vector found in second-step-stop as shown in Fig.3(b); (2) RDS uses 4 new points to form LDSP for completing the third step if the minimum BDM is found not located at the RSP center in the first step and then one of the two new points on diamond-face of the half LDSP in the second step, as shown in Fig.3(c).

IV - 3399

4. EXPERIMENTAL RESULTS The proposed algorithm RDS is simulated using the luminance of the popular video sequences listed in Table I. They consist of different degrees and types of motion content. In all simulations, sum absolute difference (SAD) as the BDM, block size of 16 and search window w = ±7(SIF/CIF) / ±15 (CCIR601) are used. For CCIR601, half-pel accurate motion estimation is also used. The proposed RDS algorithm is compared against the five traditional fast BMA: FS, 3SS, 4SS, N3SS and DS by four aspects. They are (1) average number of search point (Ns) per block and its speedup ratio (SpUp) with respect to the FS; (2) average MAD per pixel; (3) average distance from the true motion vector per block, and (4) probability of finding the true motion vector per block. The “true” motion vectors are regarded as those found in FS. As the frame sizes of CCIR601 sequences are four times than that of SIF/CIF, motion displacements become larger and results in lower performance in the last two aspects. The following part will present the performance comparison using SIF/CIF format without explicitly notification. It is noted that similar performance using RDS can be found in CCIR601’s. Table V shows the proposed RDS algorithm always consumes the smallest number of search points with smaller or slightly higher in MAD as compared to other fast BMA. As compared to DS, RDS saves about 4.73 search points per block. When compared to 4SS, it also saves about 6.65 search points per block. The average Ns per block with the observations, RDS < DS < 4SS < N3SS < 3SS < FS, is manifested for sequences with w = ±7. With such speed improvement, the experimental results show that the RDS algorithm still achieves smaller distortion error than that of 3SS, 4SS and DS in terms of MAD per pixel while it keeps comparable results against N3SS. Similarly, RDS usually performs better than 3SS, 4SS and DS in terms of average distance from and probability of finding the true motion vector, as well as comparable against the N3SS. For the CCIR601 sequence “Tennis”, the RDS algorithm saves about 1.73 and even up to 11.68 search points per block, as compared to DS and 4SS respectively. In addition, it gives smaller distortion error, smaller distance and higher probability than that of 3SS, 4SS and N3SS, and provides similar results to that of DS. This implies the rood-center-biased characteristics of RDS with unrestricted-steps feature outperforms the squarecenter-biased N3SS, especially when using larger w on sequences with larger dimension, in which candidate blocks may probably possess different motion information from different objects inside the search window. Thus, RDS is more robust than other BMA. Using CIF sequence “Miss America” BMA Points SpUp MAD Distance Prob. FS 204.283 1.000 1.929 0.000 100.000 3SS 23.451 8.711 2.032 2.527 54.510 4SS 18.334 11.142 2.030 2.506 53.500 N3SS 19.950 10.240 1.952 1.679 68.810 DS 16.408 12.450 2.020 2.429 55.650 RDS 11.681 17.488 1.961 1.832 64.920 Using CCIR601 sequence “Tennis” BMA Points SpUp MAD Distance Prob. FS 916.067 1.000 5.941 0.000 100.000 3SS 32.003 28.624 6.790 3.467 63.880 4SS 27.150 33.741 6.473 2.326 71.280 N3SS 22.997 39.834 6.598 2.737 64.730 DS 17.201 53.257 6.365 1.933 75.570 RDS 15.470 59.216 6.398 1.960 74.920 Table V. Performance comparison of RDS.

The speed/MAD improvement in percentage of the proposed RDS over DS are tabulated in Table VI. For the highly roodcenter-biased video conferencing sequences such as “Miss America” and “Sales”, the RDS achieves 37-40% speed improvement over DS. For a relative higher degree of motion sequences with similar dimensions, “Garden” and “Football”, it gives about 12-25% speed improvement. For the CCIR601 sequence “Tennis”, about 11% of speed improvement is accomplished. Besides, there is also improvement in MAD if the sequences contain small motion, for examples, about 3% gain in quality using sequences “Miss America”; otherwise, vigorous motion content like “Football” or larger sequences with large search window could reasonably introduce a slight degradation in quality with less than 0.5%. MissA Sales Garden Football Tennis CCIR601 (CIF) (CIF) (SIF) (SIF) A 40.466 37.102 11.557 24.672 11.189 B -2.921 -1.135 -0.491 0.394 0.518 Table VI. A: “Average speed improvement rate (%)” and B: “average MAD changed (%)” of RDS over DS.

5. CONCLUSION In this paper, the novel rood-center-biased characteristics of motion vectors distribution is exploited and compared against the diamond-center-biased and square-center-biased ones. With the rood-center-biased behavior in most of the real-world sequences, a novel rood-diamond search algorithm is proposed. The proposed algorithm uses a rood-shaped search pattern with 9 relatively higher probable search points located in the central 5×5 area as the initial step, and diamond-shaped patterns as the subsequent steps. It also employs halfway-stop technique to find small motion vectors with fewer search points. Experimental results show that the proposed algorithm not only maintains similar or even smaller distortion error but also improves the searching speed by up to 40%, as compared to the diamond search algorithm. The rood-diamond search algorithm outperforms other fast block-matching algorithms and is more robust, and hence it is suitable for a wide range of video applications such as low-bit-rate videoconferencing.

ACKNOWLEDGMENT The work described in this paper was substantially supported by a grant from the City University of Hong Kong, Hong Kong, China. [Project No. 7001129].

REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

IV - 3400

T. Koga, K. Iinuma, A. Hirano, Y.Iijima, and T. Ishiguro, “Motion compensated interframe coding for video conferencing”, in Proc. Nat. Telecommun. Conf., New Oreleans, LA, Nov 1981, pp. G5.3.1-G5.3.5. R. Li, B. Zeng, and M. L. Liou, “A new three-step search algorithm for block motion estimation”, IEEE Trans. Circuits Syst. Video Technol., vol. 4, no. 4, pp. 438-443, Aug 1994. L. M. Po and W. C. Ma, “A novel four-step search algorithm for fast block motion estimation”, IEEE Trans. Circuits Syst. Video Technol., vol. 6, no. 3, pp.313-317, Jun 1996. L. K. Liu and E. Feig, “A block-based gradient descent search algorithm for block motion estimation in video coding”, IEEE Trans. On Circuits Syst. Video Technol., vol. 6, no. 4, pp.419423, Aug 1996. J. Y. Tham, S. Ranganath, M. Ranganath and A. A. Kassim, “A novel unrestricted center-biased diamond search algorithm for block motion estimation”, IEEE Trans. Circuits Syst. Video Technol., vol. 8, no. 4, pp.369-377, Aug 1998. S. Zhu and K.K Ma, “A new diamond search algorithm for fast block-matching motion estimation”, IEEE Trans. On Image Processing, vol. 9, no. 2, pp.287-290, Feb 2000. J. R. Jain and A. K. Jain, “Displacement measurement and its application in interframe image coding”, IEEE Trans. Commun., vol. COM-29, pp.1799-1808, Dec. 1981.