A Fuzzy Video Rate Controller for Variable Bit Rate ...

11 downloads 0 Views 735KB Size Report
Abstract— In this paper we propose a new fuzzy video rate control algorithm (RCA) for variable bit rate (VBR) video applications. An adaptive neuro-fuzzy ...
A Fuzzy Video Rate Controller for Variable Bit Rate Applications Using ANFIS M. Shafei, M. Rezaei, S. Tavakoli, F. Mohanna

 Abstract— In this paper we propose a new fuzzy video rate control algorithm (RCA) for variable bit rate (VBR) video applications. An adaptive neuro-fuzzy inference system (ANFIS) has been used in design of the proposed RCA. The proposed RCA provides high quality compressed video with a low computational complexity. By controlling the quantization parameter (QP) on a picture basis, it produces VBR video bit streams. The proposed RCA has been impalement in the JM H.264/AVC video codec and the experimental results show that it produces a high level average quality for encoded video while strongly following the buffering constraint. Index Terms— ANFIS, bit rate, coding, fuzzy, rate control, variable, video.

I. INTRODUCTION

v

ARIABLE bit rate video applications have constrain

which are significantly different from those for constant bit rate (CBR) applications. For example, while in real-time video conversation, a constant, short-term average bit rate is required to ensure low delay, in streaming applications, a constant long-term average bit rate is sufficient and a major short-term variation in bit rate is acceptable. VBR video can provide better visual quality and compression performance in comparison with constant bit rate video, for most video contents [1]. A great deal of attention has been paid to video rate control over the past two decades. As a usual approach, the control algorithms operate in two steps. In the first step, a bit budget is allocated to a video segment such as group of pictures (GOP), frame, and macro block (MB) according to practical constraints and video properties. According to the allocated bit budget and the coding complexity of video, a quantization parameter (QP) is computed in the second step. A rate-distortion (R-D) model that is derived analytically or empirically utilized for computation of QP. The R-D model parameters are updated according to encoding results.

M.Shafei is MSc student in the Faculty of Electrical & Computer Engineering, University of Sistan and Bluchestan, Iran, ([email protected]). M.Rezaei, S.Tavakoli and F.Mohanna are assistant professors with the Faculty of Electrical & Computer Engineering, University of Sistan and Baluchestan, Iran, ([email protected], [email protected], [email protected]).

The RCA of the Joint Model (JM) [2] reference software of H.264/AVC creates streams satisfying the available bandwidth provided by a channel and is also compliant to the standard hypothetical reference decoder (HRD). It consists of a tight control at three levels as follows: GOP level, picture level and an optional Basic Unit level. Groups of successive MB in the same frame describe the basic unit. At the picture level, a secondorder R-D model which is parameterized based on MAD (mean absolute difference) of the residual frame after motion compensation is used for computation of the QP for a reference picture. The MAD of the current frame or MB is only available after the rate distortion optimization (RDO) process while the RDO is performed based on QP. This chicken-and-egg dilemma is solved by the prediction of MAD from the previous frames. In a new approach, Rezaei et al. proposed a semi-fuzzy RCA with buffer constraint [3]. They do not use any R-D model directly. However, they use the theoretical and practical results of the previous usual rate control approach. The proposed algorithm in [3] has a low degree of complexity. Since their algorithm does not use any R-D model, there is no need to update the R-D model parameters based on a complexity measure such as MAD. The semi-fuzzy RCA calculate a QP for the each picture only based on results of previous encoded pictures. Therefore, the chicken-and-egg dilemma for H.264/AVC does not exist anymore. The proposed RCA in [3] utilizes a fuzzy controller in combination with several other classical controllers. The fuzzy controller has been designed and tuned empirically. The fuzzy controller and a quality controller calculate the QP of Ppictures while the QP of I-pictures is computed by another controller. In this research the goal is to design a fuzzy video RCA utilizing an ANFIS system which is a more straight forward approach than empirically tuning the fuzzy system. As the first step, in this paper we design a fuzzy controller corresponding to the only fuzzy part of the proposed RCA in [3] utilizing an ANFIS system. The ANFIS system was implemented and trained in MATLAB software. Then, the resulted tuned fuzzy controller was implemented on the H.246/AVC JM reference software. Simulation results show a high performance for the obtained fuzzy video rate controller. This paper is organized as follows; Section II presents the detailed description of the proposed video RCA utilizing ANFIS systems. Simulation results are provided in Section III. The paper is concluded in Sections IV.

Page 169 /183

II.

RATE CONTROL ALGORITHM

th

In the proposed RCA, the QP for P-pictures is defined by a fuzzy controller and a quality controller. Fig. 1 depicts the block diagram of proposed rate control system for the Ppictures. The fuzzy controller, the quality controller and the virtual buffer are the basic parts of the control system. The fuzzy controller attempts to control the bit rate of the encoded bit stream by controlling the variation of QP while it has been optimized such that to prevent unnecessary fluctuation of QP. In computation of QP, it is assumed that the consequent video pictures have a similar degree of complexity (except in scene cuts) so the complexity of the previous encoded picture is used as estimate for the complexity of the subsequent picture and the QP of the subsequent picture is computed based on the QP of previous encoded picture with small variation which is defined by both the fuzzy and the quality controllers. The fuzzy controller uses two feedback signals from the buffer fullness and from the bit rate. The quality controller utilizes a feedback signal from the quality (PSNR) of encoded video to minimize the fluctuation in quality. Furthermore, a low pass filter (LPF) smoothes the feedback signal from the rate to the fuzzy controller to smooth the variations in the output of fuzzy controller. The QP for the current P-picture  QP  is the sum of the QP used for encoding previous picture and the output of the fuzzy controller  QF  added to the output of quality





controller QQ or

QP  i   QP  i 1  QF  i   QQ  i  .

(1)

From the system point of view, the main part of computed QP for a P-picture is the delayed version of QP used for previous picture and the control (variation) of QP is provided by the fuzzy controller and the quality controller. From the R-D calculation point of view, the R-D of previous encoded pictures is used as reference for the next picture and small deviation from the reference point is computed. The main advantage of this approach is that in the small range around the reference point, the all non-linear functions that exist in the system can be assumed as linear without losing the computational precision. More details about the RCA parts are presented in the sequel. A. Virtual Buffer The virtual buffer used by the controller simulates the buffering process of the decoder in the receiving side of a CBR channel. Although it utilizes a simple model, it is nearly identical to the hypothetical reference decoder models used in different video coding standards. The occupancy of virtual buffer is updated after encoding each video picture as

OB  i 1  OB  i   B  i    RT / F 

(2)

Where O B  i  denotes the occupancy of virtual buffer before encoding i

th

picture.

B  i  shows the number of bits

consumed by the i encoded picture (P or I). R T Indicates the target average bit rate for the bit stream or the channel bandwidth and F stands for the frame rate. Note that the virtual buffer models the decoder buffer at the receiver side. Therefore, the occupancy of this buffer corresponds to the free space of a buffer at the encoder or transmitter side.

Video encoder

Virtual buffer

Delay

+ LPF

Quality Controller

Fuzzy Controller

Fig. 1. Block diagram of the RCA for P-picture.

B. Fuzzy Controller The fuzzy controller has two input signals that are normalized values of the buffer occupancy and the actual bit rate of ppictures. Buffer occupancy is normalized by the buffer size and the actual bit rate of P-pictures is normalized by the target bit rate for P-pictures. While in VBR the consumed bit budget by P-pictures can be very different from the consumed bit budget by I-pictures, depending on the frequency of I-pictures in the bit stream, the target bit rate of P-pictures can be very different from the whole target bit rate. It is attempted to estimate a precise value for the target bit rate of P-pictures to be used for the normalization purpose. The fuzzy inputs are defined as

1  2 

OB SB

BPF RT

 X IP  1  1   II  

(3)

(4)

Where BP denotes the consumed bit budget by the previous encoded P-picture. II stands for the interval of periodic Ipictures in the bit stream in terms of number of pictures. X IP indicates the coding complexity of I-pictures relative to Ppictures. To suppress the fluctuation of QP results of short-term variations in complexity of video pictures, the LPF smoothes the variation of BP before input to the fuzzy controller. The numbers of 9 and 7 membership functions (MSFs) for the two inputs 1 and 2 were employed. The linguistic fuzzy rules and MSFs were designed based on provided experiences form previous works. The detailed information of

Page 170 /183

MSFs and the desired central values for the output of fuzzy system correspond to the fuzzy rules required for the implementation is presented in the [3].The output of fuzzy system is passed through a gain control block that adaptively tunes the gain of feedback loop according to the buffer size (or delay) and the video content properties as

 1 ,  2 

Q F  0.5  RT / S B  f

(5)

Where f  1, 2  is A well-known and simple fuzzy system with two inputs using “product inference engine,” singleton fuzzifier, and center-average defuzzifier, as in [4], was used. N1 N2

 y f  1 , 2  

i1i 2

 A i1  1  .A 1

2

i2

2 

i1 1 i2 1 N1 N 2

 

(6) A1i1

 1  .A   2  2

i2

i1 1 i2 1

Where

A , A ,..., A  1 i

2 i

A i1  1 1i N and 1

1

1

Ni i

are

fuzzy

sets

with

i 1,2

A i2  2 1i N 1

2

membership 2

the following pictures in VBR video. Therefore, the QP of Ipictures has an important role from the R-D point of view and it should be computed carefully. The QP of I-picture is computed based on several parameters in [3]. While in this paper the goal is to redesign only the fuzzy controller which used for computing QP of P-pictures, we use a simplified version of the proposed algorithm for calculation of QP of Ipictures in [3]. The QP of a periodic I-picture is computed by implementing a LPF on QPs of previous encoded pictures using as in [3].However, using a similar QP for encoding the Ipicture and the neighboring P-pictures provides a higher quality for the I-picture than the p-pictures. This difference is acceptable and it is useful for overall quality. The LPF prevents larger differences that can be existed between the quality of Ipicture and P-pictures. The QP of non-periodic I-pictures inserted at scene cuts computed in another way. I-pictures at scene cuts may or may not have correlation in terms of complexity and/or content with the previous encoded pictures. Therefore, any estimation independently of previous encoded frames or only based on the previous encoded frames may lose the bit budget or the quality. From this point of view estimating a fit QP for an I-picture at scene cut is quite challenging As a simple solution, the reference QP for the first I-pictures at scene cut is calculated as

functions defined for inputs 1 and 2 , respectively. The center of output fuzzy set, denoted by yi1i2 , is chosen as the output desired value. More information about the derivation steps of the fuzzy system is presented in [4] and [5]. C. Quality Controller The quality controller computes an additive term to the final QP based on the quality of previous encoded picture and a local average quality on encoded pictures. The idea is while the fuzzy controller provides the buffer constrain. The quality controller minimizes the variation in quality of encoded video by using the available buffer space. To compute the quality QP or Q Q , the average values of QP and PSNR are considered as a reference point. Then, it is attempted to drive the PSNR of following pictures to the reference point with a gain proportional to the current deviation from the reference point. The quality QP used in (1) is computed by





QQ   Q PSNR  PSNR ,

 1  Q  1

(7)

Where Q the average QP over encoded frames in the current Scene. PSNR and PSNR are the PSNR of previous encoded frame and the average PSNR over the encoded frames in the current scene, respectively. The PSNR values are computed based on the luminance component. The  is a constant coefficient that defines the gain of quality feedback loop. D. QP Calculation for I-Pictures The quality of I-pictures has a great impact on the quality of

Q R  Q  Q m  / 2

(8)

Where Q is a local average as for frequent I-pictures and Q m is a constant QP in the middle range, e.g., (26-34) for H.264/AVC, as a global average over various video content. The local average value of QP or Q keeps the quality of the I-picture close to those of previous encoded pictures when there is some correlation between the two consequent scenes in terms of content. The Q m guarantees the allocation of a bit budget in the middle range if there is no correlation between consequent scenes in terms of complexity.

III. SIMULATION RESULTS To redesign the proposed semi-fuzzy RCA in [3] first we implemented a simplified version of that on the JM h.264/AVC reference software version 61 as described in Section II. To verify the performance of implemented RCA, a number of the known video sequences including Foreman, Carphone, Hall, and Football with QCIF picture format were concatenated to make (30 s) long sequences suitable for the test. To evaluate the described RCA from the quality and delay points of view, we compared the encoding results of the algorithm with the results of constant QP (CQP) encoding and also with the encoding results provided by H.264/AVC JM RCA. The long video sequences were encoded by the three algorithms for an average bit rate close to 300 kbps, frame rate of 30 fps, and an IDR frequency of 300 for I-pictures. The buffer size of 250 kb it was allocated to the RCAs in this

Page 171 /183

TABLE II COMPARISON THE RESULTS OF FUZZY RATE CONTROLLER AND THE ANFIS

TABLE I COMPARISON THE RESULTS OF FUZZY RATE CONTROLLER WITH THE CONSTANT QP CASE AND JM RCA IN A TARGET BITRATE ABOUT 300 kbps, 30 fps and QCIF Picture Format SEQ .

foreman

carphone

Hall

Football

Average

RCA

QP

bitrate

PSNR 'db'

CQP Fuzzy JM CQP Fuzzy JM CQP Fuzzy JM CQP Fuzzy JM CQP Fuzzy JM

24 23.93 24.39 24 23.51 23.93 20 20.06 20.57 32 31.69 31.78 25 24.8 25.12

284.66 294.69 279.74 309.32 309.32 299.75 342.49 342.49 340.21 312.18 312.18 300.85 312.16 314.67 304.96

39.57 39.81 39.59 40.52 40.95 40.7 43.04 43.20 42.86 33.16 33.49 33.40 39.07 39.36 39.14

SEQ .

football hall

news foreman FNews Average

RCA

PSNR

QP

Fuzzy ANFIS Fuzzy ANFIS Fuzzy ANFIS Fuzzy ANFIS Fuzzy ANFIS Fuzzy ANFIS

33.24 33.23 42.89 42.89 39.48 39.50 35.50 35.51 37.49 37.49 37.72 37.724

32.151 32.154 20.56 20.56 25.8 25.77 29.781 29.780 25.156 25.156 26.69 26.68

provided very similar results. This similarity proves the accurate implementation of ANFIS system and its training.

simulation. The level 3 of baseline profile, with R-D optimization RDO was used for the implementation of the three RCAs. The number of reference frames was set to 1, the number of bytes per slice was set to 1000, and other encoding parameters were used as default. To achieve similar average bit rates in three cases, each sequence was first encoded by a constant QP to get an average bit rate close to 300 Kb/s in the CQP case and then the target bit rate in the fuzzy RCA and JM RCA was set to the average bit rate resulted by the COP encoding. Table I shows the results of simulation. The averaged results are also presented in the table. In comparison with the CQP case and the JM RCA, the fuzzy algorithm has provided a higher average PSNR and a smaller average QP. Although the proposed semi-fuzzy RCA in [3] was implemented on another H.264/AVC codec, simulation results presented in Table I are in conformance with reported results in [3]. Regarding the semi-fuzzy RCA, the computation method of QP of I-pictures has been simplified in our implementation. The effect of this simplification has not been reflected in the results because of the large IDR frequency used in the simulation. To redesign the fuzzy controller by ANFIS the ANFIS system were implemented in the MATLAB software. For fine tuning the ANFIS system we need a large amount of training data that could be provided by a high performance RCA with buffering constraint. We used the our implemented fuzzy RCA described in Section II as a know available RCA to produce training data. A lot of different video contents were encoded by the fuzzy RCA to provide training data for fine tuning the ANFIS system. The final distributions of MSFs are shown in the Fig. 2. These MSFs are very near to MSFs in [3]. Then, the resulted tuned fuzzy controller was implemented on the H.246/AVC JM reference software. A simulation was run to compare the tuned fuzzy controller by ANFIS and the empirically tuned fuzzy controller used in the semi-fuzzy RCA. The results of comparison are depicted in Table II. The two controllers have

Fig. 2. Membership functions of the linguistic variables

The whole MSFs were obtained from ANFIS, are summarized in two matrixes corresponding to two inputs as shown below:

Page 172 /183

 0  0.08   0.16   0.26 M SF1   0.3799   0.5188  0.6788   0.826   0.92

0.01

0.08

0.12 0.2

0.16 0.26

0.3 0.4201

0.3801 0.5244

0.5635 0.7207

o.6807 0.82

0.856 0.95

0.926 0.99

0.12  0.2  0.3   0.4247  0.5613   0.7207  0.85   0.956   1.01 

0.01 0.35 0.45   0  0.35  0.45 0.5501 0.6501    0.55 0.65 0.7501 0.8519   MSF2  0.7492 0.8498 1.15 1.251   1.151 1.25 1.4 1.5    1.65 1.75   1..399 1.5  1.65 1.75 1.99 2  

IV. CONCLUSION A new fuzzy video rate control algorithm (RCA) for variable bit rate was proposed. It can be easily tuned for a wide range of

applications with various target delays. The proposed video RCA has been optimized to utilize the advantage of VBR video to improve compression performance and maintain constant quality. References [1] [2]

[3]

[4] [5]

T. V. Lakshman, A. Ortega, and A. R. Reibman, “VBR video: Trade-offs and potentials,” IEEE Proc., vol. 86, no. 5, pp. 952-973, May 1998. G. Sullivan, T. Wiegand, and K. P. Lim, “Joint model reference encoding methods and decoding concealment methods,” Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG Document JVT-I049, Sep. 2003. Mehdi Rezaei, Member, IEEE, Miska M. Hannuksela, Member, IEEE, and Moncef Gabbouj, Senior Member, IEEE "Semi-Fuzzy Rate Controller for Variable Bit Rate Video" L. X.Wang, Adaptive Fuzzy System and Control: Design and Stability Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1994. L. X.Wang, “Stable adaptive fuzzy control of nonlinear systems,” IEEE Trans. Fuzzy Syst., vol. 1, no. 2, pp. 146–155, May 1993.

Page 173 /183