Optimal Bit Allocation Strategy for Joint Source

Optimal Bit Allocation Strategy for Joint Source/Channel Coding of Scalable Video by Gene Cheung

Research Project Submitted in partial ful llment of the requirements for the degree of Master of Science, Plan II in Electrical Engineering and Computer Sciences in the GRADUATE DIVISION of the UNIVERSITY of CALIFORNIA at BERKELEY June 1998

2

Committee in charge: Professor Avideh Zakhor, Research Advisor Professor Steven Mccanne

The report of Gene Cheung is approved:

Chair

Date

Date

University of California at Berkeley 1998

Optimal Bit Allocation Strategy for Joint Source/Channel Coding of Scalable Video

Copyright 1998 by Gene Cheung

Optimal Bit Allocation Strategy for Joint Source/Channel Coding of Scalable Video by Gene Cheung

Abstract

We propose an optimal bit allocation strategy for a joint source/channel video codec over noisy channel when the average channel state is assumed to be known. Our approach is to partition source and channel coding bits in such a way that the expected distortion is minimized. The particular source coding algorithm we use is rate scalable and is based on 3D subband coding with multi-rate quantization. We show that using this strategy, transmission of video over very noisy channels still renders acceptable visual quality and outperforms schemes that use equal error protection only. The exibility of the algorithm also permits the bit allocation to be selected optimally when the channel state is in the form of a probability distribution instead of a deterministic state.

iii

Contents List of Figures 1 Introduction 2 Scalable Video Coder 3 Joint Source/Channel Coding

3.1 Source / Channel Bit Allocation . . . . . . . . . 3.2 Lagrange Multipliers . . . . . . . . . . . . . . . 3.3 Linear Approximation of Lagrangian Functions 3.3.1 Algorithm . . . . . . . . . . . . . . . . . 3.3.2 Diculties of Algorithm . . . . . . . . . 3.4 Generalized Gersho-Shoham Algorithm . . . . . 3.4.1 Gersho-Shoham Algorithm . . . . . . . . 3.4.2 Generalized Gersho-Shoham Algorithm . 3.4.3 Initialization . . . . . . . . . . . . . . . . 3.4.4 Pivot Point Selection Method . . . . . . 3.5 Hybrid Algorithm . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

iv 1 3 4

5 7 8 8 10 11 11 14 23 23 25

4 Implementation

26

5 Conclusions

33

Bibliography

46

4.1 Rate Distortion (RD) Functions . . . . . . . . . . . . . . . . . . . . . 4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Channel Mismatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 .2 .3

Lagrange Multiplier Error Bound . . . . . . . . . . . . . . . . . . . . Proof of Linear Approximation Algorithm . . . . . . . . . . . . . . . Proof of Generalized Gersho-Shoham Algorithm . . . . . . . . . . . .

26 28 31

33 34 35

iv

List of Figures 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.1 4.2 4.3 .1 .2 .3 .4

Example of linear Lagrange Multipliers search . . . . . . . . . Example of 1-D Lagrange multiplier problem . . . . . . . . . . Example of 1-D Singular Value Multipliers . . . . . . . . . . . Operational Source Rate vs. Multiplier value . . . . . . . . . . Geometrical interpretation of Singular Values in 2-D . . . . . . Line and Region of Eligibility for GGS Algorithm . . . . . . . Change of Basis for GGS Algorithm . . . . . . . . . . . . . . . Example of Pivot change under Triangular and Quadrangular Pivot Point Selection Algorithm . . . . . . . . . . . . . . . . . Original distortion functions . . . . . . . . . . . . . . . . . . . Procession of Generalized Gersho-Shoham Algorithm . . . . . Experimental Results . . . . . . . . . . . . . . . . . . . . . . . Example of Change of Basis . . . . . . . . . . . . . . . . . . . Proof of PPSA: case 1 . . . . . . . . . . . . . . . . . . . . . . Proof of PPSA: case 2 . . . . . . . . . . . . . . . . . . . . . . Proof of Theorem 2.10 . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

10 11 12 14 16 17 18 21 24 28 30 31 40 42 43 44

v

Acknowledgments I am grateful to many friends and colleagues who have made this thesis possible. First, I thank Professor Avideh Zakhor and Professor Steve Mccanne for their guidance and support as members of the committee. I thank Daniel Tan for reading the rst draft of this thesis. I am indebted to my parents, who have provided constant support from the beginning. Lastly, I would like to thank Kumiko Wada, with whom I have endured many trying times, and enjoyed many good ones too. The Lord is my light and my salvation{ whom shall I fear? Psalms 27:1.

1

Chapter 1 Introduction The advent of wireless personal communication services in recent years has created a number of challenging research problems in the areas of communications, signal processing and networking. A major challenge presented by the wireless channel is its inherent unreliability. This contrasts with wired networks, which have very low error rates, e.g. of the order of 10?9. In a large class of wireless video applications, users move at relatively slow speeds, rather than at tens of miles per hour. Consequently, the resulting channels suer from slow fading and shadowing eects. By estimating the condition of this slowly changing channel, one can adapt the source coding, modulation, channel coding, power control, or any other aspect of the transmission scheme to the channel condition. In particular, one can change the source coding and the channel coding algorithms according to the channel condition in such a way as to minimize the distortion of the received signals. Because the importance of dierent bits within a bitstream often varies, one can protect dierent source bits using unequal error protection (UEP) schemes such as Rate Compatible Punctured Convolutional (RCPC) codes [5] to further enhance performance. Indeed, several researchers have applied this idea to speech [1, 7] and image [6, 2, 8] transmission over wireless links. With the exception of [1] which deals with speech, the remaining papers mentioned above explicitly require the source coder to adapt to the channel condition. As an example, in [8] a whole new codebook might have to be designed and used in order to optimally accommodate each new channel

2 condition. With highly scalable video compression schemes such as [3], it is possible to generate one compressed bitstream such that dierent subsets of the stream correspond to the compressed version of the same video sequence at dierent rates. Thus, if one uses such a source coder in the wireless scenario, there is no need to change the source coding algorithm, or any of its parameters, as the channel conditions change. This is particularly attractive in heterogeneous multicast networks where the wireless link is only a small part of a much larger network, and the source rate cannot be easily adapted to the individual receiver at the wireless node. In this paper, we develop a technique for optimum partitioning of source and channel coding bits for the scalable video compression algorithm described in [3] and an unequal error protection channel coding scheme. By \optimum", we mean a partitioning which results in minimum expected value of distortion, which we choose to be Mean Squared Error (MSE). We will consider the case where the header les of the encoded bit stream are protected adequately so there is no loss of synchronization, and the average channel state information (CSI) is known. Under these conditions, the joint source/channel codec will adapt to the estimated channel and optimally transmit video for the current channel state. In Chapter 2, we brie y describe our source coding algorithm. Chapter 3 formulates our problem and describes our proposed algorithm. Chapter 4 discusses the speci c implementation issues and includes results. Chapter 5 provides a conclusion.

3

Chapter 2 Scalable Video Coder The source coder we use in this paper is the scalable coder described in [3]. This coder has been shown to generate rates anywhere from tens of kilo bits to few mega bits per second with arbitrarily ne granularity. In addition, its compression eciency has been shown to be comparable to standards such as MPEG-1 [3]. The fundamental idea is to apply three dimensional subband coding to the video sequence to obtain a set of spatio-temporal subbands. Subsequently, each subband coecient is successively re ned via layered quantization. Finally, conditional arithmetic coding is applied to code dierent quantization layers. In doing so, the spatial, temporal, and inter-subband correlations, as well as correlation between quantization layers are accounted for to maximize compression gain. The problem of optimal source bit allocation among dierent quantization layers of dierent subbands in the absence of channel errors has been discussed in [3].

4

Chapter 3 Joint Source/Channel Coding A compressed signal sent through an unreliable channel is corrupted by two types of errors: quantization errors due to lossy source coding and channel errors due to channel noise. Theoretically, if there is no delay constraint and the channel state is known perfectly, the design of source coder and channel coder for point-to-point transmission can be done separately while remaining optimal; by using large channel coding blocks to average out unlikely error events, the rate controlled source bits can be received with negligible errors. The only received errors at the receiver will then only be the quantization errors. For multicast transmission of delay sensitive signals such as video, however, the precise theoretical channel capacity is not known, and the separate source/channel coding theorem does not hold; as such, one must jointly design the source and channel coders to achieve optimal performance. In this chapter, we will consider a subclass of the general joint source/channel coding problem; instead of altering the actual implementations of the source and channel coder to adapt to the transmitting environment, we use a exible source codec, i.e. the rate scalable 3D subband coder in [3], and a unequal error correction channel codec i.e. RCPC, as building blocks. The optimal scheme is then to extract dierent subsets from the source bitstream and to protect them unequally for optimal transmission.

5

3.1 Source / Channel Bit Allocation The main problem we solve is: for a given source coder and a given channel coder, what is the optimal distribution of source and channel bits in order to transmit video for a given total bit budget? We assume total bit budget of C and a memoryless channel with known average channel state; we will later consider both a binary symmetric channel and an additive white Gaussian channel. We need to nd the best source coding rate Rs and channel coding rate Rc such that C = Rs + Rc and the MSE is minimized. This is equivalent to nding the optimal source to channel bit ratio, Rs =Rc, with Rs + Rc = C , such that the distortion is minimized. To nd these minima for various CSI's, our approach is to rst empirically construct distortion curves D( C ?RsRs ) as a function of Rs for the particular video sequence under consideration and then to locate the minima of the resulting curves. target ) on the above curves, we must deIn computing each point (Rstarget ; D( C ?RsRtarget s termine how to distribute Rstarget source bits and C ? Rstarget channel bits among the subbands and quantization layers, such that the distortion, expressed below, is minimized. K K Rstarget ) = X 2 ] = X d (n ; m ) ^ DCSI1 ( C ? E [ j X ( n ; m ) ? X j k k k k k k k Rtarget s

k=1

k=1

(3.1)

See Table 1 for notations. To perform the optimization above, we need to construct the subband distortion functions, dk (nk ; mk )'s. Since each subband has integer number of quantization layers and there are integer number of channel protection levels, distortion functions are discrete. For a given distribution of channel bits in a subband, the total number of channel bits for that subband is:

mk =

n X k

i=1

mi;k

(3.2)

Dierent distributions of channel bits in a given subband may yield the same total number of channel bits in that subband. As a result, for given (nk ; mk ), d(nk ; mk )

6

K CSIi D() Xk nk mk mi;k ^ Xk (nk ; mk )

number of subbands Channel State Information for channel i signal MSE when Rs=Rc is kth subband component of original signal source bits used for kth subband channel bits used for kth subband channel bits used to protect bit i of subband k quantized and channel corrupted kth subband component of signal given nk , mk dk (nk ; mk ) distortion function of kth subband given nk , mk Table 3.1: table of notations

can take on more than one value. Since our goal to minimize distortion, we will take the smallest value of d(nk ; mk ) as the function's value at (nk ; mk ). To reduce the number of points needed to construct each distortion function, and thus to reduce the computational strains, we make the following assumptions: 1. Same level of protection will be applied to the all the bits within a quantization layer of any given subband. 2. Higher quantization layers, which constitute re nement layers, cannot have higher level of protection than lower layers. The rst assumption assumes that all source bits within a given quantization layer have the same importance. The second assumption does not aect optimality because higher layers cannot be decoded if lower layers are not received correctly. Thus it is intuitively clear that lower layers need to be protected at least as much as the subsequent higher layers. The optimization problem is now to nd optimal allocation of source and channel bits across subbands. For given Rstarget and dk 's: min

(nk ; mk )

nPK

k=1 dk (nk ; mk )

o

K X s.t. PKk=1 nk Rstarget mk C ? Rstarget k=1

In the next section, we will focus on solving (3.3).

(3.3)

7

3.2 Lagrange Multipliers Instead of solving the primal problem, stated in (3.3), we will use Lagrange Multipliers and solve the dual problem instead. The dual problem will be: min

(X K

k=1

dk (nk ; mk ) + nk + mk

)

(3.4)

If there exists multipliers and such that the source and channel bit budgets in (3.3) are satis ed with equality, then the optimal solution to the dual problem is also the optimal solution to the primal problem [4]. The appeal of solving the dual problem instead is that the problem is now unconstrained. Moreover, for a given and , we can nd the set fnok g and set fmok g that minimizes (3.4) by solving K separate equations individually in the form: min

(nk ; mk ) fdk (nk ; mk ) + nk + mk g

for k = 1; : : : ; K

(3.5)

The resulting source and channel rate will be called the operational source and channel rate expressed below: K K X X Rco = mok (3.6) Rso = nok k=1

k=1

The problem is to nd and such that the resulting (Rso; Rco) from (3.4) will be the same as our target source and channel rate (Rstarget ; C ? Rstarget ). In cases where such multipliers do not exist, then we have to settle for an approximate solution. Although solution is not optimal, the error is bounded by the theorem in Appendix .1, similar to the one stated in [4], and in practice such errors are often negligible. The distortion functions dk (nk ; mk )'s are empirically computed discrete nonlinear functions; most general bit allocation problems do involve such genre of functions. One approach is to t analytic continuous functions to these discrete functions, and to perform optimization on the continuous counterpart. In this paper, we provide a method for selecting the Lagrange multipliers without tting analytic continous functions to the distortion functions. To search for the best possible multipliers, we develop two methods in the next two sections. The rst method, described in section 3.3, converges very fast when

8 it is far from the solution, but performs poorly when near the solution. The second method described in section 3.4, converges slow when it is far from the solution, but converges to the ideal or approximate solution eciently when near it. We will then discuss a hybrid of the two solutions in the section 3.5, which provides an ecient algorithm for nding an approximate solution in nite time.

3.3 Linear Approximation of Lagrangian Functions We will begin with the description of the algorithm in section 3.3.1. Then we will discuss the diculties with the algorithm in section 3.3.2. See Appendix .2 for proof of the algorithm.

3.3.1 Algorithm The rst method uses two properties of the rate functions to search for the multipliers: i) rate function Rso (Rco ) is more sensitive to changes in its primary multiplier (), ii) rate functions are convex, non-increasing functions, and are inversely proportional to their multipliers. The rst property is based on empirical observations; the second property is a theoretical result of Lagrange Multipliers. With these properties, we can approximate operational source rate as function of source rate weight parameter only, (likewise for channel rate Rco and channel rate weight parameter ):

Rso = As 1 + Bs Rco = Ac 1 + Bc (3.7) where As and Bs are chosen constants to t empirical data points, ( 1 ; Rso)'s, to a linear function; similarly for Ac and Bc. The algorithm iteratively selects two points o ), and constructs a linear function in the form of (3.7). Given , ( 10 ; Rs;o 0) and ( 1t ; Rs;t the linear function, the algorithm computes the multiplier, , that will yield the target rate, Rstarget . Similar procedure is performed to nd targeting C ? Rstarget .

9 Using the computed multipliers, it nds the corresponding operational rate by solving (3.4). This gives another data point, and the algorithm repeats. The details of the algorithm are:

Step 0 Initialize iteration index: t := 0. Start with initial guess for multipliers, 0 , 0 . Step 1 Find corresponding operational Rs;o 0, Rc;o 0. If the condition: Rs;0 0 Rstarget does not hold, then:

0 := 1 0 s

Rc;0 0 C ? Rstarget 0 := 1 0 c

(3.8) (3.9)

where s ; c < 1. Repeat step 1. This is point a in Figure 3.1. Otherwise, update multipliers: 1 := s0 1 := c0 (3.10)

Step 2 Obtain Rs;0 1 and Rc;0 1. If the condition: Rs;0 1 Rstarget does not hold, then:

1 := s1

Rc;0 1 C ? Rstarget 1 := c1

(3.11) (3.12)

Repeat Step 2. This is point b in Figure 3.1. Otherwise, let t:=1.

Step 3 Notice now: o ] Rstarget 2 [Rs;o 0 ; Rs;t

o ] C ? Rstarget 2 [Rc;o 0 ; Rc;t

(3.13)

o ) and (0 ; Ro ), nd As and Bs that From the two sets of source data points, (t ; Rs;t s;0 connect these points in the form of (3.7). This is the dotted line that connects point a and b in Figure 3.1. Similarly, using the two corresponding sets of channel data points, nd Ac and Bc .

Step 4 For given As and Bs, nd t+1 that yields Rstarget . This is 12 in Figure 3.1 in the o o rst iteration. Similarly, nd t+1 that yields C ? Rstarget . Get Rs;t +1 and Rc;t+1 by using t+1 and t+1 as multipliers to solve (3.4). This is point c in Figure 3.1.

10 o

Rs o

b

c

Rs,1 o Rs,2 target Rs

optimal point

o

Rs,0

a

1/λ0

1/λ3

1/λ2

1/λ1

1/λ

Figure 3.1: Example of linear Lagrange Multipliers search

Step 5 Check exit conditions: target target o if jRs;t +1 ? Rs j s Rs ; target target o and jRc;t +1 ? (C ? Rs )j c (C ? Rs ); exit else

where s; c < 1.

t := t + 1

goto step 3

3.3.2 Diculties of Algorithm In practice, two problems prevent the algorithm from reaching the optimal solution. The rst one is that the assumption of the operational source (channel) rate, Rso (Rco) is function of its primary multiplier only, (), is only a local approximation. Rso can be approximated as function of only when changes in is small. Throughout the procession of the algorithm, however, it is inevitable that will change. One remedy is to restart the algorithm every N iterations with N and N as initial starting multipliers. This way, (0; Rs;o 0) is updated with current values of . Similar procedure is performed to update (0; Rc;o 0). The other more serious drawback is that the source and channel rate are in practice discrete rather than continuous functions with respect to their multipliers. As the algorithm approaches the optimal solution (; ), the discontinuity of the discrete source and channel functions may create diculties for it. We address this problem with an improved algorithm presented below.

11 d1(n1) + λ1 n1

d2(n) +λ1 n2

d2min

d1min

λ1

n1

λ1

n2

n1*

a)

n2*

b)

Figure 3.2: Example of 1-D Lagrange multiplier problem

3.4 Generalized Gersho-Shoham Algorithm Our proposed algorithm is a generalization of Gersho-Shoham Integer Programming algorithm [4] which yields optimal or near-optimal solutions to the source bit allocation problem mentioned in Section 2. We will begin with a brief review of Gersho-Shoham algorithm in section 3.4.1. In section 3.4.2, we will de ne our generalized algorithm. See Appendix .3 for proof of the algorithm.

3.4.1 Gersho-Shoham Algorithm Gersho-Shoham Algorithm is an integer programming algorithm for the source bit allocation problem. The algorithm solves the dual of the optimization problem by nding the best possible Lagrange Multiplier; the algorithm always terminates with an optimal solution or an error-bounded approximate solution. Unlike previous bit allocation algorithms, this algorithm addresses a more general class of problems because it does not t the subband distortion curves to continous analytic functions, nor does it make any assumptions about the nature of the distortion curves such as convexity. It is important to understand the geometrical interpretation of Lagrange multiplier problems. (3.14) can be viewed as minimizing the sum of distortions of all subbands plus a penalty function Q() = PKk=1 nk .

12 d(n) + λ1 n

d(n)

d(n) + λ2n

dmin λ1

dmin

n1

n2

λ1

n

n1

a)

n

n2

b)

λ2

n n2

n3

c)

Figure 3.3: Example of 1-D Singular Value Multipliers min

(X K k=1

dk (nk ) + nk

)

(3.14)

The penalty function translates into adding a penalty line of slope to each of the K distortion functions. Figure 3.2 shows that for a given multiplier 1 , we found minima in subband 1 and 2 to be n1 and n2 . If these two are the only subbands, then the operational source rate for this multiplier 1 is Rso = n1 + n2 . If Rso is smaller than our budget rate Rs, then we decrease the multiplier value to decrease the eect of the penalty function. Geometrically, that would mean decreasing the slope of the penalty lines. Gersho-Shoahm Algorithm addresses the problem of exactly how the multiplier, or the slope of the penalty lines, should be adjusted. The crux of the algorithm is the notion of singular values { a special set of multiplier values such that the optimal set of solutions is non-unique. Geometrically, such multipliers create a slope on the subband distortion curves such that a pair of adjacent convex-hull data points on a particular distortion curve are simultaneously minimum. Figure 3.3a shows the original distortion function with respect to source bits. To make n1 and n2 simultaneously minimum, we rst nd the slope 1 = ? d(nn11)??nd(2n2 ) . Figure 3.3b shows the eect of adding a line of slope 1 to d(n) { such that there are now two minima in the distortion function. This implies that a singular value, such as 1, has more than one operational source rate. An important property of singular values is that neighboring singular values

13 always share one solution. Figure 3.3c shows an adjacent singular value to one in 3.3b, and they share one solution, namely n2 . Another property is that there can be no additional solutions in between the neighboring singular values. We see that as we decrease 1 to 2, the only possible minimum is n2, which is the common solution for the two adjacent singular values. Another interpretation of this property is that the set of non-singular multipliers in between two neighboring singular multipliers does not yield any more solutions that is not covered by the two singular multipliers. Therefore, the set of singular values leads to the entire set of solutions to the dual problem in (3.4) for all possible values of multipliers. We can make the following important conclusion: Instead of sweeping the multiplier value from zero to in nity continuously in search of an operational rate that is close to our target rate, it is sucient to look at the singular values only, since they alone lead to all possible solutions anyway. Figure 3.4 shows the operational rate as a function of multiplier . Notice the singular values, , 2, 1 etc, each have multiplier solutions, denoted by circles. Notice also that non-singular valued multipliers do not lead to solutions that is not already covered by the singular values. Since operational source rate is monotonically non-increasing with respect to the multiplier, at any given point on the operational rate plot, we only need to search neighboring singular values iteratively in the direction towards our target rate instead of traversing all of them. Geometrically, to go to a neighboring multiplier, we are decreasing (increasing) the slope of the penalty lines gradually in all subbands until non-unique solutions appear in a subband. With the new non-unique solutions in hand, we compute the new operational source rates and check against the target. In Figure 3.4, we start at multiplier 1, then move to neighboring 2 , then to neighboring ; each time we drive our operational rate closer to our target rate Rs. Upon reaching , we notice that the two associated operational rates encloses our target rate. These are the closest solutions we can nd by solving the dual; we will settle with the solution set corresponding to Rs;1 to be our approximate solution [4].

14 o

Rs

Rs2 Rs* Rs1

λ∗

λ2

λ1

λ

Figure 3.4: Operational Source Rate vs. Multiplier value

3.4.2 Generalized Gersho-Shoham Algorithm Our algorithm extends Gersho-Shoham Algorithm to another dimension. Similar to the 1-D case, the multiplier problem in the form of (3.4) can be viewed as minimizing the sum of distortions of all subbands plus a penalty function Q(; ) = PK n + m . With the added dimension, the penalty function now translates into k k=1 k adding elevated penalty planes in n and m axes to all subbands with slopes and respectively. Our goal is to iteratively make adjustments to the slopes of the penalty planes such that our operational rate pair (Rso; Rco) converges to our target rate pair (Rstarget ; C ? Rstarget ). Similarly, we would like to make these adjustments using singular values. The notion of singular values for three dimensions, however, is slightly more complicated and needs to be explained further. Singular Values A non-singular value set of multipliers, (; ), similar to the 1-D case, implies that there are corresponding sets, fnok g and fmok g, that uniquely minimizes (3.4). If the set of multipliers is singular, then there is at least one subband, called the pivot subband, where there is more than one point in the subband that minimizes it. For example, there are two points, xo;p 1 and xo;p 2, that are co-minimum of subband p:

9p 2 f1; :": : ; K g min

s.t.

#

xo;p 1

= (no;p 1; mo;p 1 )

= arg (np; mp) fdp(np; mp) + np + mpg

xo;p 2

= (no;p 2; mo;p 2 )

min (np; mp) fdp(np; mp) + np + mpg

= arg

"

#

15

xo;p 1 6= xo;p 2

(3.15)

Points that satisfy (3.15), xo;p 1 and xo;p 2 in the example, are each called pivot point. If the multiplier pair yields only one pivot subband, and it has only two pivot points, we denote the case as two-point pivoting. Note that we have two degrees of freedom in choosing the multiplier pair, namely and . To satisfy (3.15) for two pivot points in a subband, we essentially have one equation, and only one degree of freedom is needed to satisfy it. To exploit the additional degree of freedom in selecting our multiplier pair, we have two alternative. First, we can nd another point within the subband such that together with the original two pivot points, we have three pivot points that are simultaneously minimum for a multiplier pair. We call the case triangular. This implies that we can always move from a two-point pivoting case to a triangular case by using up the remaining degree of freedom in multiplier selection. Second, we can use up the remaining degree of freedom by nding two pivot points in a dierent subband. We again have two equations, each in the form of (3.15). We denote the case as quadrangular. We can move from a two-point pivoting case to a quadrangular case by using up the remaining degree of freedom. Geometrically, a two-point pivoting case means adjusting slopes of the penalty planes in the n and m axes of all subbands such that there are two minimum points in a subband. Figure 3.5a shows that originally there is one unique minimum in subband i. In Figure 3.5b, we create slopes in the two axes such that there are two minima. As indicated by the arrow, we can continue to decrease the tilt of the plane surface by changing and while keeping these two points minimum. In Figure 3.5c, we reach a triangular case; there are three pivot points that are simultaneously minimum for subband i. In Figure 3.5d, we reach the other alternative; instead of nding a third pivot point in subband i, we nd two pivot points in another subband j. Note also that if the case is two-point pivoting, then the resulting operational

16 d i(n,m)

d i(n,m)

d i(n,m) n

n

n

d1

dimin

dimin

dimin

µ

a)

b)

m

λ

µ

λ

c)

m

d i(n,m)

m

d j(n,m) n

n dimin

djmin

λ

µ

m

µ

λ

m

d)

Figure 3.5: Geometrical interpretation of Singular Values in 2-D source and channel rate are in 2 pairs:

Rso;1

=

no;p 1 +

Rso;2 = no;p 2 +

K X k=1;k6=p K X k=1;k6=p

nok

Rco;1

= mo;p 1 +

nok

Rco;2 = mo;p 2 +

K X k=1;k6=p K X k=1;k6=p

mok mok

(3.16)

where p denotes the index of the pivot subband. Similarly, if the case is triangular or quadrangular, there will be three or four corresponding pairs of (Rso; Rco) respectively. Similar to the 1-D case, singular values lead to all possible solutions to the dual. Therefore, instead of sweeping multiplier values set (; ) continuously for all possible values, it is sucient just to look at the singular values. Instead of considering all singular values, however, we will only need to step through a sequence of singular values such that we iteratively approach the best possible solution. Lines and Regions of Eligibility We will now introduce the notion of line of eligibility and region of eligibility for the Rs -Rc plane and the nk -mk plane of any subband k. Suppose we are in a two-point pivoting case with singular value pair (o; o), resulting in two pivot points, xo;p 1 and xo;p 2 in pivot subband p, and unique

17 np

Rs

nk

X

xο,1 p

ο,1

np

ο,1

xο,2

ο,2

nk

np

ο,1

ο,2

mp

m

a)

mp

X1

Rc

xkο

ο

p

target

X2

ο,2

Rs

mk

ο

mk

b)

ο,1

ο,2

Rc

Rc

c)

Figure 3.6: Line and Region of Eligibility for GGS Algorithm optimal points fxok g for non-pivot subbands. For the Rs-Rc plane, line of eligibility is the line passing through operational rate points (Rso;1; Rco;1) and (Rso;2; Rco;2). The line divides the plane into two regions. The region that contains the target rate pair (Rstarget ; C ? Rstarget ) is the region of eligibility. This is illustrated in Figure 3.6c, where X 1 and X 2 are operational rate pairs and X target is the target rate pair. We can draw corresponding lines of eligibility on nk -mk planes of each individual subband k as well. We do so in such a way that for each subband, the line will go through the currently optimal point(s) with the same slope as the line on Rs-Rc plane. We can also identify the regions of eligibility in nk -mk planes as the same corresponding side of the line as the one in Rs-Rc plane. Figure 3.6b illustrates this for non-pivot subband k. Notice it goes through the optimal point xok , with the arrows indicating the corresponding region of eligibility. Figure 3.6a illustrates this for pivot subband p. By de nition of operational rate pairs for two-point pivoting in (3.16), a line with the same slope as line in Rs-Rc plane and going through one optimal point xo;p 1 will also go through the other optimal point xo;p 2 . The arrows indicates the side of the line that is the region of eligibility of subband p. Description of Algorithm Having de ned the above, we can now sketch the outline of our algorithm in words as follows: 1. Start with initial multiplier o and o that yield a two-point pivoting subband p. This uses up one degree of freedom. 2. Use the remaining degree of freedom by searching through region of eligibility of

Rc

18 d (n,m)

dp(n,m)

ip

n di

z

d min p

le

l l

m

a)

z

o

zp

zp proj

b)

Figure 3.7: Change of Basis for GGS Algorithm all subbands to nd either (a) the next pivot point in the 2-point pivoting subband (triangular), or, (b) the two next pivot points in a new subband (quadrangular). 3. Use pivot selection scheme of section 3.4.4 to choose two pivots points out of the three pivot points in triangular case, or the four pivot points in quadrangular case. 4. Repeat step 2 and 3 until we get suciently close to (Rstarget ; C ? Rstarget ) in the Rs -Rc plane.

The essence of the algorithm is to choose the new pivot point(s) in step 2 in such a way that the resulting operational rates from the new pivot point(s) together with those of the old two points enclose the target rate (Rstarget ; C ? Rstarget ). For this to happen, the operational rate pair for the new pivot point(s) in the Rs-Rc plane must be in the region of eligibility. This means that for each subband, to search for the next potential pivot point, we only need to search among the points that are in the region of eligibility. Finding Candidate Pivot Points Geometrically, the next pivot point is the rst point that, as the tilt of the planes is being decreased (increased) gradually, becomes co-minimum of its subband together with the original minimum point(s): xo;p 1 and xo;p 2 , in the case of pivot subband (Triangular), or, xok , in the case of nonpivot subband (Quadrangular). Keep in mind that we are doing so while keeping

19 the pivot points co-minimum in the pivot subband, thus we are decreasing the tilt in one dimension only. This point is illustrated in Figure 3.7a. We rst de ne a tilted distortion function for each subband k as:

dk (nk ; mk ) = onk + omk

(3.17)

where (o; o) are the pivoting singular value pair. Note that by de nition of pivoting, the two pivot points evaluate to the same value in the pivot subband p. In Figure 3.7a, le denotes the line of eligibility of this subband. Suppose we perform a change of basis to l-z axes, as shown in Figure 3.7a, such that l is parallel to line le and z is perpendicular to it. Consider the set of all the planes passing through the l axis, each having a dierent tilt angle with respect to the l-z plane. Since le is parallel to l in all these planes, the two pivot points will evaluate to the same value no matter how large the tilt of the plane is. Therefore, if we search for the next pivot point by changing the tilt of the plane passing through l axis, we will be doing so while keeping the original pivot points co-minimum. To nd out which point of which subband will be the next pivot point mathematically, we do the following. For each subband k, we elect a candidate pivot point xk as follows. For each point in the region of eligibility, we nd out how much the plane passing through l axis must tilt for that point to become a minimum. The point that requires the minimum tilt in order to become minimum is the rst point that will become co-minimum of the subband. In the new coordinate system, nding the minimum tilt point is equivalent to nding the minimum slope point in the 1D case. As a example, Figure 3.7b shows the 2D view of the tilted distortion function of pivot subband p in the new coordinate system; l axis is pointing out of the page, and z axis is pointing along the page. We rst observe that the two pivot points are on top of each other: they evaluate to the same value, dmin p , and they have the same z coordinate, zpo . Note also the region of eligibility in this coordinate system is the set of points whose z coordinate value, zp, is larger then the minimum points' zpo . To nd the point with the minimum slope, we evaluate the slopes of all points zp > zpo ,

20

xok = (nok ; mok ) (Rso ; Rco) o; 1 xp = (no;p 1; mo;p 1) (Rso;1 ; Rco;1) xk = (nk ; mk ) Ck dk (nk ; mk )

optimal point for subband k, given multiplier set (; ) operational source-channel pair, given multiplier set (; ) pivot point 1 in pivot subband p operational source-channel pair 1 using pivot point 1 as subband minimum, and multiplier set (; ) candidate pivot point of subband k region of eligibility of subband k = dk (nk ; mk ) + nk + mk ; tilted distortion function of kth subband

Table 3.2: table of notations for GGS Algorithm where slope is:

dp(lp; zp) ? dmin (3.18) slope = jz ? zo j p p p Note that in the original coordinate system, jzp ? zpo j is the projected distance of the point to the line of eligibility. The point with the minimum slope is the candidate pivot point of the subband. For non-pivotal subbands, the same procedure applies. To choose a pivot point among the K candidates, we pick the point that requires the smallest tilt. This corresponds to the rst point that would become co-minimum if we actually change the tilt of the plane passing through l axis gradually as mentioned before. Selecting Pivot Points If the new pivot point is found in the pivot subband, then we have a triangular case. We now have three operational rate pair, points X 1, X 2 and X , as seen in Figure 3.8a. If the new pivot point is found in a subband other than the pivot subband, then the original minimum point of that subband becomes a pivot point as well, and we have a quadrangular case. We have four operational rate pairs, points X 1, X 2, X ;1 and X ;2,as seen in Figure 3.8b. In the triangular case, a pivot point selection method picks two of three pivot points in the pivot subband as new pivot points. In the quadrangular case, it picks one of two pivot subbands, each of which contains two pivot points, as the new pivot subband. The planes tilt again in the direction of the target pair using the new pivots, and the process continues. The algorithm stops when an enclosed area, whose corners are denoted by the operational points of the pivots, includes our target rate; this indicates that we have reached the

21 X∗,1

Rs

Rs X

X*

∗

target

Xtarget

∗,2

Rc

∗,2

Rc 1

X

1

X

X

2

2

X

X

ο,2

ο,2

Rs

Rs

∗

Rc

ο,2

Rc

a) Triangular Pivot Change

Rc

ο,2

Rc

Rc

∗,2

Rc

b) Quadrangular Pivot Change

Figure 3.8: Example of Pivot change under Triangular and Quadrangular closest feasible points to the target. Details of Algorithm The details of the algorithm are (see Table 2 for notations): 1. Multiplier Initialization: Initialize multipliers o and o that yields a two-point pivoting case, say in subband p (soon to be discussed). Find the sets fnok g and fmok g that minimizes (3.4), and the corresponding operational source and channel pairs, denote X 1 = (Rso;1 ; Rco;1 ), X 2 = (Rso;2 ; Rco;2 ). 2. Pivot Adjustment: For each subband k, introduce a tilted distortion follows: dk (nk ; mk ) = dk (nk ; mk ) + o nk + omk

function

as

(3.19)

In the pivot subband, the tilted distortion function evaluated at the two pivot points is the same. 3. De nition of Eligibility Region: Let the slope of eligibility line, shown in Figure 3.6b, be: o;2 o;1 (3.20) M = Rso;2 ? Rso;1

Rc ? R c This is the slope of the line through points X 1 and X 2 on the Rso -Rco plane. For each subband k , construct a line of eligibility, nk = Mmk + b, that goes through the

22 optimal point(s) xok = (nok ; mok ). Identify a region of eligibility, Ck , in each subband k that corresponds to the region of eligibility in the Rso-Rco plane. 4. Identi cation of Candidate Pivot Point: For each subband k, nd a point xk such that: " min k (nk ; mk ) ? dk (nok ; mok ) # d (3.21) xk = (nk ; mk ) = arg (nk ; mk ) 2 Ck jproj (n ; m ) on line of elig.j k k Among these K points from K subbands, nd one that yields the minimum value in k ;mk )?dk (nok ;mok ) . (3.21), as in the one that yields smallest value in expression: jproj d(kn(kn;m k ) on line of elig.j Call it x . 5. Pivot Selection: If x is in the pivot subband, then it is a triangular case. De ne a point X as :

X = (R ; R ) = (n + s

c

p

K X

k=1;k6=p

nok ;

m + p

K X

k=1;k6=p

mok )

(3.22)

where p denotes the index of the pivot subband. We choose two of three points among the set fX 1 ; X 2 ; X g in the Rso -Rco plane, and specify the new region of eligibility, based on the pivot point selection method (to be discussed). This selection method guarantees that the new cluster of points makes progress towards the target. If x is not in the original pivot subband, then it is a quadrangular case. De ne points X ;1 and X ;2 as:

X ;1 X ;2

0 1 K K X X mok A nok ; mo;p 1 + mj + = (Rs;1 ; Rc;1 ) = @no;p 1 + nj + k=1;k6=p;j k=1;k6=p;j 0 1 K K X X A mok(3.23) nok ; mo;p 2 + mj + = (Rs;2 ; Rc;2 ) = @no;p 2 + nj + k=1;k6=p;j

k=1;k6=p;j

where p is the index of the original pivot subband, and j is the index of the new pivot subband. We invoke the pivot point selection method, which returns two points in the set fX 1 ; X 2 ; X ;1 ; X ;2 g and speci es the region of eligibility. The algorithm stops if pivot point selection algorithm signals termination. 6. Multiplier Re-initialization: For the newly selected pivot points in the pivot subband, nd any multiplier pairs (o ; o ) such that the resulting tilted distortion function for the pivot subband evaluated at the pivot points will have the same value. Goto step 2.

23

3.4.3 Initialization For step 1 of the algorithm, the goal is to initialize multipliers o and o such that it results in a two-point pivoting case. We will assume we already have optimal sets fnok g and fmok g (optimal solution to (3.4)) for a given non-singular value multiplier pair (l; l ). A simple method is the following: rst de ne a tilted function for each subband as: dk (nk ; mk ) = dk (nk ; mk ) + l mk (3.24) For each subband, nd xk such that: " min o ; mo ) # d ( n ; m ) ? d ( n k k k k k k (3.25) xk = (nk ; mk ) = arg nk > nok o nk ? nk Geometrically, this is rst point in subband k that will become co-minimum with the present minimum pk if the slope of the penalty plane on the n axis is gradually changed. Among these K points, nd one that minimizes (3.25), as in the one that o o minimizes the expression: dk (nk ;mnkk)??ndkok(nk ;mk ) . Call it x . This is the rst point in all subbands that will become co-minimum if the slope is gradually changed. The o o subband of x is the pivot subband. Let ? = dk (nk ;mnkk)??dnkok(nk ;mk ) evaluated at p . The pivoting multiplier values are: (o ; o) := ( ; l ).

3.4.4 Pivot Point Selection Method In either the triangular or quadrangular case, we must select two points in the Rso-Rco plane out of three or four possible points for step 5 of the algorithm. The selection algorithm must select pivots in such a way that it ensures the algorithm will converge to the best possible solution. To do so, the algorithm must make progress, i.e. the 2 operational source-channel pairs will yield a cluster of points in the next step that is closer to the target rate pair than the previous one. We will begin with a few necessary de nitions. De ne pivot line segment, labeled a in Figure 3.9, as the line segment between two current pivot rate-pairs. De ne distance line segment, labeled b in Figure 3.9, as the line segment that minimizes the distance between the target and the pivot line segment. The selection procedure is as follows:

24 Rs X X*

∗,1

Rs

target

X

a) Triangular

target

b

X1 a

X

∗,2

X

1

a’

X

X2

a’ a

b

2

X

Rc

b) Quadrangular

Rc

Figure 3.9: Pivot Point Selection Algorithm

Triangular: To avoid stagnation, the new pivot, X must be one of the pivots

selected. To choose between the two original pivots, X 1 and X 2 , we select one that yields a new pivot line segment that crosses the current distance line segment. If such point does not exist, then we select the pivot that yields a pivot line segment that touches the current distance line segment. In Figure 3.9a, by selecting pivot points X and X 2, the new pivot line segment, a0, crosses the current distance line segment, b.

Quadrangular: Again, to avoid stagnation, one of the two pivots, X ;1 and

X ;2, must be selected. Similarly, we select two pivots that yield a new pivot line segment that crosses the current distance line segment. If such pair of pivots does not exist, then we select the two pivots that yield a pivot line segment that touches the current distance line segment. In Figure 3.9b, there are no two pivots that yield a pivot line segment crossing the distance line segment b. So we select pivots X ;2 and X 2 to yield pivot line segment a0 , which touches b.

Ending Condition: When the triangle or the parallelogram that is created

by the three or four pivot points encloses the target rate, we terminate the algorithm.

See Appendix .3 for convergence of this algorithm.

25

3.5 Hybrid Algorithm Our generalized Gersho-Shoham Algorithm has one major disadvantage that discourages its exclusive use: because it painstakingly searches through every singular multiplier pair on the search trajectory toward to the optimal solution, the algorithm is slow if the initial operational source-channel pair is very far from the optimal. To remedy this problem, we propose a hybrid algorithm that rst uses the linear approximation algorithm in section 3.3 to estimate the optimal solution, then uses the Generalized Gersho-Shoham Algorithm to re ne the estimate into the optimal or nearoptimal solution. The linear approximation algorithm has ecient convergence until it encounters the discreteness of the rate functions near the optimal. The generalized Gersho-Shoham algorithm performs poorly when far from the solution, but nds the optimal solution eciently when near it. Therefore the hybrid algorithm combines the advantages of both algorithms while avoiding their respective pitfalls.

26

Chapter 4 Implementation 4.1 Rate Distortion (RD) Functions In the previous sections, we assume that we know the empirical distortion function for various subbands for the given video sequence under consideration:

dk (nk ; mk ) = E [jX^k (nk ; mk ) ? X j2] (4.1) The actual rate distortion functions depend on the particular implementation of the source codec, channel codec, and the channel conditions. We will show how we arrive at our particular RD functions for our implementation. We can rst expand the expected distortion of subband k as the sum of conditional distortions for a collection of error events. According to the Total Probability Theorem, the events in the collection are disjoint, and they collectively span the sample space. nk X 2 ^ E [jXk (nk ; mk ) ? X j ] = E [jX^k (nk ; mk ) ? X j2 j e01;k ; : : : ; e0i?1;k ei;k ]P (e01;k ; : : : ; e0i?1;k ei;k ) i=1

+E [jX^k (nk ; mk ) ? X j2 j e01;k ; : : : ; e0nk ;k ]P (e01;k ; : : : ; e0nk ;k )

(4.2)

In the above expression, ei;k denotes the event that bit i of subband k is received incorrectly, and e0i;k denotes the corresponding complement event. We will assume the

27 usage of conditional arithmetic coding in the coding of subband coecients. Since conditional arithmetic coding is a variable length code, it is a good approximation to assume that if bit i is corrupted, all bits in the remaining codeword are rendered useless due to loss of synchronization. So we can assume that the resulting error when bit i and some other bits after bit i are ipped is approximately equivalent to the resulting error when only bit i is ipped. We will now de ne two functions to ease our notations. Let fk (i) be the resulting distortion function of subband k if only bit i is ipped. This function is approximately equal to the rate-distortion function using i ? 1 source bits under noiseless conditions. To obtain this function, we experimentally inject an error at bit i of the corresponding subband, and average out the error over 200 frames to get an approximate value. Let g(m) be the resulting error probability of a source bit if, on average, m channel bits are used to protected it. This function will obviously depend heavily in the particular implementation of the channel codec and the channel condition. In our experiment, we use RCPC for our unequal error protection codec. Since RCPC is a convolutional code, individual bit error can be bounded using [5]: 1 X 1 cd(m)Pd (4.3) g(m) P d=dfree where P is the puncturing period, Pd is the probability that the wrong path at distance d is selected, and cd is the distance spectra. Depending on how many bits we used to channel code the source bits, cd will be dierent. For a binary symmetric channel (BSC), Pd is simply Pd = Ped, Pe being the cross over probability. For an additive white Gaussian channel, Pd is: s 1 s (4.4) Pd = 2 erfc dE N0 where NEs0 is the signal-to-noise ratio of the channel. We note that although the error probability in (4.3) is only an upper bound, we found that experimentally the bound is quite tight, especially in poor channel conditions. With the above two functions de nitions, we can write:

E [jX^k (nk ; mk ) ? X j2je01;k :::e0i?1;k ei;k ] = fk (i)

(4.5)

28 20*exp(−0.25*n−0.3*m)

15*exp(−0.2*n−0.2*m)

12 12

10 10

8

d1

d2

8 6

6 4

4

2 2

0 0

0 0

2

2

10 4

4

8

10

6

6

6

8

4

8

8

2 10

0

m

4 2

10

a) subband 1 distortion function n

6

0

b) subband 2 distortion function n

m

Figure 4.1: Original distortion functions iY ?1 P (e01;k :::e0i?1;k ei;k ) = g(mi;k ) [1 ? g(mj;k)] j =1

(4.6)

where g(mj;k) is the resulting error probability if mj;k channel bits are used to protect that bit. Substituting into the previous equation:

dk (nk ; mk ) = E [jX^k (nk ; mk ) ? X j2] =

n X k

i=1

iY ?1 g(mi;k )fk (i) [1 ? g(mj;k)]

+fk (nk + 1)

nk Y

j =1

j =1

[1 ? g(mj;k )]

(4.7)

4.2 Results To demonstrate the eectiveness of the Generalized Gersho-Shoham algorithm, we construct Figure 4.1. The two plots represent two subband distortion functions as functions of source bits n and channel bits m. Our goal is to to minimize the sum of the two distortions in the form of (3.3), such that the sum of source bits and sum of channel bits do not exceed (Rs; Rc) = (6; 7). Table 3 shows the operational sourcechannel rate pairs we obtain in each iteration. For each iteration, Figure 4.2 maps

29 iteration (Rso;1; Rco;1) (Rso;2; Rco;2) (Rs;1; Rc;1) (Rs;2; Rc;2) 1 (0,6) (1,5) (0,7) 2 (0,7) (1,5) (2,4) 3 (0,7) (2,4) (1,6) 4 (1,6) (2,4) (2,5) 5 (2,5) (2,4) (3,4) 6 (2,5) (3,4) (2,6) (3,5) 7 (3,4) (3,5) (4,4) (4,5) 8 (3,5) (4,5) (0,8) 9 (4,5) (0,8) (1,8) 10 (4,5) (1,8) (0,10) 11 (4,5) (0,10) (5,5) 12 (0,10) (5,5) (0,11) (5,6) 13 (0,11) (5,6) (0,12) 14 (5,6) (0,12) (6,6) 15 (0,12) (6,6) (0,13) (6,7) Table 4.1: Operational Source-Channel Rate Pairs for GGS Algorithm the corresponding three or four rate pairs onto the Rs-Rc plane. If the algorithm yields a triangular case for one iteration, Figure 4.2 draws a triangle whose corners are denoted by the three rate pairs. If it yields a quadrangular case, Figure 4.2 draws a quadrangle whose corners are denoted by the four rate pairs. The target rate pair is denoted by * symbol in Figure 4.2. We see in Figure 4.2a that iteration 1 yields a triangular case. When we move to iteration 2 (another triangular case), we found rate pair (2; 4), which is in the direction of the target rate. Iteration 6 is a quadrangular case, denoted by the parallelogram 6. We see that the next cluster of rate pairs are closer to the target pair that the previous. In Figure 4.2b, we see that the algorithm terminates after 15 iterations. In this case, we found the optimal solution. Now to test the overall algorithm numerically, we combined the 3D scalable video codec and Rate-Compatible Punctured Convolutional Codes [5] to build our proposed joint source/channel codec. For source coding, we used 3 levels of spatial and 2 levels of temporal subband decomposition. We used 200 frames of the digitized video \Raiders of the Lost Ark" to compute the distortion functions, and applied our bit allocation strategy to search for the optimal source to channel coding ratio Rso=Rco

30 Generalized GS Algorithm: iteration 1−15

Generalized GS Algorithm: iteration 1−6 6

6

5

5

4

4 14

Rs

Rs

7 3

3

11 12

5

5

8

2

2

4

4

1

2

1

3

4

4.5

5

a) iteration 1-6

5.5 Rc

6

9

2

10

13

3

1

1 0

15

6

6

6.5

7

0

4

5

6

7

b) iteration 1-15

8

9

10

11

12

13

Rc

Figure 4.2: Procession of Generalized Gersho-Shoham Algorithm for various CSI ranging from 0.001 to 0.05. The total bit budget is 250kbits=s. We see in Figure 4.3a that there exists a unique distortion minimum for various Pe. We observe that for this particular source and channel codec, as the error probability Pe increases, the total number of quantization layers selected decreases; this is due to the decrease of optimal source to channel ratio as the channel condition worsens. From Pe = 0 to Pe = 0:001, the layers that are dropped are mostly high frequency layers; this is due to the low error sensitivities of high frequency components. In poor channel conditions, Pe 0:01, the number of layers of low frequency subbands is reduced, resulting in a more uniform distribution of quantization layers among the subbands. This is because higher quantization layers are useless unless all the preceding lower layers are received error-free. Therefore a subband with too many quantization layers will render the higher layers frequently futile in poor channel condition. To show that our optimization strategy is essential in poor channel condition, Pe = 0:05, we compare its performance with other codecs in Figure 12b. Curve a in Figure 12b, shows the PSNR of the scalable codec under ideal noiseless conditions for 100 frames. The average PSNR in this case is 31.8 dB. Curve b in Figure 12b shows the PSNR of our proposed optimized codec operating at the optimal Rs=Rc = 0:6,

31 Distortion vs. Source to Channel Ratio (250kbits/s)

PSNR vs. frame number (250kbits/s)

300

35 a

Pe=0.05 Pe=0.01 250

30

Pe=0.005

b

Pe=0.001 25

c

MSE

PSNR

200 20

150 15

d

100 10

50 0

1

2

3

4

5

a) MSE vs. Rs=Rc for dif CSI

6

7

Rs/Rc

5 0

10

20

30

40

50 60 frame number

70

80

90

100

b) PSNR vs. Frame Number

Figure 4.3: Experimental Results with unequal error protection as described in earlier sections. The average PSNR in this case is about 4 dB lower than the ideal noiseless case. Curve c in Figure 12b shows the performance of a codec operating at the optimal Rs=Rc = 0:6, but using equal error protection. This codec distributes Rs source bits using traditional bit allocation theory that assumes a noiseless channel, then channel codes these source bits with Rc channel bits equally. As seen, the PSNR is about the same as case b for most frames, except for occasional drops of 25 dB. These drops are a direct consequence of the fact that important source bits not adequately protected. Finally curve d in Figure 12b shows the performance of the same equal error protection codec as in c but operating at non-optimal Rs=Rc = 2. As seen, the average PSNR of this codec is about 8 dB. The main conclusion to be drawn from Figure 12b is that optimal source/channel bit distribution does make a signi cant dierence in poor channel condition scenarios.

4.3 Channel Mismatch Although we assume the knowledge of the average channel, there are times when the estimate of the channel is incorrect. Using Figure 12a, we can easily determine the performance degradations in such cases. For example, to nd the approximate performance of the joint source/channel codec assuming Pe = 0:01 but operating at

32

Pe = 0:05, we locate the point Rso=Rco for Pe = 0:01 on the curve D0:05 (Rs=Rc). If the signal is being multicasted to dierent users, each with dierent channel conditions, then considering only the average channel may not be optimal. In this case, CSI may be expressed in the form of a probability distribution function; the proportion of users in channel state Pe0 will be incorporated as the likelihood of the channel: P (CSI = Pe0). Our proposed approach can now be used to nd the optimum operating point: 8 39 2 < = min X Rso =Rco = arg :Rs=Rc 4 DPe 0(Rs =Rc)P (CSI = Pe0)5; (4.8) Pe 0? where ? is the set of all possible CSI's. In this case, our joint source/channel coding approach has better adaptation potential to the channel than previous schemes [2].

33

Chapter 5 Conclusions In this paper, we have presented a methodology to optimally allocating source and channel bits for video transmission over noisy channels. In particular, an optimal bit allocation strategy that is ecient and yields near-optimal solutions is presented. Our development of the theory shows that our solution is very close to optimal, and our results proves that in poor channel conditions, an optimal bit allocation scheme is essential to maintain good visual quality. Although we have discussed our algorithm in the context of video, we feel that our strategy can conceivably be applied to other delay sensitive trac transmitted through unreliable channels, such as images and speech.

.1 Lagrange Multiplier Error Bound Theorem 0

Let the resulting distortion from the approximate solution be

D(fnok g; fmok g) =

K X

k=1

dk (nok ; mok )

(.1)

Let the resulting distortion from the ideal solution be D(fnk g; fmk g). Additionally, let D(fn1k g; fm1k g) and D(fn2k g; fm2k g) be the distortions of two other solutions resulted from two dierent sets of multipliers such that:

Rs1 Rso; Rs Rs2

34 Rc1 Rco; Rc Rc2

(.2)

The error bound of our approximate solution can be bounded as:

jD(fnk g; fmk g) ? D(fnok g; fmok g)j jD(fn1k g; fm1k g) ? D(fn2k g; fm2k g)j

(.3)

See [4] for a similar proof.

.2 Proof of Linear Approximation Algorithm We will show that if the rate function is a convex, decreasing continuous function of one variable (multiplier), then our algorithm converges to the correct multiplier values. De nition Assuming Rs() is a source rate function of the Lagrange multiplier , we de ne rs() = Rs (1=). Lemma 1 If Rs() is a convex, decreasing function of , then rs() is a concave, increasing function of . Proof 1 We will rst show rs() is increasing.

Rs() Rs ( + ) 0 Rs( 1 ) Rs ( +1 ) rs() rs( + ) 0

(.4)

Therefore rs () is increasing. We will now prove concavity. Let u = 1=.

drs = drs du d du d du dR = ds d d2 rs = d drs d2 d d d dRs du = d d d 2 du d Rs + d2 u dRs = d d2 d2 d ? 1 = 2 f 0g + 23 f< 0g

(.5)

(.6)

where the last step holds because Rs () is convex and decreasing. Since Lagrange multiplier is non-negative, the second derivative of rs() is non-positive, and rs() is concave. Or in other words, Rs (1=) is concave with respect to . 2

35 We now show, using the Global Convergence Theorem [9], that our algorithm converges to the solution. We rst de ne a descent function as: o ? Rtarget j Z (t ) = jRs;t s

(.7)

o 6= Rtarget . We now show that Z (t ) is strictly decreasing for every t and Rs;t s o target o target Z (t+1 ) ? Z (t ) = jRs;t +1 ? Rs j ? jRs;t ? Rs j (a) o o = Rs;t +1 ? Rs;t (b) = f< 0g

(.8)

(a) holds because Rso (1=) is concave (by lemma 1), and therefore a linear approximation of the function using 2 points will always yield an answer that is larger than the optimal o o target o o solution (See 3.1). (b) holds because by construction, Rs;t +1 2 (Rs;0 ; Rs;t ) if Rs;t 6= Rs . Therefore Z (t ) is a strictly decreasing function. Since i) is in a compact set, ii) Z (t ) is a proper decent function, iii) the mapping de ned by the algorithm is closed, by the Global Convergence Theorem [9], our algorithm converges to the solution set.

.3 Proof of Generalized Gersho-Shoham Algorithm We now prove the correctness of various stages of the algorithm with a series of lemmas. We rst prove that the initialization step is correct; that is, we will show that the initialization procedure does yield a pivoting case. Lemma 2.0 Let (l ; l ) be a non-singular multiplier pair with corresponding optimal sets fnok g, fmok g. Let ? is the minimum of all K values produced by (3.25). Claim: l > . Proof 2.0 Let p denote the pivot subband from which was derived. De ne a tilted distortion function for subband p as in (3.24). By de nition of optimality of pair (nop; mop ), and non-singularity of (l ; l ):

di(np ; mp ) + l np > di (nop; mop) + l nop d (n ; m ) ? d (no ; mo ) l > ? p p np ? npo p p = p p where the last step is true because np > nop . 2

(.9)

36

Lemma 2.1 Given a non-singular multiplier pair (l ; l ) and the corresponding optimal sets fnok g, fmok g. Suppose x = xp of (3.25) is a strict minimum in subband p. Then, the new multiplier pair ( ; l ), where ? is the value of (3.25) evaluated at x , results in

two optimal solutions in subband p. o o Proof 2.1 Let = ? dp (np;mnpp)??dnpop(np;mp ) . We divide points within subband p into 4 classes: 1. A = fxop ; xp g, the co-minimum points. 2. B = fxp j np > nop; xp 6= xp g, the non-minimum points in the region of eligibility. 3. C = fxp j np < nopg, points not in the region of eligibility. 4. D = fxp j np = nop ; xp 6= xopg, points on the line of eligibility. Class

A: Using de nition of : o o

d (n ; m ) ? d (n ; m ) dp(np; mp) + (np ? nop) = dp(np ; mp ) ? p p np ? npo p p (np ? nop ) = dp (nop ; mop )

Therefore: Class

p

p

dp (np ; mp) + np = dp (nop; mop ) + nop

(.10) (.11)

B : By assumption of strict minimum at xp = (np ; mp ), all points xp 2 B :

dp (np; mp ) ? dp (nop; mop) dp (np; mp ) ? dp (nop; mop) > (.12) np ? nop np ? nop Again, using de nition of : d (n ; m ) ? d (no ; mo ) dp (np ; mp) + (np ? nop) = dp(np ; mp ) ? p p np ? npo p p (np ? nop ) p p dp (np ; mp) ? dp (nop ; mop) > dp(np ; mp ) ? (np ? nop ) np ? nop = dp (nop ; mop ) (.13) Therefore we can conclude that for any point xp 2 B :

dp (np; mp ) + np > dp(np ; mp ) + np = dp (nop; mop ) + nop

(.14)

37 Class

C : Since (nop ; mop ) is originally a unique minimum pair for multiplier pair (l ; l ): dp (np ; mp ) + l np + l mp > dp (nop ; mop ) + l nop + l mop dp (np ; mp ) + l np > dp (nop ; mop ) + l nop

(.15)

By Lemma 2.0, l > . For np < nop, (l ? )np < (l ? )nop . Therefore:

dp (np ; mp ) + (l ? )nop + l np > dp (nop ; mop) + (l ? )np + l nop dp (np ; mp ) + np > dp (nop ; mop ) + nop Class

(.16)

D: Again, by de nition of minimum point (nop; mop ) for multiplier pair (l ; l ): dp (np ; mp) + l np > dp (nop; mop ) + l nop

(.17)

dp (np ; mp ) + l np + ( ? l )np > dp (nop ; mop ) + l nop + ( ? l )nop dp (np ; mp ) + np > dp(nop; mop ) + nop

(.18)

Since np = nop:

Combining previous results, we conclude that for any point in subband p:

dp (np ; mp) + np + l mp > dp (nop ; mop ) + nop + l mop = dp (np ; mp ) + np + l mp

(.19)

Therefore with ( ; l ), we have two unique optimal pairs, (np ; mp ) and (nop; mop ) as the minimizing solution to subband p. 2

Lemma 2.2

Given a non-singular multiplier pair (l ; l ) and the corresponding optimal sets fnok g, fmok g. Suppose x = xp of (3.25) is a strict minimum among set fx1 ; : : : ; xK g. The new multiplier pair ( ; l ), where ? is the value of (3.25) evaluated at x , results in the same unique optimal solution in subband j 6= p, as the previous pair (l ; l ).

Proof 2.2 Let = ? d (n ;mn)??dn (n ;m ) . Similarly, we will divide points within subband j into 4 p

classes:

p

p p

p o p

o p

o p

1. A = fxoj ; xj g, minimum point and candidate pivot point. 2. B = fxj j nj > noj; xp 6= xj g, points in the region of eligibility except candidate pivot point.

38 3. C = fxj j nj < nojg, points not in the region of eligibility. 4. D = fxj j nj = noj ; xj 6= xoj g, points on the line of eligibility. Class

A: By assumption of strict minimum of xp: dj (nj ; mj ) ? dj (noj ; moj) dp (np; mp ) ? dp (nop; mop) > nj ? noj np ? nop

(.20)

o o dj (nj ; mj ) + (nj ? noj ) = dj (nj ; mj ) ? dp (np ; mp) ? dpo(np ; mp ) (nj ? noj) np ? np d (n ; m ) ? d (no ; mo) > dj (nj ; mj ) ? j j nj ? npo j j (nj ? noj ) j j = dp (noj ; moj ) (.21) Therefore:

dj (nj ; mj ) + nj > dj (noj; moj ) + noj Class B : By de nition of minimum xj = (nj ; mj ), all points xj 2 B : dj (nj ; mj ) ? dj (noj ; moj ) dj (nj ; mj ) ? dp(noj ; moj) nj ? noj nj ? noj

(.22) (.23)

dp (np; mp ) ? dp (nop ; mop ) (nj ? noj ) np ? nop d (n ; m ) ? d (no ; mo) > dj (nj ; mj ) ? j j nj ? njo j j (nj ? noj ) j j dj (nj ; mj ) ? dj (noj ; moj ) dj (nj ; mj ) ? (nj ? noj ) nj ? noj = dj (noj ; moj ) (.24)

dj (nj ; mj ) + (nj ? noj) = dj (nj ; mj ) ?

Therefore we can conclude that for any point xj 2 B :

dj (nj ; mj ) + nj > dj (noj; moj ) + noj Class

(.25)

C , D: Following same argument as Proof 2.1, we get: dj (nj ; mj ) + nj > dj (noj; moj ) + noj

(.26)

Combining previous results, we can conclude that for any point:

dj (nj ; mj ) + nj + l mj > dj (noj; moj ) + noj + l moj

(.27)

39 Therefore, with ( ; l ), we have one unique optimal pair, (nop ; mop ) as the minimizing solution to subband j . 2

Theorem 2.3

Given a non-singular multiplier pair (l ; l ) and the corresponding optimal sets fnok g, fmok g. Suppose x = xp of (3.25) is a strict minimum in subband p. Suppose also that x of (3.25) is a strict minimum among set fx1 ; : : : ; xK g. The new multiplier pair ( ; l ), where ? is the value of (3.25) evaluated at x, results in a pivoting case.

Proof 2.3

By Lemma 2.1, subband p has 2 optimal solutions, xop ; xp . By Lemma 2.2, subband j 6= p has the same unique optimal solution, xoj . Therefore, we have a pivoting case. 2 We have proved the initialization is correct given strict minimum requirements are satis ed within the pivot subband and among K subbands. In practice, strict minimum requirements are not a problem, since empirical data will very rarely yield equality in numerical comparison. Now we will prove the steps within the algorithm. We will begin will the proof of correctness of the triangular case; that is, we will prove that the selected pivot point, together with the original two pivot points, do indeed yield a triangular case. Lemma 2.4 Given a singular, pivoting, multiplier pair (p; p), the corresponding optimal sets fnok g, fmok g, and pivoting points xo;p 1 , xo;p 2 of pivoting subband p. Suppose x = xp, and xp is a strict minimum of (3.21) in subband p. Then there exists multiplier pair ( ; ) such that fxo;p 1 ; xo;p 1 ; xp g are simultaneous minimum for subband p. Proof of Lemma 2.4 We will rst introduce a change of basis, mentioned in section 3.4.2, such that the line of eligibility is parallel to one of the new axis, l. The change of basis is expressed as a matrix multiplication:

32 3 2 3 2 3 2 a a n z 4 p 5 = A 4 p 5 = 4 1;1 1;2 5 4 np 5

(.28)

32 3 2 3 2 3 2 b b z n 1 ; 1 1 ; 2 p p 5 4 zp 5 4 5=B4 5=4

(.29)

lp

mp

mp

a2;1 a2;2

mp

lp

b2;1 b2;2

lp

where B = A?1 is well de ned (See Figure .1). The de nition of the tilted function in the new coordinate system is:

ep (zp ; lp ) = dp (b1;1 zp + b1;2 lp ; b2;1 zp + b2;2 lp) ep (zp ; lp ) = ep (sp ; lp ) + p(b1;1 zp + b1;2 lp ) + p(b2;1 zp + b2;2 lp )

40 n

z

ο,1

xp

ο,1

n

ο,2

xp

ο,2

n

ο,1

m

m

ο,2

m

l

Figure .1: Example of Change of Basis = ep (zp ; lp ) + (p b1;1 + pb2;1 )zp + (p b1;2 + p b2;2 )lp = ep (zp ; lp ) + zp + lp

(.30)

where = pb1;1 + pb2;1 and = pb1;2 + p b2;2 . As discussed in section 3.4.2, after the basis transformation, (3.21) becomes: " min # ep (zp ; lp ) ? ep (zpo ; lpo ) o xp = (zp ; lp ) = arg zp > zp (.31) z ? zo p

p

Note that ep (zp ; lp ) evaluated at xo;p 1 and xo;p 2 are the same, and that z coordinate value of both points are the same. Therefore, we can treat the two points as one in the evaluation of (.31). If we let ? to be the value of (.31), then by following the proof of Lemma 2.1, ( ; ) will result in the set fxo;p 1 ; xo;p 2 ; xp g being the optimal solution to 3.4 in the (zp ; lp ) coordinate system. Translating back to (np; mp ) coordinate, we have the singular multiplier ( ; ) = ( a1;1 + a2;1 ; a1;2 + a2;2 ) that yields the same set as optimal solutions. 2

Lemma 2.5

Given a singular, pivoting, multiplier pair (p ; p ) and the corresponding optimal sets fnok g, fmok g, and pivot points xo;i 1, xo;i 1 of pivoting subband i. Suppose x = xi , and xi is a strict minimum of (3.21) among set fx1 ; : : : xK g. Then the multiplier pair ( ; ) = ( a1;1 + a2;1 ; a1;2 + a2;2 ) de ned previously will still yield fxoj g as the unique optimal solution to 3.4 for subband j 6= i. Proof 2.5 Change the basis of subband j via linear transformation as in previous proof. Follow the same procedures as in proof of Lemma 2.2 to show xoj is still the unique

41 solution in subband j . Change the basis back to (nj ; mj ) to complete the proof. 2 Theorem 2.6 Given a singular, pivoting, multiplier pair (p; p) and the corresponding optimal sets fnok g, fmok g, and pivot points xo;p 1 , xo;p 1 of pivoting subband p. Suppose x = xp , and xp is a strict minimum of 3.21 in subband i. Suppose also that xp is a strict minimum of 3.21 among set fx1 ; : : : ; xK g. Then there exists ( ; ) such that we have a triangular case. Proof 2.6 Follows from Lemma 2.4, 2.5. 2 We now state the corresponding theorem for quadrangular case. Theorem 2.7 Given a singular, pivoting, multiplier pair (p; p) and the corresponding optimal sets fnok g, fmok g, and pivot points xo;p 1 , xo;p 1 of pivoting subband p. Suppose p = pj ; j 6= i, and xj is a strict minimum of (3.21) in subband j . Suppose also that xj is a strict minimum of 3.21 among set fx1 ; : : : ; xK g. Then there exists ( ; ) such that we have a quadrangular case. Proof 2.7 Follow similar arguments to Proof 2.6. 2 By proving the correctness of pivot point evaluation for the triangular case and the quadrangular case, we have shown that for each iteration of the algorithm, it nds a set of non-unique solutions that are valid solutions to our Lagrange multiplier problem, stated in (3.4), for dierent values of the multiplier pair. We now prove that pivot point selection algorithm converges to the ending conditions stated in section 3.4.4. We accomplish that by showing a special metric, which tracks the progress of the algorithm at each iteration, decreases or remains the same at each iteration. This indicates that the algorithm yields a cluster of points that are at least as close to the target as the previous set. We will start by showing that the selection method stated in section 3.4.4 is feasible; i.e. until we reach the ending condition, there always exists a set of pivots that yields a pivot line segment that crosses or touches the distance line segment at every iteration. We will divide the proof into two cases: 1) when the distance line segment is a perpendicular drop to the interior of the pivot line segment; 2) when the distance line segment is a line connecting the target and one of the current pivots. Lemma 2.8 Given the relative locations of the pivots and the target are as in case 1. Then either we have reached ending condition, or there exists one set of pivots such that it creates a pivot line segment that crosses the current distance line segment.

42 Rs

Rs

Rs

Rs

X*,1

A target

X b

X

X*,2 X*,1

target

X

B

target

X

C

1

2

X

Rc

1

X

2

X

a)

X

1

Rc

X

Rc

1

Rs

Rc

2

X

a)

b)

Rs

2

X

target

b

b

a

X*,2

X

b)

Rs

Rs

X* X b 1

X

c)

X b

target

X*,1 a’

2

X

X*

target

i) Triangular

Rc

1

2

X

X

d)

Rc

X*,1 X*,2

X*,2

x’ b

1

b 2

X

X

X

c)

ii) Quadrangular

Rc

1

target

a’ 2

X

Rc

X

d)

Figure .2: Proof of PPSA: case 1

Proof 2.8 Let suppose the two pivot points associated rate pairs, X 1 and X 2 , are on

the Rc axis, as shown in Figure .2ia. We can do that without loss of generality because a simple change of basis and a linear translation can move any sets of points to this con guration. We rst divide the search space into three disjoint subspaces, A, B and C , as seen in Figure .2ib. These subspaces resulted from lines that connect the pivot rate pairs and the target and the distance line segment. We can discover one or two pivot rate-pairs, resulting in Triangular or Quadrangular case. Let us suppose it is the former rst. If the next pivot point is found in region A, as in Figure .2ic, then it is clear that we have reached the ending condition. If the next pivot point is in region B , as in Figure .2id, then selecting pivot X and X 2 , we have a new pivot line segment, a0 , that crosses the current distance line segment b. By symmetry, we can also conclude that if pivot is found in Region C , by selecting X and X 1 , we have a new pivot line segment that crosses the current distance line segment as well. Suppose two pivot rate-pairs are found, resulting in the Quadrangular case. If either one of the left or right pivot point, X ;1 or X ;2 , is found in region A (Figure .2iia), then we have reach the ending condition. If X ;1 is found in region B with X ;2 in region C , such that the line connecting them is above the target (Figure .2iib), we have again reached

43 Rs

Rs

Rs

Rs

A target

X

B

X

X*,1 X*,2

X*,1 target

target

X*,2

X

C

target

X

a’

b

b

b a X

1

X

Rc

2

X1

X

a)

X*

X

a’

X

1

X

X b

target

c)

i) Triangular

target

X*

X1

X b

X*,1 a’

X

2

d)

Rc

Rc

X2

Rs

target

a’ Rc

X1

b)

Rs

b 2

Rc

X2

a)

b)

Rs

Rs

X1

Rc

2

X1

X

b X*,1

X*,2

2

X

c)

ii) Quadrangular

target

X*,2

a’ Rc

X1

Rc

X2

d)

Figure .3: Proof of PPSA: case 2 ending condition. If the line is below the target (Figure .2iic), then that line is the new pivot line segment and it crosses the distance line segment. If X ;1 and X ;2 are both in Region B , then by selecting X ;2 and X 2 as pivots, we have a new pivot line segment that crosses the distance line segment. Finally, by symmetry, if X ;1 and X ;2 are both in Region C , then then by selecting X ;1 and X 1 as pivots, we have a new pivot line segment that crosses the distance line segment. 2 Lemma 2.9 Given the relative locations of the pivots and the target are as in case 2. Then either we have reached ending condition, or there exists at least one set of pivots that yields a pivot line segment that crosses the current distance line segment, or touches the current distance line segment at the end point. Proof 2.9 We follow similar procedure as in Proof 2.8, and divide the space into three subspaces, A, B and C . First suppose Triangular case, as seen in Figure .3i. If pivot lands in region A, we have reached ending condition, similarly to case 1. If pivot lands in region B , by selecting X and X 2 , the new pivot line segment a0 touches distance line segment b at X 2 . If pivot lands in region C , then by selecting X and X 1 , the new pivot line segment a0 crosses b.

44 Rs

Rs R*

di+1

X b’

target

ei+1

P*

X Q*

di

b

P*

I X2

X1

target

Rc

X1

Rc

X2

a)

b)

Figure .4: Proof of Theorem 2.10 Suppose Quadrangular case, as seen in Figure .3ii. If either X ;1 or X ;2 lands in region A, then we have reach the ending condition, similar to case 1. If X ;1 is found in region B with X ;2 in region C , such that the line connecting them is above the target (Figure .3iia), we have again reached ending condition. If the line is below the target (Figure .3iic), then that line is the new pivot line segment and it crosses the distance line segment. If X ;1 and X ;2 are both in Region B (Figure .3iib), then by selecting X ;2 and X 2 as pivots, we have a new pivot line segment that touches the distance line segment. Finally, if X ;1 and X ;2 are both in Region C (Figure .3iid), then then by selecting X ;1 and X 1 as pivots, we have a new pivot line segment that crosses the distance line segment. 2 Theorem 2.10 Suppose there exists a cluster of points that satisfy the ending condition. The selection method stated in section 3.4.4 terminates in that cluster. Proof 2.10 We rst de ne a special metric, di , which is the length of the distance line segment at iteration i. It measures how close a cluster of points is from the target. By de nition of distance line segment, if the pivot line segment of the next cluster is closer to the target, then it will have a smaller metric than previous cluster. If the next pivot line segment crosses the current distance line segment, then length of the new distance line segment must be smaller than the previous one. This is best illustrated geometrically. In Figure .4a, we see pivots X and X 2 yields a pivot line segment that crosses the current distance line segment. We denote the point of intersection as I . Clearly distance between target X target and I , ei+1 , is strictly smaller than length of distance line segment , di. Further, The next distance line segment, b0 , must have length di+1 strictly smaller than

45 ei+1 . We can conclude the following: if next pivot line segment crosses the current distance line segment, then di+1 < ei+1 < di . By Lemma 2.8, we know such pivot line segment

always exists if we are in case 1. Therefore, we can conclude that the metric must strictly decrease for the next iteration if we are in case 1. Notice in case 2, the only time there is no pivot line segment that crosses the distance line segment is when the new pivot(s) is(are) in region B , shown in Figure .3. In such cases, the distance metric might remain the same between iterations. In Figure .4b, by selecting X and X 2 , the metric remains the same. However, as the algorithm continues to progress, this situation cannot remain forever. Since the search space is continually being rotated, it will eventually reach a pivot point such that the metric will decrease. In Figure .4b, the new pivot goes from P to Q to R . When we reach R , the distance line segment is a perpendicular drop (case 1) and the metric is decreased. Therefore we can conclude that the metric must eventually decrease if we are in case 2. Since the metric continues to decrease as the algorithm progresses, the cluster moves closer to the target. The cluster that can make no more progress is the one with the ending condition. Therefore, the algorithm converges to the cluster with the ending condition. 2

46

Bibliography [1] R.Cox, J.Hagenauer, N.Seshadri and C.Sundberg, \Subband Speech Coding and Matched Convolutional Channel Coding for Mobile Radio Channels,"IEEE Trans. SP, vol.39, pp.1717-1731, August 1991. [2] M.Ruf, J.Modestino, \Rate-Distortion Performance for Joint Source and Channel Coding of Images," Proc. ICIP 1995, pp.77-80, October 1995. [3] D.Taubman and A.Zakhor, \Multirate 3-D Subband Coding of Video," IEEE Trans. Image Proc., vol.3, pp.572-88, September 1994. [4] Y.Shoham and A.Gersho, \Ecient Bit Allocation for an Arbitrary Set of Quantizers," IEEE Trans. ASSP, vol.36, pp.1445-1453, September 1988. [5] J.Hagenauer, \Rate-Compatible Punctured Convolutional Codes (RCPC Codes) and their Applications," IEEE Trans. Comm., vol.36, pp.389-399, April 1988. [6] N. Tanabe, N. Farvardin, \Subband Image Coding Using Entropy-Coded Quantization over Noisy Channel," IEEE Jour. Comm. vol. 10, No. 5, pp.926-942, June 1992. [7] H. Shi, P. Ho, and V. Cuperman, \Combined Speech and Channel Coding for Mobile Radio Communications", IEEE Transactions on Vehicular Technology, Vol. 43, No. 4, pp. 1078- 1087, November 1994. [8] N. Farvardin, \A Study of Vector Quantization for Noisy Channels", IEEE Transactions on Information Theory, Vol. 36, Nol. 4, July 1990, pp. 799 - 809.

47 [9] D. Luenberger,Linear and Nonlinear Programming, May 1989, pp.187-188.