Approaching Capacity on Noncoherent Block Fading ... - CiteSeerX

6 downloads 0 Views 148KB Size Report
system is within 0.7 dB of the Shannon capacity of the binary- input block fading channel. I. INTRODUCTION. This paper considers a single-input single-output ...
Approaching Capacity on Noncoherent Block Fading Channels with Successive Decoding Teng Li, Xiaowei Jin, Oliver M. Collins and Thomas E. Fuja Dept. of Electrical Engineering, University of Notre Dame Notre Dame, IN, 46556 Tel : (574) 631-9205 Fax: (574) 631-9924 Email: {tli,xjin,ocollins,tfuja}@nd.edu Abstract— A receiver structure that uses decision feedback and successive decoding is proposed for a noncoherent channel with independent block fading. The transceiver employs a block interleaver to decompose the channel into a set of independent fading sub-channels. LDPC codes are used on each sub-channel, and the decoded data are fed back to help in channel estimation. Simulation results indicate that the performance of the proposed system is within 0.7 dB of the Shannon capacity of the binaryinput block fading channel.

I. I NTRODUCTION This paper considers a single-input single-output communication channel in which the transmitted signal is subject to frequency-flat, time-selective fading with both amplitude fluctuation and phase rotation. It assumes a block fading model [1] such that the complex fading coefficients remain constant for a block of T channel uses and change independently from one block to the next. Block fading is an accurate model for a communication system employing frequency-hopping or time-division multiple access. Moreover, it is also a good model for a channel subject to correlated Rayleigh fading. We assume that both the transmitter and receiver know the channel statistics but do not know the channel realization. Recently, it has been shown that iterative receivers employing LDPC or turbo codes can provide good performance on such channels [2,3]. This strategy employs iterative channel estimation and decoding; either pilot symbol aided channel estimation or differential detection may be used in such an iterative receiver. In [2], low-density parity-check (LDPC) codes are designed for an independent block fading channel with an iterative receiver in which a pilot symbol aided channel estimation algorithm is used. In [3], turbo codes and iterative differential detection technique are used. The iterative channel estimation and decoding is usually suboptimal, and furthermore, it may suffer from high computational complexity and convergence problems. This paper considers a successive decoding receiver [4] that makes use of LDPC codes. This type of receiver was also used on ISI channels [5], [6] and on correlated fading channels [7], yielding a near-capacity performance in both cases. The successive decoding receiver uses a block interleaver at the transmitter to decompose the block fading channel into a bank of memoryless sub-channels. Each sub-channel

is independently coded with LDPC codes of different rates. The receiver decodes these LDPC codes successively, and the decoded symbols are fed back to a channel estimator. The rate of the LDPC code used on each sub-channel is decided by the capacity of that sub-channel. For a block fading channel with coherence length T , the optimal strategy needs T LDPC codes of different rates1 . The resulting performance is within 0.7 dB of the Shannon capacity of the binary-input block fading channel. It is possible to use fewer codes at the cost of some performance loss. The trade-off between the complexity and the performance is also discussed in this paper. The rest of the paper is structured as follows. Section II introduces the channel model. Section III first gives a brief description of the successive decoding receiver structure, and then proves the optimality of the scheme, and finally presents a method of calculating channel capacity. Section IV gives the optimal and suboptimal methods of computing the symbol a posteriori probability. Section V presents the way to do code design. Section VI gives the simulation results followed by the conclusions in Section VII. II. S YSTEM AND CHANNEL MODEL To transmit data over a block fading channel with coherence time T , the transmitter de-multiplexes the user data into T streams. Each stream is then individually encoded using a block code of length N and rate Rk for k = 1, · · · , T . The kth codeword is denoted x:,k = [x1,k , · · · , xN,k ]T for k = 1, · · · , T . For simplicity, we assume binary phase shift keying (BPSK) modulation, so xi,k ∈ S = {+1, −1}. These codewords are stored column-wise in the following block structure:   x1,1 x1,2 · · · x1,k · · · x1,T  x2,1 x2,2 · · · x2,k · · · x2,T     .. .. .. ..   .  . . .   (1)  xi,1 xi,2 · · · xi,k · · · xi,T  .    . .. .. ..   .. . . .  xN,1 xN,2 · · · xN,k · · · xN,T 1 It will be shown that the first sub-channel cannot support any positive rate, so the total number of LDPC codes is actually T − 1.

1416 1-4244-0355-3/06/$20.00 (c) 2006 IEEE This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2006 proceedings. Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on June 21, 2009 at 20:37 from IEEE Xplore. Restrictions apply.

The transmitter sends the data in (1) row by row. We will also use xi,: to denote the ith row in (1). The receiver produces samples of the matched filter output at the symbol rate, then stores these samples in the same block structure shown in (1). The equivalent discrete-time channel model is given by yi,k = ci xi,k + wi,k , i = 1, 2, · · · N, k = 1, 2, · · · , T, (2) where the fading coefficient ci are i.i.d complex Gaussian random variables with distribution ∼ CN (0, 1) and the additive noise wi,k ∼ CN (0, N0 ).

Now we define the kth sub-channel as follows: it has a scalar input xi,k , a vector output yi,: and a vector of decision feedbacks [xi,1 , · · · , xi,k−1 ]T . The capacity of the kth subchannel is given by

1

I x:,k ; y:,1 , · · · , y:,T

Ck = lim N →∞ N [x1,1 , · · · , x1,k−1 ]T , · · · , [xN,1 , · · · , xN,k−1 ]T . (6) From the chain rule of mutual information [8], we have C=

III. T HE SUCCESSIVE DECODING RECEIVER The receiver employs a successive decoding strategy that was proposed for a generic channel with memory in [4]. More specifically, the receiver operates on the block structure in (1), starting from the leftmost column and proceeding to the right. The codeword x:,1 is decoded first, and x:,2 is decoded second with the help of the decoded symbols in x:,1 . The estimation and the decoding of a codeword are performed sequentially. Take the kth codeword as an example, where 1 ≤ k ≤ T . At this point, all previous k − 1 codewords are decoded and the decoded symbols are fed back to the receiver. First, the receiver estimates the a posteriori probability (APP) of the ith bit as   APP(xi,k = a) = P xi,k = a|yi,: , xi,1 , · · · , xi,k−1 (3) for a = {+1, −1} and i = 1, · · · , N . In (3), the bits xi,1 , · · · , xi,k−1 are treated as training symbols. After the receiver calculates the log-likelihood ratios (LLRs) {ξ(1, k), · · · , ξ(N, k)}, where APP(xi,k = +1) , APP(xi,k = −1)

(7)

k=1

A. Estimation and decoding

ξ(i, k) = log

T 1 Ck T

which means the original channel can be decomposed into T sub-channels without loss of mutual information. Furthermore, due to the independent nature of the original channel, the subchannel is memoryless, i.e., N 1 I(xi,k ; yi,: |xi,1 , · · · , xi,k−1 ) N →∞ N i=1

Ck = lim

= I(xi,k ; yi,: |xi,1 , · · · , xi,k−1 ).

B. Optimality The receiver structure described above is optimal, i.e., it is information lossless if the decision feedbacks are correct. This is shown in [4] for any channels with memory. The rest of this section will briefly show the result for the block fading channel. The main idea is that the block transmission structure effectively decomposes the original block fading channel into a bank of T sub-channels. These sub-channels are memoryless, but they interact with each other via the decision feedback. Thus, the bits in a codeword are transmitted over a memoryless sub-channel and separate estimation and decoding is optimal. To see this, we write the binary input capacity of a block fading channel as 1 I(x:,1 , · · · , x:,T ; y:,1 , · · · , y:,T ). (5) C = lim N →∞ T N

(9)

Finally, the APP value in (3) is the sufficient statistics of the sub-channel. Therefore, the estimation and decoding scheme in the previous section is optimal. Intuitively, the block fading channel with coherence time T and i.i.d. inputs can be viewed as a multiple access channel with T independent users and a vector channel output. Using this analogy, the successive interference cancellation scheme, which is optimal for a multiple access channel [8], becomes a successive decoding scheme, wherein the decision feedback serve as training symbols. From the definition of mutual information and entropy, (9) can be evaluated as Ck = H(xi,k ) − E[− log APP(xi,k )]

(4)

the decoder uses the LLRs to decode the kth LDPC code. We assume a perfect decision feedback receiver. Specifically, when decoding the kth codeword, the previous k−1 codewords have been decoded perfectly.

(8)

(10)

where the entropy H(xi,k ) = 1 and the APP value APP(xi,k ) is evaluated for the true input xi,k according to definition (3). The expectation in (10) can be calculated using Monte Carlo integration. Furthermore, if the APP values in (10) are computed using a reduced-complexity suboptimal method, the result of (10) is the achievable rate for the given receiver, which is also a lower bound of the channel capacity. It is clear that due to the increasing number of training symbols the sequence of sub-channel capacity is monotonic increasing C1 < C2 < · · · < CT . IV. T HE APP CALCULATION This section presents the methods to compute the APP in (3). Since the estimation of the APP value is the same across the rows of (1), we will only consider the first channel use of all sub-channels and drop the row index i for the rest of the paper. In what follows, we will use the notation xk1 = [x1 , x2 , · · · , xk ]T , and y1k is defined similarly. Therefore, from the Bayes’ rule, the APP in (3) can be written as   (11) P (xk = a|y1T , x1k−1 ) ∝ P y1T |x1k−1 , xk = a . We will compute the likelihood function (11) instead.

1417 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2006 proceedings. Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on June 21, 2009 at 20:37 from IEEE Xplore. Restrictions apply.

A. Optimal method The optimal method [7] first obtains an MMSE estimate of the channel state, and then enumerates all possible values of the unknown future symbols to get the probability (11). Mathematically,     (12) P (xTk+1 )P y1T |xT1 P y1T |xk1 = =



T −k xT k+1 ∈S

P (xTk+1 )P (y1k−1 |x1k−1 )P (ykT |y1k−1 , xT1

T −k xT k+1 ∈S





T −k xT k+1 ∈S

=



T −k xT k+1 ∈S

P (ykT |y1k−1 , xT1





(13)

where CQ is the set of quantization points and A(cq ) is the area of the quantization region for cq , and P (yi |xi , cq ) =

  1 exp − |yi − xi cq |2 N0−1 . πN0

(21)

In the above formula, (19) follows from the independence of channel output sequence {y1 , · · · , yT } given c. The complexity of this method is on the order of T |CQ ||S|, which is now linear in T and |S|. V. C ODE DESIGN

(14)



  H  1 exp − ykT − xTk cˆ Σ−1 ykT − xTk cˆ |πΣ|

Let Rk be the rate of the code for sub-channel k. In this paper, we set the code rate to be equal to the sub-channel capacity, i.e.,

(15)

Rk = Ck ,

for k = 1, · · · , T .

where from linear estimation theory [9], the conditional mean and variance are given by

(22)

The optimized irregular LDPC codes obtained by differential evolution and density evolution [10] are used as component codes. k−1 1 T codes of different rates are required for a channel with yi /xi ; (16) cˆ = k − 1 + N0 i=1 coherence time T . Note that if a sub-channel has zero capacity, we will use a zero rate code, i.e., a single codeword consisting N0 Σ= xT (xT )H + N0 IT −k+1 . (17) of all pilot symbols. Unlike other schemes, e.g., [2], in our k − 1 + N0 k k system, the pilot symbols are used only when the underlying The complexity of this approach is roughly on the order of sub-channel does not support any positive rate. No pilot T |S|T , which grows exponentially as the coherence length T symbols are required if all sub-channels have positive capacity, increases. for example, when the input amplitude conveys information as well, such as 16-QAM. B. Suboptimal methods However, in order to reduce the code design effort and Because most of the optimal method’s complexity comes receiver complexity, we may wish to use only K codes, where from the enumeration of all future symbols, an obvious K ≤ T , especially when T is large. Clearly, the disadvantage reduced-complexity version of (15) is to use at most Mf of using fewer codes is the loss of information rate since future symbols, where Mf < T , by computing the likelihood we are only allowed to use codes of rate smaller than the   function P y1L |xk1 , where L = min(T, k + Mf ). The com- sub-channel capacity. We can formulate the following design plexity reduces to T |S|Mf . In the extreme case of Mf = 0, problem: given a set of sub-channel capacities {C1 , · · · , CT }, the receiver does not use any future channel outputs and thus find a set of rates R = {R1 , · · · , RT } that satisfy |R| = K is most efficient. and Rk ≤ Ck fork = 1, · · · , T , while minimizing the The other suboptimal approach uses a numerical approxi- total rate loss  = Tk=1 (Ck − Rk ). The above rate design mation of the integral over the complex fading coefficient as problem is equivalent to finding the best approximation of a follows: monotonic increasing sequence C1 , · · · , CT using a K-level

sequence R1 , · · · , RT that always lies below the sequence P (y1T |xk1 ) = P (y1T |xk1 , c)P (c) dc (18) C1 , · · · , CT . For the extreme case of K = 1 and K = T , the solutions are R1 = · · · = RT = C1 and Rk = Ck for

 k T  = P (yi |xi , c) P (yi |c)P (c) dc (19) k = 1, · · · , T , respectively. It is straightforward to show that the K-level sequence must be a stair-like sequence, whose i=1 i=k+1 corner touches the original sequence, as in Fig. 5. More

 k T  P (yi |xi , c) P (xi )P (yi |xi , c)P (c) dc specifically, the sequence of rate must take the following form: = Rij = Rij +1 = · · · = Rij+1 −1 = Cij for j = 1, · · · , K − 1, i=1 i=k+1 xi ∈S and Rij = Rij +1 = · · · = RT = Cij for j = K, where k  1 = i1 < i2 < · · · < iK ≤ T is some ordered sequence ≈ P (yi |xi , cq ) of sub-channel indices that need to be optimized. Note i1 is cq ∈CQ i=1 always equal to 1. In this paper, T  use a brute force approach  T we  that enumerates all possible P (xi )P (yi |xi , cq )P (cq )A(cq ) (20) K combination of indices set {i , · · · , i } to find the solution that minimizes . 1 K i=k+1 xi ∈S

1418 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2006 proceedings. Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on June 21, 2009 at 20:37 from IEEE Xplore. Restrictions apply.

1

0.7 0.995

0.99

Ratio to the true channel capacity

Mutual information (Bit/Sec/Hz)

0.6

0.5

0.4

0.3

0.985

0.98

0.975

0.97

0.965

0.2

0

0.96

known state bound T=20 T=10 T=5

0.1

0

0.95

2

4

6

8

10 12 sub−channel index

14

16

18

Es/N = 5.2 dB 0 Es/N = 2.2 dB 0 Es/N = −1.2 dB

0.955

20

Fig. 1. The sub-channel capacity for channels with different coherence length, where Es /N0 = 3 dB.

0

1

2

3 4 5 Number of future symbols

6

7

8

Fig. 2. The ratio of achievable rate to the true capacity, where the suboptimal MMSE channel prediction receivers use different number of future symbols to compute the APP value. The channel has coherence length of T = 10. 0.6

VI. N UMERICAL RESULTS

0.55

A. Capacity and achievable rates Mutual information (Bit/Sec/Hz)

0.5

First, we compute the sub-channel capacities of block fading channels with various coherence lengths according to (10) and plot them in Fig.1. It shows that sub-channel capacity monotonically increases with the sub-channel index k. In fact, the first sub-channel always has zero capacity for any PSK inputs. This is because it has no decision feedback symbols and thus has no knowledge of the channel phase. For channels with large T , the sub-channels towards the end of the block will have capacity close to the known state bound since the decision feedback provides reliable channel estimation. This also indicates that one LDPC code of the same rate can be used for these sub-channels with little loss of the information rate. Second, we compute the achievable rate for the receivers using suboptimal APP computation methods. Fig. 2 shows the ratio of the achievable rate to the true capacity, where the receiver computes the APP value using a suboptimal MMSE channel estimation approach that only enumerates over Mf future input symbols. It is surprising to note that a computationally efficient receiver using no future symbols at all has capacity loss less than 2%, 3.5% and 5% at SNR (defined as Es /N0 ) of 5.2 dB, 2.2 dB and -1.2 dB, respectively. Thus, this suboptimal receiver is very attractive for practical systems. Furthermore, the comparison of the three curves in Fig. 2 shows that the future symbol is more valuable at low SNR. The intuition is that in a more noisy channel, the receiver needs more channel observations to make a reliable estimate. Fig. 3 shows the achievable rate of the receiver using the integration APP method as a function of the quantization levels. We independently quantize the real and imaginary part of c ∼ CN (0, 1) using a Max-Lloyd algorithm. The receiver for a block fading channel of T = 10 operating at SNR of 5.2

0.45

0.4

0.35

0.3

0.25

0.2

Es/N0 = 5.2 dB Es/N = 2.2 dB 0 Es/N = −1.2 dB

0.15

0

0.1

4

6

8

10 12 14 16 Number of qantization levels per dimension

18

20

Fig. 3. The achievable rate of the suboptimal integration receiver versus the number of quantization level. The channel has coherence length of T = 5.

dB, 2.2 dB and -1.2 dB would need approximately 9, 12 and 15 levels per dimension respectively to achieve channel capacity. It is expected that a channel with denser input constellation would require more quantization levels. B. Coding results This section shows the coding performance of two channels of T = 5 and T = 10. Irregular LDPC codes of codeword length 100,000 bits, which are generated according to Section V, are used for each sub-channel. The code rates are designed for the SNR of 2.2 dB. For T = 5 channel, the rates R1 , · · · , R5 are set to be 0, 0.4948, 0.5643, 0.5917, 0.6058, respectively. The overall code rate is 0.4513. For T = 10 channel, the rates R1 , · · · , R10 are set to be 0, 0.5177, 0.5869,

1419 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2006 proceedings. Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on June 21, 2009 at 20:37 from IEEE Xplore. Restrictions apply.

K 2 3 5 10

Perfect decision feedback Actual decision feedback −1

10

loss in mutual info. 15.8% 4% 1% 0

loss in SNR 1.6 dB 0.4 dB 0.1 dB 0 dB

TABLE I P ERFORMANCE LOSS FOR DIFFERENT K

−2

10

BER

overall rate 0.4659 0.5312 0.5483 0.5533

−3

10

them is actually training symbols), i.e., K = 5, has only 0.1 dB performance loss. There will be even more significant reduction in number of codes for channels with larger T , as we can see from Fig. 1 that most sub-channels have similar capacity in the case of T = 20.

−4

10

−5

10

Capacity

T=10 T=5

0.64 dB 0.68 dB

2.2

2.3

2.4

VII. C ONCLUSION 2.5 2.6 Es/No (dB)

2.7

2.8

2.9

Fig. 4. Performance of irregular LDPC codes of length 100,000 on independent block fading channels with T = 5 and T = 10.

0.6

Code rate (Bit/Sec/Hz)

0.5

0.4

0.3

In this paper, we apply a successive decoding scheme to an independent block fading channel with unknown state at receiver. This scheme decomposes the block fading channel into a bank of independent fading channels with partially known state. The LDPC codes are then applied on each subchannels. The successive decoding strategy is proven to be mutual information preserving. Our simulation shows that the BER performance of the system is within 0.7 dB of the Shannon capacity of the binary-input block fading channel, yielding a better capacity achieving ability than the iterative receiver strategy. The main disadvantage, however, is the large delay since the entire block of N T bits must be received before decoding. Another disadvantage is the complexity and storage required at the receiver for decoding T − 1 irregular LDPC codes.

0.2

R EFERENCES Number of codes K = 10 Number of codes K = 5 Number of codes K = 3 Number of codes K = 2

0.1

0 1

2

3

4

Fig. 5.

5 6 sub−channel index

7

8

9

10

Code rates for various K

0.6109, 0.6229, 0.6302, 0.6364, 0.6397, 0.6430, 0.6453, respectively. The overall code rate is 0.5533. The BER performance of both perfect decision feedback and actual decision feedback is shown in Fig. 4. In both cases, the SNR at which the BER drops under 10−5 are less than 0.7 dB away from the channel capacity. Furthermore, Fig. 4 shows that the scheme is robust against decision feedback errors when using LDPC codes of large block length. C. Rate design In this section, we design the code rate for a T = 10 channel when the number of codes are restricted to K, where K = 2, 3, 5, 10. The channel SNR is 2.2 dB. The code rates are designed according to Section V and the results are shown in Fig. 5. The performance loss because of using fewer codes is given in Table I. It shows that using five codes (one of

[1] R. McEliece and W. Stark, “Channels with block interference,” IEEE Trans. Inform. Theory, vol. IT-30, pp. 44–53, Jan 1984. [2] K. Fu and A. Anastasopoulos, “Analysis and design of LDPC codes for time-selective complex-fading channels,” IEEE Transactions on Wireless Communications, vol. 4, no. 3, pp. 1175–1185, May 2005. [3] R.-R. Chen, R. Koetter, U. Madhow, and D. Agrawal, “Joint noncoherent demodulation and decoding for the block fading channel: a practical framework for approaching Shannon capacity,” IEEE Transactions on Communications, vol. 51, no. 10, Oct. 2003. [4] T. Li and O. M. Collins, “A successive decoding stragety for channels with memory,” in Proc. IEEE International Symposium on Information Theory, Adelaide, Australia, Sep. 2005. [5] K. R. Narayanan and N. Nangare, “A BCJR-DFE based receiver for achieving near capacity performance on inter symbol interference channels,” in 42nd Annual allerton conf. on communication, control and computing, Monticello, IL, Oct. 2004. [6] H. D. Pfister, J. B. Soriaga, and P. H. Siegel, “On the achievable information rates of finite state ISI channels,” in Proc. GLOBECOM 2001, San Antonio, TX, Nov. 2001, pp. 2992–2996. [7] T. Li, X. Jin, and O. M. Collins, “Approaching capacity on correlated fading channels with unknown state,” in Proc. IEEE International Symposium on Information Theory, Adelaide, Australia, Sep. 2005. [8] T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley, New York, 1991. [9] S. M. Kay, Fundamentals of Statistical Signal Processing Estimation Theory, Prentice Hall, 1993. [10] T. J. Richardson, A. Shokrollahi, and R. L. Urbanke, “Design of capacity-approaching irregular low-density parity-check codes,” IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 619–637, Feb. 2001.

1420 This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2006 proceedings. Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on June 21, 2009 at 20:37 from IEEE Xplore. Restrictions apply.