pp 01 p - CiteSeerX

0 downloads 0 Views 774KB Size Report
tion in fM 2 : 010 z + M 2 +g. However, since MJ(01) 0 ...... The authors wish to thank Persi Diaconis for reading the original draft of this manuscript and Amir ...
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

[10] A. Martinez, A. Guillén i Fàbregas, and G. Caire, “New simple evaluation of the error probability of bit-interleaved coded modulation using the saddlepoint approximation,” in Proc. 2004 Int. Symp. Information Theory and its Applications, Parma, Italy, Oct. 2004, pp. 1491–1496. [11] G. Poltyrev, “Bounds on the decoding error probability of linear codes via their spectra,” IEEE Trans. Inf. Theory, vol. 40, no. 4, pp. 1284–1292, Jul. 1994. [12] D. Divsalar, H. Jin, and R. J. McEliece, “Coding theorems for “turbolike” codes,” in Proc. 36th Allerton Conf. Communication, Control, and Computing, Allerton House, Monticello, IL, Sep. 1998, pp. 201–210. [13] F. W. J. Olver, Asymptotics and Special Functions. New York: Academic, 1974.

On the Distribution of SINR for the MMSE MIMO Receiver and Performance Analysis

271

where xt 2 p is the (normalized) transmitted signal vector and yr 2 m is the received signal vector. Here p is the number of transmit antennas and m is the number of receive antennas. H W 2 m2p consists of independent and odentically distributed (i.i.d.) standard complex Normal entries. Rt 2 p2p is the transmitter correlation map2p ; c~ = c m , where c is the ~2 ; . . . ; c ~p ] 2 trix. P = diag[~ c1 ; c k k p k signal-to-noise ratio (SNR) for the k th spatial stream. This definition of SNR is consistent with [1, Sec. 7.4]. H = H W Rt P 2 m2p is treated as the channel matrix. nc 2 m is the complex noise vector and is assumed to have zero mean and identity covariance. Note that the power matrix P has terms involving the variance of the noise. The correlation matrix R t and power matrix P are assumed to be nonrandom. Also, we restrict our attention to p  m. We consider the popular linear minimum mean-square error (MMSE) receiver. Conditional on the channel matrix H , the signal-tointerference-plus-noise ratio (SINR) on the k th spatial stream can be expressed as (e.g., [1]–[6]) SINRk =

Ping Li, Debashis Paul, Ravi Narasimhan, Member, IEEE, and John Cioffi, Fellow, IEEE

1

MMSEk

01=

1 Ip

1

+ mH

yH 01

01

(2)

kk

where I p is a p2p identity matrix, and H y is the Hermitian transpose of Abstract—This correspondence studies the statistical distribution of the signal-to-interference-plus-noise ratio (SINR) for the minimum mean-square error (MMSE) receiver in multiple-input multiple-output (MIMO) wireless communications. The channel model is assumed to be (transmit) correlated Rayleigh flat-fading with unequal powers. The SINR can be decomposed into two independent random variables: SINR SINR , where SINR corresponds to the SINR for a zero-forcing (ZF) receiver and has an exact Gamma distribution. This correspondence focuses on characterizing the statistical properties of using the results from random matrix theory. First three asymptotic are derived for uncorrelated channels and channels with moments of equicorrelations. For general correlated channels, some limiting upper bounds for the first three moments are also provided. For uncorrelated channels and correlated channels satisfying certain conditions, it is proved that converges to a Normal random variable. A Gamma distribution and a generalized Gamma distribution are proposed as approximations to the finite sample distribution of . Simulations suggest that these approximate distributions can be used to estimate accurately the probability of errors even for very small dimensions (e.g., two transmit antennas).

=

+

Index Terms—Asymptotic distributions, channel correlation, error probability, Gamma approximation, minimum mean square error (MMSE) receiver, multiple-input multiple-output (MIMO) system, random matrix, signal-to-interference-plus-noise ratio (SINR).

I. INTRODUCTION This study considers the following signal and channel model in a multiple-input multiple-output (MIMO) system: yr

=

p1

m

H W Rt P

xt

+ nc =

p1

m

H xt

+ nc

(1)

Manuscript received January 20, 2005; revised September 9, 2005. P. Li is with the Department of Statistics, Stanford University, Stanford, CA 94305 USA (e-mail: [email protected]). D. Paul is with the Department of Statistics, University of California, Davis, Davis, CA 95616 USA (e-mail: [email protected]). R. Narasimhan is with the Department of Electrical Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064 USA (e-mail: [email protected]). J. Cioffi is with the Department of Electrical Engineering, Stanford University, Stanford, CA 94305 USA (e-mail: cioffi@stanford.edu). Communicated by R. R. Müller, Associate Editor for Communications. Digital Object Identifier 10.1109/TIT.2005.860466

. Note that (2), in the same form as equation (7.49) of [1], is derived based on the second-order statistics of the input signals, not restricted to binary signals. For binary inputs, Verdú [4, eq. (6.47)] provides the exact formula for computing the bit-error rate (BER) (also see [7]). Conditional on H , this BER formula requires computing 2p01 Q-functions. To compute BER unconditionally, we need to sample H enough times (e.g., 105 ) to get a reliable estimate. When p  32 (or p  64), the computations become intractable [4], [8]. Recently, study of the asymptotic properties of multiuser receivers (e.g., [2]–[4], [6], [8]–[11]) has received a lot of attention. Works that relate directly to the content of this correspondence include Tse and Hanly [11] and Verdú and Shamai [6], who independently derived the asymptotic first moment of SINR for uncorrelated channels. Tse and Zeitouni [3] proved the asymptotic Normality of SINR for the equal power case, and commented on the possibility of extending the result to the unequal powers scenario. Zhang et al. [12] proved the asymptotic Normality of the multiple-access interference (MAI), which is closely related to SINR. Guo et al. [8] proved the asymptotic Normality of the decision statistics for a variety of linear multiuser receivers. [8] considered a general power distribution and corresponding unconditional asymptotic behavior. Based on the asymptotic Normality results, Poor and Verdú [2] (also in [4], [8]) proposed using the limiting BER (denoted by BER1 ) for binary modulations, which is a single Q-function

H

BER1 = Q( E(SINRk )1 ) =

1

p

e

0t

=2

dt

(3)

E(SINR )

where E(SINRk )1 denotes the asymptotic first moment of SINRk . Equation (3) is convenient and accurate for large dimensions. However, its accuracy for small dimensions is of some concern. For instance, [8] compared the asymptotic BER with simulation results, which showed that even with p = 64 there existed significant discrepancies. In general, (3) will underestimate the true BER. For example, in our simulations, when m = 16; p = 8; SNR = 15 dB, the 1 of the exact BER. In asymptotic BER given by (3) is roughly 10000 current practice, code-division multiple-access (CDMA) channels with m; p between 32 and 64 are typical and in multiple-antenna systems arrays of 4 antennas are typical but arrays with 8 to 16 antennas would be feasible in the near future [9]. Therefore, it would be useful if

0018-9448/$20.00 © 2006 IEEE

272

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

one could compute error probabilities both efficiently and accurately. Another motivation to have accurate and simple BER expressions comes from system optimization designs [13], [14]. This study addresses three aspects related to improving the accuracy in computing probability of errors. First, the known formulas for the asymptotic moments can be improved at small dimensions. Second, while the asymptotic BER (3) converges very slowly [4, p. 305], the accuracy can be improved by considering other asymptotically equivalent distributions using higher moments. Third, the presence of channel correlations may (seriously) affect the moments and BER. In fact, when the dimension increases, the effect of correlation tends to invalidate the independence assumption [15, p. 222]. To the best of our knowledge, the asymptotic SINR results in the existing literature do not take into account correlations. Müller [16] has commented that the presence of correlations makes the analysis very difficult. The channel model in (1) takes into account the (transmit) channel correlations. This model does not consider the receiver correlation and assumes a Gaussian channel (Rayleigh flat-fading), while [3] and [8] did not assume Gaussianity. Removing the Normality assumption will still result in the same first moment of SINR but different second moment (see [3]). Ignoring the channel correlation, however, may produce quite different results even for the first moment. For example, it will be shown later that, with equal power and constant correlation  (equicorrelation model), the first moment is only about (1 0 ) times the first moment without correlation. Our work starts with a crucial observation: assuming a correlated channel model in (1), the SINR can be decomposed into two independent components + Tk = Sk + Tk SINRk = SINRZF k

1

01 1 H yH m

:

4

BERa = 0

Let A = I p +

m

H yH . Apply the following shifting operation:

01 ]kk = [A~ 01 ]11

[A

where FSINR is the cumulative distribution function (CDF) of SINRk . (6) converges to (3) as long as the asymptotic Normality holds. We propose using a Gamma and a generalized Gamma distribution to approximate FSINR . In particular, if Tk is approximated as a generalized

01

Akk 0 ayk(0k) (A(0k;0k) )01 ak(0k)

=

:

(8)

From (8), it follows that (2) can be expressed as SINRk =

1

Akk 0 ay

01 A(0k;0k) 01 ak(0k)

0k)

k(

(6)

(7)

where ak 2 p21 stands for the k th column of A; ak(0k) 2 (p01)21 is ak with the k th entry removed. A (0k;0k) 2 (p01)2(p01) is A with the k th column and k th row removed. Then

=

kk

1 1 0 px p2 e dt dFSINR (x)

1

A = (a1 a2 . . . ak . . . ap ) Akk ayk(0k) shift ~ 0! =A ak(0k) A(0k;0k)

(5)

It will be shown that Sk has a Gamma distribution. Because Sk and Tk are independent, we can focus only on Tk . As Sk is often the dominating component, separating out Sk is expected to improve the accuracy of the approximation. For uncorrelated channels and channels with equicorrelation, we derive the first three asymptotic moments for Tk (thus, for SINRk also), which match the simulations remarkably well for finite dimensions. For general correlated channels, some limiting upper bounds for the first three moments are proposed. We prove the asymptotic Normality of Tk for correlated channels under certain conditions. The proof of asymptotic Normality provides a rigorous basis for proposing approximate distributions. Since Tk is proved to converge to a nonrandom distribution (i.e., a Normal with nonrandom mean and variance), the same asymptotic results hold for the “unconditional” SINRk , by dominated convergence. The subtle difference between conditional and unconditional SINRk becomes important if one would like to extend to random power distributions or random correlations (see [8]). As an alternative to (3), a well-known approximation to the true BER for binary modulations would be

1

II. DECOMPOSITION OF SINRk

(4)

where SINRZF k , denoted by Sk , is the corresponding SINR for the zeroforcing (ZF) receiver, which, conditional on H , can be expressed as (e.g., [1], [3], [17]) SINRkZF =

Gamma random variable, combining the exact distribution of Sk , we are able to produce very accurate BER curves even for m = 4. We will use binary signals as a test case for evaluating our methods. We assume binary frequency-shift keying (BFSK) for obtaining the same limiting BER (3) as in [2], [4], [8]. It should be clear that our methodology applies to other constellations such as M -QAM. The correspondence is organized as follows. Section II describes how to decompose SINRk = SINRkZF + Tk . Section III has the derivation of the exact distribution of SINRkZF and its independence from Tk . Section IV contains the derivation of the first three asymptotic moments (or upper bounds on moments) of Tk . Asymptotic Normality of Tk is proved in Section V. In Section VI, the Gamma and generalized Gamma distribution approximations are proposed. Section VII has a demonstration about how to apply our results on SINR in computing the probability of errors. Section VIII compares our theoretical results with Monte Carlo simulations.

1

m

hy hk

01

k

0 m1 hy H 0 (

k

2

I p01 +

k)

1

m

01

H (y0k)H (0k)

2 m21 stands for the kth column 2(p01) is H with the k th column removed.

where hk m

H (y0k) hk (9)

of H ; H (0k)

2

Consider the singular value decomposition (SVD): H (0k) =

UDV U DV y ; U

2 m2(p01) ; D 2 (p01)2(p01) ; V 2 01)2(p01) ; U y U = I p01 ; V y V = V V y = I p01 . Let U c be the orthogonal complement of U , implying that U y U c = 0(p01)2(m0p+1) , y y (p

~ = I m . Plug these into (9) to get ~ = [U U c ] satisfies U ~ U ~ =U ~U and U

SINRk =

1

m

y

hky U~ U~ hk 0

hy UD 2 I p01 + 2 k

1

m

1

m

01

D2

U y hk : (10)

Now, expand

y

U~ hk =

U y hk U yc hk

tk ;t 2 sk k

=

01)21 ; sk 2

Then (10) becomes SINRk = =

1

m 1

sky sk +

0p+1

m

m i=1 = Sk + Tk

1

m

tky I p01 +

ks k k;i

0p+1)21 :

(p

2

+

where di is the ith diagonal entry of D .

1

m

1

m

D2

(m

01

tk

01 kt k2 k;i

p

i=1

1+

1

m

di2 (11)

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

273

The same shifting and SVD operations on SINRZF in (5) yields k SINRZF k

= m1 sy s = S : k

k

(12)

k

This proves the decomposition SINRk = SINRkZF + Tk . So far, no statistical properties of H are used, hence, the decomposition is not restricted to Gaussian channels. In the case p > m, SINRk in (2) is still valid. The same SVD yields SINRk

= m1

m

kt k2 = T 1 + 1 d2 k;i

i

=1

m

(13)

k

i

because H (0k) has only m nonzero singular values when p > m. The results from random matrix theory that we will use later also apply to p > m. However, in this study, we restrict our attention to p  m. III. DISTRIBUTIONS OF Sk

tk The distribution of Sk and its relationship with Tk can be obtained AND

from the properties of the multivariate Normal distribution (see [18, Ch. 3]). Here, R = P R tP , which is assumed to be positive definite, can be considered as the generalized covariance matrix. We denote a Normal distribution as N(mean; Var), and a complex Normal distribution as CN(mean; Var). Thus, each row of H W  CN(00; I p ), and each row of H  CN(00; R). Observe that

1

= c~k [R 01]kk [Rt 01 ]kk = rkk 0 rky(0k) (R(0k;0k) )01 rk(0k)

4 6k =

k

k

m

p

p

(16) (17)

)

=

Lemma 1: SINRkZF = Sk is a Gamma random variable, Sk G(m 0 p + 1; m1 6k ). Sk is independent of Tk . Proof: Recall that



k

k

k

c

k

c

We state our first lemma.

k

Sk(S ) = E

1 0p+1 ks k2 : Sk = k;i =1

i

The independence of Sk and Tk follows from (16) and (17). Conditional distribution of sk given H (0k) is independent of H (0k) . Therefore, (17) is the unconditional distribution of sk . The elements of sk are independent and identically distributed (i.i.d.) CN(00; 6k ), and therefore

1 S  G(m 0 p + 1; 6 ): m k

k m2 (Sk 0 E(Sk ))3

= 2 m 0mp3 + 1 63 :

(18)

k

Lemma 2: Conditional on H (0k) ; ktk;i k2 is a noncentral Chi-squared random variable with a moment generating function (MGF)

2 = 1 01y6 exp 1y0kzyk6 (19) where z = 0d (V yR(001 0 ) r (0 ) ) . For uncorrelated channels, kt k2  G(1; c~ ), independent of H (0 ) . Proof: From (16), t j H (0 )  CN(z ; 6 ). For uncorrelated channels, z = 0 and 6 = c~ hence kt k2  G(1; c~ ). In general,

E exp(kt k2y) j H (0 k;i

k

i

k;i

i

)

k;

k

k

k

k

Gore et al. [19] proved that SINRkZF is a Gamma random variable for the equal power case. Both Müller et al. [17] and Tse and Zeitouni [3] used a Beta distribution to approximate SINRkZF .

k

k

i

k

k

i

k

i

k

k

k

k;i

k

conditional on H (0k) , 62 ktk;i k2 is a standard noncentral Chi-squared random variable with two degrees of freedom and noncentrality parameter 62 kzi k2 , which gives the MGF in (19).

For future use, we give expressions for three moments of ktk;i k2 , conditional on H (0k)

E(kt k2 j H (0 ) ) = 6 + kz k2 Var(kt k2 j H (0 ) ) = 62 + 2kz k2 6 Sk(kt k2 j H (0 ) ) = 263 + 6kz k2 62 :

m

m

k

k;i

+ CN (00; 6 I 01 ) y U H 0 s = h j (0 )  CN (0; 6 I 0 +1 ) : (Recall that H (0 ) = UDV y ; U y H (0 ) = DV y ; U yH (0 UDV DV y = 0 .) U yU k

E(S ) = m 0mp + 1 6 Var(S ) = m 0 p + 1 62

i

(15)

tk = U y hk j H (0k)  0DV y (R(0k;0k) )01 rk(0k) c

The first three moments of Sk are

The following lemma concerns the distribution of tk .

(14)

hk j H (0k)  0H (0k) (R(0k;0k) )01 rk(0k) + CN (00; 6k I m )

k

E( ) 10 = ( = 0 0 5 0 8)

k

where rkk is the (k; k)th element of R , rk(0k) is the k th column of R with the k th element removed, R (0k;0k) is R with the k th row and k th column removed, [R 01 ]kk is the (k; k)th entry of R 01 . Conditional on H (0k) ; hk ; tk ; and sk are all Normally distributed

k

p T Fig. 1. The ratios , obtained from 5 simulations, with respect to m , S for two (equal power) SNR levels (c 0 dB, 30 dB), and three levels of ; : ; : . equicorrelations 

k;i

k

k

k;i

k

k

k;i

k

i

i

k

i

k

(20)

k

The significance of the decomposition of SINRk can be seen from Fig. 1, which plots E( TS ) as a function of p; c (SNR), and correlation , for m = 16. Here we consider equal power ck = c case with R t being an equicorrelation matrix (i.e., Rt consists of 1’s in the main diagonal and ’s in all off diagonal entries). This figure suggests that in the major p ; Sk might be the dominating component. range of SNR and m IV. ASYMPTOTIC MOMENTS This section derives the asymptotic moments of Tk , written as

1 01 kt k2 = 1 01 kt k2 = 1 ty 3t (21) m =1 1 + 1 d2 m =1 m where  = 1+ 1 ; 3 = (I 01 + 1 D )01 = diag[1 ; . . . ;  01 ]. Tk =

p

p

k;i

m

i

i

d

k;i

i

i

k

k

i

p

m

p

274

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

We use some known results for the empirical eigenvalue distribution (ESD) of the product of two random matrices (e.g., Silverstein [20], Bai [21], Silverstein and Bai [22]), to study the asymptotic properties of Tk . 01 ! 2 ; . In We work under the regime: p ! 1; m ! 1; pm the rest of the correspondence, whenever we mention “in the limit,” we refer to this condition. 01 are the eigenvalues of R(0k;0k) . Suppose furSuppose that fi gip=1 P Rt P converges as p ! 1 to a nonther that the ESD of R random measure F R . By Theorem 1.1 of [20], under some weak regularity conditions on F R (including that the support of F R is compact 1 H y(0k)H (0k) , deand does not contain ), in the limit, the ESD of m noted by J , converges to a measure J , whose Stieltjes transform, denoted by MJ , satisfies

(0 1)

=

0

^

1

( ) =4

( )= x0z

MJ z

J dx



(1

()

dF R  0 0 zMJ z

( )) 0 z

(22)

(see [8]). For finite p, the last integral in (22) can be approximated by

1

p01

1 (1 0 0 zMJ (z)) 0 z : Note that for uncorrelated channels, i = c~i . In general, (22) requires = fz 2 : Im(z) > 0g, and MJ (z) is the unique soluthat z 2 g. However, since MJ (01)  0 tion in fM 2 : 0 0 z + M 2 and since R is positive definite (implying that all i ’s are positive), and we only consider  1, it follows that  0 0 zM z 0z and its derivatives are bounded in a complex neighborhood of z = 01, hence MJ (z ) and its derivatives are well defined at z = 01 by the bounded p0

1i

=1

i

()

MJ z and its derivatives are not sufficient for computing the moments of Tk , which involve ktk;i k2 . We will combine the conditional moments of ktk;i k2 in (20) with MJ z to get the asymptotic moments of Tk . Three cases are considered in the following three subsections. First, we consider uncorrelated channels with unequal powers and obtain the asymptotic moments of Tk explicitly, although the results cannot be expressed in closed form. Next, we show that, assuming equal powers, the asymptotic moments of Tk for uncorrelated channels have closedform expressions. Finally, for general correlated channels with unequal powers, we derive some limiting upper bounds for the moments of Tk and give some sufficient conditions under which these upper bounds are the exact limits.

()

A. Asymptotic Moments of Tk for Uncorrelated Channels

= 0 G(1 ~ )

1

( ))

convergence theorem. Therefore,

1

(26)

Lemma 4:

Var ppp0 1 Tk ! ck  ;c =4 ck MJ0 (01): 2

1

(23)

1

1

=1

2 2

(24)

1

(25)

MJ0 (z )

and where “0!” denotes “converge in probability,” and MJ00 z are the first and second derivatives of MJ z , respectively. In general, MJ z ; MJ0 z , and MJ00 z have to be solved numerically except in some simple cases (e.g., an uncorrelated channel with equal powers). i 0 0 zMJ z 0 z . MJ0 z , and MJ00 can be Let i z approximated by solving p

()

()

( ))

()

p0 p0 ( ) 1 0 p 01 1 1ii( zz) = p 01 1 i M1Ji ((zz)) + 1 i i p0 p0 1 2 

z i MJ0 (z ) i MJ00 (z ) 1 0 = p 0 1 i 1 i (z ) p01 i 1 i (z ) p0 0 J (z ) + 1) + p 02 1 (i MJ (z) +1ii( zM z) i 1

=1

1

=1

1

2

2

=1

1

=1

2

3

E(Tk )  ck p 0p 1  ;c Var(Tk )  ck p p0 1  ;c Sk(Tk )  2ck p p0 1  ;c :

(29)

2

(30) (31)

Combining with the moments of Sk in (18), and using independence of Sk and Tk , we have

E(SINRk )  ck m 0pp + 1 + ck p 0p 1  ;c Var(SINRk )  ck m 0pp + 1 + ck p p0 1  ;c Sk(SINRk )  2ck m 0pp + 1 + 2ck p p0 1  ;c : 2

2

2

2

2

3

3

3

3

(32) (33) (34)

B. Closed-Form Asymptotic Moments of Tk for Uncorrelated Channels With Equal Powers

= p~

2

=1

Therefore, asymptotically, the first three moments of Tk can be approximated as

1 cH W . The ESD of m H (y0k)H (0k) converges In this case, H to the well-known “Marcenko–Pastur Law” ([21, Theorem 2.5]), from which one can derive closed-form expressions for the moments by (tedious) integration. Alternatively, we directly solve for MJ 0 from the simplified version (i.e., taking i c for all i) of (22) to get

1

2

(28)

3

2 3

3

1 ( ) = (1

3

3

1

()

(27)

2

2

=1

3

2

2

3

2

2

=1

2

2

1 Sk(pTk ) ! 2ck  ;c =4 ck MJ00 (01): 1

p0

1

MJ0 z

! ck  ;c =4 ck MJ (01):

p Tk p0

E

Lemma 5:

tr(33) = 1 p0 1 = 1 J^(dx) p01 p 0 1 i 1 + m di 1+x p 4 MJ (01) =  ;c ; = x 0 1(01) J^(dx) 0! tr(33 ) = 1 p0 1 p01 p01 i 1 + m di p 4 = (x 0 (101)) J^(dx) 0! MJ0 (01) =  ;c ; tr(33 ) = 1 p0 1 p01 p01 i 1 + m di 1 p 1 00 4 = (x 0 (01)) J^(dx) 0! 2 MJ (01) =  ;c ()

(

+

(1

()

6 = ~ = ck mp ; ) = 0, and

=

Lemma 3:

+

1

= =

In this case, R t ck P; k I p; R ; kz i k 2 di2 k V y R(001k;0k) rk(0k) i k2 rk(0k) ktk;i k2  ; ck , as shown in Lemma 2. The following three lemmas are proved in Appendix I.

:

=~  0 (~ c(1 0 ) + 1)  ;c = MJ (01) = 2~c

( 1)

(35)

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

275

Similar results can be obtained if we compare our approximate variance formula (33) with the asymptotic variance given in [3]. See the technical report [23] for details. C. Asymptotic Upper Bounds for Correlated Channels Recall that

E(ktk;i k2 kH (0k) ) = 6k + kzi k2 where

kzi k = di k(V y (R 0k;0k )0 rk 0k )i k : 2

2

(

1

)

(

2

)

There is an important inequality  ~ Fig. 2. The ratios  ~ 20 dB.

 ~ and  ~

=

where 

i=1

= c~ (1 0 ) + 2~c(1 + ) + 1. Similarly 2  ;c = MJ (01) =  ;c 0 1 + c~(12~c+  ) 0  1 2  ;c = MJ (01) =  ;c 0 c~3 : 2 2



2

0

From 1 + c~(1 + )

(37)

2

(38)

01 in our computations. In the literature We always replace by pm p is often used. It can be shown that the (e.g., [3], [4], [8]), = m choice of can have a significant impact for small dimensions.1 Our ~) is approximate formula (32) for E (SINRk ) (denoted as   ~=c

p c~ (1 0 ) + 1 2

2

:

= c~ 0 14 ( c~(1 + p )2 + 1 0 c~(1 0 p )2 + 1)2 :

(40)

Similarly,  ~R;p01 and ~R;p indicate whether = pm01 or = mp is ~R . Obviously,  ~R;p used for  01 = ~p01 .  ~ ~ p Fig. 2 plots ~ and ~ as functions of m and m . It is clear ~p01 and ~R;p at small dimensions can be that the difference between   ~ substantial. For example, when m = 4; p = 4; ~ is almost 3. On ~p01 and ~p is not significant. the other hand, the difference between  This experiment implies that our proposed approximate formulas are not only accurate (see simulation results in Section VIII) but also not as sensitive to the choice of . ~ with ~R algebraically. The following lemma compares 

 ~R;p .

~ = c mp is equivalent to “ A

that in our notations, c and equivalent to “ P ” in [3]. 1Note

0 1 6k MJ (01) + ek E(Tk )U = p m Var (T )U = p 0 1 62 M 0 (01) + 1 6 k

m2

Sk(Tk ) =

01 , the corresponding ~ is denoted by When is replaced with pm  ~p01 . If instead is replaced with mp , it is denoted by ~p . We can ~ with the well-known asymptotic first moment (e.g., [3], [4, compare  ~R ; eq. (6.59)]), denoted by 

Lemma 6:  ~p  ~p01 = ~R;p01 Proof: See Appendix II.

i=1

2

i

(41)

Lemma 7:

U

(39)

 ~R

V y (R(0k;0k) )01 rk(0k)

Some limiting upper bounds for the first three moments of Tk are provided in the next lemma.

 , it follows that  ;c   ;c   ;c :

m0p+1 + c p 0p 1  ;c p 0 1 1 c~ (1 + p )2 + 1 0 = c~ 0 14 p m

m i

i=1

p01

= kV yR(0k;0k) )01 rk(0k) k2 4e : = kR(0k;0k) )01 rk(0k) k2 = k

(36)

00

1 kz k2 = p01 m1 kzi k2 i i m 1 + 1 d2

p01

, with respect to m and p=m, for SNR

” in [4, eq. (6.59)]

k

J

m

(42) k

1 +2 2 2 

ek + ek2

(43) p 0 1 3 00 8 =9 2 6 21 6 M (01) + m2 6k ek + m2 6k  ek + ek3 m3 k J

+ 23 ek 6k m1

ek + 2

16

m

k

2 

ek + 2

16

m

k

2

0 (44)

where  and  0 are the same constants in the proofs of Lemmas 4 and 5. Proof: See Appendix III. We are interested in the special case where ek ! 0. If ek ! 0 at any rate (faster than O(p0 ) or faster than O(p0 )), ignoring all the terms involving ek ; E(Tk )U (Var(Tk )U or Sk(Tk )U ) is still the true limit. Under these conditions, we propose the approximate moments for SINRk

0 1 6k  ;c + ek E(SINRk )  m 0mp + 1 6k + p m Var(SINR )  m 0 p + 1 62 + p 0 1 62 2 + e2

(45)

(46) k k m2 m2 k ;c Sk(SINRk )  2 m 0mp3 + 1 6k3 + 2 pm031 6k3  ;c + ek3 : (47) k

Note that ek ; ek2 ; ek3 are retained in these expressions because simulation studies show that keeping these terms improves accuracy at very small dimensions. It turns out that in the equicorrelation situation, ek ! 0 at a good rate under mild regularity conditions. The equicorrelation is the simplest correlation model (e.g., [15, eq. (4.41)]) and is often used to model closely spaced antennas or for the worst case analysis [24]. For more realistic correlation models, see Narasimhan [25].

276

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

Lemma 8: If the correlation matrix R t consists of 1’s in the main diagonal and ’s in all off-diagonal entries,2 then ek ! 0 in the limit, as long as i6=k cc = O(pf ) for f < 2. Sufficient conditions for

ek ! 0 faster than O(p0

) and O(p0 ) would be f < 32 and f < 43 ,

respectively. Proof: R t can be written as R t = (1 0 )I p + 1 p1Tp , where 1p denotes a column vector of p 1’s. Recall that R = P R tP .

P (0k;0k)1p011pT01P (0k;0k) R (0k;0k) = (1 0 )P (0k;0k) + P with an inverse

P 01

(R(0k;0k) )01 = (100k;0k)

0 P 0 (100k;0k) (p 0 1)+ 1 0  1p011pT01P 0(0k;0k): (48)

Note that rk(0k)

= PP kk P (0k;0k)1p01 . Therefore,

=

! 0 if

(50)

i= 6 k c = O(p ), for some f < 2. If we want ek ! 0 faster than O(p0 ) and O(p0 ), we should

from which it follows that ek

c

f

have f < 32 and f < 43 , respectively. This completes the proof.

In the equal power case, i6=k cc = p 0 1 and 6k  (1 0 )~ c, which implies that the first (second, third) moment of SINRk for an equicorrelation channel is roughly only 1 0  ((1 0 )2 ; (1 0 )3 ) of the moment without correlations. To conclude this section, we mention that our asymptotic results also apply to the real channels (e.g., in [3], [8]) with a modification that involves multiplying the variance by a factor of 2 and the third moment by a factor of 4. V. ASYMPTOTIC NORMALITY OF Tk This section proves that, if appropriately normalized, Tk converges in distribution to a Normal random variable, both in the case of uncorrelated channels as well as the correlated channels if ek ! 0 faster than O(p0 ). The latter includes the equicorrelation channel with regularity conditions as in Lemma 8. The difficulty in dealing with channels with arbitrary correlations is due to the term p01

1 kz k2

m i=1

i

=

i

p01 i=1

2In order for

1 2 1 + 1 d2 m di

m i

2 = mTkp0p60k1tr(33) =D) N 0; 6k2  ;c

(52)

where “=)” means “converge in distribution.” Proof: See Appendix IV. D

Corollary 1:

mTk 0 (p 0 1)6k  ;c pp 0 1

2 =D) N 0; 6k2  ;c

(53)

mTk 0 (p 0 1)6k  ;c pp 0 1 = mTkp0p60k1tr(33) +

6k tr(33) 0p (p 0 1)6k  ;c : p01

We have shown that

mTk 0 6k tr(3 pp 0 1 3c )

2 ): =D) N(0; 6k2  ;c

P 6k tr(33) 0pp(p0011) ;c ! 0

2  T 01 (p 0 1) + 1 0  P kk1p01P (0k;0k)1p01 2  ck (p 0 1) + 1 0  i6=k ci

=

W

The fact that

2 R(0k;0k) 01 rk(0k)

ek =

Lemma 9:

Proof:

R(0k;0k) 01 rk(0k) = 1 0  P kk P 0(0k;0k)1p01 0 1 0  P kkP 0(0k;0k) (p 0 1)+ 1 0  1p01 (p 0 1) = (p 0 1)+ 1 0  P kk P 0(0k;0k)1p01 (49) and

Although (51) is bounded by ek = k(R(0k;0k) )01 rk(0k) k2 , it is challenging to show that (51) converges (not necessarily to zero). For the remainder of this section, we assume that ek ! 0 faster than O(p0 ), which makes the contribution of the term (51) asymptotically negligible.

V y (R(0k;0k) )01 rk(0k)

2 i

: (51)

Rt to be positive definite, we have to restrict 0 p01 1 <  < 1.

follows from Theorem 1.1 in Bai and Silverstein [22], which implies 3) that (p 0 1)( tr(3 p01 0  ;c ) is stochastically bounded. Equation (53) now follows from the Converging Together Lemma (or Slutsky’s Theorem) [26]. Corollary 2:

01 6k  ;c D SINRk 0 m0mp+1 6k 0 pm =) m0p+1 m

2 6k2 + pm01 6k2  ;c

N(0; 1):

(54)

Proof: This is a direct consequence of the independence of Sk and Tk . VI. DISTRIBUTION APPROXIMATIONS Once the asymptotic Normality of Tk (and SINRk ) is proved, it is possible to rigorously define the “approximating” distributions. Since Sk and Tk are independent, we do not need to approximate the distribution of Sk . However, we keep the option of approximating the distribution of Sk + Tk itself. Our approach is to match the first two asymptotic moments of Tk with the corresponding moments of a target distribution (e.g., Gamma). The asymptotic Normality of Tk (Lemma 9) ensures that the approximate distribution converges to the true limit. The Normal approximation is not accurate for small dimensions, and to an extent this is because Tk is positive and a Normal random variable has zero third central moment. We expect a Gamma distribution, which has nonnegative support and nonzero third central moment, to be a better approximation. It turns out that only considering the first two moments may not be sufficient for accurately computing error probabilities. Therefore, we consider a generalized Gamma distribution, which matches the first three moments and hence is likely produce better results, as another approximation.

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

277

In this section, the same conditions are assumed as in proving the asymptotic Normality of Tk . That is, ek ! 0 faster than O(p0 ). When this condition is not satisfied, we do not have a rigorous asymptotic result. In this section, the symbol  is also used to denote the approximate distributions. A. Normal Approximation Asymptotic Normality of Tk implies that

2 Tk  N p 0 1 6k  ;c ; p 021 6k2  ;c : m m

(55) Fig. 3. RST and RSSINR for selected SNR levels, over the whole range of , for equal power uncorrelated channels.

Also, the asymptotic Normality of SINRk means that

m 0 p + 16 + p 0 16  ; SINRk  N k m m k ;c m 0 p + 1 62 + p 0 1 62  2 : k m2 m2 k ;c

(56)

B. Gamma Approximation

Tk can be approximated by a Gamma random variable G( T ; T )

whose parameters are determined by solving

The Gamma approximation of Tk is therefore,

(57)

According to the Gamma approximation, the third central moment of

Tk should be

4 = 2 pm031 6k3  ;c :

;c

We can also approximate SINRk by a Gamma distribution again by matching the first two moments SINRk

2 2 2 63 m 0 p + 1 + (p 0 1) ;c : k m3 m 0 p + 1 + (p 0 1) ;c

RST

(65)

4  ;c :  ;c  ;c

2 2 1 0 pm01 + pm01  ;c = : 1 0 pm01 + pm01  ;c 1 0 pm01 + pm01  ;c

Fig. 3 plots some examples of power uncorrelated cases.

RST

and

RSSINR

, for the equal

C. Generalized Gamma Approximation

(60)

E(Tk ) = T T Var(Tk ) = T T2 Sk(Tk ) = (T + 1) T T3 :

Define RS to be the ratio of the third central moment of the approximated Gamma distribution to the asymptotic third central moment. For Tk this becomes

RSSINR

(64)

(59)

G( ; ),

The third central moment of SINRk then would be

For SINRk ;

(63)

A generalized Gamma distribution can be described by a stable law or an infinite divisible distribution [27], [28], [26, Chs. 2.7, 2.8], which involves the sum of i.i.d. sequence of random variables. In our case, conditionally, Tk is a weighted sum of non-i.i.d. random variables, hence Tk is not exactly a stable law nor infinite divisible. The Gamma approximation of Tk can be generalized by introducing an additional parameter to the original Gamma distribution [27], [28], i.e., assuming Tk  G( T ; T ; T ). The regular Gamma distribution is a special case with T = 1. Assuming a generalized Gamma distribution, the first three moments of Tk would be

(58)

)2 ;  G (mm00pp++11++((pp001)1) ;c 2

;c 2 1 6 m 0 p + 1 + (p 0 1) ;c : k m m 0 p + 1 + (p 0 1) ;c

RST =

1  1 for any p  m  RSSINR in the limit:

Proof: See Appendix V.

2 2 (p 0 1) ;c 1 6  ;c ; : k 2  ;c m  ;c

2 T T3

Lemma 10: Assuming equal powers and no correlations

RST RSSINR RST

0 1 6k  ;c E(Tk ) = T T = p m 2 Var(Tk ) = T T2 = pm021 6k2  ;c :

Tk  G

like RS = 1. It will be shown later that RS is also critical for the generalized Gamma approximation. The following inequalities hold for uncorrelated channels with equal powers.

(61)

(62)

and RSSINR are indicators of how well the Gamma approximation captures the skewness of the distribution of Tk . Ideally, we would

(66)

Equating these moments with the asymptotic moments of Tk will lead to the same T and T as for the Gamma approximation. The third parameter T will be

T =

2 0 1:

RST

(67)

Similarly, we can also generalize the Gamma approximation of SINRk by assuming SINRk  G( ; ;  ). The third parameter will be

=

2

RSSINR

0 1:

(68)

278

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

When  > 1, the generalized Gamma distribution with these parameter does not have an explicit density in general. However, it can be described in terms of the stable law and has a closed-form MGF [27], [28] MGF (s; G( ; ; )) = exp



01

1 0 (1 0 s)

:

1 where the gamma function 0(y) = 0 ty01 e0t dt, and the incomplete y 0 1 e0t dt, and 2 F1 ( 1 ) is the hypergamma function 0( ; y) = 0 t geometric function 2 F1

1; 1=2 + ; + 1;

(69)

When  < 1, the generalized Gamma distribution is a compound Poisson distribution with an MGF MGF (s; G( ; ; )) = exp

10

1 1 0 s

01

:

(70) From Lemma 10, it follows that T > 1 and  > 1 for uncorrelated equal power channels.

Computation of the probability of errors using the distribution of SINRk is a way of measuring how successfully the proposed distribution approximates the truth. For this purpose, we will compute the BER using (6), which is equivalent to the BFSK BER. It should be straightforward to apply our methods to other types of (nonbinary) constellations. To simplify the notations, the subscript k in BERk will be dropped for the rest of the correspondence. In this section, we will provide BER formulas under the Gamma and the generalized Gamma approximations, and compare these results with the exact formula for BER, denoted by BERe , given in [4, eq. (6.47)] or [7]. If the asymptotic Normality of SINRk holds, as in Corollary 2, then BERa

1 1 0 p e dt dFSINR (x) p 0 x 2 ! BER1 =4 Q E(SINRk )1 :

4 1 =

2

(72) 01 = p = . However, for correlated channels, since by treating pm m it is not convenient to compute E(SINRk )1 , we will use the our finite-dimensional moment formula to compute E(SINRk )1 , which is different for different p and m. Next, we will derive a variety of BER formulas corresponding to various approximation schemes, based on BFSK, which can be easily generalized to other types of constellations.

Denote the BER computed using the Gamma approximation by BERg . Integration of (6) by parts yields

1 1 0 p e FSINR (x) 21 x0 dx: (73) 2 0 (x) with F ; (x), the CDF of the Gamma distri-

Replacing FSINR bution G( ; ) as defined in (59), and using the results from integral tables [29], we have BERg

1 1 0 p e F ; (x) 21 x0 dx 2 0 1 0 0 1 p = x e 0 ; x dx 20( ) 2 0 0 (1=2 + ) (1= ) 1 p = 20( ) 2 (1=2 + 1= )1=2+ 22 F1 1; 1=2 + ; + 1; 1=21= + 1=

0( +1)

n!

:

(75)

According to the results of [30], [31, Ch. 9.2.3], under the generalized Gamma approximation on SINRk , the BFSK BER (denoted as BERgg ) can be expressed in terms of the MGF as

=1  =1 

0 0

MGF

0 2 sin1 2  ; G( ; ; ) d

1 0 1 +  01 2 sin2 

exp

d (76)

which is for  > 1 and can be evaluated numerically. We can similarly write down BERgg for  < 1. C. BER by Generalized Gamma Approximation on Tk Under this approximation, when T MGF for SINRk = Sk + Tk as

> 1, we can write down the

s; G m 0 p + 1; 6k + G ( T ; T ; T ) m 1 = m0p+1 1 0 6m s

MGF

2 exp T T0 1 1 0 (1 0 T T s)

:

(77)

d

(78)

The corresponding BER (denoted as BERg+gg ) would be BERg+gg

=1 

1

0

1+

m0p+1 6 2m sin 

2 exp T T0 1 1 0 1 + 2 Tsin2T which has to be evaluated numerically. VIII. SIMULATIONS

A. BER by Gamma Approximation on SINRk

=

n 1= 1=2+1=

B. BER by Generalized Gamma Approximation on SINRk

(71)

For uncorrelated channels with equal powers c2 (1 0 )2 + 2c(1 + ) + 2 + c(1 0 ) 0 E(SINRk )1 =

BERg

n=0

BERgg

VII. ANALYSIS OF THE PROBABILITY OF ERROR

1= 1=2 + 1= 1 0(1=2+ +n) 0(1=2+ ) = 0( +1+n)

=

Our simulations consider m = 4 (p = 2; 4), and m = 16 (p = 8; 16), for the equal power case. Both uncorrelated and correlated (with equicorrelation  = 0:5) channels are tested. The range of SNR’s is c = 0 dB–30 dB. Without loss of generality, the first stream (i.e., k = 1) is always assumed. H W is sampled 106 times for every combination of (m; p; c, and ) for computing the empirical moments, distributions, and BER, except when computing the exact BERe for m = 16; H W is only sampled 105 times. A. Moments

(74)

The theoretical moments are computed using (45), (46), and (47). Fig. 4 plots the first three moments of Tk computed both theoretically and empirically from simulations. For m = 16, the theoretical moments match the simulations very well, especially the first moment. When m = 4, the curves for the first moment are still quite accurate except for the correlated cases at very small SNRs. Note that due to the log scale, the errors at small SNRs are largely exaggerated in the

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

279

Fig. 4. Theoretical and empirical moments of Tk. The vertical axes are in the log10 scale. Note that the variances and third moments are in terms of their square roots and cubic roots, respectively.

Fig. 5.

Theoretical and empirical moments of SINR.

figure. When m = 4, for the second and third moments, there seem to be some “significant” discrepancies between theoretical results and simulations at large SNRs. However, these errors contribute negligibly to the second and third moments of SINRk (see Fig. 5). For example, when m = 4; p = 4;  = 0, exact Sk(Sk ) = 314:983 . Although the empirical Sk(Tk )(= 16:403 ) differs quite significantly from the theoretical Sk(Tk )(= 5:333 ), the theoretical Sk(SINR)(= 314:983 ) is almost identical to the empirical Sk(SINR)(= 314:993 ). The theoretical and empirical moments of SINRk are compared in Fig. 5 for m = 4. As expected, the curves match almost perfectly except for the observable (due to the log scale) errors at very small SNRs. B. Distributions Fig. 6 presents the quantile-quantile (qq) plots for distributions of Tk based on Gamma and Normal approximations against the empirical distribution, at a selected SNR = 10 dB. The figure shows that the Normal approximation works poorly for small m or p. The Gamma

approximation fits much better in all cases. Fig. 7 gives the same type of plots for SINRk . It shows the Gamma approximation works well, p = 1 . When p = 1, the Gamma distribution approxespecially for m 2 m imates the portion between 1%–99% quantiles pretty well. We have shown that RST  RSSINR for uncorrelated channels with equal powers in Lemma 10, which implies that a Gamma could approximate the distribution of SINRk better than that of Tk . Also, p = 1 than at p = 1 , Fig. 3 shows RSSINR is much smaller at m mp 2 = 12 . which helps explain why Gamma approximation works well at m C. Error Performance Fig. 8 plots BERs versus SNRs for uncorrelated channels. The figure shows that the BER curves produced by the Gamma approximation p = 1. are almost indistinguishable from the simulated curves for m 2 p When m = 1, the Gamma approximation still works well for moderate SNRs (e.g., SNR < 15 dB). At larger SNRs, the Gamma approximation slightly overestimates BER. The figure also shows that the generalized Gamma distribution produce almost perfect fits to the simulated BER

280

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

Fig. 6. Quantile-quantile plots for Tk . The triangles on the curves indicate the 1% and 99% quantiles. The range of quantiles is 0.1%–99.9%.

Fig. 7.

Quantile-quantile plots for SINR . The triangles indicate the 1% and 99% quantiles. The range of quantiles is 0.1%–99.9%.

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

281

Fig. 10.

p

=

=1

1 Fig. 8. BER curves for uncorrelated channels. and are used for 2 computing BER in both (a) and (b). BER stands for the exact BER. BER and BER are computed by assuming SINR is a Gamma and a generalized uses the exact Gamma distribution for Sk and Gamma, respectively. BER approximates Tk by a generalized Gamma. BER is the asymptotic BER.

= 16.

Compare the uncorrelated and correlated BER curves. m

= 16,

curves. All figures indicate using the asymptotic BER (BER1 ) formula will seriously underestimate the error probabilities at large SNRs.3 Fig. 9 presents the BER results for the correlated channels with equicorrelation  = 0:5. We can see the similar trends as for the p = 12 uncorrelated cases, i.e., Gamma approximation works well for m and the generalized Gamma approximation performs remarkably well. Finally, to see the difference between correlated and uncorrelated cases more closely, Fig. 10 plots the interesting portions of the BER curves for both cases, which illustrates that the differences could be significant. IX. CONCLUSION This study characterized the distribution of SINR for the MMSE receiver in the MIMO systems, for channels with nonrandom transmit correlations and unequal powers. The work started with a key observation that SINR can be decomposed into two independent component: SINR = SINRZF + T , where SINRZF has an exact Gamma distribution. For uncorrelated channels as well as the correlated channels under certain conditions, T is proved to converge in distribution to a Normal and can be well approximated by a Gamma or a generalized Gamma. Our BER analysis suggested that these approximate distributions can be used to accurately estimate the error probabilities even for very small dimensions. APPENDIX I PROOF OF LEMMAS 3, 4, 5 Restate Lemma 3 as follows:

E

p Tk p01

! ck  ;c =4 ck MJ (01):

(79)

Proof:

E

p Tk p01

=E

1 E(pT j H ) k (0k)

p01

1

p01

p E(ktk;i k2 )i p 0 1 m i=1 3) = 6k mp E tr(3 p01 ! 6k mp E(MJ (01)) = ck MJ (01)

=E

= 05

Fig. 9. BER curves for equicorrelation  : . Note that the BER ’s are different because we did not simulate the true limit of the mean.

3Note that, in order to produce comparable BER results with the classical references (e.g., [2], [4], [8]), we actually used real channels (only for the BER curves).

282

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

We can show that the first term in the right-hand side of (84) converges to ck3 MJ00 (01) as follows:

by the bounded convergence theorem [26, Sec. 1.3b], because

tr(33)  1: p01

p

Restate Lemma 4 as follows:

4 c2 M 0 (01): 2 = Var ppp0 1 Tk ! ck2  ;c k J

=

(80)

p01 2 = p p0 1 Var m1 ktk;i k2i

i=1

3

! 2 mp 3 6k3 12 E

H (0k)

p

3) 0 because E( tr(3 p01 ) ! MJ (01) by the bounded convergence theorem, 3 tr(3 ) and (p 0 1)Var( p01 )!0, which can be proved using the results from the concentration of spectral measures for random matrices [32]. The result we need is Corollary 1.8b in [32], which can be stated as follows:

tr(33) 0 E tr(33) p01 p01

 2e0(p01) 

3) = 1 2xP tr(33) 0 E tr(33) > x Var tr(3 p01 p01 p01 0 1 0(p01) x 2 dx =  4 xe  (p 0 1)2

(82)

3

3) which implies (p 0 1)Var( tr(3 p01 )!0. This completes the proof.

1

3

3 3  i

tr 33 p01 00 (01) = c3 M 00 (01): k J

MJ

(85)

j

pTk H (0k)

0 E(pTk ) 3

3

3)) tr(33) 0 E(tr(3 p01 p01

3

:

3)) 3 tr(33) 0 E(tr(3 p01 p01 1 2 3) 0 E tr(33) > x  3x P tr(3 p 0 1 p01 0 1 2 0(p01) x 3 1  3 x 2e dx = 2 (p 0 1)3 0

dx  3

(86)

3 1 E((E(pTk jH which implies p0 (0k) ) 0 E(pTk )) ) ! 0. 1 The last term in (84) also tends to zero, again using the concentration theorem and the following inequality: p

3

0 1 Cov(E(pTk j H (0k) ); Var(pTk j H (0k) )) 3 3) ; tr(332 ) = 3 mp 3 6k3 (p 0 1)Cov tr(3 p01 p01

32 ) Var tr(3 p01

:

(87)

We have shown (83)

Proof:

1 E((pT 0 E(pT ))3 ) k k p01 = p 01 1 E E pTk 0 E pTk j H (0k) 3 j H (0k) + p 01 1 E E(pTk j H (0k) ) 0 E(pTk ) 3 + p 03 1 Cov E pTk j H (0k) ; Var pTk j H (0k)

H (0k)

26k3 3i

3 3)  3 mp 3 (p 0 1)6k3 Var tr(3 p01

Restate Lemma 5 as follows:

1

3

Apply the concentration theorem one more time

dx:

0

0 1 Sk(pTk ) = p 0 1 E((pTk 0 E(pTk )) ) ! 2ck3  ;c =4 ck3 MJ00 (01):

ktk;i k2 0 6k i

= mp 3 6k3 (p 0 1)2 E

(81)

for any  > 0.  depends on the spectral radius of R (0k;0k) , the logarithmic Sobolev inequality constant, and the Lipschitz constant of g (x) = f (x2 ) = 1+1x . We can easily check that the function 1 is convex Lipschitz and g (x) has a finite Lipschitz norm. f (x) = 1+ x Therefore,

p

1 0 1E E

E >

3 j H (0k)

We can show the last two terms in the right-hand side of (84) tend to 0. Expand

H (0k)

2 32 ) + p2 62 (p 0 1)Var tr(33) = mp 2 6k2 E tr(3 k p01 m2 p01 2 0 ! ck MJ (01)

P

i=1

i=1

3

ktk;i k2 i

j

pTk H (0k) p01

i=1 p01

= 2 mp 3 6k3 E

i=1

p01 2 = p p0 1 E Var m1 ktk;i k2 i

0E

1 0 1E E

3 = mp 3 p 01 1 E

Var ppp0 1 Tk

2 + p p0 1 Var E m1

p3 m3 p

pTk

p01 3 = mp 3 p 01 1 E E ktk;i k2 0 6k

Proof:

i=1 p01

1 0 1E E

3)  2 : Var tr(3 p01  (p 0 1)2 Similar arguments will show that

32 )  2 Var tr(3 p01  0 (p 0 1)2

: (84)

for a different constant  0 . Therefore, the last term in (84) also tends to zero. Combining the results for the three terms of (84) together completes the proof.

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

283

APPENDIX II PROOF OF LEMMA 6

APPENDIX III PROOF OF LEMMA 7

Restate Lemma 6 as follows:

We will frequently use the following three inequalities:

~  ~ 01 = ~ 01  ~

p

p

R;p

R;p

:

4 (1) =

(88)

01 1 kz k2

p

Zk

~p Proof: Expand 

m

=1

i

~ = 12 c m 0pp + 1 + 21 cm 0 21 p 0p 1 (1 0  ) p2

p

p

(89)

k

where

=

p

4 (3) =

Zk

p 2

1 0 m + 2c mp 1 + mp + 1

=1 p01

~

m

p2

2

3

p

cm p2

+ p1 + p 0p 1  0  01 p

:

p

It suffices to show

2 cm + p + p(p 0 1)p 0 p p01  0 ()(cm + p)2 + (p 0 1)p2 p2 + 2(cm + p)(p 0 1)pp  p4 p201

()(cm + p)(p 0 1)p  (p 0 1)(cm + p)2 0 cp(p 0 1)(cm 0 p) ()(cm + p)p  (cm + p)2 0 cp(cm 0 p) ()(cm + p)2  (cm 0 p)2 p

~

p01

0 ~

R;p

= 12

c p

p

+  01 0  p

p

:

p

p

p

p

p

which is true. Hence, we complete the proof.

(94)

k

J

k

U

k

MJ

(01) + e

k

:

(95)

Var ppp0 1 T

1 01 62 + 2kz k2 6 1 2 2 m2 =1 1 + d 01 6 + kz k2 2 + p p0 1 Var m1 1 2 =1 1 + d 2

k

p

= p p0 1 E

i

k

i

m

k

i

p

i

k

m

i

:

(96)

i

The first term in the right-hand side of (96) can be written as

32 ) + p2 1 6 E(Z (2) ) 62 E tr(3 p01 m p01 2 2  mp 2 62 M 00 (01) + pm p 01 1 6 12 e k

k

k

k

J

(97)

k

in the limiting sense. Using the fact that

we can bound the second term in (96) by

p 01  p 0 c ()p2 201  p2 2 + c2 0 2cp ()c2 + 2c2 m 0 2c2 p 0 2cp  c2 0 2cp ()p  0cm + cp + p p

k

Var(x + y)  Var(x) + E(y2 ) + 2 Var(x) E(y2 )

Suffices to show p

k

P

k

 ~ 01 . Here we use“()” for

i

m

We can derive the limiting upper bound for the variance of Tk by following the proof of Lemma 4. We know that

p2 m2

p

which is true. Therefore, we prove  ~p “is equivalent to.” ~p01  ~R;p , note that To show 

i

k

! M (01), the limiting upper bound of E(T ) is k

Therefore,

~ 0 ~ 01 = 12

(93)

01 6 + kz k2 = m1 1 2 =1 1 + d = m1 6 tr(33) + Z (1)  1 6 tr(33) + e :

0 16 E(T ) = p m

p

p

:

k

3) Recall that tr(3 p01 then given by

(p 01 )2 = (p )2 + c2 + 2c2 m 0 2c2 p 0 2cp: p

k

p

Tk H (0k)

(90)

+ pc2 + 2c2 pm2 0 2 cp 0 2pc ; 2

(92)

i

m

=

(91)

k

i

m

i

k

i

p

1 d2

m

i

m

where p01

m

=1 1 + k2 =4 e i

1 + 1 d2 2 2 1 6 kz k 2 8 3  9e 1 2 1+ d

j

E

= 12 c m 0pp + 1 0 21 (1 0  01 )

01 1 kz k2 i

p

(Note that 1+x x  1; (1+2xx)  12 ; (1+6xx)  89 , for any x  0.) We begin with the first moment. Conditional on H (0k)

p

p01

=1

i

(p )2 = (cm + p 0 cp)2 + 4cp2 = (cm + p)2 + c2 p2 + 2cp2 0 2c2 mp: Expand  ~p01

=

i

 kR(0k;0k) )01rk(0k) p01 1 2 kz i k 2 4 (2) = m Z  1e i

m2 c2 2 p

i

p2 p m2

3) + p2 E Z (1) 2 ( 0 1)62 Var tr(3 p01 p01 2 3) E Z (1) 2 + 2 pm 6 Var tr(3 p01 2 2 2 2  mp 2 (p260 1) + p p0 1 e2 + 2 m(pp0 1) 6 k

k

k

k

k

k

where  is the same constant in Lemma 4.

k

2e 

k

(98)

284

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

Combining (97), (98), and ignoring the higher order term in (98), we can get a comprehensive upper bound

Var ppp0 1 Tk

Sk(Tk )U = pm031 6k3 MJ00 (01) + 8m=29 6k2 ek + m62 6k2 1 ek + ek3

2 2 = mp 2 6k2 MJ (01) + pm 6k p 01 1 12 ek

U

Combing all the bounds and limits, we can obtain a limiting upper bound for the third moment

0

2 2 + p p0 1 ek2 + 2 m(pp0 1) 6k 2 ek

+ 23 ek 6k m1

(99)

ek

+ 2 m1 6k 2

ek

+ 2 m1 6k 20 :

(104)

which, for convenience, is written as

Var(Tk )U = pm021 6k2 MJ (01) + m1 6k 12 + 2 2 0

ek

APPENDIX IV PROOF OF LEMMA 9

+ ek2 :

Restate Lemma 9 as follows:

(100) W

We can get the limiting upper bound for the third central moment by following the proof of Lemma 5. Recall p3 p

1 3 0 1 Sk(Tk )= p 0 1 E E pTk 0 E(pTk j H (0k) ) j H (0k) + p 01 1 E E(pTk j H (0k) ) 0 E(pTk ) 3 + p 03 1 Cov E pTk j H (0k) ; Var pTk j H (0k)

:

(102)

Expanding the second term of (101) and selectively ignoring some negative terms, we can get a bound

p

1

3

0 1 E E(pTk j H (0k) ) 0 E(pTk ) 3  mp 3 p 01 1 E((tr(33) 0 E(tr(33)))6k + mek )3 

p3 m3

6k3 ( 23 p 01 1

 3

p3

) + p 0 1 ek3 + 3

p3 m2

E

e

jyW

j H (0k)

= E exp

6k2

( 0 1)

 p

jy

=

1+

1

3 0 1 Cov E pTk j H (0k) ; Var pTk j H (0k)  p 03 1 Var E pTk j H (0k) Var Var

ek :

and

p0p 60k1tr(33)

mTk

d

H (0k)

 1. For any fixed y 2

, since p

pTk H (0k)

:

p01

pjy p01 jy p6p01 1 0 i=1 p01 pjy = exp kzi k2 jypp061  1 0 p01 i=1

( )=

We know

Var E

(106)

! 1, we

which allows us to conduct complex series expansions. We can write L2 p

j

p

= 01 :

ppjy0j 1 < 1; jpyjp60k 1i < 1

We can also bound the last term of (101) by

p

j

3) p01 E exp jykt k2 = exp 0jyp6p k0tr(3 i k;i 1 i=1 3) = exp 0jyp6p k0tr(3 1 p01 01 pjy 2 1 0 pp10 1 jy6k i exp kzi k2 jypp061  1 0 p01 i=1 3) = exp 0jyp6p k0tr(3 (107) 1 L 1 (p)L 2 (p): Note that i can assume

2

(105)

Conditional on H (0k)

Expand the first term of the right-hand side of (101) to get

33 ) + p3 1 62 Z (3) 6k3 E tr(3 k k p01 m2 p 0 1 3 3  mp 3 6k3 MJ00 (01) + mp 2 p 01 1 6k2 89 ek :

:

Proof: It suffices to show that the characteristic function of W ,

2 ), where E(ejyW ) ! exp(0 y2 6k2  ;c

E(ejyW ) = E(E(ejyW j H (0k) ));

(101)

p3 m3

2 = mTkp0p60k1tr(33) =D) N 0; 6k2  ;c

exp kzi k2

:

(108)

Then

j

pTk H (0k)

Var Var 4

2  p2 ek2 + 2 pm 6k 2 ek

k log(L2(p))k 

j



pTk H (0k)

= mp 4 6k2 Var 6k tr(332 ) + mZk(2) 2 4  mp 4 6k2 6k2 20 + m2 12 ek + 6k mek 20

:

(103)

p01 i=1 p01 i=1

kzi k2

pjy p01 1 0 jyp6p01

kzi k2 pjypj0i 1 2

 2jyj ppm0 1 Zk(1)  2jyj ppm0 1 ek ! 0

(109)

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

because of the assumption that ek ! 0 faster than O(p0 P L2 (p) ! 1, hence can be ignored.

285

). Therefore

p01

01 L1 (p) = 1 0 pp10 1 jy6k i i=1 p01 pp60k 1i = exp 0 log 1 0 jy i=1 p01 1 (jy6k i )n = exp n(p 0 1)n=2 i=1 n=1 1 p01 (jy6  )n k i = exp n ( p 0 1)n=2 n=1 i=1 p01 p01 jy 6k i 0y2 62k 2i = exp exp 1 = 2 (p 0 1) 2(p 0 1) i=1 i=1 1 p01 (jy6  )n k i 2 exp n(p 0 1)n=2 n=3 i=1 3) exp 0y2 62k tr(332 ) L (p): (110) = exp jyp6pk tr(3 3 2(p 0 1) 01 We will show that L3 (p) converges to 1 in probability. Note that in the

APPENDIX V PROOF OF LEMMA 10 

First we can show that RST =   2 To simplify the expressions, let  ;c 2  ;c 0 Q =  ;c 0 O 0 Q, where

O

n=3 i=1 1 p01

4  ;c  ;c 0  ;c

2 0 Q 0 4 =  ;c  ;c

;c 2 O0 Q =  ;c (112)

;c c ~ (1 +

) + 1 0  2 = 0  ;c

;c 2~c  (~ c(1 0 )2 + (1 + ) 0 (1 0 )) = (113) 2  2 2 O = (~c(1 0 ) + (1 + ) 0 (1 0 ))  ;c 2  (~ c(1 + ) + 1) 0  2 2~c  1 = 2~c 2 2 (2 0 (~c(1 0 ) + 1) 0 2~c ) (114)  0 c~(1 0 ) 0 1 c~ 0 ) 0 1 :  ;c Q = =  0 c~(12  3 2~c 3

E

ejyW j H (0k)

E(e

P ! 0;

i.e., L(p; c) ! 1: P

0y2 62k tr(332 ) P ! lim exp p!1 2(p 0 1) y2 2 2 = exp 0 2 6k  ;c

(111)

it suffices to show jE(exp(jyW

E exp

+

y2 2 2 6k  ;c

+ 2

jyW

2 2 0 y 6k2  ;c

) ! exp

2

y

2

62  2

k ;c

)) 0 1j ! 0.

01

y2

2 + 2 6k2  ;c H (0k) 0 1 2 2 E(exp(jyW )jH = E exp y2 6k2  ;c (0k) ) 0 1

= E E exp

2

jyW

= 2~c 12 3

3 0 2 (~ c(1 0 ) + 1)

0 3~c  + c~2 (1 0 ) + c~

(116)

Therefore,

which is the characteristic function of a Normal random variable with 2 . mean 0 and variance 6k2  ;c To show jyW

(115)

4  ;c  ;c 0  ;c

(jyj6k )n  n(p 0 1)n=2 n=3 i=1 1 p 0 1 (jyj6 )n k  n=2 3 ( p 0 1) n=3

After cancellations

c~ Q = 3: 

Plug (114) and (115) into (112),

n(p 0 1)

jy j 6 p 0 1 (p01) = 3 1 0 (pj0yj1)6

) + 1 0  = c~(1 + 2~ ; c 

Then

above derivations, we can expand the complex logarithm and switch the order of summations because of the condition pjpy0j 1 < 1

1 p01 (jy6  )n k i klog(L3 (p))k = n=2

4   ;c  ;c .  1, i.e.,  ;c =  ;c 0 O and  ;c =

2

2 exp 0 y 62 2 0 1 = 0: ! E exp y2 6k2  ;c 2 k ;c by the bounded convergence theorem. This completes the proof.

4 0  ;c  ;c 0  ;c 3 2 () 0  (~c(1 0 ) + 1) 0 3~c  + c~2 (1 0 ) + c~ ()(2 0 3~c )  (~c(1 0 ) + 1)(2 0 c~ ) ()2 (2 0 3~c )2  (2 0 4~c 2 )(2 0 c~ )2 ()2 (2~c )(22 0 4~c )  4~c (2 0 c~ )2 ()2~c2 2  0

which is true. It is easy to check to make sure 2 completes the proof for

RST

0

0 3~c  0. This

4

=   ;c 1 

;c ;c

with equality held when c = 0 or = 0. It is easy to show RSSINR  1 by noting that

2  ;c

 p ;c  ;c   ;c +2  ;c

which implies

RSSINR

1 0 m01 + p01 p ;c  ;c  1 0 p01 + p0p1  m 1 0 p01 + p01  m m ;c m m ;c p01 2 p01 p01 p (1 0 m ) + 2 m (1 0 m )  ;c  ;c + pm01 2  ;c  ;c = (1 0 pm01 )2 + (1 0 pm01 ) pm01 ( ;c +  ;c ) + pm01 2  ;c  ;c 1 with equality held only when RST = 1 holds. 2

286

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 1, JANUARY 2006

The remaining task is to show that RST 01 by , thus In the limit, we can replace pm

 RSSINR

in the limit.

 RSSINR 2 2 2 4

;c +  ;c () (1 0 )(12 +0

(1) 0+ 2)( (1 ;c0 +) ;c ) + 2 ( ;c  ;c ) 4  ;c   ;c  ;c 4 ()(1 0 )  ;c ;c 0  ;c 2 2   ;c  ;c ( ;c +  ;c ) 0 2 ;c  ;c RST

which is equivalent to c~2 2  0, after pages of tedious algebra. This completes the proof for RST  RSSINR , in the limit. See the technical report [23] for details.

ACKNOWLEDGMENT The authors wish to thank Persi Diaconis for reading the original draft of this manuscript and Amir Dembo for the helpful advice in the revision. Ping Li thanks Tze Leung Lai and Youngjae Kim for the enjoyable conversations. He also acknowledges Antonia Tulino, Sergio Verdú, and Dongning Guo for the e-mail communications during the preparation of the manuscript.

REFERENCES [1] A. Paulraj, R. Nabar, and D. Gore, Introduction to Space-Time Wireless Communications, 1st ed. New York: Cambridge Univ. Press, 2003. [2] H. V. Poor and S. Verdú, “Probability of error in MMSE multiuser detection,” IEEE Trans. Inf. Theory, vol. 43, no. 3, pp. 858–871, May 1997. [3] D. N. C. Tse and O. Zeitouni, “Linear multiuser receivers in random environments,” IEEE Trans. Inf. Theory, vol. 46, no. 1, pp. 171–188, Jan. 2000. [4] S. Verdú, Multiuser Detection. Cambridge, U.K.: Cambridge Univ. Press, 1998. [5] U. Madhow and M. L. Honig, “MMSE interference suppression for direct-sequence spread-spectrum CDMA,” IEEE Trans. Commun., vol. 42, no. 12, pp. 3178–3188, Dec. 1994. [6] S. Verdú and S. Shamai (Shitz), “Spectral efficiency of CDMA with random spreading,” IEEE Trans. Inf. Theory, vol. 45, no. 2, pp. 622–640, Mar. 1999. [7] A. Hjørungnes and P. S. R. Diniz, “Minimum BER prefilter transform for communications systems with binary signaling and known for MIMO channel,” IEEE Signal Process. Lett., vol. 12, no. 3, pp. 234–237, Mar. 2005. [8] D. Guo, S. Verdú, and L. K. Rasmussen, “Asymptotic normality of linear multiuser receiver outputs,” IEEE Trans. Inf. Theory, vol. 48, no. 12, pp. 3080–3095, Dec. 2002. [9] A. M. Tulino and S. Verdú, “Random matrix theory and wireless communications,” Foundations and Trends in Communications and Information Theory, vol. 1, Jun. 2004. [10] L. Li, A. M. Tulino, and S. Verdú, “Asymptotic eigenvalue moments for linear multiuser detection,” Commun. Inf. and Syst., vol. 1, no. 3, pp. 273–304, 2001. [11] D. N. C. Tse and S. V. Hanly, “Linear multiuser receivers: Effective interference, effective bandwidth and user capacity,” IEEE Trans. Inf. Theory, vol. 45, no. 2, pp. 641–657, Mar. 1999. [12] J. Zhang, E. K. P. Chong, and D. N. C. Tse, “Output MAI distribution of linear MMSE multiuser receivers in DS-CDMA systems,” IEEE Trans. Inf. Theory, vol. 47, no. 3, pp. 1128–1144, Mar. 2001. [13] D. Gesbert, “Robust linear MIMO receivers: A minimum error-rate approach,” IEEE Trans. Signal Process., vol. 51, no. 11, pp. 2863–2871, Nov. 2003. [14] A. Hjørungnes and D. Gesbert, “Exact SER-precoding of orthogonal space-time block coded correlated MIMO channels: An iterative approach,” in Proc. 6th Nordic Signal Processing Symp., NORSIG 2004, Espoo, Finland, Jun. 2004, pp. 336–339.

[15] E. Biglieri, “Transmission and reception with multiple antennas: Theoretical foundations,” Foundations and Trends in Communications and Information Theory, vol. 1, p. 222, 2004. [16] R. R. Müler, “On the asymptotic eigenvalue distribution of concatenated vector-valued fading channels,” IEEE Trans. Inf. Theory, vol. 48, no. 7, pp. 2086–2091, Jul. 2002. [17] R. R. Müller, P. Schramm, and J. B. Huber, “Spectral efficiency of CDMA systems with linear interference suppression,” in IEEE Workshop on Communication Engineering, Ulm, Germany, Jan. 1997, pp. 93–97. [18] K. Mardia, J. T. Kent, and J. Bibby, Multivariate Analysis. San Diego, CA: Academic, 1979. [19] D. Gore, R. W. Heath, and A. Paulraj, “On performance of the zero forcing receiver in presence of transmit correlation,” in Proc. IEEE Int. Symp. Information Theory Lausanne, Switzerland, Jun./Jul. 2002, p. 159. [20] J. W. Silverstein, “Strong convergence of the empirical distribution of eigenvalues of large dimensional random matrices,” J. Multivariate Anal., vol. 55, no. 2, pp. 331–339, 1995. [21] Z. D. Bai, “Methodologies in spectral analysis of large dimensional random matrices, a review,” Statist. Sinica, vol. 9, no. 3, pp. 611–677, 1999. [22] Z. D. Bai and W. Silverstein, “CLT for linear spectral statistics of largedimensional sample covariance matrices,” Ann. Probab., vol. 32, no. 1A, pp. 553–605, 2004. [23] P. Li, D. Paul, R. Narasimhan, and J. Cioffi, “On the Asymptotic and Approximate Distribution of SINR for the Linear MMSE MIMO Receiver,” Dept. Statist., Stanford Univ., Stanford, CA, Tech. Rep. 2005-15, Jun. 2005. [24] H. Shin and J. H. Lee, “Capacity of multiple-antenna fading channels: Spatial fading correlation, double scattering and keyhole,” IEEE Trans. Inf. Theory, vol. 49, no. 10, pp. 2636–2647, Oct. 2003. [25] R. Narasimhan, “Spatial multiplexing with transmit antenna and constellation selection for correlated MIMO fading channels,” IEEE Trans. Signal Process., vol. 51, no. 11, pp. 2829–2838, Nov. 2003. [26] R. Durrett, Probability: Theory and Examples, 2nd ed. Belmont, CA: Duxbury, 1995. [27] P. Hougaard, “Survival models for heterogeneous populations,” Biometrika, vol. 73, no. 2, pp. 387–396, 1986. [28] H. U. Gerber, “From the generalized gamma to the generalized negative binomial distribution,” Insurance: Mathematics and Economics, vol. 10, pp. 303–309, 1991. [29] I. S. Gradshteyn and I. M. Ryzhik, Tables of Integrals, Series, and Products, 2nd ed. New York: Academic, 1980. [30] M. S. Alouini and A. J. Goldsmith, “A unified approach for calculating error rates of linearly modulated signals over generalized fading channels,” IEEE Trans. Commun., vol. 47, no. 9, pp. 1324–1334, Sep. 1999. [31] M. K. Simon and M.-S. Alouini, Digital Communication Over Fading Channels: A Unified Approach to Performance Analysis, 1st ed. New York: Wiley, 2000. [32] A. Guionnet and O. Zeitouni, “Concentration of the spectral measure for large matrices,” Electron. Commun. Probab., pp. 119–136, 2000.