JOINT MAXIMUM LIKELIHOOD ESTIMATION OF TIME ... - CiteSeerX

8 downloads 0 Views 165KB Size Report
Carlos Fernández-Prades∗, Juan A. Fernández-Rubio∗, Gonzalo Seco∗∗. ∗Dept. of Signal Theory and Communications, Universitat Polit`ecnica de Catalunya ...
JOINT MAXIMUM LIKELIHOOD ESTIMATION OF TIME DELAYS AND DOPPLER SHIFTS Carlos Fern´andez-Prades∗ , Juan A. Fern´andez-Rubio∗ , Gonzalo Seco∗∗ ∗

Dept. of Signal Theory and Communications, Universitat Polit`ecnica de Catalunya. Jordi Girona 1-3, 08034 Barcelona, Spain. E-mail: {carlos, juan}@gps.tsc.upc.es ∗∗ TT&C and Radio Navigation Section (TOS-ETT), European Space Agency ESA/ESTEC. Postbus 299, 2200 AG Noordwijk, The Netherlands. E-mail: [email protected] ABSTRACT This paper deals with the problem of estimating the amplitudes, time delays and Doppler shifts of a set of known signals received by an array of antennas in temporally white but spatially colored noise. Applying the Maximum Likelihood approach, the estimation reduces to an optimization problem where the nonlinear parameters such frequency shifts and delays are jointly estimated by means of a minimization over a highly nonlinear cost function, which is derived in this paper. In addition, an iterative algorithm that performs these estimations is also presented.

can be modeled as x(t) =

M X

ai si (t − τi ) exp {j2πfi t} + w(t)

(1)

i=1

where ai is the complex amplitude of each signal, τi is the delay, fi is the Doppler shift, and w(t) is additive white Gaussian noise. Each antenna receives a different replica of this signal, with a different phase depending on the array geometry and the direction of arrival. This can be expressed by a vector signal model, where each row corresponds to one antenna1 : x = GT Ad + n

1. INTRODUCTION

(2)

where The estimation of amplitudes, time delays and Doppler shifts of several incoming signals is a critical aspect in wireless communications, where these estimators provide information about the parameters of a multipath frequency– selective channel. The equivalent problem of synchronization is especially important in time-division multiple access (TDMA), packed-based systems and multiuser detection in code-division multiple access (CDMA). In applications such as active radar and sonar, global navigation satellite systems (GNSS) and search and rescue (SAR) satellitebased systems, these parameters inform about the position and relative motion of the objects. This paper presents a statistical approach to the synchronization problem exploiting antenna arrays. 2. PROBLEM FORMULATION An N –element antenna array receives M scaled, time– delayed and Doppler–shifted known complex baseband signals, si (t), i = 1, · · · , M . The receiving baseband signal Work partially supported by Spanish Government under grants FPUAP2000-3893 and TIC2001-2356-C02-01

• x(t) ∈ CN ×1 is the observed signal vector, • G is related to geometry, • A = diag(a) ∈ CM ×M is a diagonal matrix with the elements of the vector a along its diagonal, • d = [s1 (t − τ1 ) exp{j2πf1 t} . . . sM (t − τM ) exp{j2πfM t}]T , d ∈ CM ×1 the delayed Doppler–shifted narrowband signals envelopes, and • n(t) ∈ CN ×1 represents additive noise and all other disturbing terms. In this model, the narrowband array assumption has been made. This assumption consists of taking the time required for the signal to propagate along the array as much smaller than its inverse bandwidth. So, a phase shift can be used to describe the propagation from one antenna to another. In the same way, we have assumed that the Doppler effect can be modeled by a frequency shift, which is commonly referred as the narrowband signal assumption. 1 The transpose, conjugate and conjugate transpose operations are designated by (·)T , (·)∗ and (·)H respectively.

The matrix G depends on the geometry of the array and on the position of the sources or the scatterers. G can be regarded as a spatial signature, since it is uniquely defined for a set of sources emitting from different directions. This quite restrictive structured array model can be substituted by the unstructured one simply exchanging GT A for a channel matrix H = GT A with no loss of accuracy in the estimation of the synchronization parameters (see discussion in section 3). Suppose that K snapshots of the impinging signal are taken with a sampling interval Ts satisfying the Nyquist criterion. Then the sampled data can be expressed as X = GT AD + N

(3)

using the following definitions: • X = [ x(t0 ) · · · 



  D=  

x(tK−1 ) ] ∈ CN ×K

s1 (t0 −τ1 )ej2πf1 t0

···

s1 (tK−1 −τ1 )ej2πf1 tK−1

.. .

.. .

sM (t0 −τM )ej2πfM t0 M ×K

···

sM (tK−1 −τM )ej2πfM tK−1

D∈C

• N = [ n(t0 ) · · ·

n(tK−1 ) ] ∈ C

where C has been defined as C=

1 (X − GT AD)(X − GT AD)H K

From (5) to (6) we have used the trace property xH y = tr(yxH ), which leads to xH Ax = tr(AxxH ) and definition (7) arises. Using the following cross-correlation estimation matrix definitions: ˆ xx = 1 XXH R K ˆ dx = R ˆH R xd

ˆ xd = R ˆ dd = R

1 H K XD 1 H K DD

(8)

(7) can be expressed as ˆ xx − R ˆ xd AH G∗ − GT AR ˆ dx + GT AR ˆ dd AH G∗ C=R (9)  The joint Maximum Likelihood (ML) estimate of f , τ   and A, or in other words, the value of these parameters for ,  which the value of the observed X is most probable, is ob tained by minimizing (6): ˆ A, ˆ ˆf , τˆ |M L = arg min Q,

N ×K

(7)

Q,A,f ,τ

  ln(det(Q)) + tr(Q−1 C) (10)

The term n(t) includes the contribution of several phenomena, such thermal noise, interferences or multipaths of each signal. We assume a complex, circularly symmetric Gaussian vector process with a zero–mean, temporally white and arbitrary unknown spatial correlation matrix Q:  E n[n]nH [m] = Qδn,m (4) Matrix Q is not parameterized by the direction of arrival of the signals. This characteristic helps to overcome difficulties due to errors in the array calibration or jamming [1]. In case the noise is temporally colored, prewhitening techniques can be applied [2]. 3. MAXIMUM LIKELIHOOD PARAMETER ESTIMATION The probability density function (PDF) of a complex multivariate Gaussian vector x, taking into account the previous considerations about the noise, can be expressed as:   exp −(x − GT Ad)H Q−1 (x − GT Ad) p(x) = (5) π N det(Q) Hence, the likelihood function of the data is proportional to (5). Applying the logarithm and neglecting irrelevant additive and multiplicative constants, we determine the negative log–likelihood function for K observations of x: −1

Λ1 (Q, A, f , τ ) = ln(det(Q)) + tr(Q

C)

(6)

The gradient of (6) with respect to Q is   T ∇Q ln det(Q) + tr(CQ−1 ) = Q−1 − Q−1 CQ−1 (11) Assuming K ≥ N + M , so that the matrix C is invertible with probability one, the ML estimate of the covariance matrix Q is ˆ M L = C(A, f , τ ) ˆ Q (12) A=AM L ,f =ˆ fM L ,τ =ˆ τ ML Replacing Q in (6) with (12) and neglecting the additive constant term we obtain  ˆ xx − R ˆ xd AH G∗ − GT AR ˆH + Λ2 (A, f , τ ) = ln det R xd  ˆ dd AH G∗ + GT A R (13) ˆ xd R ˆ −1 R ˆ H to the argument Adding and subtracting R xd dd of the determinant in (13) and using the hermitian property of the autocorrelation matrix, the criterion function can be expressed as  ˆ xx − R ˆ xd R ˆ −1 R ˆH + Λ2 (A, f , τ ) = ln det R xd dd    H  ˆ dd GT A − R ˆ xd R ˆ −1 ˆ xd R ˆ −1 R + GT A − R dd

dd

(14)

The determinant is a nondecreasing function, so for any positive definite matrix U and any non-negative definite matrix V satisfies  det(U + V) = det U I + U−1 V =  (15) = det(U) det I + U−1 V ≥ det(U) since the eigenvalues of I + U−1 V are ≥ 1. The equality only holds for V = 0. Defining ˆ xx − R ˆ xd R ˆ −1 R ˆH U=R xd dd

(16)

and    H ˆ xd R ˆ −1 R ˆ dd GT A − R ˆ xd R ˆ −1 V = GT A − R dd dd (17) we have that ln det(U + V) ≥ ln det(U), so equation (14) reaches its minimum when V = 0, i.e., ˆ xd R ˆ −1 GT A = R dd

(18)

and PXH defined equally, a straightforward manipulation of (21) yields to ˆfM L , τˆ M L

=

arg min ln det (I − PDH PXH ) =

=

arg min ΛM L (f , τ )

f ,τ

f ,τ

(22)

which does not depend explicitly on the geometry, although it is implicitly involved in the observation matrix X. This can be explained by a result given in [3], where the decoupling of synchronization and spatial parameters is showed. Hence, it can be concluded that there is no need of knowledge of the spatial signatures if we are only concerned about synchronization, since asymptotically the same accuracy in estimating f and τ is achieved with the structured or unstructured array model. The resulting cost function ΛM L is highly nonlinear, as can be seen in figure 1, so iterative algorithms based on the gradient like steepest descent or Newton–Raphson method need to be initialized in the convergence region by a suboptimal algorithm.

which can be regarded as an adaptive Wiener filter for estimating X from D, or the channel matrix H in case of the unstructured array model. If the geometry matrix G is known (common in GNSS applications), the ML estimation for the amplitudes is ˆ M L = G∗ GT A

−1

ˆ xd R ˆ −1 G∗ R dd

(19)

The ML estimate of the carriers phase are directly the ˆ M L due to the invariance principle of the arguments of A ML estimates. Substituting (19) in (14) and applying the determinant properties det(AB) = det(A) det(B) and det(A + BC) = det(A+CB) for matrices with appropriate dimensions:   ˆ xx − R ˆ xd R ˆ −1 R ˆH = Λ3 (f , τ ) = ln det R xd dd    ˆ xx I − R ˆ −1 R ˆ xd R ˆ −1 R ˆH = ln det R = xx xd dd     1 1 2 ˆ ˆ xx + ln det I − R ˆ− ˆ −1 ˆ H ˆ − 2 = ln det R xx Rxd Rdd Rxd Rxx (20) Therefore, the joint ML estimate for both time–delays and Doppler–shifts is:   1 1 2 ˆ ˆfM L , τˆ M L = arg min ln det I − R ˆ− ˆ −1 ˆ H ˆ − 2 xx Rxd Rdd Rxd Rxx f ,τ

(21) This expression can be rearranged applying definitions in (8). If PDH = DH (DDH )−1 D is the projection matrix over the subspace spanned by the columns of DH ,

Figure 1: Cost function ΛM L for the case M = 1. This criterion can be simplified if the noise term is considered spatially white (i.e., the covariance matrix is proportional to the identity matrix), because (22) reduces to a trace instead of a log-determinant operation. In that case, the cost function depends linearly on the signal projection matrix PDH , and this is exploited by iterative solutions like IQML [1]. However, this assumption is often not accomplished in practical applications. 4. ESTIMATION BY THE NEWTON’S METHOD In order to develop an iterative algorithm and thus to avoid the computational burden of a grid search, we propose a minimization of the cost function (22) by the Newton’s method, i.e., weighting the gradient by the inverse of the Hessian matrix:   −1   ˆ f (k+1) = ˆ f (k) − Hf Λ ˆ f (k) , τˆ(k) ∇f Λ ˆ f (k) , τˆ(k) (23)

where

5. SIMULATION RESULTS 

   ∂2Λ Hf Λ ˆ f (k) , τˆ(k) = =  T ∂f ∂f

∂2Λ ∂f1 ∂f1

.. .

∂2Λ ∂fM ∂f1

∂2Λ ∂f1 ∂fM

... .. . ...



.. .

  

∂2Λ ∂fM ∂fM

(24) is the Hessian matrix of function Λ in the iteration k respect to the vector f , and ∇f is the gradient. The same for vector τ:

The simulated scenario consists of four filtered BPSK signals (three desired and a coherent interference), impinging to a linear array of 5 antennas. The directions of arrival have been fixed to θ = [ 20◦ 60◦ 80◦ ]T , while the interference comes from θint = 40◦ with the same amplitude than the desired signals. The evolution of the error is shown in figure 2.

−1     τˆ(k+1) = τˆ(k) − Hτ Λ ˆ f (k) , τˆ(k) ∇τ Λ ˆ f (k) , τˆ(k) (25) 4.1. Gradient computation ∂P

∂P

The derivative ∂fDiH is a K × K matrix, so ∂fDH will be a tridimensional tensor formed by M matrices of K × K elements, each one of them containing the partial derivatives of the PDH elements respect to the fi variables. The H derivative ∂D ∂fi is easily obtained from definition of D. An equivalent procedure is applied to obtain derivatives respect the τi variables.   ∂Λ −1 ∂PDH = −tr [I − PDH PXH ] PXH (26) ∂fi ∂fi   H H   H ∂PDH ⊥ ∂D H † ⊥ ∂D H † D = P DH D + P DH ∂fi ∂fi ∂fi (27) 4.2. Hessian computation ∂2P

In this case, ∂f ∂fDTH is a set of M tridimensional tensors, each one formed by M matrices of K × K size. ∂2Λ = ∂fi ∂fk

n −1 tr [I − PDH PXH ] −1

PXH o ∂PDH H P + X ∂fi

[I − PDH PXH ] n −1 − tr [I − PDH PXH ]

∂ 2 PD H ∂P = − ∂fDiH ∂fk ∂fi

∂DH ∂fk

DH

†

∂PDH ∂fk

∂ 2 PDH ∂fk ∂fi

2

o PXH (28)

H

∂ D H + P⊥ DH ∂fk ∂fi D

†

+

Figure 2: Vector estimation error of f and τ for 3 desired signals and an interference

6. CONCLUSIONS We have presented a new cost function which minimization leads to the joint Maximum Likelihood estimation of frequency shifts and time delays of a set of incoming signals, using an antenna array and supposing noise with unknown spatial covariance. This criterion is independent of the array response matrix. An iterative algorithm for the computation of such estimators has also been derived. 7. REFERENCES [1] Gonzalo Seco, Antenna Arrays for Multipath and Interference Mitigation in GNSS Receivers, Ph.D. thesis, Dept. of Signal Theory and Communications, Universitat Polit´ecnica de Catalunya, Barcelona, Spain, July 2000. [2] Andreas Jakobsson, A. Lee Swindlehurst, and Petre Stoica, “Subspace–based estimation of time delays and doppler shifts,” IEEE Transactions on Signal Processing, vol. 46, no. 9, pp. 2472–2483, September 1998.

−1 ∂D ⊥ DDH (29) [3] Aleksandar Dogand˘zi´c and Arye Nehorai, “Space– ∂fi PDH +   H H † † time fading channel estimation and symbol detection H ∂D −P⊥ DH ∂D DH + (·) DH ∂fk ∂fi in unknown spatially correlated noise,” IEEE Transactions on Signal Processing, vol. 50, no. 3, pp. 457–474, H where the notation (·) means that the same expression apMarch 2002. pears again with complex conjugate and transpose. H

∂D +P⊥ DH ∂fk