estimation of hyperspectral covariance matrices - IEEE GRSS

3 downloads 0 Views 770KB Size Report
(b) n=10 measurements (n/p=2 e.g. RMB rule in radar). Λ=[0. 4 1.1 2.1 4.0 7.3]. (Reed, Mallet & Brennan, 1974, average SNR loss for matched Filter is X2) ...
ESTIMATION OF HYPERSPECTRAL COVARIANCE MATRICES Avishai Ben-David1 and Charles E. Davidson2 1Edgewood Chemical Biological Center, USA. 2Science and Technology Corporation, USA.

Outline • • • •

Why covariance matrices are important? What is the difficulty in estimation? Our approach Example for hyperspectral detection

Why covariance matrices are important • The covariance matrix C is the engine of most multivariate detection algorithms • Examples: Matched Filter: score = αT·C-1·t Anomaly Detector: score = α T·C-1· α α = measurement vector, t = target vector

How do we compute C ? • z is measurement vector with p spectral bands (i.e., z is p-by-1 vector) that is measured when target was absent (i.e., the H0 hypothesis) • We acquire n z-vectors and construct a p-by-n matrix Z, and center it (mean subtracted) Z Z-E(Z) • C=cov(Z)=E(ZZT)=UUT (CWishart statistics) where  is the estimated eigenvalue-matrix and U is the estimated eigenvector-matrix using SVD decomposition.

What is the difficulty in estimation • The problem is that there are not enough measurement of z-vectors (n is too small) • Example: sampled eigenvalues  from sampled C (average of 1000  matrices) • 5 spectral bands (p), i.e., C=5-by-5 matrix (very small) with true eigenvalues =[1 2 3 4 5] (a) n=50 measurements: n/p=10 (e.g., p=150 (typical in hyperspectral) n 1,500 =[0.9 1. 8 2.8 4.0 5.6] (b) n=10 measurements (n/p=2 e.g. RMB rule in radar) =[0. 4 1.1 2.1 4.0 7.3] (Reed, Mallet & Brennan, 1974, average SNR loss for matched Filter is X2)

Our solution (general overview) • Objective: to find a simple transformation from sampled eigenvalues Λx to population (truth) eigenvalues (ΛΩ). Λ=f(Λx) ΛΩ • The improved covariance matrix is computed as C=UTΛU. We replace sampled eigenvalues Λx with the improved estimate Λ, and using the sampled eigenvectors U (for lack of knowledge of the population eigenvectors).

• Our solution involves two steps. 1st step is interpreted as adding energy spectrally. 2nd step is balancing the energy in two big blocks: small and large eigenvalue regions. Thus, we “redistribute” energy to the eigenvalues • We use theory for statistical distribution of eigenvalues for Wishart matrices and bounds on magnitude of eigenvalues, and energy conservation constraint.

We view the sampled eigenvalues ”as if” they can be represented with diagonal of p blockmatrices, each with Marcenko-Pastur law. Sampled eigenvalues “as if” sampled from the mode (i.e., highest probability location). Sampled eigenvalue are “shifted” (1ststep) toward the population eigenvalues. We impose energy conservation (2nd step) for the solution - because the sum of eigenvalues (trace) is unbiased, i.e., trace(x)=trace() Trace is the signal “energy” (total variation)

Our solution (detailed view) How simple is it? Multiplication of 3 matrices:   f ( x , n)   x  F  E F

x is the sampled eigenvalues matrix, x = eig(C)

pi 2 ) 1 n  diag( ); Fmode (i)  p Fmode (1  i ) n

 E

(1 

Elarge  I t  0 

  Esmall  I p t  0

p

Esmall 

 s x (i )

i t 1 p s



x (i )

i t 1 Fmode (i )

shift sampled eigenvalues based on mode with matrix F and multiplicity pi balance the energy with matrix E t

E large 

 s x (i )

i 1 t

s x (i )  i 1 Fmode (i )

Regularization aspect of the solution (enhanced stability) • The solution is a nonlinear transformation of the sampled eigenvalues x • We can also write the solution in the framework of traditional regularization as    x  ;    x ( F  E  I ) • Our correction is potentially different for each eigenvalue. (it is single offset in traditional regularization). • With our method the condition number of  improves (decreases) due to the fact that in the magnitude of the small sampled eigenvalues tend to increase. Thus, cond() < cond(x)

Eigenvalue estimation for diagonal matrix: Marcenko-Pastur law • C is p-by-p diagonal matrix with C=2 (multiplicity of p eigenvalues each is 2) • The pdf of sampled eigenvalues is known analytically. • There is a relationship between the mode of the pdf and the true (population) eigenvalue. Mode is ML position C ~ Wp ( I 2 n1, n)

(1  k )2 2 sx (mode)    Fmode 2 1 k k  p/n

• based on the mode location, the sampled eigenvalue is shifted upward (step 1 of the process) toward population value (the mean)

Apparent multiplicity p for nondiagonal matrices

F

pi 2 ) 1 n  diag( ); Fmode (i)  p Fmode (1  i ) n (1 

• We use theory for bounds of the sampled 2 2   a ( i )  s ( i ) 1  k b(i)  s x (i)1  k  eigenvalues a(i)  s x (i)  b(i) k  p / nx • We count the number (pi) of overlapped eigenvalues within [ai bi] for each sampled eigenvalue The multiplicity of the 4th eigenvalue is 3 (two neighbors, the 2nd & 3rd plus itself)

Examples 1. Simulations with many analytical functions & statistics for population eigenvalues (normal, uniform, Gamma) 2. Field data: hyperspectral sensors SEBASS & TELOPS SEBASS

n/p=2 p=115 data solution truth

figures of merit Ratio of improvement of the solution over the data • Re = residual • RA = area • Rcond = condition # • Rd =distance in probability

All figures of merit are greater than 1. Hence, improvement of our solution over data

Probability density functions for TELOPS measurements for selected eigenvalues n/p=2 p=135

truth data solution

All figures of merit are greater than 1. Hence, improvement of our solution over data Drastic Improvement: panels 3, 4, 6, 7, 8, 9 (eigenvalues # 30, 40, 80, 100,120,135) No Difference panels 1, 5 (eigenvalues # 1, 50) Failure panel 2 (eigenvalue # 10)

Application to Hyperspectral Detection • Matched Filter: score = αT·C-1·t α = measurement vector, t = target vector • Random target direction data solution truth (known population) clairvoyant (known population and directions)

C

C

• from data: Pd < 50% • with solution Pd >60% • known eigenvalues (& sampled eigenvectors) Pd >65% • known covariance (true eigenvalues & vectors) Pd >80%

Summary • We presented a method to estimate the eigenvalues of a sampled covariance matrix (Wishart distributed) with few samples. • The method is practical, quick and simple for implementation with a multiplication of three diagonal matrices. • The method achieves two objectives: improved estimation of eigenvalues & improved condition number (i.e., regularization). • With the method we improve the detection (ROC curve)