A covariance matrix test for high-dimensional data

0 downloads 0 Views 365KB Size Report
Keywords: asymptotic distribution, high-dimensional data, null distribution, non-null distribution, ... Both U and V test statistics are consistent and have been.
Songklanakarin J. Sci. Technol. 38 (5), 521-535, Sep. - Oct. 2016 http://www.sjst.psu.ac.th

Original Article

A covariance matrix test for high-dimensional data Saowapha Chaipitak and Samruam Chongcharoen* School of Applied Statistics, National Institute of Development Administration (NIDA), Bang Kapi, Bangkok, 10240 Thailand. Received: 25 October 2015; Accepted: 25 January 2016

Abstract For the multivariate normally distributed data with the dimension larger than or equal to the number of observations, or the sample size, called high-dimensional normal data, we proposed a test for testing the null hypothesis that the covariance matrix of a normal population is proportional to a given matrix on some conditions when the dimension goes to infinity. We showed that this test statistic is consistent. The asymptotic null and non-null distribution of the test statistic is also given. The performance of the proposed test is evaluated via simulation study and its application. Keywords: asymptotic distribution, high-dimensional data, null distribution, non-null distribution, multivariate normal, hypothesis testing

1. Introduction Let X 1 ,..., X N be a set of independent observations from a multivariate normal distribution N p (  , ) where both the mean vector  and covariance matrix ,  is a positive definite matrix, are unknown. In this paper, we are interested in the problem of testing the hypothesis that the covariance matrix of a normal population is proportional to a given matrix, that is, H 0 :   t  0 against H1 :   t  0 where both 0  t   ,  0 are known. The likelihood ratio test (LRT),which is based on the sample covariance matrix, is the traditional technique to handle this hypothesis and requires n  p . But many applications in modern science and economics, e.g. the analysis of DNA microarrays, the dimension is usually in thousands of gene expressions whereas the sample size is small, which makes n  p , called high-dimensional data. For such data, the LRT is not applicable because the sample covariance matrix, S, is singular when n  p (see, for examples, Muirhead, 1982, Sections 8.3 and 8.4; Anderson, 1984, Sections 10.7 and 10.8).

* Corresponding author. Email address: [email protected]

Recently, several authors have proposed methods for testing the related problems. Some of them are: John (1971); Nagao (1973); Ledoit and Wolf (2002); and Srivastava (2005), and Fisher et al. (2010). Those are given as follows. John (1971) proposed a test statistic for testing that the covariance matrix of a normal population is proportional to an identity matrix, that is, H 0 :   tI , 0  t   a known value which is the locally most powerful invariant test as 2  S   U  tr  I  p  (1 / p )tr ( S )  

1

and Nagao (1973) proposed a test statistic for testing H 0 :   I as V 

1

tr  S  I 

p 

2

 

Both U and V test statistics are consistent and have been studied under assuming that n goes to infinity while p remains fixed. So, Ledoit and Wolf (2002) demonstrated that the test statistic for testing H 0 based on U statistic is still consistent if n goes to infinity with p that is as ( n, p )   and p / n  c, c  (0, ) . The null hypothesis H 0 is rejected if

522

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016 npU

Uj 

(1.1)

2 2 exceeds the appropriate quantile from the   distribution with p ( p  1) / 2  1 degree of freedom. For testing H  : 0   I if goes to infinity with p as n < p, Ledoit and Wolf (2002) showed that the statistic V is not consistent against every alternative and its nlimiting distribution differs from its ( n, p )  limiting distribution under the null hypothesis. Then they modified the statistic V as

W 

2 p1 p  2 tr  S  I     trS   .  np  n p 

S1



n ˆ ˆ2 (h / h  1) 2 2 1

(1.2)

exceeds the appropriate quantile of the standard normal distribution, where hˆ1  (1 / p)tr ( S ) and hˆ2 

n

2

(n  1)( n  2)

1

trS 2  1 (trS ) 2  . The statistics hˆ and hˆ are ( n, p )  2 1  p  n

consistent estimators of (1 / p )tr and (1 / p )tr 2 respectively. Also he proposed a test to reject the null hypothesis H 0 if

T

S2



n ˆ (h 2hˆ  1) 2 2 1

(1.3)

exceeds the appropriate quantile of the standard normal distribution. Motivated by the result in Srivastava (2005), which requires, p / n  c , c  (0, ) Fisher et al. (2010) proposed the test for testing H 0 based on unbiased and consistent estimators of the second and fourth arithmetic means of the sample eigenvalues. With the constants:

2n 2  3n  6 2(5n  6) * b ,c  ,d , 2 2 n n(n  n  2) n( n  n  2) 4

e

5n  6 2 2 n ( n  n  2)

n5 (n 2  n  2) ( n  1)( n  2)( n  4)( n  6)( n  1)( n  2)( n  3)

,

they proposed the test statistic to reject the null hypothesis H 0 if 2 n( hˆ4 / hˆ2  1)

TF 

8(8  12c  c 2 )

exceeds the appropriate quantile of the standard normal distribution, where

1

They have shown that the statistic W is consistent as ( n, p )   , including the case n < p. The test statistic based on W rejects the null hypothesis H  if npW / 2 exceeds the 0 2 appropriate quantile from the   distribution with p ( p  1) / 2 degrees of freedom. Srivastava (2005) proposed a test statistic when ( n, p )   , n  O ( p ) 0    1, to 2 reject the null hypothesis H 0 :    I ,  2  0 unknown, with a test statistic which did not relate with unknown  2  0 . Then we applied his test statistic for testing H 0 and reject H 0 if

T

and  

 hˆ4  trS 4  btrS 3trS  c* (trS 2 ) 2  dtrS 2 (trS ) 2  e(trS ) 4   p  is ( n, p )  consistent estimator of (1 / p )tr 4 . The remainder of this paper is organized as follows. Section 2 provides the proposed test statistic and its asymptotic distribution under both the null and alternative hypotheses as (n, p) go to infinity even if n < p. Section 3 shows the performance of the proposed test statistic through simulation technique. Section 4 applies the test statistic to real data. Section 5 contains the conclusions. The theoretical derivations are given in the Appendix. 2. Description of the Proposed Test Suppose X ,..., X ~ N p   ,   and we are 1 n 1 interested in testing that the covariance matrix of a normal population is proportional to a given matrix, that is, H 0 :   t 0 against H1 :   t  0 where 0  t   is known value and  0 is a given known positive definite matrix. Wee proposed the test statistic by considering a measure of a distance between the two matrices

 

1 p

(tr ( 1  tI )2  0

1 p

tr ( 1) 2 

2t

0

p

tr ( 1 )  t 2 0

(2.1) where tr denotes the trace of matrix and if and   0 only if the null hypothesis holds. Thus, we may consider testing H 0 :  0 against H :  0 . 1 We shall make the following assumptions: 0 i

0 i

(A) lim a  a , a  (0, ), i  1,...,8 p

(B)

i

lim

p / n  c , c  (0,  )

( n, p )  

1

i

p

i

where a  (1 / p )tr ( )  (1 / p)  ( / d ) . The i 0 j j j 1

 j ’s are the eigenvalues of the covariance matrix  and dj’ss are the eigenvalues of a known positive definite matrix 0. We need estimators of a1 and a2 to be consistent estimators

523

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016 for large p and n even if n < p. The following theorem provides these consistent estimators. Theorem 2.1 The unbiased and consistent estimators of a1 = 1 2 a  (1 / p ) tr (  1 ) and a  (1 / p )tr (  ) are respectively 1 0 2 0 given by

aˆ1  (1 / p )tr (01S )

(2.2)

and

aˆ 2 

n2

1 1  tr ( 01S ) 2  (tr ( 01S )) 2  (2.3)  ( n  1)( n  2) p  n 

Thus we use estimators in Theorem 2.1 to define the unbiased and consistent estimator of  in (2.1) as

ˆ  aˆ2  2taˆ1  t 2

For studying the performance of the proposed test statistic T, we compute the attained significance level (ASL) of the proposed test by simulation technique. Based on 10,000 replications of the data set simulated under the null hypothesis H 0 :   t  0 , test statistic T is computed and then we obtain the attained significance level (ASL) of the test by recording the proportion rejection of test statistic for the null hypothesis with the nominal significance level at 0.05. We simulate the ASL for different four null hypotheses as 1

1) H 0 :   t 0  C01 where C

2

Theorem 2.2 Under the assumption (A), and (B), as

( n, p )  

D

where x  y denotes x converges in distribution to y. The following theorem and corollary provide the asymptotic distribution of ˆ under the alternative and null hypothesis by applying the delta method of a function of two random variables. Theorem 2.3 Under the assumption (A), and (B), as

( n, p )  

0

2) H 2 :   t  C 0

0



4

2

with  

np



(2.5)

 0.5 I p  0.51 p1p , where

02

p  1 vector having each element equal to 1 3) H 3 :   t   C 0

0

(c j , i ) p  p with c and c

i, i

where C

03 i j

 ( 1)

03

 (ci , j ) p  p 

(i / 2 j ) i  j  1,.., p

 1.0 i  1,..., p

4) H 4 :   t   C 0

0

where C04  (ci , j ) p 

04

p



1/ 5 (c j , i ) p  p with c  (1)i  j 0.9| i  j | i , j  1,.., p i, j

For each null hypothesis, we simulate the empirical power of the proposed test T under the alternative hypothesis for each of four null hypotheses as 1) H 1 :   C against H 1 :   C where C  0

01

1

1

1

(ci , j ) p  p  (ci  j ) p  p is a Toeplitz matrix with elements

c0  1, c1  c1  0.49 and the rest elements are equal to zero 2 0

02

3 0

03

against H 2 :   C  0.9 I  1

2

p

(0.1)1 p1p 3) H :   C

(2ta2  4ta3  2a4  ca22 ) .

1

I p denotes the p  p identity matrix, and 1 denotes the p

2) H :   C

D ˆ   N 0,  2

 (ci , j ) p  p 

c1  0.5 and the rest elements are equal to zero

i, j

4a3 / np  a   2a2 / np   aˆ1  D 1      N ,     2 a   2 ˆ a   4 a / np 4(2 a  ca ) / np  2   2   3 4 2

01

(ci  j ) p  p is a Toeplitz matrix with elements c  1, c 

(2.4)

The following theorem gives the asymptotic distribution of the estimators aˆ and aˆ in (2.4). 1

3. Simulation Study

3 1

against H :   C where C  3

3

(ci , j ) p  p  (c j , i ) p  p with c  ( 1)i  j (i / 4 j ) i i, j

 j  1,.., p and ci , i  1.0 i  1,..., p Corollary 2.1 Under the null hypothesis H :   t  then 0 0   0 and under the assumption (A), and (B), as ( n, p )  

T

npˆ 2 ct 4



nˆ 2t 2

D

 N (0,1)

4 0

4) H :   C

04

4 1

against H :   C where C 

(ci , j ) p  p  (c j , i ) p  p with c (2.6)

Remark If t = 1 and 0  I where I is identity matrix, then the proposed statistic T is the test statistic T in (1.3) given S2 by Srivastava (2005).

i, j

 ( 1)

4

i j

4

2/5 0.9| i  j |

i, j  1,.., p 3.1 Simulation results The ASL is provided in Table 1 corresponding to the null hypotheses. As expected, the ASL of the test statistic T

524

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016 Table 1. The ASL of test statistic T under four null hypotheses at Nominal Significance Level   0.05 . p

n  N 1

10 40

9 9 39 9 39 79 9 39 79 159 9 39 79 159 319

80

160

320

The ASL of T H 10 :   C01

H 02 :   C02

H 03 :   C03

H 04 :   C04

0.059 0.055 0.056 0.057 0.052 0.053 0.053 0.056 0.056 0.053 0.052 0.052 0.051 0.051 0.053

0.058 0.055 0.056 0.056 0.052 0.052 0.053 0.055 0.056 0.053 0.052 0.051 0.051 0.050 0.051

0.059 0.055 0.056 0.057 0.052 0.052 0.054 0.055 0.056 0.053 0.052 0.052 0.050 0.051 0.053

0.059 0.055 0.057 0.057 0.052 0.052 0.054 0.056 0.055 0.053 0.052 0.052 0.050 0.051 0.053

is reasonably close to the nominal significant level 0.05 and datasetCancerResearch.xls (last accessed: 9 October 2011). gets better when p and n get large. We found that four sets There are 18 colon adenocarcinomas and their paired normal of the ASL are almost the same that means the consistency colon tissues and they are obtained on oligonucleotide of our test statistic is not affected by varying the null co- arrays. The expression levels of 6500 human genes are variance matrices. measured on each. For simplicity, we will restrict attention to The empirical powers are shown in Table 2. It shows 18 colon adenocarcinomas with only first 256 measurements that four sets of the empirical power of test statistic T rapidly each. We examine whether the covariance matrix is the converge to one and stay high as n and p get large for n < p. sphericity. The data gives the observed test statistic values We also compute the ASL in a special case of the null as T = 8 .500, U  284.567 and T  270.582 with pj S1 covariance matrix with setting t = 2 and  0  I , that is, the p  value value  0 each, thus the hypothesis of being sphericity is test with the null hypothesis as H 0 :   2 I (spherecity). We compare the performance of the proposed test statistic rejected at any reasonable significance level. T with the test statistics defined in Ledoit and Wolf (2002), denoted Uj in (1.1) and Srivastava (2005), denoted TS1 in 5. Conclusions (1.2). We compare them under the alternative hypothesis For testing the covariance matrix in high-dimensional H1 :   2 D where D  diag ( d1 ,..., d p ); d i  Unif (0,1), i  1, 2,..., p . The ASL and the empirical powers are provided data, our test statistic is proposed under normality assumpin Table 3. Table 3 reports that the ASL of the proposed test tion. The test statistic is approximated by normal distribution. statistic T is similar to those provided in Table 1 and closed Numerical simulations indicate that our test statistic T in to those from the test statistic TS1 and Uj. But the test statistic (2.6) constructed from the consistent estimators with accuT gives the best performance for all of the setting (n,p) and rately control size of test and their powers get better when has substantially higher powers than those of Uj and TS1 for (n,p) get large for n < p. Moreover, the test statistic gives almost every n and p considered. These results suggest that higher power than, for testing being sphericity of the cothe proposed test may more appropriate to use than Uj test variance matrix, those of the tests in Ledoit and Wolf (2002) and Srivastava (2005). and T test, especially when is small. S1

4. A Real Example

Acknowledgements

In this section, the microarray dataset is collected from Notterman et al. (2001) is available at http://genomicspubs.princeton.edu/oncology/Data/CarcinomaNormal

The authors would like to express their gratitude to the Commission on Higher Education (CHE) of Thailand for their financial support.

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016

525

Table 2. The empirical power of T under four alternative hypotheses. p

n  N 1

1 H C 1 1

10 40

9 9 39 9 39 79 9 39 79 159 9 39 79 159 319

0.480 0.996 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

80

160

320

The empirical power of T 2 3 H : C H :  C 1 2 1 3 0.560 0.617 1.000 0.624 1.000 1.000 0.625 1.000 1.000 1.000 0.629 1.000 1.000 1.000 1.000

4 H : C 1 4

0.174 0.265 0.772 0.300 0.837 0.998 0.319 0.866 0.999 1.000 0.342 0.891 1.000 1.000 1.000

0.159 0.286 0.877 0.330 0.939 1.000 0.346 0.966 1.000 1.000 0.361 0.977 1.000 1.000 1.000

Table 3. The ASL (under H  :   2 I ) and the empirical power (under H  :   2 D ) 0 1 of T , U and T at Nominal Significance Level   0.05 . j S1 p

ASL

n  N 1 T

10 40 80

160

320

9 9 39 9 39 79 9 39 79 159 9 39 79 159 319

0.059 0.055 0.056 0.057 0.052 0.052 0.053 0.055 0.055 0.053 0.052 0.052 0.050 0.051 0.053

U

j

0.049 0.054 0.056 0.057 0.052 0.051 0.056 0.056 0.057 0.052 0.055 0.054 0.050 0.050 0.053

References Anderson, T.W. 1984. An introduction to multivariate statistical analysis, 2nd edition, Wiley, New York, U.S.A. Billingsley, P. 1995. Probability and measure, 3rd edtion, Wiley, New York, U.S.A. Fisher, T., Sun, X., and Gallagher, C. M. 2010. A new test for sphericity of the covariance matrix for high dimensional data. Journal of Multivariate Analysis. 101, 2554-2570.

Empirical Power T S1

T

0.048 0.051 0.053 0.053 0.050 0.050 0.054 0.055 0.055 0.052 0.052 0.052 0.050 0.050 0.053

1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

U

j

0.412 0.368 0.999 0.356 0.999 1.000 0.354 0.999 1.000 1.000 0.352 0.999 1.000 1.000 1.000

T S1 0.405 0.360 0.999 0.348 0.999 1.000 0.346 0.999 1.000 1.000 0.343 0.999 1.000 1.000 1.000

John, S. 1971. Some optimal multivariate tests. Biometrika. 58, 123-127. Ledoit, O., and Wolf, M. 2002. Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Annual Statistics. 30, 1081-1102. Lehmann, E.L., and Romano, J.P., 2005. Testing statistical hypotheses, 3rd edition, Springer, New York, U.S.A. Nagao, H., 1973. On some test criteria for covariance matrix. Annual Statistics. 1, 700-709.

526

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016

Muirhead, R.J. 1982. Aspects of Multivariate Statistical Theory, Wiley, New York, U.S.A. Notterman, D.A., Alon, U., Sierk, A.J., and Levine, A.J. 2001. Transcriptional gene expression profiles of colorectal adenoma, adenocarcinoma, and normal tissue examined by oligonucleotide arrays. Cancer Research. 61, 3124-3130.

Rao, C.R. 1973. Linear statistical inference and its Application, 2nd edition, Wiley, New York, U.S.A. Rencher, A.C. 2000. Linear models in statistics, Wiley, New York, U.S.A. Srivastava, M.S., 2005. Some tests concerning the covariance matrix in high dimensional data. Journal of Japan Statistical Society. 35, 251-272.

Appendix Before proving Theorem 2.1, we need the following information and lemma: For positive symmetric definite matrix  and by spectral decomposition, we have    where   diag ( 1 , 2 ...,  p ) with  being the ith eigenvalue of  and  is an orthogonal matrix with each column as normalized corresponding i eigenvectors γ1 , γ 2 ,..., γ p . Similarly, we also can write  0 as  0  RDR  where D  diag ( d , d ..., d ) with di being the 1 2 p ith eigenvalue of  0 and R is an orthogonal matrix with each column as normalized corresponding eigenvectors r1 , r2 , ..., rp (Rencher, 2003). Let nS  YY  ~ W p (, n ) where Y  ( y1 , y2 ,... yn ) and each y ~ N (0, ) and independent (Anderson (1984), j

p

Section 3.3; Srivastava (2005); Fisher et al. (2010)). Let U  (u1 , u2 ,...un ) where u j is independently and identically 1

1

1

distributed (iid.) N (0, I ) and we can write Y   2 U where  2  2   . Define W   U   ( w , w ,..., w ) and each wi 1 2 p p are iid. N (0, I ). Thus, define v  w w are iid chi-squared random variables with n degree of freedom.

ii

n

i i

Lemma A.1. For v  ww and vij  wiw j for any i  j ii i i

E (viir )  n( n  2)...( n  2r  2), r  1, 2,...

Var (vii )  2n,

Var (vii2 )  8n (n  2)(n  3),

E (vii  n )3  8n,

E (vii  n ) 4  12n(n  4),

E (vii2  n ( n  2)) 4  3n ( n  2)[272n 4  O( n 3 )],

E (vij2 )  n,

E (vij4 )  3n( n  2),

E (vii vij2 )  n( n  2),

E (vii2vij2 )  n(n  2)( n  4),

E (vij2vii v jj )  n( n  2) 2 . Proof. The first 6 results can be found in Srivastava (2005) and the last 5 results can be found in Fisher et al. (2010). As in similar proofs of Srivastava (2005), we can write (1 / p)tr ( 0 1 S ) and (1 / p )tr (  1S )2 in terms of chi-squared 0 random variables.

aˆ  1

1 p

tr ( 1S )  0

Similarly, we also have

1   1 p i tr  ( RDR) 1 YY    v .  p  n  np i  1 d ii

1

i

(A.1)

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016

1 p

1 2 tr ( 0 S )

p

1

i2

i  j

p

2

2

2

vij ,

(A.2)

1 1  tr (01S ) 2  (tr ( 01S )) 2  .  ( n  1)( n  2) p  n 

(A.3)





2

n p

2 i 1d i

vii 

527



n p i  j di d j 2

where v  ww . We let ij

i

aˆ 2 

j

n2

Thus

aˆ 2 

1 1

n2

p



( n  1)( n  2) p  n2 p 

i2

 v2 2 ii i 1 d i



2

i  j

p



n 2 p i  j di d j

vij2

2  1 p i     vii    np i 1 di  

 n  1 p i2 2  2 p i  j 2 1  (vij  vii v jj )   3  2 vii  2  ( n  1)( n  2)  n p i 1 di n n p i  j di d j  n2



where b1 

n2 ( n  1)( n  2)

n 1 3

n p

i2

p



2 i 1d i

[ b1

(A.4)

 b2 ],

2

vii ,

b2 

p

2



2

n pi

i  j

1 2 (vij  vii v jj ). dd n j i j

Proof of Theorem 2.1. Since

 1 p i  1 p i 1 p i 1  E ( aˆ1 )  E  tr ( 01S )   E   vii    E (vii )   n np i 1 di p   np i 1 di  np i 1 di 

1

p



i

p i 1 di



1 p

tr ( 0 1 )  a1.

2 ij

And from Lemma A.1, we easily find that E (v 

E ( aˆ 2 ) 



1 n

v v )  0 then E (b2 )  0 . Thus ii jj

 n  1 p i2 2   n  1 p i2  n2 2 E  3  2 vii    E ( v )   ii  ( n  1)( n  2)  n3 p i 1 d 2 ( n  1)( n  2)  n p i 1 d i    i n2 n2

n 1 3

(n  1)( n  2) n p

p



i2

2 i 1 d i

n( n  2) 

1 p

p



i2

2 i 1 d i



1 p

tr ( 01) 2  a2

This is shown that both aˆ1 and aˆ2 are unbiased estimators of a1 and a2 respectively. To show that aˆ1 and aˆ2 are consistent estimators considered by 2 2   p   1 p i  1 2  1 p i  2 i v  Var (v )   a .    ii  ii  np 2 2 2 np  p i  1 d 2  np 2  i  1 di  n p i  1 di  i 

Var ( aˆ )  Var  1

(A.5)

528

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016

And since aˆ 2 

n2

b  b  thus 1 2

( n  1)( n  2) 

2

  n2  Var (b )  Var (b )  2COV (b , b . Var ( aˆ )   2 1 2 1 2  ( n  1)( n  2)   



Var (b1 ) 



(n  1) 2 6 2

n p



 p i2 2  (n  1) 2 p i4 2   i 1 d 2 vii   n6 p 2 i 1 d 4 Var (vii )   i i

Var  

8( n  1) 2 ( n  2)( n  3) n5 p

(A.6)

a4 .

 p i  j 1    Var  vij2  vii v jj    n    n 4 p 2  i  j di d j 2 2 2 2 4 p     2 1    2 1    4 2  i2 i2  E  vij  vii v jj    E  vij  vii v jj    n n       n p i  j di di   4( n  1)( n  2)  2 1    a2  p a4  . n4  

Var (b2 ) 

4

(A.7)

And since E (b )  0 then 2

COV (b1 , b2 )  E (b1b2 ) 

2( n  1) 5

n p



p

E 

i2

2 i 1 di

 j k  2 1   v jk  v jj vkk   n i j di d j    p

vii2 

  12 2 p  j k  2 1  22 2 p  j k  2 1     E  2 v11   v jk  v jj vkk    E  2 v22   v jk  v jj vkk    (A.8) n n i j di d j  i  j di d j        d 2 2( n  1)   d1    5   p2 2 p  j k  2 1 n p     v jk  v jj vkk    ...  E  2 v pp i  n  j di d j     d p   0 because

529

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016



 j k  2 1   v jk  v jj vkk   n    i  j di d j  p

E  vii2  p

 j k

i j

di d j

p

 j k

i j

di d j

p

 j k

 

 

  i j

 

1

 

1

E  vii2 v 2jk  E  vii2 vik2 

n

n

 j k  1   n(n  2)(n)  n(n  2)(n)(n)   0, for i  j  k , n i  j di d j   p

 

vii2v jj vkk   

 

 j k  1   n(n  2)(n  4)  n( n  2)(n  4)(n)   0, for i  j  k , (A.9) d d n i j i j   p

vii3 vkk   

1 1   p  j k   E  vii2 vij2  vii3 v jj     n(n  2)(n  4)  n( n  2)(n  4)(n)   0, for i  j  k . di d j  n n  i  j di d j  

By (A.6) – (A.8), then we have

n4



 Var (b1 )  Var (b2 )  2COV (b1 , b2 ) 2 2   ( n  1) ( n  2) 

Var ( aˆ2 )  

  8(n  1) 2 ( n  2)( n  3) 4( n  1)( n  2)  2 1    a4   a 2  p a4   2 2  5 4 n p n     ( n  1) ( n  2)   

n



4

2



4(2n  3n  6) n( n  1)( n  2) p

a4 

4 ( n  1)( n  2)

(A.10)

a22 .

Since aˆ1 and aˆ2 are unbiased estimators of a1 and a2 , respectively and from (A.5), (A.9), and by applying the Chebyshev’ss inequality, for any   0 as ( n, p )   ,

P  aˆ1  a1    

1

2

P  aˆ2  a2    

Var ( aˆ1 ) 

1

2

1 2a2

 2 np

Var ( aˆ2 ) 

 0 and

2 1   4(2n  3n  6)

 n( n  1)( n  2) p 2  

a4 

4 ( n  1)( n  2)

  8

2

a2   

  np

a4 

4 n2

2



a2   0.



Hence aˆ1 and aˆ2 are unbiased and consistent estimators of a1 and a2 , respectively. The proof is completed. Proof of Theorem 2.2. From Theorem 2.1, we have

E ( aˆ1 )  a1 ,

E ( aˆ2 )  a2

(A.11)

By Lemma A.1., with simple calculations and in similar proofs of Srivastava (2005) under assumption (A), and (B), and as

( n, p )   , we obtain Var ( aˆ1 )  2a2 / np, Var (b1 ) 

Var (b2 ) 

(A.12)

8( n  1) 2 ( n  2)(n  3) 5

n p

a4  8a4 / np,

4(n  1)(n  2)  n4

1  2 2  a2  p a4   4c (a2  a4 / p ) / np,  

(A.13)

(A.14)

530

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016

Var ( aˆ2 ) 

4(2n 2  3n  6) n( n  1)( n  2) p

a4 

4 ( n  1)( n  2)

a22  4(2a4  ca22 ) / np,

(A.15)

COV ( aˆ , b )  E ( aˆ b )  E ( aˆ ) E (b ) 1

1

1 1

1

1

2  p   p   1 p i   n  1 p i2 2  2 i i   E  v  v E v E v    ii ii 4 2 ii  3 2 ii    d np d n p  i  1 i i  1 di  i  1 i   n p i  1 di   2   3 2 p  p  p  n  1  p i 3 i j 2  ( n  1)( n  2) i i  E  v   v v     4 3 ii 2 ii jj  2 2 2 d n p  i  1 di n p i  jd d i 1 ii 1d  i j i 

n 1



n 1 



i3

2



2

i3

p

p

n4 p 2 p



i3

i 1 d

( n  1)( n  2)( n  4) n3 p 2 4( n  1)(n  2) a3 3

n p

2

i3

p



i 1 d

( n  1)( n  2) n2 p 2

i  2j 

 3    2  i 1 di i  j di d j 

( n  1)( n)( n  2)( n  4)





p

(n  1)( n  2)  n p



i  2j

 2   n ( n  2)( n  4)  n ( n  2)   2 i j d d n 4 p 2  i 1 di3  i j p

3 i p



4a3 np

n4 p 2

n2 p 2

i3

i 1 d



( n  1)( n  2)





3 i

( n  1) n 2 ( n  2)

3 i



p

 i j

( n  1)( n  2) n2 p2

p

 i j

i  2j di d 2j

(A.16)

i  2j di d 2j p



i3

i 1 d

3 i

.

From the fact that E (b )  0 and similar to the proof for E (b b ) 2

COV ( aˆ1, b2 )  E ( aˆ1b2 )  Note that

1 2

 p    p   2 i j E   i vii      3 2  i  1 di   i  j di d j n p  

  2 1   v  v v  ij   0. n ii jj     

(A.17)

531

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016

  2 p 2 p  1 p i n  n  1  i v2  2  i j E ( aˆ aˆ )  E  v  1 2  np i  1 d ii (n  1)(n  2)  n3 p i  1 d 2 ii n 2 p i  j d d i i j   i

  2 1   v  v v  ij  n ii jj     

 p  p 2 p  p i  j 1 2 2 i  E vii  i vii    i vii   ( n  2) n 2 p 2 i  1 di i  1 d 2 ( n  1)( n  2) np 2 i  1 di i  j di d j  i

  2 1   vij  vii v jj  n    (A.18)

 p  p i  j i E ( b b ) E v By similar proof to we have   1 2 ii  i  1 di i  j di d j

  2 1  v  v v  ij    0 then the expectation of the second n ii jj    

term in (A.18) equals to zero. Thus, we obtain that

E ( aˆ1aˆ 2 ) 

 p  3 3 p i  2j  2  i vii   E v v   ii jj 2 2 3 2 i j d d ( n  2) n p i j  i 1 di  1

2  p 3  p i  j 2 i   n( n  2)( n  4)   n ( n  2)   2 i j d d ( n  2) n 2 p 2  i 1 di3 i j 

1





( n  4) np 2



4 np

2

p





i3

3 i 1 d i

3 i

i3

i 1 d p

2

i3

i 1 d

( n  4) np

p



3 i





 1 p

2

1 p2

p

 i j

i  2j

(A.19)

di d 2j

i p i3    3 2 2 p i 1 di i 1 di i 1 di  1 

p



i2

p



i2

p   i 2 i 1 d i 1 d i i p



 4a3 / np  a2 a1.

By (A.11) and (A.19) as ( n, p )   , we obtain

COV ( aˆ1, aˆ2 )  E (aˆ1aˆ2 )  E ( aˆ1) E ( aˆ 2 )  4a3 / np

(A.20)

To find the distribution of aˆ1 and aˆ2 , we used Multivariate central limit theorem (Rao,1973,p.147) and Lindebergg Central Limit Theorem (Billingsley, 1995, p.359) Since aˆ2 

n2

[b  b ] , so we need to find the distribution of aˆ , b and b which will distribute as 2 1 1 ( n  1)( n  2) 1 2

Normal distribution, respectively. First, we find the distribution of aˆ , b because both are functions of v and the second 1 1

ii

is of b because it is a function of vij , i  j . Finally, the distribution of aˆ which is a distribution of a linear function of two 2

2

normal random variables is obtained. First, in order to find the distribution of aˆ and b . Under  and d as before, we let 1

1

i

i

532

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016

 (v  n )  2 (v 2  n( n  2)) u  i ii and u  i ii 1i 2i d n d 2 n(n  2)( n  3) i i where

E (u1i )  0, E (u 2i )  0, Var (u1i )  2 i2 / di2 , Var (u 2i )  8i4 / d i4 , COV (u1i , u2i )  4i3en / di3 e  n

and

n  2 / n  3  1 as n   Since vii s are independent, thus ui  (u1i , u2i ) are independently distributed

random vectors, i  1,..., p with E (u )  0 and covariance matrices 

i

 2 2 / d 2 i i   in  3 3  4i en / di

3 3 4 e / d  i n i  , 4 4  8 / d  i i

i  1, ..., p

in given by

.

For any n as p     (  ...   ) / p n 1n pn

 2 p 2   i  p i 1d2 i   3  4en p i  p 3 i 1d  i

0 where  n 

 2a 0  2  0  4en a3

3 4e p   n  i  p i  1 d3  i  4 8 p i  P i  1d4 i

 2 a2   4e a n 3    

4e a  n 3 8a

0   n  0 4 

0 4e a  n 3

0 8a 4

 

.

If F is the distribution function of u then i

1 p  p i 1

i

 uiui  p

ui u i

dFi 

1 p 1 1 p 2 p 2 2 2 2 4 4  ( ui ui) dFi   E (u1i  u2i )   E (u1i  u2i ),  2 2 2 2 p i 1  2 p i  1 i  1  p  p

from Cr  inequality in Rao (1973, p.149). Since as p   and from Lemma A.1., 4 4 p p  p  2 2 4 4 i i  E (u )   E (v  n)   12n( n  4)  0 1i ii 2 2 2 2 4 2 2 2 4 2  p i 1  p i 1d n  p i 1d n i i 2

and by an analogous derivation as p   , Hence

p 4 4  E (u  u )  0 as p   . By applying the multivariate central limit theorem, as p   for any n 1i 2i 2 2  p i 1 2

533

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016

   1 (u  u  ...  u )   2 p p 1    

p  (v  n)  i ii d np i  1 i

   D    N 2 (0, 0n ) 2 2 p  (v  n( n  2))  1  i ii  np i  1 d 2 ( n  2)( n  3)   i 1

 2a 0 2

0   n 

Note that as n   ,, en  1,

0  4en a3

0 4e a  n 3

 2a 0 0   , where 0   2 0   0 8a  4  4a3

0 4a  3 0 8a  4

. Thus, it follows that as

( n, p )   , p  (v  n)  i ii d np i  1 i

       

   D    N 2 (0,  0 ). 2 2 p  ( v  n ( n  2))  1  i ii  np i  1 d 2 (n  2)(n  3)   i 1

 2 a2

4a  3

 4a3

8a

And under assumption (A) which leads to assuming that   0 , where   

 then we have that 4 

p  (v  n)  i ii d np i  1 i

       

   D    N 2 (0,  ). p  2 (v 2  n( n  2))  1  i ii  np i  1 d 2 (n  2)(n  3)   i 1

For the first element in the previous random vector, since

i (vii  n) 1  p i vii p ni  1 D     (npaˆ1  npa1 )  np (aˆ1  a1 )   N (0, 2a2 ),   di i 1 di  np i 1 np  i 1 di np

1

p



then D aˆ   N ( a1, 2a2 / np ). 1 For the second element, we have that

1 np

p

 i 1

2

2

2 2 2 p  p i vii i n( n  2)    2  2 np i 1 di ( n  2)( n  3) i 1 di ( n  2)( n  3)  ( n  2)( n  3)

i (vii  n (n  2)) 2

di



Since as n  ,

(A.21)

1

3  n pb1 n( n  2) pa2  D   N (0, 8a4 ).    np  ( n  1) (n  2)(n  3) ( n  2)( n  3) 

1

534

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016 3  n pb n( n  2) pa  1 2   1 ( npb  npa )  np (b  a )   1 2 1 2 np  ( n  1) ( n  2)( n  3) ( n  2)( n  3)  np  

1

then

D np (b  a )   N (0, 8a4 ) also, and with a linear transformation we have the result that 1 2 D b   N (a2 , 8a4 / np ). 1

(A.22)

The next is to find the distribution of b . Srivastava (2005) gave the important results, which are used for the next 2 2 2 n   proof, that v / n ~ N (0,1) as and v / n ~  which are asymptotically independently distributed for all distinct ij ij 1 i and j. Note that b defined in (A.4), now we let ij  2

2i  j 2

n pdi d j

2

(vij 

1 n

vii vij ) . From Lemma A.1., we have E (ij )  0 and

let

p

2

 2i  j

p

i j

4



2  n pd i d j

i j

4

p

n p

2

 i j

i  j di d j





2

S p   Var (ij )   Var 

( vij  1 vii v jj )  n





2

Var ( vij  1 vii v jj ) n

 Var (b2 ) 

4( n  1)( n  2)  n

a4  4c  2 a4  2  a2    2  a2   p  n  p  

4

as ( n, p )   . p

Let M p   ij  i j

p



1



i  j

1 2 ( v  vii vij )  b2 . If Pij is the distribution function of ij . Since, for   0 ij n n 2 p i  j di d j

2 i  j S ij  S p p

2

p



ij2 dPij 

1

p



2

2 2 i j  S p p

 

 ij dPij

1

2 2 i j  S p

E (ij2 )

 i  j   4 2 2 2E d d i j n p  S p  i j 4

p

p

8( n  1)( n  2)i2  2j

i j

n p  S p di d j

 

4

2 2 2

2 2

p

8i2  2j

i j

n p  S p di d j

 

2

 2 1   vij  vii v jj   n  

2 2 2 2 2

0

2

535

S. Chaipitak & S. Chongcharoen / Songklanakarin J. Sci. Technol. 38 (5), 521-535, 2016 as p  . Then, it follows from the Lindeberg Central Limit Theorem in Lemma A.3., M S

p p



npb 2 2 2 c( a  a / p) 2 4

D

D

  N (0,1).



4c



np

Then we have b2  N  0,

 2 ( a2  a4 / p )  .

(A.23)



By (A.8), then b and b are asymptotically independent. Note that aˆ is a linear function of two random variables b 1 2 2

1

2 n and b that is, aˆ  [b  b ]  b1  b2 as n   . By (A.5), (A.15.), (A.22.), and (A.23.), then we have 2 ( n  1)( n  2) 1 2 2 aˆ

D

2





  N a2 , 4(2a4  ca22 ) / np .

(A.24)

From (A.20), COV ( aˆ1, aˆ2 )  4 a3 / np , (A.21), and (A.24), we have 4 a / np  a   2a2 / np   aˆ1  D 3 1   .    N ,   2  a   2 ˆ  a  4 a / np 4(2 a  ca ) / np  2  2   3  4 2

The proof is completed. Proof of Theorem 2.3. Note that our test statistic is ˆ  aˆ  2taˆ  t 2 and we have 2 1 ˆ aˆ 1

ˆ

 2t and

aˆ

1.

2

D By applying the delta method (Lehmann and Romano, 2005, p.436), thus, ˆ     N (0,  2 ) where



2

  2t

 2a2 / np

4a / np 3

  2t  4     (2t 2 a  4ta  2 a  ca 2 ) 2 3 4 2 2  4a / np 4(2 a  ca ) / np   1  np  3  4 2

1 

The proof is completed. 2 3 Proof of Corollary 2.1. Under H , a  t , a  t and a  t 4 Thus,  2  4ct 4 / np. It follows from Theorem 2.3. that 0 2 3 4 the null asymptotic distribution of T is N (0,1) . The proof is completed.