Minimax hypothesis testing about the density support - CiteSeerX

1 downloads 0 Views 1MB Size Report
hypothesis Ho and the alternatives for which one is able to test the null hypothesis ... Keywords: minimax rate of testing; nonparametric hypothesis testing; ...
Bernoulli 7(3), 2001, 507-525

Minimax hypothesis testing about the density support GHISLAINE

GAYRAUD

Laboratoire de Statistique, CREST, Timbre J340, 3 avenue Pierre Larousse, 92241 Malakoff Cedex, France; and Laboratoire de Mathématiques Raphael Salem, UMR CNRS 6085, UFR des Sciences-Mathématiques, 76821 Mont St Aignan Cedex, France. E-mail: [email protected] The paper is concerned with testing nonparametric hypotheses about the underlying support G of independent and identically distributed observations. It is assumed that G belongs to a class .Cf' of compact sets with smooth upper surface called boundary fragments. It is required to distinguish the simple null hypothesis specified by a known set Go in ~ against nonparametric alternatives that G belongs to a class obtained by removing a certain neighbourhood of Go in ~. Using the asymptotic minimax approach, the problem is to determine the order of the smallest distance between the null hypothesis Ho and the alternatives for which one is able to test the null hypothesis against the alternatives with a given summarized error. Keywords: minimax rate of testing; nonparametric hypothesis testing; underlying support

1. Introduction We observe N -dimensional independent and identically distributed random variables Xl, ... , Xn, uniformly distributed on an unknown set G. We assume that leb(G), the Lebesgue measure of G, is positive. We denote by LN-I(Y, LI) the c1ass of functions on [0, I]N-I having continuous partial derivatives up to order k = lY J (k E N is the greater integer strictly less than y) and such that Vy, z E [0, I]N-I,

where p~(z) denotes the Taylor polynomial of order k for g(.) at a point y, and Izi denotes the Euclidean norm of a vector z. We assume that the set Ge [0, l]N is of the form where g: [0, I]N-I ----+ G = {x = (Xl, ... , XN) E [0, I]N : ° ~ XN ~ g(XI, ... , XN-r)}, [0, 1] is called the edge of G (Korostelev and Tsybakov 1993a) and is a smooth function belonging to L(Y, LI, br), which is defined by L(Y, LI, bI)

where y;?: 1 is real, 0< bl
~

= {g E LN-I(Y,

LI) :

bl < g(y) < 1 - bl Vy E [0,

I]N-I},

LI is a positive constant and bl is a positive constant such that by :Y' the c1ass of sets whose edge belongs to L(Y, LI, bI)' Such sets fragments (Korostelev and Tsybakov 1993b). the present paper studies the following hypothesis testing problem

2001 ISIfBS

508

G.

Gayraud

concerning the support G: the null hypothesis is specified by a known fixed set Go in .cç and the alternatives are classes of sets, obtained by removing a certain 1jJn-neighbourhood of Go in ~, where 1jJn is a sequence of positive numbers decreasing to zero with n. ln detail, we first let doo be the Hausdorff distance between two closed compact sets Gand G' defined by doo(G, G') = max(maxxEGP(X, G'), maxxEG'p(X, G)), where p(x, G) is the Euclidean distance between a point x and a closed set G. We consider the problem of testing the simple hypothesis Ho:

G

= Go,

against the composite alternative

Since y ~ l, the Hausdorff distance doo between G and Go is equivalent to the Loo-distance between the corresponding edge functions g and go. Thus, AI,n(1jJn) can be defined as the class of sets whose edge belongs to L(Y, LI, bl) and is separated from go in Loo-distance by c1jJn, where c is a positive constant. Second, consider support functionals S(G) defined by S(G) = IG tp(x)dx, where tp is some known bounded positive function on [0, l]N; also let So = S(Go). The test we are interested in is Ho: G

= Go,

against the composite alternative

Third, let dl (G, G') be the Lebesgue measure of the symmetric difference between two compact closed sets Gand G'. We wish to test Ho: G

=

Go,

against the composite alternative

Since the Lebesgue measure of the symmetric difference dl between G and Go is equal to the LI-distance between the corresponding edge functions g and go, A3,n( 1jJn) can be defined as the class of sets in ~ whose edge is separated from go in LI -distance by 1jJn. AI,no A2,n and A3,n are henceforth abbreviated as An when it is convenient, and d represents the distance which separates the alternatives from Go. Note that An is defined by three parameters: the class ~, d and 1jJn' However, it can be shown (Ingster 1993a; 1993b; 1993c) that, given ~ and d, 1jJn cannot be chosen arbitrarily. It turns out that if 1jJn is too small, then it is not possible to test the hypothesis Ho against HAn with given summarized

509

Minimax hypothesis testing about the density support

errors of the fust and the second kind. On the other hand, if 'ljJ n is very large, such a test is possible; the problem is to find the smallest 'ljJ n for which such a test is still possible and to indicate the corresponding test. Let us give sorne precise definitions to solve these problems. Let J1n be a test statistic that is an arbitrary function with values 0, 1 and is measurable with respect to Xl, ... , X n and such that we accept Ho if J1n = 0 and we reject Ho if J1n = 1. Set Ro(J1n) = PGo{J1n = l} for the error of the fust kind and RI(J1n, 'ljJn) = SUPGEAn(1J!n)PG{J1n = O} for the error of second kind. The index G means that the measure P G is generated by Xl, ... , X n uniformly distributed on G. The properties of the tests J1n are characterized by the SUffi of the two errors

Fix a number f3 in (0, 1). The sequence 'ljJn is called the minimax rate oftesting (MRT) if the following two conditions hoId: • there exists b > 0 such that (1.1)

where the infunum is taken over an the tests J1n; • there exist a positive constant b' and a test n-;oo

J1n

such that (1.2)

Thus, the MRT 'ljJn is such that a meaningful test of Ho is impossible if the distance between the null hypothesis and the alternative is smaller than b'ljJn, and is possible if the distance is greater than b''ljJn. Clearly, b' ~ b. The problem of nonparametric hypothesis testing is closely related to minimax nonparametric estimation problems with infinite-dimensional 'parameter' class. For a detailed review of this area, see Donoho et al. (1995) and the references therein. However, specifie features of hypothesis testing problems have been discovered which do not occur in estimation problems. ln particular, the minimax rates are different for the estimation problem and for the testing problem. The problem of nonparametric hypothesis testing was initiated by Ingster (1982), although sorne closely related ideas appeared in earlier papers by Burnashev (1979), Ibragimov and Has'minskii (1977) and Birgé(1983). Ingster (1982) considered hypothesis testing in the problem of signal detection where a certain function J, treated as a signal, is observed with noise: the null hypothesis in this case is J == 0, that is, no signal is present. The problem of testing this null hypothesis against the alternative defined as the set of functions belonging to an ellipsoid in L2(0, 1) and separated from zero in the L2-norm by a distance K'ljJn (K is a positive constant) was studied by Ingster (1982) and Ermakov (1990a). Another alternative defined as a Halder class separated from zero in the Lp-norm, in the uniform norm and in a fixed point by a distance K'ljJn was considered by Ingster (1986b). ln addition, Ingster (1990) and Suslina (1993) investigated the case of an alternative defined as a set of functions belonging to an ellipsoid in lp, 0 < p ~ 00, lying outside the ball of radius K'ljJn around zero. Lepski (1993) studied a slightly different

510

G. Gayraud

problem: that of finding the exact value K'ljJn, where K is a positive constant depending on the smoothness parameter, for which relations (1.2) and (1.1) hold with alternative sets defined by the Holder class separated from zero in the uniform norm and also in a fixed point; an extension of Lepski's (1993) result is given in Lepski and Tsybakov (1996). Lepski and Spokoiny (1999) studied the signal detection problem, considering a Besov baIl as the alternative separated away from zero in the integral Lp-norm. Spokoiny (1996) extended the investigations to the problem of adaptive testing. This problem of adaptive testing is also studied in Spokoiny (1998). Another widely studied example of hypothesis testing concerns the probability density; this can be formulated in the following way. Let Xl, ... , X n be independent and identically distributed random variables having an unknown probability density The null hypothesis is specified by a known density Jo against several nonparametric alternatives such as ellipsoids in Lz (lngster 1984; Ermakov 1994), Holder classes (lngster 1986b), ellipsoids in lp, 0 < p ,,;;CX) (Ingster 1994), and Sobolev balls (Ingster 1986a). The reader is referred to Ingster (1993a; 1993b; 1993c) for a most detailed review of nonparametric minimax hypothesis testing for both signal detection and density problems. Other problems of nonparametric hypothesis testing are studied using the minimax approach: see, for instance, Ermakov (1990b; 1990c) in which the objects of interest are respectively the spectral density and the distribution function, and recent papers (Ermakov 1996; HardIe et al. 1997; Spokoiny 1997; Feldmann et al. 1998; Baraud et al. 1999; Gayraud and Tsybakov 1999; HardIe and Kneip 1999; Pouet 2000) in which the nonparametric null hypothesis and alternatives are both composite. Although our problem is a minimax hypothesis testing problem, it differs from those mentioned above in that it concerns a set (the underlying support) and not a function. As in signal detection (Ingster 1982), we show that the minimax rates are sometimes different for the estimation problem and for the hypothesis testing problem: actually the MRT is at least the same as the minimax rate of estimation (MRE), or else the MRT improves the MRE obtained in the corresponding estimation problems. Our study not only is of theoretical interest but also could be important in many applications. Indeed, the underlying support is an object of interest in several areas such as econometrics, cluster analysis and reliability theory. For example, the knowledge of the boundary of the density support allows the performance of an enterprise to be evaluated in terms of technical efficiency, and also the underlying support can be a useful tool in reliability theory for detecting abnormal system behaviour. The paper is organized as follows. ln Section 2, we present the test statistics for which relation (1.2) holds. The main results of this paper are stated in Section 3 and their proofs are given in Section 5. We prove that the MRT is either (n/log n)-yj(y+N-I) for the alternative AI,n('ljJn), or n-[y+(N-I)jZ]j(y+N-I) when the alternative is defined by either AZ,n('ljJn) or A3,n('ljJn). A comparison between the MRT and the MRE obtained in the corresponding estimation problems shows that they are equal when both alternative and error of estimation are defined by the doo-distance and also by the positive difference IS(G) - Sol. When both alternative and estimation error are defined with the dl metric, it is interesting to note that the MRT and the MRE are different and, in particular, that the MRT improves the MRE. ln this last case, the MRT corresponds to the MRE obtained in

f.

Minimax

hypothesis

testing

about

the density

511

support

the estimation problem of functionals of support such as T(G) = J[O,1]N-I Ig(y) (Gayraud 1997). Section 4 is devoted to additional remarks and simulations.

- go(y)ldy

2. Definition of the test statistics Henceforth, let On be a positive sequence and set M = O~(N-I). Without loss of generality, suppose that Mis an integer. Introduce a partition of [0, l]N-1 into cubes Qq, q = l, ... , M, with edges of length On' For each q in {l, ... , M}, denote by uq = (Uq,l, ... , Uq,N-l) E [0, l]N-l the centre of the cube Qq.

2.1. Testing for supports in the Hausdorff distance ln this section, set On = (n/log n)-(l!y+N-l). The test statistic is based on Gn and gn, the estimates of Gand its edge g proposed in Korostelev and Tsybakov (1993b); there, edge estimation is caITÎed out separately on each cube Qq, q E {l, ... , M}, as a polynomial function, and thus the entire process of estimating g n is based on slicing and piecewise polynomial approximation; then, the set estimator Gn is defined as the set Gn

= {x = (y,

XN) E [0, l]N : 0 ~ XN ~ gn(y)}.

(2.1)

This leads to the test statistic

"Kl,n

where

CI

={

01

00 n, GGo)) 0 < ~ Cci1°O~; n' ~ffddoo((GGn, 1

>(N - l)/(y + N - 1) is a constant.

2.2. Tests for functionals of density supports ln this section, set On = n-(l/y+N-I). This problem is related to the minimax estimation of the functionals S(G) (Gayraud 1997). We fust define an estimator Sn of S(G): divide the = {Xi, i Eh}, whole sample Xl, ... , Xn into three subsamples JI = {Xi, i Eh}, J3 = {Xi, i E h} such that U U h = {l, ... , n} and cardJl = cardJ2 = cardJ3 = n/3. Without loss of generality, suppose that n/3 is an integer. We transform the estimator Gn defined in (2.1) as follows: instead of using the original sample, we consider another sample Xi, ... , X~, obtained by a transformation T of XI, ... , Xn; this allows us to construct an estimator Gn included in the true support G almost surely (the proof and the transformation T are given in Gayraud 1997). Then, let Jî and J3 be the samples obtained by transformation T of JI and h, and denote by Gn,J~, gn,J~ and Gn,J~ the estimator of G, the estimator of g, and the complement to Gn,J~ in [0, l]N, respectively; all of these are based on J~, for all r E {l, 3}. The estimator Sn of S(G) is defined by

h h

h

512

G. Gayraud

where f!n,J), which is based on J3, is the estimator of leb(G) defined in Korostelev and Tsybakov (l993b, Lemma 4). We can now define the test statistic -Llz,n

= {O1

if (Sn - So)z if (Sn ~ So)Z

< Czà~/n,

?

Czà~/n,

where Cz is a positive constant.

2.3. Tests for supports in the dl-distance ln this section, set àn = n~(l/y+N-l). Although this problem is related to the density support estimation, the construction of the test is based on an estimator Tn of T( G) = J[O,W-1 Ig(y) - go(y)ldy. We first divide the whole sample Xl, ... , Xn into two subsamples JI, Jz and then divide each subsample Jq into three sub-subsamples Jq,l, Jq,z, Jq,3, q = 1,2. Without 10ss of generality, we suppose that n/6 is an integer and that each sub-subsample Jq,l, Jq,z, Jq,3, q = 1,2, contains n/6 observations. Moreover, define J~,n q = 1,2 and r E {1, 3}, the samp1e obtained by the transformation Y of Jq,r as in Section 2.2. The functional T( G) can be written as the SUffi of two terms: T( G)

= J(g(y)

- go (y»I {g(y)

? go(y) }dy + f (go(y)

- g(y»I {g(y)

< go(y) }dy =

tl

+ tz·

Then define the statistic (2.3)

where

'iz

= f [O,l]N I{x

E

Gn,JZ,1

n Go}dx-~

L

i:X;Eh,2

I{Xi

E

Gn,JZ,1

n

GO}f!n,Jz,3'

where Gn,J~,1 is the set estimator defined as in Section 2.2, which is based on J~,l for q = 1,2, and f!n,J~,3 is the leb(G)-estimator defined in Korostelev and Tsybakov (l993b, Lemma 4) and based on J~,3 for q = 1, 2. Then, our test statistic is Ll3,n -

where

C3

= {O1

is a positive constant.

if if

ITnl ITnl

< C3(à~/n)1/2,

?

C3(à~/n)1/Z,

513

Minimax hypothesis testing about the density support

3. Main results Theorem 3.1. Let (n/log n)-(y/y+N-l). relations hold:

alternatives be defined by There exist positive constants

the set A1,n('ljJn), where 'l/Jn= b2 and b3 for which the foUowing (3.1)

limsup[Ro(~l,n) n-->co

+ Rl(~l,n,

b3'l/Jn)]= 0,

(3.2)

where inf!:>.ndenotes the irifimum over aU test statistics. Remark 3.1. Note that 'l/Jn corresponds to the MRE obtained in Korostelev and Tsybakov (1993b) for density support estimation with the Hausdorff distance for the class of boundary fragments. Remark 3.2. ln Theorem 3.1, the right-hand side of each relation does not depend on {3 and therefore (3.1) and (3.2) are satisfied for any {3 E [0, 1]. This is connected with the fact that the limiting distribution, arising here, is singular. Theorem 3.2. Let alternatives be defined by the set A2,n('l/Jn), where 'l/Jn= (o~/n)1/2 and On = n-(1/y+N-l). (i) Assume that ep is an integrable function on [0, I]N such that lepl is greater than a positive constant on some closed N-interval contained in [0, I]N-l X [bj, 1 - bd. Then, there exist positive constants b4 and {3* < 1 such that, for aU {3 < {3*, the foUowing inequality holds: (3.3) where inf!:>.ndenotes the infimum over aU possible test statistics. (ii) Assume that ep is continuous and bounded on [0, I]N. Then, there exists some positive constant bs such that limsup[Ro(Li"2,n) + R1(Li"2,n,bs'l/Jn)]~ {3.

(3.4)

n-->co

Remark 3.3. Note that 'l/Jnis equal to the MRE obtained in Gayraud (1997) for the estimation of the functional density support for the class of boundary fragments. Theorem 3.3. Let alternatives be defined by the set A3,n( 'l/Jn), where 'l/Jn = (o~/n )1/2 and On = n-(1/y+N-l). (i) There exist positive constants b6 and {3* < 1 such that, \:j{3 < {3*, we have liminfinf[Ro(~n) n-tCXJ

!ln

+ Rl(~n, b6'l/Jn)]~ {3,

where inf!:>.ndenotes the infimum over aU possible test statistics.

(3.5)

514

G. Gayraud

(ii) There exists sorne positive

constant b7 such that

1imsup[Ro("K3,n) + Rj(X3,m n~oo

b77Pn)] %

{3.

(3.6)

Rernark 3.4. ln this case, the MRT 'ljJn improves the MRE obtained for the estimation of the density support when the error is defined with the dj-metric (Gayraud 1997).

4. Additional remarks and simulations 4.1. Remarks 4.1.1. The case of an unknown probability

density

The results of this paper can be generalized to a more genera1 class of density than the uniform density. Let YG be the class of densities whose underlying support is G such that YG

= {f l-

E

Y(ao, G) : f has continuous partial derivatives up to order

1 in Int(G) and If(x) - p{(x)

f

1

% QLlx - vil, Vx E G, Vv E Int(G)},

l-

where p{(x) is the Taylor polynomial of of order 1 at the point v E Int(G), QL is a positive constant, Int( G) denotes the interior of G, l is a positive integer and the class .ci (ao, G) = {f defined on [0, lt : f(x) ~ ao > 0, Vx E G, and f(x) = 0, Vx f:- G}, where ao > 0 is a given constant. ln this case, one defines a density estimator as a kernel estimator K (for its construction, see Gayraud 1997, Section 2.3) in place of the estimator of leb( G) used in the definition of both Sn (2.2) and Tn (2.3). Sorne assumptions on K and l (for details, see Gayraud 1997, Section 3.1) allow one to consider the probability density as a nuisance parameter. Then one obtains Theorems 3.2 and 3.3, since Theorem 3.1 remains valid.

4.1.2. Lower bounds ln nonparametric estimation problems such as regression or density estimation, the minimax lower bounds lead to proof that the rates of convergence obtained for sorne estimators cannot be improved by any other estimators. ln hypothesis testing problems, the relations of the lower bound (1.1) lead to proof that relation (1.2) cannot be used with 'ljJ~= o('ljJn) in place of b''ljJn, that is, one cannot successfully distinguish the null hypothesis from the alternatives that are much closer than 'ljJnfrom Ho in d-distance. The difficulty in proving the relations of lower bounds in hypothesis testing problems lies in the construction of the parametric family which must be included in the whole class ~ as in nonparametric estimation problems, but also which must be separated from the null hypothesis by 'ljJnin d-distance. This is achieved by randomizing the alternative classes of sets.

Minimax

hypothesis

4.1.3. The exact

testing

separation

about

the density

515

support

constant

A possible extension of our results wou1d be to provide b, the exact separation constant (ESC), for which (1.1) holds for all b" < band (1.2) holds for aIl b" > b. The ESC is known in severa1 prob1ems, in particu1ar for functiona1 classes and distances d defined in a coordinate form - ellipsoids in lp in the density mode1 in Ingster (1994) and in the signal detection prob1em in Suslina (1993). For the classes defined in functiona1 form such as RaIder or Sobo1ev classes, with d defined by the Lp-norm, much less is known about the exact asymptotics: to our knowledge, the ESC is known on1y in the signal detection problem in Lepski (1993) for the RaIder class with smoothness parameter less than 1 and the LJQnorm as the distance d, and, in Lepski and Tsybakov (1996) and in Pouet (1999) for the RaIder and Sobolev classes and for ana1ytica1 alternatives, respective1y; in both papers d is defined by the supremum norm and by the distance in a fixed point. ln our framework, a study of the ESC wou1d be a non-trivial matter requiring further investigations and requiring a paper in its own right.

4.2. Simulations ln this subsection we illustrate our theoretica1 results by comparing the errors of the second kind obtained under sorne alternative class of sets which are separated from Ho by a, with the distance dao and the distance based on the functiona1 S(-). For this comparison, we consider the particu1ar case of: N = 2; the nuIl hypothesis Go = {x = (x], X2) E [0, If : 0 ~ X2 ~ gO(XI) = n, S(G) = IG dx = 1eb(G); distances dao(G, Go), IS(G) - Sol used to separate Go and sets G be10nging to the alternative; and two forms of alternative class, defined by Wj(a)

= {G

W2(a)

=

{G:

: G G

= {(Xj, X2) = {(Xj, X2):

: 0 ~ X2 ~ go(Xj)

0 ~ X2 ~ go(Xj)

+

an,

+ K(a,

d)asin(xl/a)n,

where a and K( a, d) are positive constants. The constant K( a, d) is chosen such that d(G, Go) ;? a, for aIl GE 1:?2(a). Since the alternative hypotheses are composite, we takes a as varying inside a set ./6 defined by A = {0.03, 0.05, 0.07, 0.09, 0.1, 0.12}. Then we GGca1cu1ate Rj (~],n, a) = PG(~],n = 0) and Rj (~2,n, a) = PG(~2,n = 0), where G belongs to either Wj(a) or W2(a) and a is in A (Table 1). The first step of these ca1cu1ations is to compute the test statistics "Kj,n and "K2,n; this is done foIlowing the theoretica1 procedure given in Korostelev and Tsybakov (1993b) and in Gayraud (1997), respective1y. The second step consists in using the Monte Carlo method with 10 000 replications to approximate each R7("Kj,n, a) and R7("K2,n, a) for G in Wj(a) U W2(a) and a in A. Since our theoretica1 results are asymptotic, our ca1cu1ations are done with different values of n, that is, n E {100, 250, 500, 750, 1000}. Furthermore, the value of a = 0 leads us to evaluate the error of the first kind since R7("Kq,n, 0) = 1 - Ro("Kq,n), q E {l, 2}. One must note that for each distance d, the theoretical error of second kind defined in Section 1 is the maximal error over the class of sets Gin W](a) U W2(a) and the set of a. The presence of two different forms of alternative classes would demonstrate that the simulation results are independent of the choice of one particu1ar form.

A

)

516 a GE R?(A1,n,ofa), Values

0.8085 0.8593 0.0867 0.86 0.8688 0.8877 0.083 0.9879 0.0031 0.9543 0.9699 0.8824 0.003 0.0042 0.067 0.5673 0.8648 0.5986 0.0082 0.2654 0.0415 0.8598 0.9228 0.9323 0.8064 0.181 0.9441 0.5849 0.4318 = 500 0.013 0.0771 0.0874 0.9581 0.8798 0.3652 0.0112 0.0062 0.0026 0.6457 0.5214 0.2465 0.2294 0.9692 0.0684 0.9417 0.3675 0.1286 0.9641 00.6488 n0.9952 0.9849 0.0001 0.0002 n =250 750 1000 100 0.8987 Wj(a)

G. Gayraud

Minimax

hypothesis

517

testing about the density support

First, note that for all cases, the error of the second kind decreases as n lS mcreases. Second, if we fix a real number y in (0, 1) which is an upper bound for the error of the second kincl, the ca1culation of the error of the second kind in both tests gives the smallest value of a for which one can distinguish the alternatives from Ho; for example, for n = 500 and if we fix y = 0.1, the corresponding a for the first test is 0.05 and that for the second test is 0.03. Furthermore, without fixing an upper bound y and for large values of n, say n ~ 500, one must note that the errors of the second kind are always smaller when the distance d is based on the functional S(G) than for the dao-distance: this would give the same conclusion as the theoretical results.

5. Proofs 5.1. Proof of Theorem 3.1 Let

On

= (n/log

5.1.1. Praof

n)-I/(y+N-l).

of (3.1)

Introduce a partition of [0, I]N-I into M cubes Qq, q = 1, ... , M = (b~/y On)-(N-I), with edges of length b~/y On' Assume without loss of generality that M is an integer. Let be TJ(t)~1 if tE [_!,!]N-I, a function such that TJ is Woo, TJ(t)=O if t~[_!,!]N-I, SUPt 1TJ(k+l)(t) 1 ~ LI where k = ly J, and write TJ* = SUPtE[_ll]N-! 1J(t) ~ 00 and if = 2'2 ~[_ll]N-! TJ(t)dt. For q = 1, ... , M, define the sets 2'2 TJ

Go =

{x

= (XN,

y) E

[0,

I]N :

0~

XN ~

go(y)}

,

Gq

= {x = (XN,

y) E [0, I]N : 0 ~ XN ~ gq(Y)},

G;

= {x = (XN,

y) E [0, I]N : gq(Y) ~ XN ~ go(y)},

where gq(Y) = go(y) - b20~TJ((Y - uq)/b~/y on). Denote by ;y~ the class of sets Gq for all q = 1, ... , M: it is clear that ;y~ is included in AI,n(b20~). Set c~n) = (dPq/dPo)(XI, ... , Xn). Then, for any decision mIe Lln, 1 M PGo[Lln = 1] + sup PG[Lln = 0] ~ Po[Lln = 1] +Pq[Lln = 0] GEA!,n(b20~) M q=1

L

'" (1 - E)P, [~

t.

1;\") '"

(1 -

E)] ,

where PG, Po and Pq respectively denote the probability distribution of the data when they are uniformly distributed on G, Go and Gq. The last inequality holds for any positive real E < 1. Under Ho, and since leb Gq = leb Gq, for all q, q' = 1, ... , M, we have

518

G. Gayraud

= tceb Go I{Xi i=llebGqI{Xi

~~n)

= l, ... , M. Set 0 as n ---7 00, then PO[ZM?: Note that EO[ZM] = 1 and that

where leb G* PO[iZM - 11

= leb

E Gq} _ (leb GO)ntri=1 I{Xi EGO} lebG*

?:

Gq for an q

---7

E]

2 EO[ZM]

1

=

M2

leb Go

Ceb

2n

GJ

[L

qjoq'

= (1/ M)L~I E)]

Eo

[(

If, for an

E

< l,

= l, ... , n} )2]

E Gq, \il

~I{Xi

... ,n} ]

.

n G~, \ii = l, ... , n}])

Gq

~~n). ---7 00.

.

Eo [M ~I{XiEGq,\iI=l,

l {Xi E

as n

---71

M

= M2 lebG* 1 (lebGO)2n(

+ Eo

ZM

(1-

E Gq},

.

Set

TI = -2

--

M1 (leb leb G* Go )2n

Eo

LI{Xi q=1

E Gq, \il

[ M

= l, ... , n}

.

]

and note that

Also set

T2

= Ml1

leb G* {;j;,I{Xi (leb Go)2n Eo ["""

E Gq

n Gq,'

\il .

= l,

... , n} ]

and note that

= _1_ Go) M2 (leb leb G*

L

2 n q,q -1-

'

(

(5.1)

leb Go -leb Gq, * leb Gq*)n ' leb Go

where (5.1) is due ta the independence of the variables Xi. Write leb G* q E {l, ... , M} since leb G; = leb G; for an p, q E {l, ... , M}. Then

= leb G;

for any

519

Minimax hypothesis testing about the density support

(5.2) EO[ZM]2

Write go

-_

1 (leb Go)n +-W1 (leb Go)2n qrq """ (leb Go - leb G;, M -le-b-G-* -le-b-G-* L7', ----le-b-G-o---

= leb Go; for leb G* (leb Go)n

b2

= (1

small enough, note that

b~+(N-I)/Y à~+N-I(fj/

If b2 is small enough and satisfies as n obtain

--+

go))-n

b2

M2

(1

+

--+

-1/y+N

0(1))

--+

1,

n

-1),

we

(5.3)

O.

for the second

G;)n

M (1 _ b~+(N-I)/Y à~+N-I(fj/

M2-M

=

+ 0(1)).

n(b;+(N-I)/Y(i)/go»(1

small enough, we obtain the following approximation

_1_ Go)2n qrq """ M2 (leb leb G* L7', (leb Go -lebleb G;, Go -leb

= M~

=

00, b~+(N-I)/Y(fj/go)«N

lebG* ~M (leb Go)n ln the same way, for part of (5.2):

- leb G;)n

go))-2n(1 _ 2b~+(N-I)/Y à~+N-I(fj/ --+

00.

go)r (5.4)

From (5.2), (5.3), (5.4) and by Chebyshev's inequality, (3.1) holds.

5.1.2. Proof of (3.2) First, consider the error of the second kind. For aIl G in AI,n(b3à~), PGtKI,n

= 0] ~

PG[doo(G, Go) - doo(Gn, G)
(b3 - CI)à~].

Clà~]

(5.5)

As soon as there exists a constant b3 such that b3 - CI is large enough, and using relation (2.15) in Korostelev and Tsybakov (1993b), we obtain (5.6) where p is an arbitrary fixed positive number. It follows that RI (KI,n, b3à~) is asymptotically equal to zero. Under PGo' providing sorne upper bound for Ro(KI,n) reduces to provide sorne bound for PG[doo(Gn, G) ~ Clà~], for aIl GE .'Y. Relation (3.2) follows since the last inequality is similar to the right-hand side of relation (5.5).

520

G. Gayraud

5.2. Proof of Theorem 3.2 Let

'ljJn

= n-[y+(N-1)/1]/(y+N-1)

and

On

= n-1/(y+N-1).

5.2.1. Proof of (3.3) Consider

the partition defined in the proof of (3.1) with b4 in place of bl n-(1/y+N-l), and consider also the function 17 defined in the proof of Theorem Let OJ (OJ 1, ... , OJ M) be a binary vector such that OJ 1, ... , OJ Mare independent

=

On

=

and 3.1. and

identically distributed Bernoulli random variables and set Q = {OJ : l:~1 wq = M1/1}; assume without loss of generality that M1/1 is an integer. Write IQI = card Q and define the sets

= {x = (XN, Gw = {x = (XN, Go

y) E [0, l]N : 0 ~ XN ~ go(y)}, y) E [0, l]N : 0 ~ XN ~ gw(y)},

where gw(Y) - go(y) - b4 n LJ OJq17 ----v:;- . _ oy WEQ '"' (Yb4 - On uq)

= {Gw

: OJ E Q}, the parametric c1ass of Gw. It is clear that ~M with 'ljJn = n-[y+(N-1)/1]/(y+N-l). Set P = (l/IQI)l:wEQPGw' Thus, for any decision mIe !1n, we obtain

Set

~M

is inc1uded in

Al,n(b4'ljJn),

= 1] +

PGo[!1n

sup

PG[!1n

= 0]

GEA2,n(b4ljJn)

(5.7) where Pw and Po denote the probability distribution of the data when their underlying support is Gw, OJ E Q, and Go respectively, and t: is an arbitrary positive real. For simplicity's sake, denote go = leb( Go). We first fix OJ E Q, and by Chebyshev's inequality we obtain

Pw

[ïQI!Eo 1 '"'

dPo dPw'

~

~

1]

~

1 - ïQI!Eo t: '"'

Ew [dPw'] dPo

L

t: = l-ïQlw'iw:w,w,EQEw

Since

b4

[dPw'] dPo

t: -ïQlEw

[dPw] dPo'

(5.8)

is chosen small enough, the final term on the right in (5.8) becomes

Ew [dPw] dPo

=

leb Gw (leb Go)n

= exp(M

1/1

b4(ij/ gO))(1

+ 0(1)).

(5.9)

Minimax

hypothesis

testing

about

the density

521

support

Furthermore, IQI is equal to the number of ways of choosing MI/l elements from a set of M elements, that is, IQI = C~I/2' Note that there exist two positive constants c and e' such that cnn+(I/l)exp(-n)

n!

"S

"S

Vn.

c'nn+(l/l)exp(-n)

(5.10)

Then, as n goes to infinity, C~I/2

= exp[Mlog

M - ~MI/llog

= exp [~MI/llog

M - (M - MI/l)log(M

- MI/l)](1

+ 0(1))

+ 0(1)).

M](1

This entails as n

Now consider the case w'

-=1-

--+

00.

(5.11)

w, w, w' E Q. Since leb GOJ= leb GOJ" for aH w, w' E Q,

=

dPo E OJ[dPOJ']

(5.12)

(lebn GOJ? (leb(GOJ' GOJ)lebGo) n.

Define

Since w and w' belong

= go(y)

- 2MI/l(b4

if/n)

to Q,

+

MI/l - 1. Note that leb (GOJ, n GOJ) Then, as n goes to infinity, (5.12) can be written as

I cannot exceed

l(b4 if/n).

(lebn GOJ)l GOJ)lebGo)n (leb(GOJ'

(1 - MI/lb4/n?+ l(b4/n))n = (1 - 2MI/l(b4/n) = (1 + (I/ n)b4r(1 + 0(1))

= exp(Ib4)(1 + 0(1)), where

b4

= (b4

(5.13)

if)/ go is a positive constant. From (5.12) and (5.13), we obtain

1

where QI = {w E Q : leb(GOJ n GOJ,) enough and from (5.10), we have

= I}.

Note that cardQI

~p

= CI

M ~p CMV2_1'

For n large

522

G. Gayraud

= exp[!MI/Zlog

Clfl/2 C~;;~lt

M - 1 log 1 - 2(MI/Z - l)log(MI/Z

+ (M - MI/Z)log(M - (M - 2MI/Z ,s;

exp[!MI/zlog

- 1)

- MI/Z)

+ l)log(M

- 2MI/Z

M - llog 1](1

+

+ 1)](1 + 0(1)) (5.14)

0(1)).

Thus, for n large enough

1~lw'#w~W'EQ

[d?] dl:a' =8 M1/2_I ~

Ew

(5.15) exp(-llog/)(1+o(1)) O. Set (3* = 8*(1 - L~I exp( -llog 1)8*). From (5.8), (5.11), (5.15) and for aU (3 0 and (3z > 0 such that (3 = (31 + (3z. First consider the risk of the second kind. Choose bs > Cz; VG E Az,n(bs"l/Jn), we have PG[Kz,n

= 0]

,s; PG[(Sn -

S(G))z >(bs - Czin-(Zy+N-I)/(y+N-I)].

(5.16)

Eo[(Sn - S(G))z] (bs _ Cz)Zn-(Zy+N-I)/(y+N-l)'

(5.17)

By Chebyshev's inequality, we obtain

=0

K

P G[

Z,n

,s;

]

For n large enough, adapting the results on support estimation in Gayraud (1997), there exists a constant bs > Cz such that RI(Kz,n, bsn-(Zy+N-I)/[Z(y+N-I)]) is bounded from above by (31. Under PGo' the proof of Ro(Kz,n),s; (3z is reduced to proving that PG[(Sn - S(G))z ,s; (3z, VG E;7'. Noting that the last inequality is similar to :? Czn-(Zy+N-l)/(y+N-l)] inequality (5.16), relation (3.4) is then satisfied.

5.3. Proof of Theorem 3.3 Let On

=

n-I/(y+N-I)

and "l/Jn= n-[y+(N-I)/Z]j(y+N-l).

5.3.1. Proof of (3.5) As in the proof of Theorem 3.2, consider the set w = {w : L~IWq the number of cubes of the partition of [0, l]N-l, and Go

= {x = (XN,

Gw

=

{x

=

y) E [0, l]N : O,s; XN ,s; go(y)},

(XN, y) E [0, l]N : 0 ,s; XN ,s; gw(y)},

=

MI/Z}, where Mis

523

Minimax hypothesis testing about the density support where

Note that the parametric family of sets foUows from the proof of (3.3).

{Gw

E Q} is included in A3,n(b61/Jn)' Thus, (3.5)

: QI

5.3.2. Proof of (3.6) Choose Choose

/31 b7

> 0 and /32 > 0 such that /3 = /31 + /32. > C3; then for aU GE A3,nCb71/Jn), PG[~3,n

= 0]

:"S

:"S

Since b7 X 1/J~2(b7

> C3 -

PG[I T( EG[(T(G)

G) -

Consider first the error of the second kind.

Tn! >(b7 Tni1/J~2(b7

and from Theorem 2 in Gayraud :"S/31, as n -+ 00 is satisfied.

-

C3)1/J n], -

C3)-2].

(1997), the relation

EG[(T(

G) -

Tn)2

C3)-2]

Under PGo' G is Go_and then T(Go) = Iro,w-1 go(y) - go(y)ldy is equal to zero. Then to prove that Ro(~3,1):"S /32, as n -+ 00, reduces to proving that PG[!Tn - T(G)! ~ C3n-[y+(N-l)/2]/(y+N-1)] :"S/32 as n -+ 00, for G E ~. Then (3.6) holds. 1

Acknowledgments 1 thank Alexander B. Tsybakov for useful discussion and helpful suggestions.

References Baraud, Y., Huet, S. and Laurent, B. (1999) Adaptive tests of linear hypotheses by model selection. Preprint, École Normale Supérieure, Paris. Birgé, L. (1983) Approximation dans les espaces métriques et théorie de l'estimation. Z. Wahrscheinlichkeitstheorie Verw. Geb., 65, 181-237. Burnashev, M.V (1979) On the minimax detection of an inaccurately known signal in a white noise background. Theory Probab. Appl., 24, 107-119. Donoho, D.L., Johnstone, LM., Kerkyacharian, G. and Picard, D. (1995) Wave1et shrinkage: Asymptopia? J. Roy. Statist. Soc. Ser. B, 57, 301-369. Ermakov, M.S. (1990a) Minimax detection of a signal in a white Gaussian noise. Theory Probab. Appl., 35, 667-679. Ermakov, M.S. (1990b) A minimax test for hypotheses on a spectral density. Zap. Nauchn. Sem. Leningr. Otdel. Mat. Inst. Steklov., 184, 115-125.

524

G. Gayraud

Ermakov, M.S. (1990c) Asymptotic minimaxity of usua1 goodness of fit tests. ln B. Grigelionis, Yu. V. Prohorov, v.v. Sazonov and V. Statu1evicius (eds), Probability Theory and Mathematical Statistics: Proceedings of the Fifth Vilnius Conference, Vol. l, pp. 323-331. Utrecht: VSP. Ermakov, M.S. (1994) Minimax nonparametric testing of hypotheses on the distribution density. Theory Probab. Appl., 39, 396-416. Ermakov, M.S. (1996) Asymptotically minimax criteria for testing composite nonparametric hypotheses. Problems lnform. Transmission, 32, 184-196. Fe1dmann, D., HardIe, W, Hafuer, C., Hoffinann, M., Lepski, O. and Tsybakov, AB. (1998) Testing linearity in a stochastic vo1atility model. Preprint SFB 373, Humboldt Universitat zn Berlin. Gayraud, G. (1997) Estimation of functiona1s of density support. Math. Methods Statist., 6, 26-46. Gayraud, G. and Tsybakov, A.B. (1999) Testing hypotheses about contours in images. Research Report no. 512, Laboratoire de Probabilités et Modèles Aléatoires, Universités de Paris 6 et Paris 7. HardIe, W and Kneip, A (1999) Testing a regression mode1 when we have smooth alternatives in mind. Scand. J. Statist., 26, 221-238. HardIe, W, Sperlich, S. and Spokoiny, v.G. (1997) Semiparametric single index versus fixed link function modelling. Ann. Statist., 25, 212-243. Ibragimov, I.A. and Has'minskii, R.z. (1977) One prob1em of statistical estimation in Gaussian white noise. Soviet Math. Dokl., 236, 1351-1354. Ingster, YU.I.(1982) Minimax nonparametric detection of a signal in Gaussian white noise. Problems lnform. Transmission, 18, 130-140. Ingster, Yu.!. (1984) Asymptotica1 minimax testing hypotheses on the distribution density of an independent samp1e. Zap. Nauchn. Sem. Leningr. Otdel. Mat. lnst. Steklov., 136, 74-96. Ingster, Yu.!. (1986a) Minimax testing of nonpararnetric hypotheses on a distribution density in Lp-metrics. Theory Probab. Appl., 31, 333-337. Ingster, Yu.!. (1986b) Asymptotically minimax testing of nonparametric hypotheses. ln Yu.v. Prohorov, V.A Statulevicius, v.v. Sazonov and B. Grigelionis (eds), Probability Theory and Mathematical Statistics, Proceedings of the Fourth Vilnius Conference, Vol. l, pp. 553-574. Utrecht: VSP. Ingster, YU.I. (1990) Minimax detection of signaIs in Lp-metrics. Zap. Nauchn. Sem. Leningr. Otdel. Mat. lnst. Steklov., 184, 152-168. Ingster, Yu.!. (1993a) Asymptotically minimax hypothesis testing for nonparametric alternatives, 1. Math. Methods Statist., 2, 85-114. Ingster, Yu.!. (1993b) Asymptotically minimax hypothesis testing for nonparametric alternatives, I!. Math. Methods Statist., 3, 171-189. Ingster, Yu.!. (1993c) Asymptotically minimax hypothesis testing for nonparametric alternatives, III. Math. Methods Statist., 4, 249-268. Ingster, Yu.!. (1994) Minimax testing of hypotheses on the distribution density for ellipsoids in lp' Theory Probab. Appl., 39, 417-436. Koroste1ev,AP. and Tsybakov, A.B. (1993a) Minimax Theory of Image Reconstruction, Lecture Notes in Statist. 82. New York: Springer-Verlag. Koroste1ev, AP. and Tsybakov, AB. (1993b) Estimation of the density support and its functionals. Problems lnform. Transmission, 29, 3-18. Lepski, O.v. (1993) On asymptotically exact testing of nonparametric hypotheses. Discussion paper, Université de Louvain. Lepski, O.v. and Spokoiny, v.G. (1999) Minimax nonparametric hypothesis testing: the case of an inhomogeneous alternative. Bernoulli, 5, 333-358. Lepski, O.v. and Tsybakov, AB. (1996) Asymptotically exact nonparametric hypothesis testing in supnorm and at a fixed point. Preprint SBF 373, Humboldt-Universitat zn Berlin.

525

Minimax hypothesis testing about the density support

Pouet, C. (1999) On testing nonparametric hypotheses for ana1ytic regression functions in Gaussian noise. Math. Methods Statist., 8, 536-549. Pouet, C. (2000) Tests minimax non-paramétriques: hypothèse nulle composite et constantes exactes. Doctoral thesis, Université Paris VI. Spokoiny, VG. (1996) Adaptive hypothesis testing using wavelets, Ann. Statist., 24, 2477-2498. Spokoiny, VG. (1997) Testing a linear hypothesis using Haar transform. Research Report no. 314, WIAS, Berlin. Spokoiny, VG. (1998) Adaptive and spatial testing of a nonparametric hypothesis. Math. Methods Statist., 7, 245-273. Suslina, I.A. (1993) Minimax detection of a signal for Mat. Inst. Steklov., 207, 127-137. Received February

1997 and revised November 2000

Ip-ba11.

Zap. Nauchn. Sem. s.-Peterburg.

Otdel.