Convergence of Robust Models - CiteSeerX

2 downloads 0 Views 683KB Size Report
Convergence of Robust Models. Outline. 1. Asymptotic Theory of Robustness. Asymptotically Linear Estimators. Infinitesimal Neighborhoods. Optimally Robust ...
Asymptotic Theory of Robustness Convergence of Robust Models

Convergence of Robust Models Dr. Matthias Kohl Mathematics VII: Stochastics

Stochastiktage (GOCPS) 2008

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Outline

1

Asymptotic Theory of Robustness Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

2

Convergence of Robust Models Setup and Questions Convergence of Robust Models

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

Outline

1

Asymptotic Theory of Robustness Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

2

Convergence of Robust Models Setup and Questions Convergence of Robust Models

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

Ideal Model parametric family of probability measures Θ ⊂ Rk (open)

P = {Pθ | θ ∈ Θ}

defined on some measurable space (Ω, A) smoothly parameterized; i.e., L2 differentiable at θ ∈ Θ with L2 derivative Λθ ∈ Lk2 (Pθ ), Eθ Λθ = 0 and Fisher information of full rank Iθ = Eθ Λθ Λτθ

Dr. Matthias Kohl

Iθ  0

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

Influence Curves (ICs)

Definition The set Ψ2 (θ) of all square integrable ICs at Pθ consists of all ψθ ∈ Lk2 (Pθ ) which are centered: E θ ψθ = 0 and Fisher consistent: E θ ψθ Λτθ = Ik (where Ik k -dimensional identity matrix)

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

Asymptotic Theory of Robustness Convergence of Robust Models

AL Estimators

Definition An asymptotic estimator Sn : (Ωn , An ) → (Rk , Bk ) is called asymptotically linear at Pθ if there is an IC ψθ ∈ Ψ2 (θ) with

n

Sn = θ +

1X ψθ (yi ) + oPθn (n1/2 ) n i=1

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

Infinitesimal Neighborhoods

Convex contamination neighborhood (gross error model)  Uc (θ, rn ) = (1 − rn )+ Pθ + (1 ∧ rn ) Q Q ∈ M1 (A) M1 (A) set of all probability measures on A √ radius rn := r / n shrinks with sample size n ∈ N where r ∈ [0, ∞)

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

Asymptotic Theory of Robustness Convergence of Robust Models

Unique asymptotic MSE Solution Theorem 5.5.7 (b), Rieder (1994) η˜θ = (Aθ Λθ − aθ )w

 w = min 1,

bθ |Aθ Λθ − aθ |



with Lagrange multipliers Aθ , aθ and bθ determined by 0 = E θ (Λθ − zθ )w

aθ = Aθ zθ

(1)

Ik = Aθ E θ (Λθ − zθ )(Λθ − zθ )τ w  r 2 bθ = E θ |Aθ Λθ − aθ | − bθ +

(2)

Dr. Matthias Kohl

Convergence of Robust Models

(3)

Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

Asymptotic Theory of Robustness Convergence of Robust Models

Unique asymptotic MSE Solution Theorem 5.5.7 (b), Rieder (1994) η˜θ = (Aθ Λθ − aθ )w

 w = min 1,

bθ |Aθ Λθ − aθ |



with Lagrange multipliers Aθ , aθ and bθ determined by 0 = E θ (Λθ − zθ )w

aθ = Aθ zθ

(1)

Ik = Aθ E θ (Λθ − zθ )(Λθ − zθ )τ w  r 2 bθ = E θ |Aθ Λθ − aθ | − bθ +

(2)

Dr. Matthias Kohl

Convergence of Robust Models

(3)

Asymptotic Theory of Robustness Convergence of Robust Models

Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

Extension of classical Cramér-Rao bound Proposition 2.1.1, Kohl (2005) (as.)maxMSE(ηθ , r ) ≥ (as.)maxMSE(˜ ηθ , r ) = tr Aθ Classical Cramér-Rao bound: Cov (ηθ )  Cov (ψˆθ ) = Iθ−1

where ψˆθ = Iθ−1 Λθ

Hence MSE (ηθ ) = tr Cov (ηθ ) ≥ tr Cov (ψˆθ ) = tr Iθ−1 = MSE (ψˆθ )

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

Estimator Construction

˜n One-step construction of optimally robust estimator S n

X ˜ n = θˆ + 1 η˜θˆ(yi ) S n i=1

where θˆ is a uniformly consistent starting estimator. Examples for initial estimators: Kolmogorov or Cramér von Mises minimum distance estimators; cf. Rieder (1994).

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

Estimator Construction

˜n One-step construction of optimally robust estimator S n

X ˜ n = θˆ + 1 η˜θˆ(yi ) S n i=1

where θˆ is a uniformly consistent starting estimator. Examples for initial estimators: Kolmogorov or Cramér von Mises minimum distance estimators; cf. Rieder (1994).

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Outline

1

Asymptotic Theory of Robustness Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators

2

Convergence of Robust Models Setup and Questions Convergence of Robust Models

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Setup Let Pν = {Pν,θ | θ ∈ Θν } ⊂ M1 (Aν )

Θν ⊂ Rk (open)

(ν ∈ N0 ) be a sequence of L2 -differentiable parametric families with L2 derivatives Λν,θ and Fisher information of full rank Iν,θ . and consider  Uν,c (θ, rn ) = (1 − rn )+ Pν,θ + (1 ∧ rn ) Qν Qν ∈ M1 (Aν )

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Setup Let Pν = {Pν,θ | θ ∈ Θν } ⊂ M1 (Aν )

Θν ⊂ Rk (open)

(ν ∈ N0 ) be a sequence of L2 -differentiable parametric families with L2 derivatives Λν,θ and Fisher information of full rank Iν,θ . and consider  Uν,c (θ, rn ) = (1 − rn )+ Pν,θ + (1 ∧ rn ) Qν Qν ∈ M1 (Aν )

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Questions

Questions: Q1:

Pν ≈ P0

for ν sufficiently large?

or even Q2:

Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )

for ν sufficiently large?

Q1: Convergence of Experiments; cf. Le Cam (1986)

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Questions

Questions: Q1:

Pν ≈ P0

for ν sufficiently large?

or even Q2:

Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )

for ν sufficiently large?

Q1: Convergence of Experiments; cf. Le Cam (1986)

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Questions

Questions: Q1:

Pν ≈ P0

for ν sufficiently large?

or even Q2:

Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )

for ν sufficiently large?

Q1: Convergence of Experiments; cf. Le Cam (1986)

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Q2: Assumptions

Assume LPν,θν (γν−1 Gν Λν,θν ) −→ w LP0,θ (Λ0,θ0 ) 0

γν−2 tr Iν,θν −→ tr I0,θ0

as ν → ∞

as ν → ∞

for (γν )ν∈N ⊂ (0, ∞) and orthogonal (Gν )ν∈N ⊂ Rk ×k .

Dr. Matthias Kohl

Convergence of Robust Models

(4)

(5)

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Q2: Assumptions

Assume LPν,θν (γν−1 Gν Λν,θν ) −→ w LP0,θ (Λ0,θ0 ) 0

γν−2 tr Iν,θν −→ tr I0,θ0

as ν → ∞

as ν → ∞

for (γν )ν∈N ⊂ (0, ∞) and orthogonal (Gν )ν∈N ⊂ Rk ×k .

Dr. Matthias Kohl

Convergence of Robust Models

(4)

(5)

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Q2: Convergence of Robust Models Theorem 2.4.1, Kohl (2005) Assume (4) and (5) and denote the Lagrange multipliers contained in the MSE optimal solution η˜ν,θν by Aν,θν , aν,θν and bν,θν . Then, lim γ 2 tr ν→∞ ν

Aν,θν = tr A0,θ0

lim γν bν,θν = b0,θ0

ν→∞

In case A0,θ0 and a0,θ0 are unique, then also lim γ 2 Gτ Aν,θν Gν ν→∞ ν ν

= A0,θ0

Dr. Matthias Kohl

and

lim γν Gντ aν,θν = a0,θ0

ν→∞

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Q2: Convergence of Robust Models Theorem 2.4.1, Kohl (2005) Assume (4) and (5) and denote the Lagrange multipliers contained in the MSE optimal solution η˜ν,θν by Aν,θν , aν,θν and bν,θν . Then, lim γ 2 tr ν→∞ ν

Aν,θν = tr A0,θ0

lim γν bν,θν = b0,θ0

ν→∞

In case A0,θ0 and a0,θ0 are unique, then also lim γ 2 Gτ Aν,θν Gν ν→∞ ν ν

= A0,θ0

Dr. Matthias Kohl

and

lim γν Gντ aν,θν = a0,θ0

ν→∞

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Approximations For ν sufficiently large, we have γν2 tr Aν,θν ≈ tr A0,θ0

⇐⇒

tr Aν,θν ≈ γν−2 tr A0,θ0

γν bν,θν ≈ b0,θ0

⇐⇒

bν,θν ≈ γν−1 b0,θ0

γν2 Gντ Aν,θν Gν ≈ A0,θ0

⇐⇒

Aν,θν ≈ γν−2 Gν A0,θ0 Gντ

γν Gντ aν,θν ≈ a0,θ0

⇐⇒

aν,θν ≈ γν−1 Gν a0,θ0

Note: To obtain an exact and not only an approximate IC we re-center and re-standardize the approximation.

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Approximations For ν sufficiently large, we have γν2 tr Aν,θν ≈ tr A0,θ0

⇐⇒

tr Aν,θν ≈ γν−2 tr A0,θ0

γν bν,θν ≈ b0,θ0

⇐⇒

bν,θν ≈ γν−1 b0,θ0

γν2 Gντ Aν,θν Gν ≈ A0,θ0

⇐⇒

Aν,θν ≈ γν−2 Gν A0,θ0 Gντ

γν Gντ aν,θν ≈ a0,θ0

⇐⇒

aν,θν ≈ γν−1 Gν a0,θ0

Note: To obtain an exact and not only an approximate IC we re-center and re-standardize the approximation.

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

MSE-Inefficiency Let χ be the re-centered and re-standardized approximating IC. The MSE-Inefficiency of χ at radius r ∈ [0, ∞) is defined as relMSE θ0 (χ, r ) =

E θ0 |χ|2 + r 2 [sup Pθ |χ|]2 0

tr A0,θ0

respectively as E θν |χ|2 + r 2 [sup Pθν |χ|]2 relMSE θν (χ, r ) = tr Aν,θν

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

MSE-Inefficiency Let χ be the re-centered and re-standardized approximating IC. The MSE-Inefficiency of χ at radius r ∈ [0, ∞) is defined as relMSE θ0 (χ, r ) =

E θ0 |χ|2 + r 2 [sup Pθ |χ|]2 0

tr A0,θ0

respectively as E θν |χ|2 + r 2 [sup Pθν |χ|]2 relMSE θν (χ, r ) = tr Aν,θν

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

MSE-Inefficiency Let χ be the re-centered and re-standardized approximating IC. The MSE-Inefficiency of χ at radius r ∈ [0, ∞) is defined as relMSE θ0 (χ, r ) =

E θ0 |χ|2 + r 2 [sup Pθ |χ|]2 0

tr A0,θ0

respectively as E θν |χ|2 + r 2 [sup Pθν |χ|]2 relMSE θν (χ, r ) = tr Aν,θν

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Example 1: Normal Approximation of Binomial

Normal approximation: (“mp(1 − p) ≥ 9”)  Binom (m, p) ≈ N mp, mp(1 − p) We have ν = m and r γm =

m p(1 − p)

and

Gm = 1

cf. Lemma 3.2.2 in Kohl (2005).

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Example 1: Normal Approximation of Binomial

Normal approximation: (“mp(1 − p) ≥ 9”)  Binom (m, p) ≈ N mp, mp(1 − p) We have ν = m and r γm =

m p(1 − p)

and

Gm = 1

cf. Lemma 3.2.2 in Kohl (2005).

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Example 2: Poisson Approximation of Binomial

Poisson approximation: Let λ = mp (“p small, m large”). Then, Binom (m, p) ≈ Pois (λ) We have ν = m and γm =

m 1−p

and

Gm = 1

cf. Lemma 4.2.4 in Kohl (2005).

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Example 2: Poisson Approximation of Binomial

Poisson approximation: Let λ = mp (“p small, m large”). Then, Binom (m, p) ≈ Pois (λ) We have ν = m and γm =

m 1−p

and

Gm = 1

cf. Lemma 4.2.4 in Kohl (2005).

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Examples 1+2: Approximation of ICs for r = 0.25 Binom(25, 0.25)

0.2

10

15

0.10 20

25

0

5

10

15

20

x

Binom(25, 0.5)

Binom(25, 0.99)

relMSE(P, B) = 1.046 relMSE(N, B) = 1 m*p(1−p) = 6.25

IC

−0.2

Binom Norm Pois

0

Binom Norm Pois

x

0.0

0.1

5

relMSE(P, B) = 1.012 relMSE(N, B) = 1.002 m*p(1−p) = 4.688

−0.15

Binom Norm Pois

0

IC

IC

m*p(1−p) = 1.188

5

10

15

20

25

x

Dr. Matthias Kohl

25

Binom Norm Pois

−0.25 −0.15 −0.05

0.00

relMSE(N, B) = 1.001

0.00

relMSE(P, B) = 1

−0.06

IC

0.04

0.08

Binom(25, 0.05)

relMSE(P, B) = 3.301 relMSE(N, B) = 1 m*p(1−p) = 0.248

0

5

10

15

20

25

x

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Example 3: Exponential Scale and Gumbel Location

It holds   sc LGum(0,1) − Λloc (0,1) = LExp(1) Λ1 ,

loc sc IGum(0,1) = IExp(1)

i.e., γν ≡ 1 and Gν ≡ −1; cf. Section 5.2.1 in Kohl (2005). Hence, relMSE (χ, r ) ≡ 1 Note: Further examples are given in Kohl (2005).

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Example 3: Exponential Scale and Gumbel Location

It holds   sc LGum(0,1) − Λloc (0,1) = LExp(1) Λ1 ,

loc sc IGum(0,1) = IExp(1)

i.e., γν ≡ 1 and Gν ≡ −1; cf. Section 5.2.1 in Kohl (2005). Hence, relMSE (χ, r ) ≡ 1 Note: Further examples are given in Kohl (2005).

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Example 3: Exponential Scale and Gumbel Location

It holds   sc LGum(0,1) − Λloc (0,1) = LExp(1) Λ1 ,

loc sc IGum(0,1) = IExp(1)

i.e., γν ≡ 1 and Gν ≡ −1; cf. Section 5.2.1 in Kohl (2005). Hence, relMSE (χ, r ) ≡ 1 Note: Further examples are given in Kohl (2005).

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Summary

Convergence of Experiments

Convergence of Robust Models

Pν ≈ P0

Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )

arbitrary loss functions

L2 -risk

arbitrary estimators

optimally robust estimators

computation of distance?

MSE-inefficiency computable

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Summary

Convergence of Experiments

Convergence of Robust Models

Pν ≈ P0

Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )

arbitrary loss functions

L2 -risk

arbitrary estimators

optimally robust estimators

computation of distance?

MSE-inefficiency computable

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Summary

Convergence of Experiments

Convergence of Robust Models

Pν ≈ P0

Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )

arbitrary loss functions

L2 -risk

arbitrary estimators

optimally robust estimators

computation of distance?

MSE-inefficiency computable

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Summary

Convergence of Experiments

Convergence of Robust Models

Pν ≈ P0

Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )

arbitrary loss functions

L2 -risk

arbitrary estimators

optimally robust estimators

computation of distance?

MSE-inefficiency computable

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Summary

Convergence of Experiments

Convergence of Robust Models

Pν ≈ P0

Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )

arbitrary loss functions

L2 -risk

arbitrary estimators

optimally robust estimators

computation of distance?

MSE-inefficiency computable

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Outlook

Extension to convex risks; cf. Ruckdeschel and Rieder (2004) Find/Consider further (multivariate) examples Application: If optimal IC is hard to compute in one model try to find an approximating model where the optimal IC is easier to compute and use it as approximation.

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Outlook

Extension to convex risks; cf. Ruckdeschel and Rieder (2004) Find/Consider further (multivariate) examples Application: If optimal IC is hard to compute in one model try to find an approximating model where the optimal IC is easier to compute and use it as approximation.

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Outlook

Extension to convex risks; cf. Ruckdeschel and Rieder (2004) Find/Consider further (multivariate) examples Application: If optimal IC is hard to compute in one model try to find an approximating model where the optimal IC is easier to compute and use it as approximation.

Dr. Matthias Kohl

Convergence of Robust Models

Asymptotic Theory of Robustness Convergence of Robust Models

Setup and Questions Convergence of Robust Models

Bibliography M. Kohl. Numerical Contributions to the Asymptotic Theory of Robustness. Dissertation, University of Bayreuth, 2005. L. Le Cam Asymptotic Methods in Statistical Decision Theory. Springer, 1986. H. Rieder. Robust Asymptotic Statistics. Springer, 1994. P. Ruckdeschel and H. Rieder. Optimal influence curves for general loss functions. Stat. Decis., 22: 201–223, 2004. Dr. Matthias Kohl

Convergence of Robust Models