Convergence of Robust Models. Outline. 1. Asymptotic Theory of Robustness. Asymptotically Linear Estimators. Infinitesimal Neighborhoods. Optimally Robust ...
Asymptotic Theory of Robustness Convergence of Robust Models
Convergence of Robust Models Dr. Matthias Kohl Mathematics VII: Stochastics
Stochastiktage (GOCPS) 2008
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Outline
1
Asymptotic Theory of Robustness Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
2
Convergence of Robust Models Setup and Questions Convergence of Robust Models
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
Outline
1
Asymptotic Theory of Robustness Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
2
Convergence of Robust Models Setup and Questions Convergence of Robust Models
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
Ideal Model parametric family of probability measures Θ ⊂ Rk (open)
P = {Pθ | θ ∈ Θ}
defined on some measurable space (Ω, A) smoothly parameterized; i.e., L2 differentiable at θ ∈ Θ with L2 derivative Λθ ∈ Lk2 (Pθ ), Eθ Λθ = 0 and Fisher information of full rank Iθ = Eθ Λθ Λτθ
Dr. Matthias Kohl
Iθ 0
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
Influence Curves (ICs)
Definition The set Ψ2 (θ) of all square integrable ICs at Pθ consists of all ψθ ∈ Lk2 (Pθ ) which are centered: E θ ψθ = 0 and Fisher consistent: E θ ψθ Λτθ = Ik (where Ik k -dimensional identity matrix)
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
Asymptotic Theory of Robustness Convergence of Robust Models
AL Estimators
Definition An asymptotic estimator Sn : (Ωn , An ) → (Rk , Bk ) is called asymptotically linear at Pθ if there is an IC ψθ ∈ Ψ2 (θ) with
n
Sn = θ +
1X ψθ (yi ) + oPθn (n1/2 ) n i=1
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
Infinitesimal Neighborhoods
Convex contamination neighborhood (gross error model) Uc (θ, rn ) = (1 − rn )+ Pθ + (1 ∧ rn ) Q Q ∈ M1 (A) M1 (A) set of all probability measures on A √ radius rn := r / n shrinks with sample size n ∈ N where r ∈ [0, ∞)
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
Asymptotic Theory of Robustness Convergence of Robust Models
Unique asymptotic MSE Solution Theorem 5.5.7 (b), Rieder (1994) η˜θ = (Aθ Λθ − aθ )w
w = min 1,
bθ |Aθ Λθ − aθ |
with Lagrange multipliers Aθ , aθ and bθ determined by 0 = E θ (Λθ − zθ )w
aθ = Aθ zθ
(1)
Ik = Aθ E θ (Λθ − zθ )(Λθ − zθ )τ w r 2 bθ = E θ |Aθ Λθ − aθ | − bθ +
(2)
Dr. Matthias Kohl
Convergence of Robust Models
(3)
Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
Asymptotic Theory of Robustness Convergence of Robust Models
Unique asymptotic MSE Solution Theorem 5.5.7 (b), Rieder (1994) η˜θ = (Aθ Λθ − aθ )w
w = min 1,
bθ |Aθ Λθ − aθ |
with Lagrange multipliers Aθ , aθ and bθ determined by 0 = E θ (Λθ − zθ )w
aθ = Aθ zθ
(1)
Ik = Aθ E θ (Λθ − zθ )(Λθ − zθ )τ w r 2 bθ = E θ |Aθ Λθ − aθ | − bθ +
(2)
Dr. Matthias Kohl
Convergence of Robust Models
(3)
Asymptotic Theory of Robustness Convergence of Robust Models
Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
Extension of classical Cramér-Rao bound Proposition 2.1.1, Kohl (2005) (as.)maxMSE(ηθ , r ) ≥ (as.)maxMSE(˜ ηθ , r ) = tr Aθ Classical Cramér-Rao bound: Cov (ηθ ) Cov (ψˆθ ) = Iθ−1
where ψˆθ = Iθ−1 Λθ
Hence MSE (ηθ ) = tr Cov (ηθ ) ≥ tr Cov (ψˆθ ) = tr Iθ−1 = MSE (ψˆθ )
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
Estimator Construction
˜n One-step construction of optimally robust estimator S n
X ˜ n = θˆ + 1 η˜θˆ(yi ) S n i=1
where θˆ is a uniformly consistent starting estimator. Examples for initial estimators: Kolmogorov or Cramér von Mises minimum distance estimators; cf. Rieder (1994).
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
Estimator Construction
˜n One-step construction of optimally robust estimator S n
X ˜ n = θˆ + 1 η˜θˆ(yi ) S n i=1
where θˆ is a uniformly consistent starting estimator. Examples for initial estimators: Kolmogorov or Cramér von Mises minimum distance estimators; cf. Rieder (1994).
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Outline
1
Asymptotic Theory of Robustness Asymptotically Linear Estimators Infinitesimal Neighborhoods Optimally Robust Estimators
2
Convergence of Robust Models Setup and Questions Convergence of Robust Models
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Setup Let Pν = {Pν,θ | θ ∈ Θν } ⊂ M1 (Aν )
Θν ⊂ Rk (open)
(ν ∈ N0 ) be a sequence of L2 -differentiable parametric families with L2 derivatives Λν,θ and Fisher information of full rank Iν,θ . and consider Uν,c (θ, rn ) = (1 − rn )+ Pν,θ + (1 ∧ rn ) Qν Qν ∈ M1 (Aν )
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Setup Let Pν = {Pν,θ | θ ∈ Θν } ⊂ M1 (Aν )
Θν ⊂ Rk (open)
(ν ∈ N0 ) be a sequence of L2 -differentiable parametric families with L2 derivatives Λν,θ and Fisher information of full rank Iν,θ . and consider Uν,c (θ, rn ) = (1 − rn )+ Pν,θ + (1 ∧ rn ) Qν Qν ∈ M1 (Aν )
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Questions
Questions: Q1:
Pν ≈ P0
for ν sufficiently large?
or even Q2:
Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )
for ν sufficiently large?
Q1: Convergence of Experiments; cf. Le Cam (1986)
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Questions
Questions: Q1:
Pν ≈ P0
for ν sufficiently large?
or even Q2:
Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )
for ν sufficiently large?
Q1: Convergence of Experiments; cf. Le Cam (1986)
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Questions
Questions: Q1:
Pν ≈ P0
for ν sufficiently large?
or even Q2:
Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )
for ν sufficiently large?
Q1: Convergence of Experiments; cf. Le Cam (1986)
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Q2: Assumptions
Assume LPν,θν (γν−1 Gν Λν,θν ) −→ w LP0,θ (Λ0,θ0 ) 0
γν−2 tr Iν,θν −→ tr I0,θ0
as ν → ∞
as ν → ∞
for (γν )ν∈N ⊂ (0, ∞) and orthogonal (Gν )ν∈N ⊂ Rk ×k .
Dr. Matthias Kohl
Convergence of Robust Models
(4)
(5)
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Q2: Assumptions
Assume LPν,θν (γν−1 Gν Λν,θν ) −→ w LP0,θ (Λ0,θ0 ) 0
γν−2 tr Iν,θν −→ tr I0,θ0
as ν → ∞
as ν → ∞
for (γν )ν∈N ⊂ (0, ∞) and orthogonal (Gν )ν∈N ⊂ Rk ×k .
Dr. Matthias Kohl
Convergence of Robust Models
(4)
(5)
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Q2: Convergence of Robust Models Theorem 2.4.1, Kohl (2005) Assume (4) and (5) and denote the Lagrange multipliers contained in the MSE optimal solution η˜ν,θν by Aν,θν , aν,θν and bν,θν . Then, lim γ 2 tr ν→∞ ν
Aν,θν = tr A0,θ0
lim γν bν,θν = b0,θ0
ν→∞
In case A0,θ0 and a0,θ0 are unique, then also lim γ 2 Gτ Aν,θν Gν ν→∞ ν ν
= A0,θ0
Dr. Matthias Kohl
and
lim γν Gντ aν,θν = a0,θ0
ν→∞
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Q2: Convergence of Robust Models Theorem 2.4.1, Kohl (2005) Assume (4) and (5) and denote the Lagrange multipliers contained in the MSE optimal solution η˜ν,θν by Aν,θν , aν,θν and bν,θν . Then, lim γ 2 tr ν→∞ ν
Aν,θν = tr A0,θ0
lim γν bν,θν = b0,θ0
ν→∞
In case A0,θ0 and a0,θ0 are unique, then also lim γ 2 Gτ Aν,θν Gν ν→∞ ν ν
= A0,θ0
Dr. Matthias Kohl
and
lim γν Gντ aν,θν = a0,θ0
ν→∞
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Approximations For ν sufficiently large, we have γν2 tr Aν,θν ≈ tr A0,θ0
⇐⇒
tr Aν,θν ≈ γν−2 tr A0,θ0
γν bν,θν ≈ b0,θ0
⇐⇒
bν,θν ≈ γν−1 b0,θ0
γν2 Gντ Aν,θν Gν ≈ A0,θ0
⇐⇒
Aν,θν ≈ γν−2 Gν A0,θ0 Gντ
γν Gντ aν,θν ≈ a0,θ0
⇐⇒
aν,θν ≈ γν−1 Gν a0,θ0
Note: To obtain an exact and not only an approximate IC we re-center and re-standardize the approximation.
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Approximations For ν sufficiently large, we have γν2 tr Aν,θν ≈ tr A0,θ0
⇐⇒
tr Aν,θν ≈ γν−2 tr A0,θ0
γν bν,θν ≈ b0,θ0
⇐⇒
bν,θν ≈ γν−1 b0,θ0
γν2 Gντ Aν,θν Gν ≈ A0,θ0
⇐⇒
Aν,θν ≈ γν−2 Gν A0,θ0 Gντ
γν Gντ aν,θν ≈ a0,θ0
⇐⇒
aν,θν ≈ γν−1 Gν a0,θ0
Note: To obtain an exact and not only an approximate IC we re-center and re-standardize the approximation.
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
MSE-Inefficiency Let χ be the re-centered and re-standardized approximating IC. The MSE-Inefficiency of χ at radius r ∈ [0, ∞) is defined as relMSE θ0 (χ, r ) =
E θ0 |χ|2 + r 2 [sup Pθ |χ|]2 0
tr A0,θ0
respectively as E θν |χ|2 + r 2 [sup Pθν |χ|]2 relMSE θν (χ, r ) = tr Aν,θν
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
MSE-Inefficiency Let χ be the re-centered and re-standardized approximating IC. The MSE-Inefficiency of χ at radius r ∈ [0, ∞) is defined as relMSE θ0 (χ, r ) =
E θ0 |χ|2 + r 2 [sup Pθ |χ|]2 0
tr A0,θ0
respectively as E θν |χ|2 + r 2 [sup Pθν |χ|]2 relMSE θν (χ, r ) = tr Aν,θν
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
MSE-Inefficiency Let χ be the re-centered and re-standardized approximating IC. The MSE-Inefficiency of χ at radius r ∈ [0, ∞) is defined as relMSE θ0 (χ, r ) =
E θ0 |χ|2 + r 2 [sup Pθ |χ|]2 0
tr A0,θ0
respectively as E θν |χ|2 + r 2 [sup Pθν |χ|]2 relMSE θν (χ, r ) = tr Aν,θν
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Example 1: Normal Approximation of Binomial
Normal approximation: (“mp(1 − p) ≥ 9”) Binom (m, p) ≈ N mp, mp(1 − p) We have ν = m and r γm =
m p(1 − p)
and
Gm = 1
cf. Lemma 3.2.2 in Kohl (2005).
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Example 1: Normal Approximation of Binomial
Normal approximation: (“mp(1 − p) ≥ 9”) Binom (m, p) ≈ N mp, mp(1 − p) We have ν = m and r γm =
m p(1 − p)
and
Gm = 1
cf. Lemma 3.2.2 in Kohl (2005).
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Example 2: Poisson Approximation of Binomial
Poisson approximation: Let λ = mp (“p small, m large”). Then, Binom (m, p) ≈ Pois (λ) We have ν = m and γm =
m 1−p
and
Gm = 1
cf. Lemma 4.2.4 in Kohl (2005).
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Example 2: Poisson Approximation of Binomial
Poisson approximation: Let λ = mp (“p small, m large”). Then, Binom (m, p) ≈ Pois (λ) We have ν = m and γm =
m 1−p
and
Gm = 1
cf. Lemma 4.2.4 in Kohl (2005).
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Examples 1+2: Approximation of ICs for r = 0.25 Binom(25, 0.25)
0.2
10
15
0.10 20
25
0
5
10
15
20
x
Binom(25, 0.5)
Binom(25, 0.99)
relMSE(P, B) = 1.046 relMSE(N, B) = 1 m*p(1−p) = 6.25
IC
−0.2
Binom Norm Pois
0
Binom Norm Pois
x
0.0
0.1
5
relMSE(P, B) = 1.012 relMSE(N, B) = 1.002 m*p(1−p) = 4.688
−0.15
Binom Norm Pois
0
IC
IC
m*p(1−p) = 1.188
5
10
15
20
25
x
Dr. Matthias Kohl
25
Binom Norm Pois
−0.25 −0.15 −0.05
0.00
relMSE(N, B) = 1.001
0.00
relMSE(P, B) = 1
−0.06
IC
0.04
0.08
Binom(25, 0.05)
relMSE(P, B) = 3.301 relMSE(N, B) = 1 m*p(1−p) = 0.248
0
5
10
15
20
25
x
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Example 3: Exponential Scale and Gumbel Location
It holds sc LGum(0,1) − Λloc (0,1) = LExp(1) Λ1 ,
loc sc IGum(0,1) = IExp(1)
i.e., γν ≡ 1 and Gν ≡ −1; cf. Section 5.2.1 in Kohl (2005). Hence, relMSE (χ, r ) ≡ 1 Note: Further examples are given in Kohl (2005).
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Example 3: Exponential Scale and Gumbel Location
It holds sc LGum(0,1) − Λloc (0,1) = LExp(1) Λ1 ,
loc sc IGum(0,1) = IExp(1)
i.e., γν ≡ 1 and Gν ≡ −1; cf. Section 5.2.1 in Kohl (2005). Hence, relMSE (χ, r ) ≡ 1 Note: Further examples are given in Kohl (2005).
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Example 3: Exponential Scale and Gumbel Location
It holds sc LGum(0,1) − Λloc (0,1) = LExp(1) Λ1 ,
loc sc IGum(0,1) = IExp(1)
i.e., γν ≡ 1 and Gν ≡ −1; cf. Section 5.2.1 in Kohl (2005). Hence, relMSE (χ, r ) ≡ 1 Note: Further examples are given in Kohl (2005).
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Summary
Convergence of Experiments
Convergence of Robust Models
Pν ≈ P0
Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )
arbitrary loss functions
L2 -risk
arbitrary estimators
optimally robust estimators
computation of distance?
MSE-inefficiency computable
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Summary
Convergence of Experiments
Convergence of Robust Models
Pν ≈ P0
Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )
arbitrary loss functions
L2 -risk
arbitrary estimators
optimally robust estimators
computation of distance?
MSE-inefficiency computable
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Summary
Convergence of Experiments
Convergence of Robust Models
Pν ≈ P0
Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )
arbitrary loss functions
L2 -risk
arbitrary estimators
optimally robust estimators
computation of distance?
MSE-inefficiency computable
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Summary
Convergence of Experiments
Convergence of Robust Models
Pν ≈ P0
Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )
arbitrary loss functions
L2 -risk
arbitrary estimators
optimally robust estimators
computation of distance?
MSE-inefficiency computable
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Summary
Convergence of Experiments
Convergence of Robust Models
Pν ≈ P0
Uν,c (θν , rn ) ≈ U0,c (θ0 , rn )
arbitrary loss functions
L2 -risk
arbitrary estimators
optimally robust estimators
computation of distance?
MSE-inefficiency computable
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Outlook
Extension to convex risks; cf. Ruckdeschel and Rieder (2004) Find/Consider further (multivariate) examples Application: If optimal IC is hard to compute in one model try to find an approximating model where the optimal IC is easier to compute and use it as approximation.
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Outlook
Extension to convex risks; cf. Ruckdeschel and Rieder (2004) Find/Consider further (multivariate) examples Application: If optimal IC is hard to compute in one model try to find an approximating model where the optimal IC is easier to compute and use it as approximation.
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Outlook
Extension to convex risks; cf. Ruckdeschel and Rieder (2004) Find/Consider further (multivariate) examples Application: If optimal IC is hard to compute in one model try to find an approximating model where the optimal IC is easier to compute and use it as approximation.
Dr. Matthias Kohl
Convergence of Robust Models
Asymptotic Theory of Robustness Convergence of Robust Models
Setup and Questions Convergence of Robust Models
Bibliography M. Kohl. Numerical Contributions to the Asymptotic Theory of Robustness. Dissertation, University of Bayreuth, 2005. L. Le Cam Asymptotic Methods in Statistical Decision Theory. Springer, 1986. H. Rieder. Robust Asymptotic Statistics. Springer, 1994. P. Ruckdeschel and H. Rieder. Optimal influence curves for general loss functions. Stat. Decis., 22: 201–223, 2004. Dr. Matthias Kohl
Convergence of Robust Models