Communications in Statistics—Theory and Methods, 38: 2210–2213, 2009 Copyright © Taylor & Francis Group, LLC ISSN: 0361-0926 print/1532-415X online DOI: 10.1080/03610920802521190

Comparison of Two Estimation Methods of Missing Values Using Pitman-Closeness Criterion ABDELGHANI HAMAZ AND MOHAMED IBAZIZEN Department of Mathematics, University Mouloud Mammeri, Tizi-Ouzou, Algeria In this article, a proof is given that the linear interpolator is Pitman-closer than the linear predictor with respect to a missing value of a stationary ﬁrst-order autoregressive process. Keywords Autoregressive model; Missing value; Pitman-closeness. Mathematics Subject Classiﬁcation 62M10; 62F35.

1. Introduction An important problem in time series analysis is that of (accurate) estimation of missing values which, for some reason cannot be observed completely. Faced with this situation, several methods of replacement of the missing values were developed in the literature. Pourahmadi (1989) proposed a complete solution of estimating the missing values of a stationary time series obtained by decomposing it into a prediction plus regression problem. He obtained the best linear interpolator (LI) of the missing value with respect to the mean square error. In the particular case of the AR(1) model Xt = Xt−1 + t t = 1 n < 1

(1.1)

where Xt is the observation at time t and t ’s are a sequence of white noise, it consists of replacing a missing value Xk by the well-known formula Xk−1 + Xk+1 (see Brockwell and Davis, 1991). 1+2 Roy and Chakraborty (2006) considered the model (1.1) and compared two methods: the average replacement method which consists of replacing a missing value Xk by the arithmetic mean of the n − 1 other observed values n t=1t=k Xt /n − 1 and the forecast replacement method (FR) which consists of substituting Xk by the predictive value of Xk based on X1 Xk−1 , i.e., Xk−1 . Received April 8, 2008; Accepted October 1, 2008 Address correspondence to Mohamed Ibazizen, Department of Mathematics, University Mouloud Mammeri, Tizi-Ouzou 15000, Algeria; E-mail: [email protected]

2210

Estimation Methods of Missing Values

2211

In this article, we compare the LI method and the FR method with respect to another quality measure which is known as Pitman-closeness. In the particular case of an autoregressive process of order one, we prove the superiority of the interpolation replacement method over the forecast replacement method.

2. Pitman-Closeness Deﬁnition 2.1. Let ˆ 1 and ˆ 2 be two estimators for unknown parameter ∈ , where denotes the parameter space. Then ˆ 1 is Pitman-closer (with respect to ) than ˆ 2 if and only if 1 P ˆ 1 − < ˆ2 − ≥ 2 with strict inequality for at least one ∈ . In the following, as in Wenzel (2002), the idea of Pitman-closeness will be applied to the estimators of missing value. Note that the parameter to be estimated in estimation theory is an unknown but ﬁxed value, whereas in estimating a missing value, the quantity to be estimated is random. In the latter case, we are interested in ﬁnding if a certain estimator of missing data is Pitman-closer than another. Deﬁnition 2.2. Suppose the observation Xk is missing. Let F1k and F2k be two estimators of a variable Xk . F1k is Pitman-closer than F2k (with respect to Xk ) if and only if 1 P Xk − F1k < Xk − F2k > 2

3. Two Replacement Methods of a Single Missing Value Let us introduce brieﬂy two approaches of estimating a single missing value. The ﬁrst is called the “replacement interpolation method”. Suppose that Xk is the missing value. Let Hkn the space of linear interpolators of Xk based on the observations Xt t ≤ n t = k , i.e., Hkn = sp Xt t ≤ n t = k k be the where sp A denotes the closed linear span of elements of the set A. Let X orthogonal projection of Xk on Hkn . For j ≥ 0, the j + 1-step-ahead predictor of Xj based on the inﬁnite past j , and is the orthogonal projection of Xj onto Hk−1 = Xk−1 Xk−2 , is denoted by X sp Xt t ≤ k − 1 . The following result due to Pourahmadi (1989) gives the formula of the best k . linear interpolator of X Theorem 3.1. Let Xt be a non deterministic stationary process with AR parameters k be the best linear interpolator of Xk based on Xt t ≤ n t = k . Then aj and let X k = X k + X

n j=k+1

j cj−kn Xj − X

(3.1)

2212

Hamaz and Ibazizen

where cjn = 1 +

n−k−1

−1 a2i

i=1

n−k−1−j

aj −

n−k 2 −1 k = 2 1 + varXk − X ai−k

ai ai+j

j = 1 2 n − k

i=1

i=k+1

In the particular case of the AR(1) model as in (1.1), we have a1 = aj = 0 ∀j > 1

(3.2)

Thus, c1n

n−k−1 n−k 2 −1 a1 − = 1+ ai ai ai+1 = i=1

i=1

1 + 2 (3.3)

cjn = 0 ∀j > 1 Moreover, k+i = i+1 Xk−1 i ≥ 0 X

(3.4)

From (3.1)–(3.4), we deduce k = Xk−1 + X =

X − 2 Xk−1 1 + 2 k+1

X + Xk+1 1 + 2 k−1

(3.5)

The second approach is the classical one called “forecast replacement method” k , known to be the best approximation of Xk based on the replacement of Xk by X by linear combination of the past observations, in the sense that the mean squared k 2 is minimum. error EXk − X

4. Result Consider the autoregressive process deﬁned in (1.1) with the t ’s iid random variables normally distributed with mean 0 and variance 1. Now we can give the following result. k be the two estimators of the missing value Xk deﬁned in k and X Theorem 4.1. Let X (3.4) and (3.5), respectively. k is Pitman-closer than X k (with respect to Xk ), i.e., X

1

P Xk − X + Xk+1 < Xk − Xk−1 > 1 + 2 k−1 2 Proof. Let the probability p = P Xk − 1+ 2 Xk−1 + Xk+1 < Xk − Xk−1 .

(4.1)

Estimation Methods of Missing Values

2213

We have

1 X − Xk−1 + 2 Xk − Xk+1 < Xk − Xk−1 1 + 2 k 1 =P − k+1 < k 1 + 2 k = P 1 − k+1 < 1 + 2 k k+1 2 2 = P − 2 − < − 0 The last equality in (4.2) can be written by

−2 − 2 k+1 p=P

1 2

Case 2. < 0 It is sufﬁcient to set = − to ﬁnd the same result as previously.

Acknowledgment We wish to thank the reviewer for helpful comments.

References Brockwell, P. J., Davis, R. A. (1991). Time Series: Theory and Methods. 2nd ed. New York: Springer-Verlag. Pourahmadi, M. (1989). Estimation and interpolation of missing values of a stationary time series. J. Time Ser. 10:149–169. Roy, S. S., Chakraborty, S. (2006). Prediction problems related to the ﬁrst order autoregressive process in the presence of outliers. Applic. Math. 33:265–274. Wenzel, T. (2002). Pitman-closeness as a measure to evaluate the quality of forecasts. Commun. Statist. Theor. Meth. 31:535–550.

Comparison of Two Estimation Methods of Missing Values Using Pitman-Closeness Criterion ABDELGHANI HAMAZ AND MOHAMED IBAZIZEN Department of Mathematics, University Mouloud Mammeri, Tizi-Ouzou, Algeria In this article, a proof is given that the linear interpolator is Pitman-closer than the linear predictor with respect to a missing value of a stationary ﬁrst-order autoregressive process. Keywords Autoregressive model; Missing value; Pitman-closeness. Mathematics Subject Classiﬁcation 62M10; 62F35.

1. Introduction An important problem in time series analysis is that of (accurate) estimation of missing values which, for some reason cannot be observed completely. Faced with this situation, several methods of replacement of the missing values were developed in the literature. Pourahmadi (1989) proposed a complete solution of estimating the missing values of a stationary time series obtained by decomposing it into a prediction plus regression problem. He obtained the best linear interpolator (LI) of the missing value with respect to the mean square error. In the particular case of the AR(1) model Xt = Xt−1 + t t = 1 n < 1

(1.1)

where Xt is the observation at time t and t ’s are a sequence of white noise, it consists of replacing a missing value Xk by the well-known formula Xk−1 + Xk+1 (see Brockwell and Davis, 1991). 1+2 Roy and Chakraborty (2006) considered the model (1.1) and compared two methods: the average replacement method which consists of replacing a missing value Xk by the arithmetic mean of the n − 1 other observed values n t=1t=k Xt /n − 1 and the forecast replacement method (FR) which consists of substituting Xk by the predictive value of Xk based on X1 Xk−1 , i.e., Xk−1 . Received April 8, 2008; Accepted October 1, 2008 Address correspondence to Mohamed Ibazizen, Department of Mathematics, University Mouloud Mammeri, Tizi-Ouzou 15000, Algeria; E-mail: [email protected]

2210

Estimation Methods of Missing Values

2211

In this article, we compare the LI method and the FR method with respect to another quality measure which is known as Pitman-closeness. In the particular case of an autoregressive process of order one, we prove the superiority of the interpolation replacement method over the forecast replacement method.

2. Pitman-Closeness Deﬁnition 2.1. Let ˆ 1 and ˆ 2 be two estimators for unknown parameter ∈ , where denotes the parameter space. Then ˆ 1 is Pitman-closer (with respect to ) than ˆ 2 if and only if 1 P ˆ 1 − < ˆ2 − ≥ 2 with strict inequality for at least one ∈ . In the following, as in Wenzel (2002), the idea of Pitman-closeness will be applied to the estimators of missing value. Note that the parameter to be estimated in estimation theory is an unknown but ﬁxed value, whereas in estimating a missing value, the quantity to be estimated is random. In the latter case, we are interested in ﬁnding if a certain estimator of missing data is Pitman-closer than another. Deﬁnition 2.2. Suppose the observation Xk is missing. Let F1k and F2k be two estimators of a variable Xk . F1k is Pitman-closer than F2k (with respect to Xk ) if and only if 1 P Xk − F1k < Xk − F2k > 2

3. Two Replacement Methods of a Single Missing Value Let us introduce brieﬂy two approaches of estimating a single missing value. The ﬁrst is called the “replacement interpolation method”. Suppose that Xk is the missing value. Let Hkn the space of linear interpolators of Xk based on the observations Xt t ≤ n t = k , i.e., Hkn = sp Xt t ≤ n t = k k be the where sp A denotes the closed linear span of elements of the set A. Let X orthogonal projection of Xk on Hkn . For j ≥ 0, the j + 1-step-ahead predictor of Xj based on the inﬁnite past j , and is the orthogonal projection of Xj onto Hk−1 = Xk−1 Xk−2 , is denoted by X sp Xt t ≤ k − 1 . The following result due to Pourahmadi (1989) gives the formula of the best k . linear interpolator of X Theorem 3.1. Let Xt be a non deterministic stationary process with AR parameters k be the best linear interpolator of Xk based on Xt t ≤ n t = k . Then aj and let X k = X k + X

n j=k+1

j cj−kn Xj − X

(3.1)

2212

Hamaz and Ibazizen

where cjn = 1 +

n−k−1

−1 a2i

i=1

n−k−1−j

aj −

n−k 2 −1 k = 2 1 + varXk − X ai−k

ai ai+j

j = 1 2 n − k

i=1

i=k+1

In the particular case of the AR(1) model as in (1.1), we have a1 = aj = 0 ∀j > 1

(3.2)

Thus, c1n

n−k−1 n−k 2 −1 a1 − = 1+ ai ai ai+1 = i=1

i=1

1 + 2 (3.3)

cjn = 0 ∀j > 1 Moreover, k+i = i+1 Xk−1 i ≥ 0 X

(3.4)

From (3.1)–(3.4), we deduce k = Xk−1 + X =

X − 2 Xk−1 1 + 2 k+1

X + Xk+1 1 + 2 k−1

(3.5)

The second approach is the classical one called “forecast replacement method” k , known to be the best approximation of Xk based on the replacement of Xk by X by linear combination of the past observations, in the sense that the mean squared k 2 is minimum. error EXk − X

4. Result Consider the autoregressive process deﬁned in (1.1) with the t ’s iid random variables normally distributed with mean 0 and variance 1. Now we can give the following result. k be the two estimators of the missing value Xk deﬁned in k and X Theorem 4.1. Let X (3.4) and (3.5), respectively. k is Pitman-closer than X k (with respect to Xk ), i.e., X

1

P Xk − X + Xk+1 < Xk − Xk−1 > 1 + 2 k−1 2 Proof. Let the probability p = P Xk − 1+ 2 Xk−1 + Xk+1 < Xk − Xk−1 .

(4.1)

Estimation Methods of Missing Values

2213

We have

1 X − Xk−1 + 2 Xk − Xk+1 < Xk − Xk−1 1 + 2 k 1 =P − k+1 < k 1 + 2 k = P 1 − k+1 < 1 + 2 k k+1 2 2 = P − 2 − < − 0 The last equality in (4.2) can be written by

−2 − 2 k+1 p=P

1 2

Case 2. < 0 It is sufﬁcient to set = − to ﬁnd the same result as previously.

Acknowledgment We wish to thank the reviewer for helpful comments.

References Brockwell, P. J., Davis, R. A. (1991). Time Series: Theory and Methods. 2nd ed. New York: Springer-Verlag. Pourahmadi, M. (1989). Estimation and interpolation of missing values of a stationary time series. J. Time Ser. 10:149–169. Roy, S. S., Chakraborty, S. (2006). Prediction problems related to the ﬁrst order autoregressive process in the presence of outliers. Applic. Math. 33:265–274. Wenzel, T. (2002). Pitman-closeness as a measure to evaluate the quality of forecasts. Commun. Statist. Theor. Meth. 31:535–550.