approximating the nonhomogeneous lognormal

1 downloads 0 Views 136KB Size Report
Aportaciones al estudio de difusi on lognormal: Bandas de confianza aproximadas y generalizadas. Estudio del caso polin omico. Ph.D Thesis. Granada, Spain: ...
Cybernetics and Systems: An International Journal, 37: 293–309 Copyright Q 2006 Taylor & Francis Group, LLC ISSN: 0196-9722 print=1087-6553 online DOI: 10.1080/01969720600626295

APPROXIMATING THE NONHOMOGENEOUS LOGNORMAL DIFFUSION PROCESS VIA POLYNOMIAL EXOGENOUS FACTORS

R. GUTIE´RREZ N. RICO P. ROMA´N D. ROMERO J. J. SERRANO F. TORRES Department of Statistics and Operations Research, University of Granada, Spain

In this article we propose a methodology for building a lognormal diffusion process with polynomial exogenous factors in order to fit data that present an exponential trend and show deviations with respect to an exponential curve in the observed time interval. We show that such a process approaches a nonhomogeneous lognormal diffusion and proves that it is specially useful in the case when external information (exogenous factors) about the process is not available even though the existence of these influences is clear. An application to the global man-made emissions of methane is provided.

INTRODUCTION The study of variables that model dynamical systems has undergone a great development over the last decades, and a variety of statistical

This work was supported in part by the Ministerio de Ciencia y Tecnologı´a, Spain under Grant BFM2002-03633. Address correspondence to F. Torres, Department of Statistics and Operations Research, Avda. Fuente Nueva, s=n, Granada, 18071 Spain. E-mail: [email protected]

294

R. GUTIE´RREZ ET AL.

and probabilistic techniques has been worked out for this purpose. Among these, stochastic processes, and in particular diffusion processes, have been systematically employed. There are many applications concerning dynamical systems. From a historical point of view, biology and economy are, possibly, fields where stochastic processes have been most frequently used. In these contexts we can cite, in economy, the theory associated with the Black–Scholes models (Black and Scholes, 1973) and further extensions (Hunt and Kennedy, 2000). In biology, stochastic diffusion models have been the focus of many researchers (see, for instance, Capocelli and Ricciardi, 1974, 1975; Di Crescenzo et al., 2004; Ricciardi, 1977; Ricciardi and Lansky, 2002). Advances in these studies have also taken into account other application fields related to new technologies, such as communication via satellite and mobile phones (Corazza and Vatalaro, 1994; Tjeng and Chai, 1999). For more details, see works by Gutie´rrez et al. (2003, 2005) and references therein. In practical situations, dynamical variables under study are observed in discrete time, so that a discrete sample is available. From this sample, the main aim of the application of stochastic processes is to obtain a model that allows one to fit the observed trend, and, therefore, to reach an explanation of the studied phenomenon. But in many cases, the problem of forecasting is even more important because of the necessity of predicting the future behavior of the system. This fact is of fundamental importance in order to provide mechanisms for controlling externally the system’s behavior. Most of the previously mentioned applications have been treated in the homogeneous case, i.e., when the infinitesimal moments are not timedependent. Nevertheless, there are, usually, two problems when this kind of process is used. The first is related to the lack of fit of the observed data because of the presence, in some time intervals, of deviations with respect to the trend of some known homogenous diffusion process. The second problem appears when one desires to check forecasting properties of the process when external influences act on the system. For these reasons, the use of diffusion processes that include exogenous factors in their trend is common in many application fields. Exogenous factors are time-dependent functions that allow one to solve aforementioned questions. In addition, factors must be totally or partially known; that is, their functional form or some aspects of their time evolution must be available. At this point, a difficulty arises: how to determine the

NONHOMOGENEOUS LOGNORMAL DIFFUSION PROCESS

295

exogenous factors. In many situations, the endogenous variable itself shows what exogenous factors are. For example, Gutie´rrez et al. (1999, 2003) built a nonhomogeneous lognormal diffusion process to fit the gross national product in Spain by considering the consumer spending and the gross domestic fixed capital formation as exogenous variables. In addition, Gutie´rrez et al. (2005) considered a Gompertz diffusion process with exogenous factors to study housing prices in Spain by taking a retail price index, gross national product per inhabitant, and long-term interest rate as external influences to the system. Nevertheless, there are situations in which external variables to the process having an influence on the system are not available. One of the mentioned kind of processes is the lognormal diffusion process with exogenous factors. This process is defined as a diffusion process fX ðtÞ; t0  t  T g taking values on Pþ with infinitesimal moments A1 ðx; tÞ ¼ hðtÞx A2 ðx; tÞ ¼ r2 x 2 ;

ð1Þ

where r > 0, and h is a continuous function in ½t0 ; T  containing the external information to the process. Frequently, there are more than one external information source, so it is usual to take h as a linear combination of continuous functions Fi ; i ¼ 1; . . . ; q (from now, factors). Some topics, as such inference and first-passage-time through varying boundaries, have already been studied. With respect to the inference, some of the treated aspects are the Maximum Likelihood (ML) estimation of the weights of the factors and of r2 (Gutie´rrez et al., 1997, 1999; Torres 1993) as well as the estimation of certain interesting parametric functions that include, as particular cases, the mean, mode, and quantiles functions (and their conditional versions, frequently used in forecasting). For these parametric functions, their ML and Uniformly Minimum Variance Unbiased (UMVU) estimators have been obtained (Gutie´rrez et al., 2001a,b, 2003). It is important to note that as far as the inferential process is concerned, it is not necessary to know the functional form of the exogenous factors but only the value of their integrals between two instants in the interval ½t0 ; T  (analogous to some interpolation problems), whereas the probability density function of the firstpassage-time of the process through a boundary includes the factors explicitly (Gutie´rrez et al., 1999). In the same way, the use of this process

296

R. GUTIE´RREZ ET AL.

for forecasting purposes requires the knowledge of the value of integrals of exogenous factors up to the forecast time or the expression of the factors in order to calculate these integrals. However, as previously mentioned, in practice the functional form of the exogenous factors is not available, or indeed, the external influences are unknown. This article shows a possible approximation in such situaP kÞ kÞ tions by using polynomial exogenous factors; that is, h ¼ kj¼0 bj Pj , kÞ kÞ kÞ where Pj is a k-degree polynomial ðP0 ¼ 1Þ and bj are real fixed parameters, j ¼ 0; 1; . . . ; k. Firstly, we discuss an example showing the actual situation. Later, we present the proposed model and its inferential study, and show recurrent expressions in terms of the degree of considered polynomials. Finally, we present the practical application of the theoretical results to the previous example; the choice of the exogenous polynomial factors, the fit of the theoretical model to the data, and the application in forecasting will be the focus of interest of this example. AN EXAMPLE Stern and Kaufmann (1998) published a study on global man-made emissions of methane from 1860 to 1994. In this study, the authors provided the estimation over the mentioned period for total anthropogenic emissions as well as for seven component categories. The global emissions are the amount of individual components whereas the partial emissions were estimated from other variables such as population or coal production. The final goal of this process was to establish a first approximation to actual emissions and, as the authors noted, estimates for the 1980s of total anthropogenic methane emissions, and other related to fossil fuels, were consistent with estimates from the Intergovernmental Panel on Climate Change. Figure 1 illustrates the obtained chronological series (data are available at http://cdiac.esd.ornl.gov/trends/trends.htm) showing an exponential trend. For this reason, we can try to fit a diffusion model to the observed data, and the lognormal diffusion process is, a priori, a valid candidate. However, after using the homogeneous lognormal diffusion process to fit the observed data (Figure 2), we can note that the estimated trend shows deviations from observed values. Hence, it is reasonable to assume that there are some external influences to the process not accounted for

NONHOMOGENEOUS LOGNORMAL DIFFUSION PROCESS

297

Figure 1. Global historical CH4 emissions in teragrams (1 Tg ¼ 1012 gr.).

by the homogeneous model. These influences must be time-dependent variables that affect the trend of the process. But, what influences, and what can we do if these external variables are not known? An approximation to solve this dilemma is shown later by using polynomials to approximate the unknown exogenous factors. In the next sections, we first describe the model and provide a brief summary of the inferential procedure on the process that will be used later.

Figure 2. Estimated trend from a homogeneous lognormal diffusion process.

R. GUTIE´RREZ ET AL.

298

THE MODEL Let fX kÞ ðtÞ; t0  t  T g be the lognormal diffusion process with polynomial exogenous factors whose infinitesimal moments are given by " # k X kÞ kÞ kÞ A1 ðx; tÞ ¼ bj Pj ðtÞ x ð2Þ j¼0 kÞ

A2 ðx; tÞ ¼ r2k x 2 kÞ



where Pj is a k-degree polynomial, bj are real fixed parameters kÞ ( j ¼ 1; . . . ; k), P0 ¼ 1 and rk > 0. Furthermore, we suppose that kÞ fPj ; j ¼ 0; 1; . . . ; kg is a basis of the k-degree polynomial vectorial space. Some interesting characteristics of the process, with forecasting aims, are the mean, mode, and quantile functions (and their conditional versions). The point estimation of mean and mode functions will allow to make point forecasts for the expected value and most probable value of the endogenous variable in future times, respectively. On the other hand, obtaining confidence bands for the mean and mode functions will lead us to make interval forecasts of the expected value and most probable value of the endogenous variable, respectively. Finally, the point estimation of the quantile functions for suitable a values provides for each t (nonconditional versions) and for each t given s (conditional versions) the estimation of an interval to which the endogenous variable belongs with a preassigned probability. This fact will allow us to obtain interval forecasts of the value of the endogenous variable for future times. The expressions for the aforementioned characteristics are . mean function mkÞ ðtÞ ¼ E½X kÞ ðtÞ

  1 ¼ E½X kÞ ðt0 Þ exp ukÞ ðtÞak þ ðt  t0 Þr2k ; 2

ð3Þ

. conditional mean function. Given s and xs, mkÞ ðtjsÞ ¼ E½X kÞ ðtÞjX kÞ ðsÞ ¼ xs    1 kÞ 2 ¼ xs exp u ðt; sÞak þ ðt  sÞrk ; 2

ð4Þ

NONHOMOGENEOUS LOGNORMAL DIFFUSION PROCESS

299

. mode function MokÞ ðtÞ ¼ ModebX kÞ ðtÞc

  ¼ EbX kÞ ðt0 Þc exp ukÞ ðtÞak  ðt  t0 Þr2k ;

ð5Þ

. conditional mode function. Given s and xs, MokÞ ðtjsÞ ¼ ModebX kÞ ðtÞjX kÞ ðsÞ ¼ xs c   ¼ xs exp ukÞ ðt; sÞak  ðt  sÞr2k ;

ð6Þ

. quantile function CakÞ ðtÞ ¼ ath  percentile½X kÞ ðtÞ  pffiffiffiffiffiffiffiffiffiffiffi  ¼ E½X kÞ ðt0 Þ exp ukÞ ðtÞak þ za t  t0 rk ;

ð7Þ

. conditional mode function. Given s and xs, CakÞ ðtjsÞ ¼ ath  percentile½X kÞ ðtÞjX kÞ ðsÞ ¼ xs   pffiffiffiffiffiffiffiffiffiffi  ¼ xs exp ukÞ ðt; sÞak þ za t  srk ;

ð8Þ

where t > s in the conditional functions, and  0 1 kÞ kÞ kÞ kÞ kÞ  ak ¼ a0 ; b1 ; . . . ; bk with a0 ¼ b0  r2k ; 2  kÞ



u ðt; sÞ ¼

t  s;

Z

t s kÞ

kÞ P1 ðsÞds; . . . ;

Z s

t

kÞ Pk ðsÞds

ukÞ ðtÞ ¼ u ðt; t0 Þ; and

ð9Þ

0 ; ð10Þ



za is the a th-quantile of a standard normal distribution. These characteristics can be expressed in terms of a kind of parametric functions hk ðC; AkÞ ðt; sÞ; Bðt; sÞ; lÞ  0  ¼ C exp AkÞ ðt; sÞak þ Bðt; sÞrlk ;

ð11Þ

with C > 0, l 2 N, AkÞ ðt; sÞ 2 Pk þ 1 and Bðt; sÞ 2 P. Concretely, if PbX kÞ ðt0 Þ ¼ x0 c ¼ 1 and if for the conditional versions we assume as

R. GUTIE´RREZ ET AL.

300

known the values xs taken for X kÞ ðsÞ; s < t (as it is usual in forecasting), then   1 kÞ m ðtÞ ¼ hk x0 ; u ðtÞ; ðt  t0 Þ; 2 ; 2 kÞ





m ðtjsÞ ¼ hk

t  t0 ;

 1 xs ; u ðt; sÞ; ðt  sÞ; 2 ; 2 kÞ

t > s  t0 ;

  MokÞ ðtÞ ¼ hk x0 ; ukÞ ðtÞ; ðt  t0 Þ; 2 ;   MokÞ ðtjsÞ ¼ hk xs ; ukÞ ðt; sÞ; ðt  sÞ; 2 ;  pffiffiffiffiffiffiffiffiffiffiffi  CakÞ ðtÞ ¼ hk x0 ; ukÞ ðtÞ; za t  t0 ; 1 ;

ð12Þ

ð13Þ

t  t0 ;

ð14Þ

t > s  t0 ;

ð15Þ

ð16Þ

t  t0 ; and

 pffiffiffiffiffiffiffiffiffiffi  CakÞ ðtjsÞ ¼ hk xs ; ukÞ ðt; sÞ; za t  s; 1 ;

t > s  t0 :

ð17Þ

The point estimation of hk functions (maximum-likelihood and minimum variance unbiased estimation) were developed in Gutie´rrez et al. (2001a) for l ¼ 2 and for a generic l (with the particular case of l ¼ 1) in Gutie´rrez et al. (2003). On the other hand, the mean and mode functions (and their conditional versions) can be written in the form expðlk ðt; sÞ þ kr2k ðt; sÞÞ with Mean lk ðt; sÞ k r2k ðt; sÞ

0

ln x0 þ ukÞ ðtÞak 1=2 ðt  t0 Þr2k

Conditional Mean 0

ln xs þ ukÞ ðt; sÞak 1=2 ðt  sÞr2k

Mode 0

ln x0 þ ukÞ ðtÞak 1 ðt  t0 Þr2k

Conditional Mode 0

ln xs þ ukÞ ðt; sÞak 1 ðt  sÞr2k

The problem of building exact confidence bands for them was solved in Gutie´rrez et al. (2003), whereas approximate and generalized bands have been treated by Rico (2005).

NONHOMOGENEOUS LOGNORMAL DIFFUSION PROCESS

301

Inference on the Model Now we provide a brief summary of the inferential procedure. Let x1 ; . . . ; xn be the observed values obtained by discrete sampling of the process at times t1 ; . . . ; tn ; ðn > k þ 2Þ, and suppose PbX kÞ ðt1 Þ ¼ x1 c ¼ 1. After transforming these values by means of v1 ¼ x1 and vi ¼ ðti  ti1 Þ1=2 lnðxi =xi1 Þ, i ¼ 2; . . . ; n, the ML estimators of the parameters ak and r2k (see Torres 1993; Gutie´rrez et al., 1997 for the multi1 variate version of the process) are b ak ¼ Vk v and b r2k ¼ n1 v0 Hk v. In these last expressions, we have set v ¼ ðv2 ; . . . ; vn Þ0 , Vk ¼ ðUk U0k Þ1 Uk , and Hk ¼ In1  U0k Vk , Uk being the matrix, whose kÞ kÞ rank is assumed to be k þ 1, given by Uk ¼ ðu2 ; . . . ; un Þ with kÞ ui ¼ ðti  ti1 Þ1=2 ukÞ ðti ; ti1 Þ. From these estimators, one can obtain the corresponding ML and UMVU estimators of the hk functions (Gutie´rrez et al., 2003). The ML estimator is straightforward whereas the UMVU estimator, for l ¼ 2 and l ¼ 1 are ! 0 n  k  2 ðn  1Þ½Bðt; sÞ  ð1=2ÞAU t;s  2 AkÞ ðt;sÞb ak b Ce ; rk ; ð18Þ 0 F1 2 2 and   nk2 2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi k v 1 ðb X rk ðn  1ÞAU t;s Þ

kÞ0

CeA



ðt;sÞb ak

v¼0

C

v!Cðnk2þv Þ2v 2 0

0

1

B Bðt; sÞ C Hv @qffiffiffiffiffiffiffiffiffiffiA; k 2AU t;s

ð19Þ

kÞ 0 1 kÞ k respectively, with AU t;s ¼ A ðt; sÞðUk Uk Þ A ðt; sÞ, 0 F1 the confluent P1 j hypergeometric function 0 F1 ða; zÞ ¼ j¼0 ðz =ðaÞj j!Þ and Hv the Hermite P½v=2 polynomial defined by Hv ðxÞ ¼ v! u¼0 ðð1Þu =u!ðv  2uÞ!Þð2xÞv2u, where [x] denotes the integer part of x. In addition, exact confidence bands for the expðlk ðt; sÞ þ kr2k ðt; sÞÞ functions, for each t and s, can be calculated. To this purpose, the procedure described in Gutie´rrez et al. (2003) can be followed. With this procedure, we have for each t and s a confidence interval for lk ðt; sÞ þ kr2k ðt; sÞ. By repeating this process for each t in the time interval for fixed s with the same confidence level and taking exponentials, we obtain the desired confidence band for expðlk ðt; sÞ þ kr2k ðt; sÞÞ.

302

R. GUTIE´RREZ ET AL.

Now, we focus our attention on the particular case introduced in this article, that is, the polynomial exogenous factors. One question that can be formulated in this model is what the optimal degree of the polynomial is, or equivalently, how many exogenous factors have to be taken. The answer is not immediate because it will depend on the data. But one can consider a similar strategy to that used in polynomial regression. For example, a forward procedure can be followed by introducing successively polynomials in the model. In such situation, it is especially interesting to get recursive expressions in order to make use, in the inferkþ1Þ ential process after including a k þ 1-degree polynomial Pkþ1 , the calculations derived for the k-degree model. kþ1Þ After introducing Pkþ1 , the information about the exogenous factors is contained in the matrix Ukþ1 ¼ ðU0k jdkþ1 Þ0 , where (i-1)-th R ti the kþ1Þ component of the dkþ1 vector is given by ðti  ti1 Þ1=2 ti1 Pkþ1 ðsÞds, i ¼ 2; . . . ; n. Taking into account the expression of the inverse of a partitioned matrix, and denoting by ekþ1 ¼ d0kþ1 Hk dkþ1, one obtains the following recursive expressions for ML estimators of the process: " # !   ekþ1 Ikþ1 þ Vk dkþ1 d0kþ1 U0k Vk dkþ1 0 1 b b akþ1 ¼ ak þ dkþ1 v ekþ1 d0kþ1 U0k 1 ð20Þ v0 Hk dkþ1 d0kþ1 Hk v 0 2 b rkþ1 ¼ b rk  : ðn  1Þekþ1 From the corresponding expressions, ML estimators of hk functions are calculated b hkþ1 ðC;Akþ1Þ ðt;sÞ;Bðt;sÞ;lÞ ¼b hk ðC;AkÞ ðt;sÞ;Bðt;sÞ;lÞ " # ! !  kþ1Þ0 Vk dkþ1 d0kþ1 U0k Vk dkþ1 0 A ðt;sÞ b ak þ  exp dkþ1 v ekþ1 d0kþ1 U0k 1  l=2  v0 Hk dkþ1 d0kþ1 Hk v 2 l þ Bðt;sÞ b rk  b rk : ðn  1Þekþ1

ð21Þ

Note that recursive expressions for the mean, mode, and quantile functions are derived from Eq. (21) by replacing C, Akþ1Þ ðt;sÞ;Bðt;sÞ and l with their corresponding arguments. Nevertheless, recursive expressions for the UMVU estimators of the hk functions are not available because the hypergeometric functions and

NONHOMOGENEOUS LOGNORMAL DIFFUSION PROCESS

303

Hermite polynomials that appear do not make it feasable. However, there 0 are recursive expressions for some arguments, concretely, AkÞ ðt; sÞb ak Uk and At;s 0

0

Akþ1Þ ðt; sÞb akþ1 ¼ AkÞ ðt; sÞb ak ; þ



1 ekþ1

kþ1 AU t;s

0

  kþ1Þ ak  vÞ ; Pkþ1 ðsÞds d0kþ1 ðU0k b

ð23Þ

2  Z t kþ1Þ kÞ0 u ðt; sÞVk dkþ1  Pkþ1 ðsÞds :

ð24Þ

ukÞ ðt; sÞVk dkþ1 

Z

t s

¼

k AU t;s

þ

1 ekþ1

ð22Þ

s

Finally, we have to note that obtaining exact confidence bands for the mean and mode functions (and their conditional versions) involves the calculus of quantiles via numerical integration depending on the number of exogenous factors (in this case, polynomials). This question does not make it possible to provide recursive formulae for the exact bands, but, as in the previous case, it is possible to obtain recursive expressions for some arguments. Concretely, one has   Z t 0 1 kþ1Þ Bkþ1Þ ðt; sÞ ¼ BkÞ ðt; sÞ þ ukÞ ðt; sÞVk dkþ1  Pkþ1 ðsÞds ekþ1 s 0 0  dkþ1 ðUk b ak  vÞ; ð25Þ 2 Skþ1 ðt; sÞ ¼

nk2 2 t  s v0 Hk dkþ1 d0kþ1 Hk v Sk ðt; sÞ  ; ð26Þ nk3 nk3 ekþ1

C kþ1Þ ðt; sÞ ¼ C kÞ ðt; sÞ þ

1 ekþ1 ðt  sÞ

 2 Z t 0 kþ1Þ  ukÞ ðt; sÞVk dkþ1  Pkþ1 ðsÞds ;

ð27Þ

s

where, see Gutie´rrez et al. (2003) for more details, 0

BkÞ ðt; sÞ ¼ lnðxs Þ þ ukÞ ðt; sÞb ak ; Sk2 ðt; sÞ ¼

ðn  1Þb r2k ðt  sÞ; nk2

C kÞ ðt; sÞ ¼

k AU t;s : ts

ð28Þ ð29Þ

ð30Þ

R. GUTIE´RREZ ET AL.

304

In this development, it is necessary to note that the Uk matrix contains powers of the ti , so that the explained procedure is more reliable for low degrees because of the conditioning of the matrix Uk U0k . In this sense, the previously discussed recursive expressions can improve the accuracy of the calculations. Nevertheless, for some instances, orthogonal polynomials or more smoothed functions (like splines) will be preferable. APPLICATION TO EMISSIONS OF METHANE DATA Returning to the example of the second section, the procedure proposed in this article can be used with remarkable results under the assumption that no additional external information about the process is available. The main problem in this situation is to establish the degree of the polynomial. The selection criterion must not consider only the goodness of fit to the data because this property cannot hold when one uses the model to make forecasts outside the rank of the observations. Therefore, we need to find a balanced solution that is suitable for both questions: the fit of the model to the observed data and its possibilities for forecasting purposes. In this sense, we have fitted the model by using the data from 1860 to 1993, and then we have made a forecast for 1994. This value is actually known and will serve to check the fitted model. Firstly, it is necessary to note some aspects about the practical methodology employed in this application. These aspects are related with how the procedure followed is to obtain the polynomial exogenous factors. As we have previously noted, there exists no additional information but only values x1 ; . . . ; xn of the endogenous variable in times t1 ; . . . ; tn are available. If we suppose P½X ðt1 Þ ¼ x1  ¼ 1, for the nonhomogeneous lognormal diffusion process with infinitesimal moments given by Eq. (1), it is known that   Z t E½X ðtÞ ln hðsÞds ¼ H ðtÞ: ð31Þ ¼ x1 t1 we can consider values   xi fi ¼ ln ; x1

i ¼ 1; . . . ; n

ð32Þ

as an approximation to H ðti Þ. Hence, with these values, we fit a (k þ 1)-degree polynomial P Mk ¼ kþ1 The Qi functions are i-degree polynomials, i¼1 ci Qi .

NONHOMOGENEOUS LOGNORMAL DIFFUSION PROCESS

305

i ¼ 1; . . . ; k þ 1 that are required to be linearly independent. Hence, we can approach the lognormal diffusion process fX ðtÞ; to  t  T g by the lognormal diffusion process with polynomial exogenous factors fX kÞ ðtÞ; to  t  T g whose infinitesimal moments are " # k X kÞ kÞ kÞ bj Pj ðtÞ x ð33Þ A1 ðx; tÞ ¼ j¼0 kÞ

A2 ðx; tÞ ¼ r2k x 2 : kÞ

ð34Þ



0 ; j ¼ 1; . . . ; k; P0 ¼ 1. We note that we do In this case, Pj ¼ cjþ1 Qjþ1 Pk kÞ not take i¼1 Pi as only one factor because it is possible that, after a kÞ successive study, some of the Pi functions were not relevant. Now we consider fi ¼ lnðxi =x1 Þði ¼ 1; . . . ; 134Þ as the values that we wish to fit. At this point, it is important to note that as f1 ¼ H ðt1 Þ ¼ 0, we have taken Qj ðtÞ ¼ ðt  1860Þj ; j ¼ 1; 2; . . . as the generators of the polynomials. Also we can propose, in advance, a possible selection of the degree after fitting a regression line to the fi data and compute the number of overcrossing with the observed trend. In this line, we can think of 4 or 5 as the possible degree, that is, the maximum degree of exogenous factors will be 3 or 4 (after the posterior discussion about the choice of the degree, we will appreciate that this intuition is near to reality). With these ideas in mind, we have realized an iterative procedure by fitting polynomials of degrees 2 to 6, and so we have obtained, from their derivatives, the exogenous factors related to the models X kÞ ; k ¼ 1; . . . ; 5. Then, for each model, from the maximum-likelihood estimators of the parameters, we have obtained

. the ML and UMVU estimation of the mean and mode functions together with their conditional versions, . the ML and UMVU estimation of the quantile and conditional quantile functions for the particular case of a ¼ 0:025 and a ¼ 0:975, and . exact confidence bands for the mean and mode functions. From these, we have built Tables 1 and 2 containing values of the point and interval forecasts for 1994, respectively. The final question is to decide which is the most appropriated model. Because the value of the global emissions of methane for 1994 is 371 Tg, we can conclude that the conditional estimations provide the best

R. GUTIE´RREZ ET AL.

306

Table 1. Point forecast for 1994 from the models X kÞ ðtÞ; k ¼ 1; . . . ; 5

ML mean ML cond. mean ML mode ML cond. mode UMVUE mean UMVUE cond. mean UMVUE mode UMVUE cond. mode

X 1Þ ðtÞ

X 2Þ ðtÞ

X 3Þ ðtÞ

X 4Þ ðtÞ

X 5Þ ðtÞ

375.033 373.114 369.262 373.071 373.084 373.114 367.256 373.070

374.852 372.935 369.088 372.892 372.905 372.935 367.039 372.890

373.142 371.296 367.591 371.255 371.266 371.295 365.572 371.252

369.413 367.795 364.545 367.759 367.768 367.793 362.733 367.755

368.912 367.300 364.061 367.264 367.271 367.297 362.214 367.259

approximation and, in particular, those related to 3 degree model. As illustration Figure 3 shows, for 1994, the ML estimation of the conditional mean, the ML estimation of the conditional quantiles (for a ¼ 0:025 and a ¼ 0:975), and the confidence band, at level 0.95, for the conditional mean. Similar conclusions can be made by considering another estimation. Furthermore, to confirm the previously discussed conclusion, we have realized, for each estimated model, 100 simulated paths following the recursive schema obtained from the numerical solution of stochastic differential equations (Kloeden et al., 1994; Rao et al., 1974) and we have Table 2. Interval forecast for 1994 from the models X kÞ ðtÞ; k ¼ 1; . . . ; 5

ML quantiles ML cond. quantiles UMVUE quantiles UMVUE cond. quantiles Mean band Cond. mean band Mode band Cond. mode band

X 1Þ ðtÞ

X 2Þ ðtÞ

X 3Þ ðtÞ

X 4Þ ðtÞ

X 5Þ ðtÞ

305.689 455.375 366.732 379.578 303.474 453.803 366.671 379.639 306.052 459.737 371.863 374.176 301.140 452.368 371.932 374.210

305.560 455.131 366.558 379.394 303.099 453.893 366.472 379.481 305.683 459.887 371.037 374.429 300.741 452.462 371.164 374.625

305.222 451.640 365.051 377.619 302.584 450.799 364.942 377.729 305.074 456.641 368.777 373.281 300.265 449.452 368.962 373.558

305.875 442.220 361.973 373.686 303.243 441.882 361.848 373.810 305.410 446.692 364.800 370.158 301.112 440.788 365.032 370.502

305.520 441.544 361.492 373.177 302.657 441.518 361.342 373.323 304.811 446.772 363.625 370.204 300.495 440.453 363.918 370.635

NONHOMOGENEOUS LOGNORMAL DIFFUSION PROCESS

307

Figure 3. Estimated values of methane emissions for 1994.

Table 3. Summary of realized simulations Simulated paths

Simulated values of X kÞ ð1994ÞjX kÞ ð1993Þ ¼ 367:2

Model

Mean

Trimmed Mean (5%)

Mean

Trimmed Mean (5%)

Mean square error

X 1Þ ðtÞ X 2Þ ðtÞ X 3Þ ðtÞ X 4Þ ðtÞ X 5Þ ðtÞ

374.252 380.240 373.758 366.368 363.933

373.010 379.413 372.449 365.702 362.987

373.175 372.974 371.221 367.644 367.229

373.121 373.002 371.237 367.601 367.255

16.780 13.754 9.958 21.138 21.074

calculated the mean value for 1994. We have also simulated 100 values of the distribution of X kÞ ð1994Þ conditioned to X kÞ ð1993Þ ¼ 367:2, k ¼ 1; . . . ; 5, from which the mean value and the mean square error have been obtained. In both cases, we have calculated the 5% trimmed mean to avoid outliers in the simulation’s process. The results are summarized in Table 3. CONCLUSIONS When the functional form of exogenous factors is not available or no information on them is available, our proposed model can be used for obtaining short-term forecasts. The choice of the model is based on the homogeneity of the point and interval forecasts obtained from approximate models built by increasing the degree of the polynomials that approach the exogenous factors.

308

R. GUTIE´RREZ ET AL.

REFERENCES Black, F. and M. Scholes. 1973. The pricing of options and corporate liabilities. Journal of Political Economy, 81:637–654. Capocelli, R. M. and L. M. Ricciardi. 1974. A diffusion model for population growth in random environment. Theoretical Population Biology, 5:28–41. Capocelli, R. M. and L. M. Ricciardi. 1975. A note on growth processes in random environment. Biol. Cybernetics, 18:105–109. Corazza, G. E. and F. Vatalaro. 1994. A statistical model for land mobile satellite channels and its application to nongeostationary orbit systems. IEEE Transactions on Vehicular Technology, 43:738–742. Di Crescenzo, A., B. Martinucci, E. Pirozzi, and L. M. Ricciardi. 2004. On the interaction between two Stein’s neuronal units. In Cybernetics and systems 2004, Vol. 1, edited by R. Trappl. Vienna, Austria: University of Vienna and Austrian Society for Cybernetics Studies. Gutie´rrez, R., A. Gonzalez, and F. Torres. 1997. Estimation in multivariate lognormal diffusion process with exogenous factors. Applied Statistics, 46:140–146. Gutie´rrez, R., R. Gutie´rrez-Sanchez, A. Nafidi, P. Roman, and F. Torres. 2005. Inference in Gompertz-type nonhomogeneous stochastic systems by means of discrete sampling. Cybernetics and Systems, 36:203–216. Gutie´rrez, R., P. Roman, and F. Torres. 1999. Inference and first-passage-time for the lognormal diffusion process with exogenous factors: Application to modelling in economics. Applied Stochastic Models in Business and Industry, 15:325–332. Gutie´rrez, R., P. Roman, and F. Torres. 2001a. Inference on some parametric functions in the univariate lognormal diffusion process with exogenous factors. Test, 10:357–373. Gutie´rrez, R., P. Roman, D. Romero, and F. Torres. 2001b. Umvu estimation of the trend and covariance function for the univariate lognormal diffusion process with exogenous factors. In Proceedings of the Tenth International Symposium on A.S.M.D.A., edited by. J. Jansen and C. H. Skiadas. Compie`gne, France: Universite´ de Technologie de Compie`gne, pp. 516–521. Gutie´rrez, R., P. Roman, D. Romero, and F. Torres. 2003. Application of the univariate lognormal diffusion process with exogenous factors in forecasting. Cybernetics and Systems, 34:709–724. Hunt, P. J. and J. G. Kennedy. 2000. Financial derivatives in theory and practice. John Wiley and Sons. Kloeden, P. E., E. Platen, and H. Schurz. 1994. Numerical solution of SDE through computer experiments. Berlin: Springer Verlag. Rao, N. J., J. D. Borwankar, and D. Ramkrishna. 1974. Numerical solution of Ito integral equations. SIAM J. Control, 12:124–139.

NONHOMOGENEOUS LOGNORMAL DIFFUSION PROCESS

309

Ricciardi, L. M. 1977. Diffusion processes and related topics in biology. Heidelburg: Springer Verlag, Berlin. Ricciardi, L. M. and P. Lansky. 2002. Diffusion models of neuron activity. In The handbook of brain theory and neural networks, edited by M.A. Arbib. Cambridge, MA: The MIT Press, pp. 343–348. Rico, N. 2005. Aportaciones al estudio de difusi on lognormal: Bandas de confianza aproximadas y generalizadas. Estudio del caso polin omico. Ph.D Thesis. Granada, Spain: Universidad de Granada. Stern, D. I. and R. K. Kaufmann. 1998. Annual Estimates of Global Anthropogenic Methane Emissions: 1860–1994. Trends Online: A Compendium of Data on Global Change. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, U.S. Department of Energy Oak Ridge, TN, USA. Tjeng, T. T. and C. C. Chai. 1999. Fade statistics in Nakagami-Lognormal Channels. IEEE Transactions on Communications, 47:1769–1772. Torres, F. 1993. Aportaciones al estudio de difusiones estocasticas no homoge´neas. Ph.D Thesis. Granada, Spain: Universidad de Granada.