Investigating uncertainty in macroeconomic forecasts by stochastic ...

2 downloads 0 Views 161KB Size Report
Sep 23, 2008 - model parameters and residuals of behavioural equations. We apply a Monte Carlo simulation technique to calculate standard errors for the ...
CPB Discussion Paper

No 112 September 23, 2008

Investigating uncertainty in macroeconomic forecasts by stochastic simulation

Debby Lanser and Henk Kranendonk

The responsibility for the contents of this CPB Discussion Paper remains with the author(s)

CPB Netherlands Bureau for Economic Policy Analysis Van Stolkweg 14 P.O. Box 80510 2508 GM The Hague, the Netherlands

Telephone

+31 70 338 33 80

Telefax

+31 70 338 33 50

Internet

www.cpb.nl

ISBN 978-90-5833-375-9

Abstract in English Uncertainty is an inherent attribute of any forecast. In this paper, we investigate four sources of uncertainty with CPB’s macroeconomic model SAFFIER: provisional data, exogenous variables, model parameters and residuals of behavioural equations. We apply a Monte Carlo simulation technique to calculate standard errors for the short-term and medium-term horizon for GDP and eight other macroeconomic variables. The results demonstrate that the main contribution to the total variance of a medium-term forecast, emanates from the uncertainty in the exogenous variables. For the short-term forecast both exogenous variables and provisional data are most relevant.

Key words: Monte Carlo simulation, Macro economic forecasting, Model uncertainty. JEL code: C15, C53, E20, E27;

Abstract in Dutch Voorspellen gaat gepaard met onzekerheid. In dit Discussion Paper onderzoeken we met het macro-economische model SAFFIER vier bronnen van onzekerheid: voorlopige cijfers over het verleden, exogenen, modelparameters en de residuen van gedragsvergelijkingen. Met behulp van Monte Carlo simulaties berekenen we standaardfouten voor het BBP en acht andere economische grootheden. Dit onderzoek wijst uit dat de onzekerheid ten aanzien van de exogenen de grootste bijdrage levert aan de onzekerheid op middellange termijn. Voor de korte termijn zijn vooral de exogenen en de voorlopige cijfers van belang.

Steekwoorden: Monte carlo simulatie, economische prognose, modelonzekerheid.

3

4

Contents 1

Introduction

7

2

The forecasting process

11

2.1

The macroeconomic model SAFFIER

11

2.2

The four sources of forecast uncertainty

13

3

The Monte Carlo simulation technique

15

4

Modelling the sources of uncertainty

17

4.1

Uncertainty in provisional data

18

4.2

Uncertainty in the exogenous variables

21

4.3

Uncertainty in model equation parameters

24

4.4

Uncertainty in the error terms in behavioural equations

26

5

Results

31

5.1

The Monte Carlo experiments: Technical details

31

5.2

Estimated standard errors

32

5.3

Comparison with uncertainty standard errors published in Don (1994)

36

5.4

Model uncertainty and realised forecast errors

37

5.5

Number of replications in the Monte Carlo experiments

38

5.6

Robustness test for path independence

39

5.7

Decomposition of the total error variance

41

6

Conclusions and Recommendations

45

References

47

A

Decomposition of the total error variance per source of uncertainty

51

B

How to generate a sample from the multinormal distribution?

59

C

Estimation of systems of equations

61

5

6

1

Introduction Uncertainty is an inherent attribute of any forecast. An essential auxiliary task of a forecasting institute is therefore to provide insight into this uncertainty to its users. To this end, several data can be provided. For instance, a forecaster can apply ex-post evaluations comparing forecasts and realizations. Alternatively, a forecaster can present different scenarios describing future outcomes. Furthermore, a forecaster can provide interval forecasts delineating a range of outcomes which captures the future in a prescribed number of cases. The aforementioned approaches quantify the forecast uncertainty. However, they are unable to identify the particular components of a forecasting model that are responsible for a certain exponent of the uncertainty. Important new insights into a model and its uncertainty can be gained by decomposing the forecast error into components that can be associated with different sources of uncertainty. Notably, the results of such an analysis can be used as a basis for prioritising model improvements, as they provide weights to the different sources. In this discussion paper, we use a standard Monte Carlo simulation technique for quantifying model uncertainty which identifies the contribution of different sources of uncertainty to the aggregated forecast error. We concentrate on macroeconomic forecasts conducted at the Netherlands Bureau for Economic Policy Analysis (CPB) with the macroeconomic model SAFFIER. Since late 2004, this model has served in most of CPB’s short-term and medium-term analyses. Specifically, we focus on the decomposition of the total error variance of nine important macroeconomic variables as predicted by SAFFIER. We distinguish four sources of uncertainty in the macroeconomic forecasts. The first source concerns initial model data uncertainty, i.e., uncertainty in provisional data obtained from preliminary publications in the Dutch National Accounts (NA) provided by Statistics Netherlands (CBS).1 In anticipation of the final data2 , these provisional data are applied as initial (lagged) data in CPB’s economic forecasts. The second source of uncertainty involves the uncertainty associated with the forecasts of the exogenous data series. Their future realisations are uncertain at the time of the model simulation. The third source of uncertainty pertains to the model parameters in the behavioural equations which eminates directly from econometric estimation of the behavioural model equations and expert adjustments. Finally, the uncertainty associated with the error terms complements the set of sources. The error terms in the behavioural equations correct for model misspecifications or random events. The aforementioned four sources of uncertainty capture most but not all of the sources of uncertainty associated with model-based macroeconomic forecasts. Clements and Hendry (1998) and Ericsson (2001) categorize five sources of uncertainty of model-based forecast error. 1

The National Accounts represent the official statistical review of the Dutch economy.

2

We acknowledge that the word final has been chosen a bit unluckily as we exclude any incidental major revision from

our analysis.

7

From their categorization, we omit model misspecification referring to the uncertainty associated with model selection, viz. the particular choice of the endogenous and exogenous variables and their functional form in the model equations. A second omitted source concerns expert opinion. Model equations are sometimes adjusted to fit non-model information or anticipated future events. This source of uncertainty will be analysed in a subsequent paper. CPB has been one of the frontrunners when it comes to evaluating the quality of macroeconomic forecasts by simulation. Already in 1991, results of a Monte Carlo analysis on data uncertainties were published, see Gallo and Don (1991). Accompanying studies on parameter uncertainties, exogenous variable uncertainties and error term uncertainties followed. In 1994, a review article (Don (1994)) reported on the contributions to the forecast’s error variance of the various sources of uncertainty. Monte Carlo simulations were performed with a simplified macroeconomic model ZOEM. At that time, the main conclusions stated that uncertainties in exogenous variables and the error terms in the behavioural equations were the two dominant sources of forecast uncertainty for almost all endogenous variables. Preliminary data uncertainty only played a prominent role in the one-year ahead business investment and government surplus forecasts. Several other institutes have published results on Monte Carlo simulations for assessing the impact of uncertainty on their macroeconomic forecasts, although less elaborate than Don (1994). The Bank of Canada, Amano et al. (2002), the Bank of England, Garratt et al. (2003), and the Federal Planning Bureau, Van der Mensbrugghe et al. (1990), analysed parameter uncertainty, the latter applying a simpler version of Monte Carlo simulation. Fair (1993) considered parameter uncertainty and error term uncertainty in his US model. He used stochastic simulation to estimate event probabilities, e.g. the probability of a recession. In Meyermans and Van Brusselen (2006), the authors evaluated the uncertainty of the exogenous variables and the error terms surrounding the 2006-2012 NIME forecasts. The most extensive analysis was found in Kolsrud (1993a) and Kolsrud (1993b), who analysed three sources of uncertainty in the KVARTS91 model of Statistics Norway and the Rimini 2.0 model of the Norge Bank. An influential paper on the application of Monte Carlo simulation for sensitivity analysis and model evaluation of macroeconomic forecasts is Canova (1995). In Fair (2003), Fair introduced the Bootstrap method in stochastic simulation of uncertainty in macroeconomic models. This non parametric stochastic simulation method proved particularly useful for analysing uncertainty in macroeconomic models with rational expectations. A parametric method would require assumptions on these expectations as well, while a non-parametric approach incorporates already these assumptions by definition. Similar approaches were adopted in Onatski and Williams (2003), Borbely and Meier (2003), Toedter (1992) and Kolsrud (2004). Onatski and Williams addressed parameter uncertainty, Borbély and Meier focused on parameter uncertainty and model selection, and Toedter and Kolsrud estimated the impact of parameter uncertainty and error term uncertainty.

8

Since 1994, the conditions surrounding Monte Carlo analyses have changed significantly. Since the first Monte Carlo simulations at CPB, computer power has increased enormously. Furthermore, more historical data on the various sources of uncertainty has become available, and we can rely on more advanced econometric analysis techniques and software. These developments enable us to apply the Monte Carlo simulations to the full operational quarterly macroeconomic model SAFFIER, where Don (1994) applied a simplified version of the annual CPB model used in those days. This allows us to apply ’real’ inputs for the quantification of the uncertainty around the parameters and error-terms of the model. SAFFIER does not incorporate rational expectations. Hence, bootstrapping has no particular advantage over Monte Carlo simulation except for a preference for non parametric simulation methods. In case of bootstrapping, however, unravelling the impact of the different sources of uncertainty is more difficult, because the effects of these sources are not observed separately in historic model realisations. historic outcomes contain the effects of all sources of uncertainty at once. We use the Monte Carlo simulation technique because our macroeconomic model is non-linear and does not admit the extraction of an explicit solution. Furthermore, a parametric distribution of all sources of uncertainty can be found. Although expert opinion still plays a valuable role in the estimation process, the additional data sources enable the econometrical estimation of all covariance matrices associated with the various sources of uncertainty. In some cases, the estimated distribution of a particular source of uncertainty needs to be adjusted because it generates economically unrealistic inputs for the Monte Carlo experiments. An example is the values of parameters, which must be in a theoretical acceptable range. Our results demonstrate that the main contribution to the total error variance of a four-year ahead forecast is induced by uncertainty in the exogenous variables. The total error variance of a short-term forecast is mainly influenced by both uncertainty in the exogenous variables and in the provisional data. Of nine important macroeconomic variables, the standard error of investment volume is most sensitive to the four sources of uncertainty. As time progresses, exports and contractual wages display large standard errors as well. For the latter variable, all sources of uncertainty seem to contribute evenly to its total error variance. This quantification of uncertainty is rather comparable with forecast errors for short-term and medium term CPB forecasts. Minor differences can arise by the fact that in this study the uncertainty related to government policy was not included. Compared with Don (1994) we have much lower standard errors for all variables, mainly because the volatility of the international exogenous variables was much higher in the reference period Don used. The paper is organized as follows. Section 2 describes the forecasting process, the macroeconomic model SAFFIER, and the four sources of uncertainty. In Section 3, we study the Monte Carlo simulation technique including the choice of distributions, variance reduction techniques and accuracy. In Section 4, we present the estimation techniques and the implementation which lead to the specific distributions of the various sources of uncertainty. In

9

Section 5, we cover the results of the Monte Carlo experiments providing the proportional contribution of the different sources of uncertainty to the error variance of various endogenous variables, e.g. GDP and consumption volume. Finally, Section 6 presents concluding remarks and discussion.

10

2

The forecasting process For concurrent decision making, many parties are interested in future uncertain developments of macroeconomic variables, e.g. GDP and consumption. Many economic institutes provide such macroeconomic forecasts mostly based on advanced macroeconomic models describing the macroeconomic future through deterministic equations. These equations, however, contain several components which are uncertain and therefore bring about uncertainty in the macroeconomic model outcomes. In this section, we identify these sources of uncertainty and establish how they are assimilated in the forecasting process. In the subsequent chapters, we identify their contributions to the overall forecast error.

2.1

The macroeconomic model SAFFIER Most short-term and medium-term macroeconomic analyses at CPB are performed with the macroeconomic model SAFFIER. SAFFIER has been operational since late 2004 and encapsulates the former CPB models, SAFE (a quarterly model) and JADE (a yearly model). SAFFIER stands for Short- and medium-term Analysis and Forecasting using Formal Implementation of Economic Reasoning. For an extensive description of SAFFIER, we refer to Kranendonk and Verbruggen (2007). In numbers, SAFFIER consists of about 2600 equations of which 50 equations represent so-called behavioural equations. These equations contain about 300 parameters. The remaining equations are rules of thumb or identities. SAFFIER holds about 3000 variables categorized in 2600 endogenous variables, 250 exogenous variables and 200 autonomous terms. Figure 2.1 outlines the various components of the forecasting process. Its main component is the macroeconomic model describing the relations between the endogenous, exogenous and autonomous variables. These latter two variables constitute input variables forecasting the variables exogenous to the model or defining constant adjustments to the behavioural equations. Besides this input data, the model requires lagged endogenous data to initialise the forecasting process. This data consists of realised historical values of the various macroeconomic variables. Furthermore, each behavioural model equation contains several parameters. Incorporation of the above components closes the model, so that a first macroeconomic forecast can be extracted. This first forecast is assessed by several experts within CPB. These experts can propose model adjustments bringing in non-model information. The experts often rely on their own models which are likely to be better equipped in predicting specific macroeconomic variables as social-security or pension-related variables. The non-model information is fed back into the model via the disturbance terms and sometimes via parameter adjustments. Several forecast rounds follow resulting in the final forecast publication. The schematic representation in Figure 2.1 can be formalized as follows. Let yt denote a

11

Figure 2.1 Schematic representation of the elements in a CPB’s forecasting process

Initial Data (Lagged Data)

Model

Exogenous data

Parameters

Disturbances in behavioural equations

Economic Forecast

Non-model information

Expert Opinion

Publication

vector containing n endogenous variables to be forecasted at time t, yt ∈ Rn . Let xet denote an ne -vector consisting of ne exogenous variables, xet ∈ Rne . Note that xet may contain lagged exogenous variables up to ke periods in the past. Let uat denote a vector of n autonomous terms and uet a vector of disturbances or error terms, uat , uet ∈ Rn . The set of parameters of the behavioural equations are denoted by βb ∈ Rn p with n p the number of parameters. The forecast process can be condensed into   yt = f xet , .., xet−ke , yt , .., yt−k , uat , uet , βb .

(2.1)

The forecast process captured by the vector-forecast function f contains k lags in y and can be non-linear in the endogenous variables yt−i , i = 0, .., k.3 This system is a simultaneous equations model. The forecast model in (2.1) is mostly treated as a deterministic relation describing the forecast yt . Some of the terms, however, are contaminated with disturbances either by estimation of their values based on former realisations or by uncertainty about their future values. These sources of uncertainty induce variations in the forecast outcomes. We investigate the sensitivity properties of our forecast to the several sources of uncertainty by means of the descriptive sample statistics of these variations, viz. its mean and its variance.

3

Note that for the first k simulation years the initial variables, xit , and the lagged endogenous variables, yt−i , overlap.

12

2.2

The four sources of forecast uncertainty We distinguish four different sources of uncertainty, viz. uncertainty in provisional data supplied by Statistics Netherlands (CBS), uncertainty in exogenous data series, uncertainty in the parameters of the behavioural equations and uncertainty in the error terms. The first source concerns initial model data uncertainty divided into two types of data uncertainty, i.e., uncertainty in provisional data (data available before adjustment) and final data (‘unreliable’ data). The former data uncertainty stems from data obtained from the Dutch National Accounts (NA) provided by Statistics Netherlands (CBS). 4 The CBS publishes four preliminary estimates before producing their final data values on a calendar year, i.e., the flash quarterly forecast (45 days after the ending of the calendar year), the regular quarterly forecast (90 days after the ending of the calendar year), the provisional quarterly forecast (6 months after the ending of the calendar year), and the revised provisional quarterly forecast published 18 months after the ending of the calendar year. The final figure5 is released 30 months after the ending of the calendar year. SAFFIER already assimilates the provisional and revised provisional data into its forecasts as it requires initial (lagged) data. The deviation between these data values and their final ones introduces a disturbance term into our model. More formally, the lagged variables of the vector xet will become a random variable in at least some of its components. The second source stems from uncertainty in the exogenous data series, i.e., time series determined outside the model. These exogenous variables can be divided into two groups: policy and non-policy variables. The first group concerns assumptions on policy, e.g. government expenditures or tax rates. The second group consists of variables related to the international environment including for instance world trade volume and import prices, and domestic variables like share prices. The uncertainty associated with the first group is difficult to quantify. First, policy alters under changing socio-economic and political environments. These changes are difficult to predict in themselves let alone be elaborated into a changing policy measure. Second, policy rules are regularly (slightly) adjusted, redefined or even completely removed.The uncertainty in policy exogenous variables can be captured in terms of a feedback model. For a discussion about uncertainty under policy feedback in CPB models, we refer to Van Vlimmeren et al. (1993). Because of the several difficulties, we exclude the exogenous policy variables from our sensitivity analysis and restrict ourselves to foreign and domestic exogenous variables.6 These

4

The national accounts represent the official statistical review of the Dutch economy.

5

We acknowledge that the word final is chosen a bit unluckily as we exclude any incidental major revision from our

analysis. 6

Because this study is done conditioned on the policy variables, total uncertainty could be underestimated and be lower

then ex-post forecasting accuracy measures indicate.

13

exogenous variables are forecasted outside the model by various additional models, data sources and/or expert information. As is inherent in forecasting, they contain a random component which is reflected in the endogenous outcomes of SAFFIER. This randomness will become apparent in the vector xet in equation (2.1). The third source of uncertainty concerns the uncertainty in the parameters of our macroeconomic model. We distinguish two types of parameter uncertainty, i.e. estimated and fixed parameter uncertainty. Parameter values of a behavioural equation are determined in an iterative estimation process. First, several model descriptions are estimated using historical macroeconomic data resulting in a ‘best’ description (estimated parameters). Best is based on both econometric, e.g. small Mean Squared Error and no bias, and economic grounds, e.g. correct sign and significance between related economic variables. Second, during the iterative process certain parameters can be fixed matching econometric results with expert opinion (fixed parameters). Apparently, the estimated parameters are uncertain by means of construction. The fixed parameters on the other hand generate no uncertainty through the estimation process. These parameters however, contain a random component as they rely on uncertain a priori information. We primarily focus on uncertainty in the first type of parameters. The uncertainty associated with the fixed parameters is more difficult to quantify and is described using expert opinion. Both sets of parameters are investigated separately. A complex model as ours hampers a correct specification of the covariances between the estimated and fixed parameters. Parameter uncertainty introduces randomness in the parameter vector βb and constant vector c in equation (2.1). The fourth source of uncertainty stems from the residual terms in the behavioural equations. A non-zero residual term uat adjusts the behavioural equations for misspecification and random events. These residuals terms are obtained in an iterative process using expert opinion. Although uncertain by definition, we will not model expert opinion. We restrict ourselves to the uncertainty in the residual terms which surfaces after forecast publication. How should we adjust the residual terms in our model to reproduce historical macroeconomic data? In this sense, the residual terms can be seen as an error term. In equation (2.1), this uncertainty corresponds to the random vector uet .

14

3

The Monte Carlo simulation technique In this section, we discuss the Monte Carlo simulation method used for investigating the sensitivity of our macroeconomic forecast to the various sources of uncertainty. On account of the simultaneity, size, non-linearity and dynamical behaviour of our model, this stochastic simulation method yields an appropriate tool for the analysis. Monte Carlo simulation is a parametric simulation method which requires the specification of the density distribution of the various sources of uncertainty. Alternative methods for forecast sensitivity analysis involve bootstrapping or model simplification. Bootstrapping is a non parametric simulation method. For a general description of the method, we refer to e.g. Davidson and MacKinnon (2004). The method has been advocated for analysing the sources of uncertainty in macroeconomic forecast models including rational expectations, see e.g. Fair (2003); Kolsrud (2004). In that case, a parametric method would require assumptions on this expectations as well. Bootstrapping will incorporate these expectations by default. Furthermore, the method has proven useful when the determination of an appropriate density function is difficult due to either data shortage or unclarity about its functional form. Model simplification is appropriate when a closed form of the solution under uncertainty is required. In that case, dynamic forecast models are often linearised or significantly reduced. Monte Carlo simulation evolves in several steps. First, we establish the density distribution of the disturbing source of uncertainty, i.e. the random component in our macroeconomic model. For instance, we investigate the sensitivity of a GDP forecast to uncertainty surrounding the parameter set in the consumption equation. Second, we generate a random sample from this distribution. In case of our GDP example, a sample of n parameter sets are drawn from the joint distribution of the parameters in the consumption equation. For each replication, we then simulate our forecast model resulting in N different forecasts. These forecasts are derived conditional on a deterministic representation of the other sources of uncertainty. Finally, the generated forecasts are combined in several descriptive sample statistics, e.g. the sample mean and standard error. In particular, we are interested in the sample variance of the forecasted endogenous variables. The sample variance of an endogenous variable measures the dispersion of a simulated (n)

sample. Let yt denote the forecast of an endogenous variable y at time t and let yt

denote the

n-th replication at time t of the Monte Carlo simulation. A consistent estimator of the sample (n)

variance of yt , σbt 2 , is the mean of the squared deviation of the solutions yt

from the sample

mean ybt , 2

σbt =

2 N  1 (n) yt − ybt . ∑ N − 1 n=1

(3.1)

N yields the number of replications in the Monte Carlo sample and the sample mean ybt is defined as ybt =

1 N

(n)

∑Nn=1 yt . The square root of the sample variance is known as the standard error which

15

can be interpreted as a measure of the uncertainty in our model solution; the larger the standard error, the larger the uncertainty in our forecast. Note that a Monte Carlo simulation concerns an ex ante simulation. Descriptive statistics are based on future outcomes. This approach is in contrast with the regularly published CPB ex-post forecast evaluation study by Kranendonk and Verbruggen (2005, 2006); ?. In these studies, relative and absolute forecast errors are presented, as well as the Theil-coefficient comparing historical CPB forecasts to outcomes. We apply a crude Monte Carlo approach, i.e., we generate our input sample by direct or naive sampling. Furthermore, accuracy bounds are not that restrictive either. The probability density distributions of our samples already incorporate some inaccuracy. An extensive literature on more efficient sampling methods, the so-called variance-reduction methods, exists though. We refer to, for instance Rubinstein (1981) or Fishman (1996). These cost-reducing methods are developed such as to obtain a smaller standard error using the same number of observations. We mention antithetic variates, importance, stratified, and correlated sampling. How many observations should one collect to ensure a particular statistical accuracy of the sample variance? Denote the variance of an endogenous variable yt as Var(yt ). The sample variance is an unbiased estimator for Var(yt ) with standard error

√s , N

where s in the standard

deviation of yt . In other words, increasing the sample size N reduces the error in our variance estimator by the order a half. In Section 5, we present results on a convergence test to demonstrate this order reduction. Most disturbances (sources of uncertainty) in our analysis are modelled by the multivariate normal distribution. For completeness, we here repeat this distribution. Let Z = (Z1 , ..., Zr )T denote a random vector with a multivariate distribution. Let µ and Ω denote its mean vector and covariance matrix. Its probability density function then reads r

1

f (z) = (2π )− 2 |Ω|− 2 e−

(z−µ )T Ω−1 (z−u) 2

.

(3.2)

The applied programming language does not facilitate direct sampling from this distribution. However, a sample can easily be derived using the Cholesky decomposition of the covariance matrix Ω. For details, we refer to Appendix B. Note that the multivariate distribution function is fully specified by its first and second order moments. This property is exploited extensively when determining the probability distributions of the sources of uncertainty, see Section 4. The first moment of the multivariate distribution is set equal to the undisturbed (deterministic) value. The uncertainty is modelled by the second-order moment, viz. the covariance matrix.

16

4

Modelling the sources of uncertainty Each Monte Carlo simulation requires a probability density function which describes the uncertainty associated with the particular source of uncertainty. In this section, we model the four sources of uncertainty. First, we formulate these models in general terms, each model containing a disturbance term specified by a probability density function. Subsequently, their distributions are obtained for Saffier-specific components. Model complexity and data restrictions induce that each source of uncertainty is investigated independently of the others. Unfortunately, this assumption can be violated as can be deduced, for instance, from the close relation between the parameter estimates and the error term in a particular model equation. The uncertainty in an error term is based on observed (historical) error terms obtained conditional on deterministic parameter values. Controlling for uncertainty in these parameters will probably explain part of the variation in the error term.7 Separate estimation of the various sources most likely results in overestimation of their uncertainty. Our results on the shares of the individual sources of uncertainty in the total model uncertainty should therefore be considered indicative. Applied data series are discussed, and we consider intermediate results leading to a correct model specification of the particular source of uncertainty.

8

This section is organised per source of uncertainty.

Although improved over the years, data shortage still interferes with an accurate and fully integrated estimation of the sources of uncertainty. Where possible, model complexity is reduced and expert opinion is called upon even further simplifying the model of a particular source of uncertainty. These restrictions and simplifications are discussed in the implementation parts of the next paragraphs. In the implementation stage of the estimated probability distribution, we have introduced lower and upper bounds. Consider for instance a variable x which is known to be positive. Random sampling from the normal distribution of this variable can result in a few negative sample outcomes. These values should be removed from our sample before simulation with the macroeconomic model. The normal distribution has infinite bounds so extreme values can occur with a probability depending on the standard deviation of the distribution. In principal, we could have avoided small sample reductions when we had estimated a probability distribution with zero probability outside the range of admissible variables. However, when the standard deviation of the normal distribution remains within limits, our approach suffices. Whenever bounds are implemented, we assure that their ranges are as wide as possible respecting the symmetric character of the underlying multivariate distribution. Furthermore, the restricted sample is evaluated by computing the sample mean and standard deviation of the restricted sample. These 7

The equations are not re-estimated every year when new data from National Accounts become available. This can lead

to more variation in the error-terms then in the situation where the equations would be re-estimated every year. 8

Details on the distribution of the sources of uncertainty are available on request.

17

sample statistics should not deviate significantly from their original values.

4.1

Uncertainty in provisional data Before stating a definite figure, Statistics Netherlands (CBS) publishes several premature estimates of historical macroeconomic variables. Since they randomly deviate from their final value, these provisional estimates introduce uncertainty in model outcomes when used as initial data. CBS publishes final data after 30 months (2.5 years) of the close of the calendar year. A forecast made at year t thus incorporates preliminary realisation data for year (t − 1) and (t − 2). These data points coincide with the so called provisional (after 6 months) and revised provisional (after 18 months) data. Let w1 (t) and w2 (t) denote vectors containing n provisional and n revised provisional variables for year t respectively. w3 (t) denotes the n-vector containing the ‘final’ data of these macroeconomic variables. We model w1 (t) = A1 w3 (t) + b1 + u1 (t),

(4.1)

w2 (t) = A2 w3 (t) + b2 + u2 (t),

(4.2)

where b1 and b2 are parameter vectors containing n elements each, and A1 and A2 are two diagonal matrices with elements α1i and α2i for i = 1, .., n. u1 (t) and u2 (t) are residual vectors contemporaneously cross correlated per variable, so E(u1 (t)) = E(u2 (t)) = 0, Var(u1 (t)i ) = E(u12 (t)i ) = σ1i2 , Var(u2 (t)i ) = E(u22 (t)i ) = σ2i2 , E(u1 (t)i u2 (t)i ) =

2 σ12i

(4.3)

and E(u1 (t)i u2 (s) j ) = 0 for s unequal t.

This model is established under specific assumptions which are supported by tests on our data. Below, we reflect on these assumptions. First, the variables contained in w1 (t), w2 (t) and w3 (t) concern growth rates. As shown in Van Vlimmeren et al. (1991), these rates are less sensitive to heteroskedasticity than their level counterparts. We test for heteroskedasticity in u1 (t) and u2 (t) using White’s heteroskedasticity test on all w1i (t) = α1i w3i (t) + b1i + u1i (t) and w2i (t) = α2i w3i (t) + b2i + u2i (t), separately. Second, we assume that the residuals u1 (t) and u2 (t) are not serially correlated. As first reference, we investigate the correlogram of the sample, i.e., we graph both the empirical autocorrelation and partial autocorrelation function of the sample and investigate the values of the coefficients of these functions. Third, most variables are investigated independently of each other. This can yield a severe model restriction. However, our data series on the CBS provisional and revised provisional data series are too short to allow estimation of a full correlated system. Although the CBS provisional data can be recaptured, it is not always possible to reformulate this data into past SAFFIER data

18

definitions. Therefore, we only group variables when their macroeconomic interpretation strongly suggests a connection. For instance, GDP volume consist for about 35% of private consumption, so investigating their data separately would seem highly implausible. We test for cross correlations on a one-to-one basis, investigating the correlation between v1i (t) = w1i (t) − w3i (t) and v1 j (t) = w1 j (t) − w3 j (t) and the correlation between v2i (t) = w2i (t) − w3i (t) and v2 j (t) = w2 j (t) − w3 j (t) for all i and j. Fourth, we assume that w1 (t) and w2 (t − 1) display no cross correlation and w1 (t) and w2 (t) do. In that sense, our model differs from Van Vlimmeren et al. (1991) whose model assumes w1 (t) and w2 (t − 1) to display cross correlation and w1 (t) and w2 (t) to do not. Van Vlimmeren et al. (1991) explain cross correlation between w1 (t) and w2 (t − 1) by the date of publication of these data in the National Accounts in year (t + 1). These data result from the same available information. Pursuing this argument, w1 (t) and w2 (t) should reveal a weaker correlation. However, a different argument advocates a stronger correlation between w1 (t) and w2 (t). Although published at subsequent years, these provisional and revised provisional data can suffer from equidirectional forecast bias. Both data points forecast the final outcome at year t. Over- or underestimation in these provisional data can be persistent as the Statistics Netherlands might be cautious for harsh adjustments. Patterson and Heravi (2004) and Lynch and Richardson (2004) discuss both approaches. We analyse the cross correlations by examining the cross correlation between v1 (t) and v2 (t − 1) and between v1 (t) and v2 (t). For some variables, neither of the two cross correlation are significant. In most cases, the cross correlations between v1 (t) and v2 (t) are. In the model, cross correlations are captured in terms of the covariance matrix Σ. The implementation of the uncertainty modelled in Van Vlimmeren et al. (1991) is straightforward. A sample is drawn from the multivariate normal distribution correlating the disturbances on the revised provisional and provisional data in year (t − 2) and (t − 1) proceeding simulation year t. In our case, the correlated disturbances on the provisional and revised provisional data succeed each other and thus influence subsequent forecasts. We therefore independently and identically draw a disturbance on the revised provisional data and provisional data for year (t − 2) and (t − 1) from the estimated multivariate distribution. Finally, we mention that the above tests assume that A1 and A2 are ‘close’ to the identity matrix and b1 and b2 are ‘close’ to the zero vector. In words, we assume that the provisional and revised provisional data are symmetrically distributed around their final realisation with no systematic under- or overestimation. We use a Wald test on α1i = 1, α2i = 1, b1i = 0 and b2i = 0 to verify this assumption. If not rejected on a 5% significance level, our model description is implemented with A1 = A2 = In and b1 = b2 = 0. Note that an F-test yields no appropriate alternative as our model allows for cross correlation in the residuals u1 (t)i and u2 (t)i . Summarizing, uncertainty due to provisional data is modelled in a seemingly unrelated regression approach. Most variables are investigated independently of the others assuming that any correlation surfaces through cross correlation in the residuals between provisional and

19

revised provisional data for year t. Uncertainty associated with the ‘final’ publication of the Statistics Netherlands data, i.e. errors due to unreliable data, are not considered. Quantifying this type of error is troublesome and beyond the scope of a sensitivity analysis of the SAFFIER model.

Implementation We model the uncertainty in the provisional data for the following endogenous variables: employment, exports of manufactured goods (V, P), consumption (V), investment of the private sector (V), contractual wages, imports of goods (V), and gross national product in market prices (V, P). A V and/or P denotes that the endogenous variable concerns a volume or price variable. We apply a sample containing provisional, revised provisional data and final data between 1993 and 2003. This data set is collected from Central Economic Plans (CEPs) between 1992 and 2005. The presented variables on year (t − 2) and (t − 3) in a Central Economic Plan of year t coincide with the provisional and revised provisional data in the National Accounts. We incorporate published CEP data instead of data from the National Accounts, because SAFFIER data definitions can differ from operational definitions at the CBS. The length of our data series is short and even further reduced by excluding the data on years 1996, 1997, 2002 and 2003. Every five to ten years, CBS revises its macroeconomic historical data series. These revisions induce a definition change between presented provisional, revised provisional and final data for a given year t. Some of these data are published before and others after the revision. The data exclusion ensues the major revision in 1995 and 2001.9 We test the endogenous variables for serial correlation and for contemporaneous cross correlation in the differences between the provisional data and the ‘final’ outcome, v1 (t), and in the differences between the revised provisional data and the ‘final’ outcome, v2 (t). Moreover, we focus on the cross correlation between the endogenous variables. None of the variables show significant serial correlation. Contemporaneous cross correlation is significant on a 5%-level only between private consumption (V) and gross national product (V) for both differences v1 and v2 . Per variable, we also investigate the cross correlation between v1i (t) and v2i (t), and v1i (t) and v2i (t − 1). For most variables, the data demonstrates significance on a 5%-level for the first cross correlation and not for the second. The remaining variables display no significant cross correlation on a 5%-level for both differences. As a consequence, we let model (4.1)–(4.2) describe the uncertainty associated with the provisional data assuming independence between all variables except for private consumption and gross national product in market prices volumes.

9

The revisions of 1995 are published in 1999 incorporating new data definitions for provisional data in 1997, revised

provisional data in 1996 and final data in 1995. Similarly, the revisions of 2001 are published in 2005 and employ new data definitions for provisional data in 2003, revised provisional data in 2002 and final data in 2001.

20

We estimate our model including both the matrices A1 and A2 , and the constant vectors b1 and b2 . For each variable, we apply subsequently a Wald-test on the restrictions b1i = 0, b2i = 0. The test results indicate that we can further simplify our model, putting the constant vectors c1 and A c2 . equal to zero. Re-estimating our model, we find A Note that some of the disturbed endogenous variables form an identity in our model. Moreover, they are composed of several other endogenous variables and can only be adjusted by a shift in one or more of these variables. Implementation of the disturbed series thus requires consistent shifts in these variables. These shifts can be induced by the so called observations-procedure. This procedure projects certain endogenous variables onto recent observations respecting the model formulation and adjusting some pre described endogenous variables by means of the residuals in the behavioural equations. For a reference on this procedure, we refer to Sandee et al. (1984).

4.2

Uncertainty in the exogenous variables The uncertainty in the exogenous variables is modelled by using data on observed short-term and medium-term forecast errors. The short-term forecast concerns a one-year ahead forecast. The medium-term forecast has a four-year-ahead forecast horizon.10 For both forecasts, we have data available on the yearly growth rates of various exogenous variables. For the medium-term forecast, these yearly growth rates comprise mean growth rates over the four-year forecast period. First, we restrict ourselves to the one-year ahead forecast error model. Let g1 (t + 1) denote the vector with one-year ahead forecasted growth rates g1i (t + 1) of exogenous variable i in year (t + 1) and let the vector g(t + 1) with elements gi (t + 1) present their realisations. Let u1 (t + 1) ∈ Rn denote a vector with elements u1i (t + 1) describing the one-year ahead forecast error of ne exogenous variables i in year (t + 1) conducted in year t: u1 (t + 1) = g1 (t + 1) − g(t + 1). Our data series on the forecast errors have time range t = 1, . . . , T . We assume that the forecast errors u1 (t) are innovations ε1 (t) which are normally distributed with mean µ1 and covariance matrix Σ1 . σ1ij is the ij-th element of the matrix Σ1 . The one-year ahead forecast errors of the various exogenous variables can be contemporaneously cross correlated. The innovations are identically and independently distributed over time. Our model reads u1 (t) = ε1 (t) with ε1 (t) ∼ N (µ1 , Σ1 ) , ∀t = 1..T.

(4.4)

Where M is a diagonal matrix with Mρ j j = ρ j

10

In some medium-term forecasts, the forecast horizon consisted of a five-years instead of the regular four-year period.

21

In a similar notation, we define u2 (t + 2), u3 (t + 3) and u4 (t + 4) as the two-, three- and four-year-ahead forecast errors in the yearly growth rates of the exogenous variables i = 1, . . . , ne . The errors are obtained as uki (t + k) = gki (t + k) − gi (t + k), where gki (t + k) denotes the growth rate of exogenous variable i in year t + k made in year t. Since the growth rates gk (t + k) are forecasted in the same year t and thus evolve from the same information set, it seems plausible that the k-year ahead forecast errors, uk (t + k) with k = 1, . . . , 4, are correlated. We assume that this correlation can be captured by the following autoregressive process uk (t + k) = Mρ uk−1 (t + k − 1) + εk (t + k) with εk (t + k) ∼ N (µ1 , Σ1 ) ,

(4.5)

For k = 0, we have u0 (t) = 0. Pooling the data over the T ∗ medium-term forecasts, we can estimate the correlation coefficients ρ j , the mean vector µ1 and the covariance matrix Σ1 . Unfortunately, our medium-term forecast data does not provide information on the individual two-, three- and four-year-ahead (yearly) growth rates, gk (t + k), instead it provides mean yearly growth rates, gMT (ti ). gMT (ti ) is the mean yearly growth rate over the period (ti + 1) until (ti + 4) for i = 1, . . . , T ∗ . Therefore an alternative approach is necessary to derive the autocorrelation coefficients ρ j . We model the forecast error in the mean growth rates by a multivariate process, uMT (ti ) = εMT (ti ) with εMT (ti ) ∼ N (µMT , ΣMT ) , ∀i = 1..T ∗ .

(4.6)

d Estimating (4.6), we find µd MT and ΣMT . Again, an F-test is applied to test for µMT = 0. We recall the relation between the k-year ahead forecasts of the growth rates and the mean growth rates given by the medium-term forecasts,     4 1 + g1k (ti + 1) 1 + g2k (ti + 2) 1 + g3k (ti + 3) 1 + g4k (ti + 4) = 1 + gMTk (ti ) , whereke = 1, . . . , ne .(4.7) Linearizing equation (4.7) around the realisation of the mean growth rate, leads to u1k (ti + 1) + u2k (ti + 2) + u3k (ti + 3) + u4k (ti + 4) = 4uMTk (ti ).

(4.8)

So combining equation (4.8) and the autoregressive process (4.5), we approximate the autocorrelation coefficients by equating  1 + ρ + ρ 2 + ρ 3 σd \ MTkk . 1kk = σ

(4.9)

In words, we set the autocorrelation coefficients such that the resulting standard deviations for a forecast over a four-year period comply with the standard deviation for observed medium-term forecast errors derived in (4.6).

Implementation We consider the following 9 exogenous variables, the long-term interest rates, the share price, and the growth rate of world trade volume, prices of competitive exports, the prices of final

22

imports for re-export, imports of consumption goods excluding energy, imports of raw materials and semi manufactures excluding energy, imports of energy and imports of investment goods. For each of the exogenous variables historical data on realisations and forecasts is available. For the interest rate, we rely on historical data on yearly forecasts and realisations since 1989. For the share prices, we have data available since 1980. The data series of the other exogenous variables on yearly forecasts and realisations date back to 1971. Part of this data is analysed in Kranendonk and Verbruggen (2005). We define the forecast error as the difference between the next year forecast from the Macro Economic Outlook (MEV) published in September each year and its (final) realisation value published by Statistics Netherlands. This difference is chosen for facilitating a comparison between our results and the results in Don (1994), who also used this forecast error. Besides data on the one-year ahead forecast error, our analysis requires data on CPB’s medium term forecasts, which are published in Kranendonk and Verbruggen (2006c).This data set consists of 10 medium-term forecasts published between 1976 and 2001. Since 1993, i.e. for three medium-term forecasts, the CPB has presented a cautious and an optimistic scenario for the Dutch economy. In our analysis, we include the average of the two scenario’s. Before 1993, the medium-term forecast consisted of a central projection. The data on share prices and the long-term interest rate are restrictive. Therefore, we assume that the forecast errors of these exogenous variables do not display an auto regressive pattern. This assumption can be partly justified from the ‘chaotic’ behaviour of the share price itself. Based on the model (4.5), we determine the one-year ahead forecast error mean µ1 , the covariance matrix Σ1 and the autocorrelation coefficients ρ j . These estimates are given in ?. An F-test on the mean µ1 of the forecast errors reveals structural under- or overestimation for some exogenous variables. The growth rate of world trade volume, for instance, is underestimated with 0.9% point per year, see also Kranendonk and Verbruggen (2005). The exogenous input can be adjusted such as to cover this over- or underestimation. However, we duplicate the current exogenous data series, and only deviate this series by application of the estimated covariance matrix. In this way, we closer resemble CPB forecasts and variation runs. We simultaneously estimate the growth rate of world trade volume, the price of competitive exports, and the price of total imports of goods. The disturbance on the latter variable should be distributed over the various import prices which compose this identity.

11

The distribution

factors are determined by weighting the standard deviations of these various prices with their correlation coefficient with the total import price and their share in the total import price identity. Their autocorrelation coefficients are chosen to mimic the coefficients of the price of competitive exports, viz. the prices on imports for re-export, imports of consumption goods excluding

11

SAFFIER distinguishes import prices for consumption goods, investment goods, intermediate goods, re-exports and

energy.

23

energy, imports of raw materials and semi manufactures excluding energy, and imports of investment goods or the price of total imports, viz the price of import of energy. In a similar fashion, the autocorrelation coefficient of the prices on imports of consumption goods excluding energy and imports of investment goods, are assumed similar as both are final goods. The imports for re-export and imports of raw materials and semi manufactures excluding energy are equally shocked as both are intermediate products.

4.3

Uncertainty in model equation parameters In this section, we quantify the uncertainty associated with the parameters in the behavioural equations of the SAFFIER model. First, we consider the special structure of these behavioural equations, where after we describe the techniques applied when estimating their parameters and the associated asymptotic covariance matrices. Following Engle and Granger (1987), the behavioural equations of an endogenous variable in SAFFIER mostly are modelled by a Error-Correction specification with a long- and short-term equation. The long-term equation presents a relation between the long-term equilibrium value and various explanatory variables, i.e., ln y ∗ (t) = xlt (t)T βlt + c,

(4.10)

where y ∗ (t) ∈ R denotes the long-term equilibrium value of the endogenous variable y(t), and xlt (t) ∈ Rklt and βlt ∈ Rklt denote vectors containing klt explanatory variables and klt parameters respectively. c is a constant. The short-term equation determines the growth rate of the endogenous variable y(t), so capturing the short-term dynamics y(t) ˙ = xst (t)T βst − ε (ln y(t) − ln y ∗ (t))−1 , ∀t = 1, .., T,

(4.11)

where xst (t) ∈ Rkst , and βst ∈ Rkst denote vectors containing kst explanatory variables, and kst parameters respectively. The error correction term in the short-term equation, partially corrects for deviations from the endogenous variable y(t) from its equilibrium value y ∗ (t). The parameter ε determines the speed of this adjustment. We assume that the parameters within a behavioural equation both in the long- and shortterm can be correlated. Parameters between different behavioural equations, on the other hand, are considered uncorrelated. This assumption seems plausible, because in practice most parameter sets are estimated separately per behavioural equation. Hence, uncertainty in a parameter set of a behavioural equation is modelled as b + u, u ∼ N (0, Σ) , (βlt , c, βst , ε )T = µ (4.12)  T b = βblt , cb, βc where µ ε denotes the operational set of parameter values and Σ denotes the st ,b covariance matrix of the disturbances on these values. Note that the parameters are assumed constant over time.

24

Within each behavioural equation, we distinguish two sets of parameters, viz. the estimated and fixed parameters. The uncertainty in the estimated parameters is easily quantified by setting the covariance matrix in (4.12) equal to the estimator of the covariance matrix of these parameters conditional on their estimation method. These estimation methods and the covariance matrix estimators are discussed below. The uncertainty associated with the fixed parameters results from questioning several experts about their opinion on the variances of the disturbances on these parameters. Recall that the fixed parameters result from the iterative process described in Section 2.2. Naturally, the disturbances on the fixed and estimated parameters are correlated. However, we evaluate their effect on the total model uncertainty separately assuming these correlations to be small. When questioning experts, these correlations were merely given in sign than significance. For estimating parameters in the long-and short-term equations contained in SAFFIER, three different methods have been used. They are a two-step, a non-linear and a three-stage least-squares method. We acknowledge that these methods experience some shortcomings when it comes to estimating systems like (4.10) and (4.11). However, we intend to work with the original estimation methods which led to the parameter estimates currently employed in SAFFIER. In appendix C, we shortly discuss the three estimation methods and address their advantages and disadvantages.

Implementation SAFFIER contains approximately 50 behavioural equations divided over several macroeconomic categories, viz. private consumption, government, investment, labour market etc. We consider 10 prominent equations from these categories. They are the equations for stock building, private consumption (V), imports of consumer goods, imports of investment goods, imports of travelling services, the intermediate imports of raw material and semi manufactured products excluding energy, imports of intermediary services (V), labour and capital demand, domestic dwellings (P), re-exports excluding energy (V), exports of services (V), exports of domestic origin (V, P) and the wage equation. Most equations can be subdivided in long- and short-term equations. In order of appearance, the private consumption until the price of domestic dwellings are estimated using 2SLS, the long- and short-term exports of services (V) and the re-exports excluding energy (V) are estimated simultaneously, the exports of domestic origin (V, P) and the wage equation apply the 3SLS technique. We apply these estimations to quantify the uncertainty associated with the various parameter sets employed in our model. Modelling uncertainty associated with estimated parameters seems natural. Along the same lines, the uncertainty associated with the fixed parameters in our model would be presumed absent. However, this assumption is invalid. Their values are fixed by expert opinion under uncertainty of the exact relation. Their uncertainty can even be more prominent than for the

25

estimated parameters, because they are mostly chosen to ensure a correct description of the modelled macroeconomic relation. As mentioned in Section 4.3, the influence of the uncertainty associated with the fixed parameters will be analysed independently of the uncertainty of the estimated ones. In the remainder of this section, we concentrate on the estimated parameter sets. The time range of the data series applied in the estimation of the various equations varies. Although more data points might increase the accuracy associated with the parameter set, it is more important that the estimation period is representative for current and future macroeconomic behaviour. For instance, do additional data points from the seventies increase accuracy when we introduce high inflation in our estimation data? We therefore adopt the data series from the original estimations. An equal time range for all equations would be preferable but not advisable. Re-estimation of the equations would alter the currently operational model at the CPB. The length of the data series varies between 1971 until 2003, so at most 32 observations are included in the estimation process. The number of observations is not large, but is not restrictive for identification of the parameter sets or the corresponding covariance matrix. We conclude with three remarks. A parameter shock is permanent. The parameters are disturbed in the shock period and retain their adjusted values until the end of the simulation period. Second, the parameters per behavioural equation are shocked simultaneously by means of their multivariate distribution. Our analysis does not allow for the separate identification of the effect of a disturbance on a specific parameter in the set. Third, some parameters with high standard errors are restricted within theoretical acceptable upper and lower bound.

4.4

Uncertainty in the error terms in behavioural equations In this section, we derive the model specification of the uncertainty associated with the error terms in the behavioural equations. These error terms adjust the behavioural equations for misspecification or random events. In the forecasting process these terms can be used to add expert opinion to the model-forecast. A discussion on the effect of expert opinion is published in Franses, Kranendonk and Lanser (2007). However, we do not intend to model the uncertainty associated with this expert opinion. Instead, uncertainty associated with the error terms refers to a second interpretation: how to set the error terms when reproducing historical macroeconomic data with our model? When forecasting, the error terms are unknown and thus uncertain by definition. We assume that the error terms satisfy a zero mean condition. Moreover, we assume that the behavioural equations forecast the various macroeconomic variables without persistent over- or underestimation. The zero mean assumption and the specific form of the error terms can be tested and determined by means of historical data on the error terms. Assume we have n behavioural equations, each containing one error term. Let ri (t) denote

26

the error term of the i-th behavioural equation at time t. We model these error terms as ri (t) = αi + ui (t), ∀i = 1, .., n, ∀t = 1, .., T.

(4.13)

where αi is a constant for behavioural equation i and ui (t) yields the residual at time t in equation i. We point to the possible terminology confusion between residual and error terms. The ri (t) are addressed as error terms in the behavioural equation i at time t and ui (t) denotes the residual in the equation modelling the error term ri (t). The residual terms ui (t) can contain an autoregressive part and can display contemporaneous cross-correlation between variables. First, we test for serial correlation for several different lags using the Gauss-Newton regression technique. A first indication of possible serial correlation is provided by a Durbin-Watson statistic for the AR(1)-specification of the residuals. Cross correlation is visualised by means of a cross-correlogram. The computed cross correlation coefficients are verified to fall within the approximate two standard error bounds computed as ±2/T , where T presents the number of considered lags or leads. Cross correlation coefficients exceeding these bounds differ significantly from zero and should therefore be controlled for in our model description. Consequently, four possible models for the residual terms result. Below, we discuss the corresponding estimation techniques and the specifications of the distribution of the error terms. The first model assumes that the residuals for the various behavioural equations display no serial and contemporaneous cross correlation. We estimate ri (t) = αi + ui (t) with ui (t) ∼ IID 0, σi2



for a given i, ∀t = 1, .., T.

(4.14)

This model can easily be estimated using standard OLS giving αbi and σbi 2 . In the second model, the error terms display contemporaneous cross correlation and no serial correlation. We model these error terms as r(t) = α + u(t) with u ∼ IID (0, Σ) , t = 1, .., T,

(4.15)

where r(t) = (r1 (t), .., rn∗ (t))T , α = (α1 , .., αn∗ )T , and u(t) = (u1 (t), .., un∗ (t))T . The behavioural equations are conveniently renumbered to let the index run from 1 to n∗ . Contemporaneous cross-correlation in the residuals indicates E(ui (t), u j (t)) = σi j and E(ui (t), u j (s)) = 0, ∀s 6= t. As estimation method, we use feasible generalized least squares, which yields a consistent and efficient estimator of α . This estimation method requires an estimate of Σ. Generally, one takes b= Σ

1 bT b U U, T −1

(4.16)

b is an T xn∗ -matrix with i-th column b where U ui , the approximated residuals resulting from OLS regression of the separate behavioural equations. For our particular model, the covariance matrix b b equals Σ. of the feasible GLS estimator of α

27

The third model describes a model where the error terms display serial correlation but no crosscorrelation,  ri (t) = αi + ui (t) with ui (t) = ρi ui (t − 1) + εi (t) and εi (t) ∼ NID 0, σi2 .

(4.17)

Following standard theory, we apply NLS when estimating (4.17). This estimation method simultaneously and consistently estimates αi and ρ , αbi and ρbi . The variance of the residual terms, σi2 , can then be consistently estimated using 2

σbi =

T (ri (t) − ρi rt−1i − αi + ρ αi )2 SSR (αbi , ρb) ∑t=2 = . T −3 T −3

(4.18)

SSR (αi , ρi ) denotes the sum of squared residuals. We divide by (T − 3), because there are two parameters in the regression function, i.e. αi and ρi and we incorporate (T − 1) observations per NLS estimation. Subsequently, we apply model (4.17) substituting ρi = ρbi , αi = αbi and σi2 = σbi 2 , as a description of the uncertainty in the error terms. We note that other methods like for instance, maximum likelihood or feasible GLS would have sufficed as estimation methods as well. For a discussion on the pro’s and cons of the various methods, we refer to e.g. Davidson and MacKinnon (2004). Note that our data series are relatively short, i.e. at most 30 observations per behavioural equation, so feasible GLS and ML would be favourable on that point. These methods do not exclude the first observation from the estimation process. The fourth model displays both serial correlation and contemporaneous cross correlation. We consider r(t) = α + u(t) with u(t) = Mρ u(t − 1) + ε (t) and ε (t) ∼ NID (0, Σ) ,

(4.19)

under the assumption of an AR(1)-process and contemporaneous cross correlation in the residuals ε (t). We apply non-linear feasible GLS for estimating (4.19). This method first requires an estimate of the covariance matrix Σ which is generated by NLS ignoring the cross correlation in b with elements the error terms. The resulting estimates are used to compute the residuals εbi and Σ   T 1 cii = T −3 bi ri (t − 1) − αbi + ρbi αbi , for all t = 2, .., T , and i = 1, .., N εbi εbi where εd σ i (t) = ri (t) − ρ with N the number of error terms. σc i j = 0 for all i 6= j. T denotes the number of available observations. We then perform non-linear GLS to complete our estimation process resulting in e fρ . These estimates yield a consistent estimator of the covariance matrix Σ e and M the estimates α   T 1 g ei ri (t − 1) − αei + ρei αei , for all t = 2, .., T , with elements σf i j = T −3 εei εei , where εi (t) = ri (t) − ρ fρ and Σ˜ can and i = 1, .., N. Again, our model description (4.19) combined with the estimated M be used as a description of the uncertainty associated with the error terms.

28

Implementation The error terms in the behavioural equations are investigated using data on realised error terms. When Statistics Netherlands publishes their National Accounts, our macroeconomic model is re-run to determine historically consistent error terms. Initial data, exogenous and endogenous variables are available and the error terms are adjusted as such to ensure simultaneous realisation of all observed endogenous variables. Our data series ranges at least from 1990 until 2002. For some variables, more extensive data series exist, however, the chosen period is considered representative for current error terms. We focus on the error terms for 18 behavioural equations. They are labour supply, wealth of domestic dwellings, consumption excluding fixed charges (V), exports of services of domestic origin (V), exports excluding energy (P), imports of consumption goods excluding energy (V), of investment goods (V), of raw materials and semi manufactures excluding energy (V), of energy (V), of services by the market sector (V), and of services by consumption (V), investment of the private sector (V), investment of firms in equipment (V), employment, contractual wages in the market sector, exports of services of domestic origin (P) and consumption excluding fixed charges (P). We distinguished four possible models describing the uncertainty associated with the error terms. After testing for serial and contemporaneous cross-correlation, two models are applicable, i.e. the first and the fourth model. The first fourteen error terms display no significant serial or cross correlation. The latter four terms can be divided in two systems. In addition, these error terms contain an autoregressive part.

29

30

5

Results In this section, we present the results of the Monte Carlo experiments. We consider the sensitivity of nine endogenous variables with respect to uncertainty in the provisional data, exogenous variables, parameters and residuals. These nine variables comprise the main variables of interest in the analysis of the accuracy of CPB forecasts obtained with the macroeconomic model SAFFIER, see Kranendonk and Verbruggen (2006). In this study, realisations are compared to historical forecasts. We concentrate on gross domestic product (V, P), private consumption (V), investment (V), exports (V, P), employment and contractual wages in the market sector, and the consumer price index. Again, a V or a P indicates volume or price. Our presentation is divided into five parts. First, we present the simulated standard errors of the endogenous variables for each Monte Carlo experiment combined per source of uncertainty. The meaning of these standard errors is explained and some precautionary remarks are made about their applicability in the derivation of forecast intervals as opposed to point estimates. We also make comparisons with the study of Don (1994) and forecast errors of CPB short and medium-term forecasts. Second, we verify the implications of the assumption that the sources of uncertainty occur linearly and independently in our model. This assumption is known to be violated. However, how severely does this violation affect our results? Third, we discuss the number of replications necessary for a sufficient approximation of the variances of the endogenous variables induced by the different sources of uncertainty. Fourth, we investigate the robustness of our results under variation in the central path underlying our macroeconomic model. SAFFIER is a non-linear model, so disturbances on a central path do not enter the resulting endogenous variables linearly. Moreover, a different central path does not imply the addition of a linear term to the endogenous variables, consequently affecting the distribution of the induced disturbances by the four sources of uncertainty. Finally, we combine all Monte Carlo experiments to identify the contributions of the four main sources of uncertainty to the total forecast error variances of the endogenous variables.

5.1

The Monte Carlo experiments: Technical details Our basic model starts simulating in year 2010, generating output on a yearly basis. The sources of uncertainty are disturbed at the beginning of the year 2010. Disturbances on the provisional data are implemented in year 2008 and 2009. They impinge on predicted outcomes of the endogenous variables. The disturbances of the provisional data, exogenous variables and residuals act on the central path describing the deviation free economy. We apply the central path taken from the Central Economic Plan, CPB (2006). Although our macroeconomic model allows for a longer time horizon, we simulate over a period of four years. The description of the uncertainty associated with the exogenous variables does not permit a longer time range,

31

because its distribution is calibrated over a four year horizon. An extended forecast would require additional data and probably a different estimation strategy. For each source of uncertainty, we run N simulations with SAFFIER for N different disturbances drawn from a prescribed distribution. The resulting N trials of the nine endogenous variables are then summarized in a sample mean, a sample variance, and a standard error. As mentioned in Section 3, the sample variance of an endogenous variable is a measure for the uncertainty associated with this variables induced by the investigated source of uncertainty. The sample standard error σb is obtained as the square root of the sample variance. By rule of thumb, a 2b σ -interval around the sample mean economic path contains 95% of the possible economic outcomes under the applied uncertainty. Note that this rule of thumb stems from univariate normal theory. In case of a linear macroeconomic model, a multivariate normally distributed disturbance results in a multivariate normally distributed set of endogenous variables. Each source of uncertainty is investigated separately. In total, we conduct 46 Monte Carlo experiments ranging from disturbances on the exogenous long-term interest rate to the parameters of the wage equation.

5.2

Estimated standard errors In Table 5.1, we present the standard errors of the nine endogenous variables induced by the four sources of uncertainty computed by Monte Carlo simulation with 2000 replications per experiment. This table should be read as follows. SAFFIER generates a cumulative scaled path describing the development of the economy. Each endogenous variable is presented as an index value, viz. the endogenous variable is scaled by a reference observation. In our experiments, the observation preceding the first year of our simulation is chosen as a reference point. It concerns observations from the year 2010. Note that this reference point is unaffected by any disturbances in the shock year. The index representation is convenient for variant analysis and illustration of growth rates. For example, consider an output time series on private consumption volume which contains an entry 1 in 2010, the reference year, and an entry 1.03 in 2013. Within three years, consumption volume has grown with 3% point. Our standard errors are computed around the mean of the index representation of the endogenous variables. An entry of 0.6 in Table 5.1, e.g. for consumption in year 2 (2012) under uncertainty in provisional data, thus indicates a standard error of 0.6% in that year. Note that for a linear model, the central path and the sample mean of the economy coincide using a sufficient amount of replications. In case of our non-linear model, the differences between the sample mean and the central path turn out to be minor. We assume that the various sources of uncertainty are independently distributed and that our macroeconomic model is linear in these sources. Consequently, we can obtain the sample

32

Table 5.1

Standard errors in % point induced by the four sources of uncertainty (N=2000).

Year 1 (2011)

Year 2

Year 4

GDP (V)

0.5

0.6

0.6

Consumption (V)

0.4

0.5

0.7

Investment (V)

4.0

4.2

4.2

Exports (V)

0.9

0.9

1.1

Employment market sector

0.7

0.6

0.4

GDP (P)

0.6

0.6

1.2

CPI

0.5

0.5

0.7

Exports (P)

0.3

0.3

0.5

Contractual wages

0.8

1.2

2.1

GDP (V)

1.1

1.5

2.5

Consumption (V)

0.6

1.1

2.4

Investment (V)

2.1

4.8

6.9

Exports (V)

2.6

3.9

6.3

Employment market sector

0.3

1.2

2.1

GDP (P)

0.8

1.1

2.4

CPI

0.7

1.4

1.8

Exports (P)

1.6

2.8

3.6

Contractual wages

1.0

1.5

3.7

GDP (V)

0.3

0.5

0.9

Consumption (V)

0.6

0.9

1.5

Investment (V)

2.7

4.3

6.1

Exports (V)

0.5

0.9

1.5

Employment market sector

0.5

1.0

1.5

GDP (P)

0.4

0.8

1.8

CPI

0.3

0.5

1.0

Exports (P)

0.2

0.3

0.6

Contractual wages

1.0

1.7

3.7

GDP (V)

0.3

0.3

0.5

Consumption (V)

0.4

0.5

0.6

Investment (V)

1.0

1.6

2.0

Exports (V)

0.4

0.5

0.5

Employment market sector

0.9

1.2

1.1

GDP (P)

0.7

1.1

1.9

CPI

0.5

0.7

1.1

Exports (P)

0.2

0.3

0.5

Contractual wages

1.1

1.6

2.8

Standard errors provisional data

Standard errors exogenous variables

Standard errors parameters

Standard errors residuals

33

Table 5.1

Standard errors in % point induced by the four sources of uncertainty (N=2000),continued.

Year 1 (2011)

Year 2

Year 4

Standard errors total GDP (V)

1.3

1.7

2.8

Consumption (V)

1.0

1.6

3.0

Investment (V)

5.4

7.9

10.3

Exports (V)

2.8

4.1

6.5

Employment market sector

1.3

2.0

2.8

GDP (P)

1.3

1.9

3.8

CPI

1.1

1.7

2.5

Exports (P)

1.6

2.9

3.7

Contractual wages

1.9

3.0

6.3

variance of the endogenous variables under two or more sources of uncertainty by adding the sample variances of the separate sources. This principle is applied in Table 5.1. The sample variances are added resulting in a variance for each main source of uncertainty. In turn, another addition results in the sample variance under total uncertainty. We acknowledge that the linearity and independence assumption is easily violated. For instance, the assumption that parameter uncertainty and uncertainty in the residuals are uncorrelated is unlikely. A yearly estimation of the parameters, when new data for the past in the national account are available, could lead to smaller residuals.12 Additionally, parameters and exogenous variables will enter our model multiplicatively violating the linearity assumption. The text box ‘The assumptions behind an additive impact analysis on variances’ demonstrates that negligence of the non-linear appearance of various sources of uncertainty can severely affect our outcomes. However, computations of the simultaneous investigation of parameters and exogenous variables show that these problems do not occur in our calculations. Computed variances differ from their additive derivation by separate investigation, however these variations can be accepted as minor. Similar conclusions are drawn in Don (1994). We consider the standard errors under total uncertainty which are depicted in the final block of Table 5.1. These errors illustrate that investment (V) is most sensitive to total uncertainty. Its standard error increases from 6% in 2010 to almost 12% in 2013. Exports (V) and contractual wages are also relatively sensitive to uncertainty in our model. Exports (V) displays a 3% standard error in 2010 and contractual wages has a standard error of 2% in that period. The other blocks unravel which source of uncertainty is responsible for these standard errors. Investment (V) suffers from the uncertainty in the provisional data, but becomes more affected by the uncertainty in the parameters and exogenous variables over time. The provisional data on

12

This is especially when revised figures back to 1969 are published, as a result of major revisions of the system of

national accounts.

34

The assumptions behind an additive impact analysis on variances The impact of the different sources of uncertainties on various endogenous variables is measured by their share in the variance of these variables. Consider, for instance, the effects of uncertainty in parameters from the exports of domestic origin and from the effects of uncertainty in the exogenous data series on the price of exports of competitors. For these sources of uncertainty separate Monte Carlo experiments are conducted. The variance of volume of exports is then found by addition of the variances approximated by the separate Monte Carlo experiments. Implicitly, we assume here that the sources of uncertainty are independent and that they linearly affect the endogenous variable of interest, as will be explained below. Some sources of uncertainty, however, enter the equations of the endogenous variables non-linearly. We mention again the exports of domestic origin. This behavioural equation contains the parameters α1 and β2 and the exogenous data series on prices of exports of foreign competitors bfc, which are both included in our sensitivity analysis, in a multiplicative way. Potentially, the implicit linearity and independence assumption can severely affect the conclusion on the impact of the various sources of uncertainty on the variance of the endogenous variables. The following example serves as an illustration of such a deficiency. Assume that x and y are independent normally distributed variables, i.e., x ∼ N(0, σx2 ) and y ∼ N(0, σy2 ), and define

z1 = x + y and z2 = xy . By definition, the variances of z1 and z2 yield Var(z1 ) = Var(x + y) = Var(x) + Var(y) = σx2 + σy2 .

(5.1)

Var(z2 ) = Var(xy) = Var(x)Var(y) = σx2 σy2 .

(5.2)

and

We first run a Monte Carlo experiment varying the x component and keeping y constant at its expected value Ey = 0. Asymptotically, the variance of z1 will converge to the variance of x . Similarly, a Monte Carlo experiment varying y and fixing x at its expected value Ex = 0 results in a variance on z1 converging to the variance of y . In this case, we can add the variances of z1 found in the two Monte Carlo experiments to construct an estimate of the total variance of z1 . Asymptotically, we find Var (z1 ) = σx2 + σy2 . Two similar Monte Carlo experiments for estimation of the z2 variable would yield a zero variance for z2 in each Monte Carlo experiment. Note that by fixing one of the variables to its expected value,

z2 will vanish in each Monte Carlo experiment. Consequently, the total estimated error variance of z2 will be zero. This variance does not converge to the value given by equation (5.2). When interpretating Table 5.1 and Table 5.6, we should keep in mind that the figures might suffer from non-linearity problems as described above.

investment volume is highly uncertain. The estimated covariance matrix of the provisional data contains a large entry for the investment (V) variance. A more detailed breakdown of the Monte Carlo results shows that the uncertainty in the exogenous international prices and world trade volume is largely responsible for the standard error of investment (V). The parameters of the investment equation with its relatively long lag structure lead to an increase of the standard errors as well. Exports (V) react strongly to uncertainty in the international prices and world trade volume determining most of the foreign economic picture. Contractual wages (P) display a more evenly distribution of the standard errors over the sources of uncertainty. The standard errors in Table 5.1 are not suitable for the construction of confidence intervals in short-term forecasts. An ongoing discussion among forecasting institutes concerns the

35

presentation of point estimates or confidence intervals in their forecasting reports. Confidence intervals are considered to stress the uncertainty associated with economic forecasts. Others in favour of point estimates say that confidence intervals are difficult to interpret. More importantly, they consider the method of forecasting inappropriate for estimating confidence intervals as forecasts are expert-based or stem from different countries, each using other quantative tools. For an inventory on forecast representation by 16 large institutes, we refer to Thissen (2005). We think that our standard errors have some additional interpretation problems. First, they are not computed under total uncertainty. Our analysis captures the most important sources of uncertainty but has no full coverage. More importantly, our analysis does not include a fifth source of forecast uncertainty, viz. uncertainty introduced by so called adjustments (autonomous terms) based on expert opinion. Unforeseen events or measures are not explicitly modelled in SAFFIER. However, these terms are actually the main transmitters of uncertainty. Therefore, the presented standard errors are indicative for the uncertainty introduced by the considered sources of uncertainty, but do not suffice for a forecast interval interpretation.

5.3

Comparison with uncertainty standard errors published in Don (1994) In table 5.2 we compare the standard errors calculated with the current CPB macro-econometric model SAFFIER with the results Don found in 1994 with a small macro-model. All our standard errors are lower than in Don (1994).

Table 5.2

Standard errors compared with Don (1994) Don(1994) Year 1

Year 4

Year 1

Year 4

GDP (V)a

1.3

2.8

2.3

5.7

Consumption (V)

1.0

3.0

2.3

6.1

Investments (V)

5.4

10.3

9.6

17.0

Exports (V)

2.8

6.5

5.0

13.4

CPI

1.1

2.5

2.5

14.8

Contractual wages

1.9

6.3

2.1

15.2

a Don (1994): production enterprises

Especially the uncertainty related to the exogenous variables is much lower then some fifteen years ago. Higher volatility of (international) prices and world trade in the seventies and early eighties dominated the historical period relevant for the calculated standard errors of the exogenous variables in the study of Don (1994). Our sample period has relatively more years with moderate price and wage increases. Also uncertainty associated with the residuals is lower today, probably because the parameters in the equations of the old model were not based on estimated equations but calibrated applying information from other models. Standard errors

36

from provisional data are larger in our study, probably because our model contains more dynamics than the one applied in Don(1994). For most behavioural equations in our current model Saffier employs ECM-specifications.

5.4

Model uncertainty and realised forecast errors In the introduction of this paper is mentioned that the analysis of ex-post evaluations of CPB-forecasts gives an impression forecast uncertainties. Is it possible to give an explanation of those forecast errors by the analysis of the Monte Carlo simulations? In table 5.3 we present a comparison between the standard errors from the Monte Carlo simulations and the standard errors from the published short- and medium-term CPB forecasts. The short-term forecasts used in this table concern the forecasts for next year published in September in the Macro Economic Outlook (MEV). The medium-term forecasts concerns the forecast with a horizon of four or sometimes five years ahead. The statistics are rather comparable for most variables. The simulated uncertainty in general could be lower than the real-time errors, because not all sources of uncertainty are simulated in this study. As mentioned in paragraph 2.2 we excluded the uncertainty related to the (exogenous) policy variables. The effect of ‘wrong’ assumptions on the government policy is probably larger on the medium term than for next year. There is a direct effect on GDP, by effecting government consumption and a indirect effect because fiscal policy influences purchasing power of consumers and private consumption. We also excluded shocks on the interest rates and share prices, which are relevant for households expenditures.

Table 5.3

Forecast accuracy and model uncertainty Monte Carlo Simulations Year 1

Forecast errors Year 4

Year 1

Year 4

RMSEb

standard errors GDP (V)a

1.4

0.8

1.6

1.3

Consumption (V)

1.0

0.8

1.8

1.6

Investments (V)

6.1

2.8

5.5

4.1

Exports (V)

2.9

1.5

4.6

1.8

Employment

1.3

0.7

1.0

0.8

CPI

1.0

0.8

1.1

1.4

Exports (P)

1.6

1.0

6.4

3.3

Contractual wages

1.9

1.9

1.5

2.1

a Don (1994): production enterprises b Root Mean Square Error

The largest differences in table 5.3 are related to the export prices, where the uncertainty is much lower than forecast errors in the past. This can be explained by the fact that the simulations were done with our current model which uses the current energy-intensities and price equations. This

37

information is relevant for the near future, but can not explain forecast errors in the past when the energy-intensity was much lower. For contractual wages and investments the simulations are too high compared to realized errors for the short term horizon. Probably, the explanation for the lower forecast errors is we use the add-factors of some behavioural equations to introduce specific information like concluded wage contracts and approved building permits. Although simulated uncertainty for the expenditure categories and imports underestimate the real uncertainty, the total effect on GDP is rather small, especially for the GDP-growth next year.

5.5

Number of replications in the Monte Carlo experiments The standard errors in Table 5.1 originate from Monte Carlo experiments with 2000 simulations. How many trials are necessary for an accurate estimation of the standard errors derived from Monte Carlo simulation? In a box, we shortly discuss some theoretical approaches to assess the relation between the number of replications and the accuracy of the standard errors. We apply a more practical approach then discussed in the box to assess the relation between the number of trails and the accuracy of our results. The above described theoretical results are not directly applicable when the error concerns the error in the sample variation and the standard error. Although these sample statistics can be written in terms of a moment, the results do not apply, because they incorporate the sample mean. Second, the available computing budget is not restrictive, so the need for a thorough analysis of an adequate number of replications is not that pressing. For efficiency reasons, it can be desirable to remove unnecessary trials. Finally, the distributions of the sources of uncertainty already contain some inaccuracy. These distributions have been derived under strong and not always verifiable assumptions. Strong accuracy restrictions on the Monte Carlo outcomes are redundant. We decide on the number of replications by running our Monte Carlo experiments for an increasing number of replications. We double the number of trials from N = 125 to N = 2000. In Table 5.4, we present the standard errors in GDP volume for each source of uncertainty in 2010, 2011, and 2013, respectively. The standard errors are given in % point. GDP volume is considered, because its variation is indicative for variation in our model. GDP volume is composed of private consumption, investment, imports, exports and government expenditure. The results demonstrate that variations in the standard errors under increasing N are minor. The variations on the standard errors induced by uncertainty in the provisional data, parameters and residuals are negligible. The variation in the standard errors induced by the exogenous variables fluctuates around 1.1% point. The order of variation, a hundred percent point, suggests that 2000 replications more than suffice for an accurate estimate of the standard errors.

38

Theoretical relation between number of replications and the standard errors Most theory is concerned with the derivation of the integral Z µ=

k (x) f (x) dx,

(5.3)

R

c which denotes the expected value of a random variable k(X) with probability density function f (x). Let µ N denote the c approximation of µ by means of a Monte Carlo experiment with N trials. The error in µ N depends on the number of replications N . When N increases, the error decreases. The error term is related to statistical sampling variation. The

c law of large numbers imposes that when N approaches infinity µ N approaches µ with probability one. However, we are c interested in an exact relation between a finite number of replications N and the error term of µ N. We discuss some common approaches to assess this relation. The standard error of k (X) provides a first estimate of the statistical sampling variation. Unfortunately, the standard error can be misleading, as it is derived by simulation itself

c including the estimated mean µ N . Confidence intervals serve as a better indicator for statistical sample variation. They depict the sample bounds of a given (1 − δ )%-interval surrounding the sample mean. A third approach is based on Chebyshev’s inequality, see e.g. Fishman (1996).

Proposition 1. Let Z denote a random variable with distribution function F defined on (−∞, ∞), EZ = 0, and σ 2 = varZ = EZ 2 < ∞. Then, for β > 0,   |Z| 1 ≥β ≤ 2. P σ

β

c Chebyshev’s inequality implies convergence in probability of µ N to µ . Chebyshev’s inequality can be helpful to derive a so called (ε , δ ) absolute error criterion. For all sample N greater than or equal to N ∗ the error specification P [|c µN − µ | < ε ] ≥

(1 − δ ) is satisfied. For certain distributions, a closed form of N ∗ can be found. Mostly, the absolute error criterion leads to a large worst-case sample size motivating the need for alternative approaches. The Central Limit Theorem can facilitate the achievement of the (ε , δ ) absolute error criterion under a smaller sample size N˜ . Chebyshev’s inequality is then rewritten in terms of a random variable subject to a Central Limit Theorem. For sufficiently small ε , the distribution of this variable approaches the standard normal distribution. N˜ will be given in terms of this limiting distribution. Unfortunately, it can be difficult to determine N˜ when ε is sufficiently small. For a discussions, on more advanced methods we refer to Fishman (1996) and Ripley (2006).

5.6

Robustness test for path independence We have assumed that the sources of uncertainty appear linearly in our model, and that they are independent. However, our macroeconomic model is clearly non-linear in some of the sources, and non-linear in the endogenous variables. In this section, we first assess the latter non-linearity. When a non-linear model with different underlying central paths is exposed to Monte Carlo disturbances, the resulting sample variances on the endogenous variables will be different. Moreover, a differing undisturbed reference situation will result in different simulated variances. We generate two different undisturbed reference situations by varying the year in which the Monte Carlo disturbances are imposed. Our basic model starts simulating in 2007 and disturbs the economy in 2010. The underlying central (undisturbed) path displays a rise in unemployment from 0.7% in 2010 to 2.6% point in 2014. The economic situations in these years

39

Table 5.4

Standard errors in GDP volume (in % point).

Year 1 (2011)

Year 2

Year 4

N=125

0.48

0.56

0.57

N=250

0.48

0.55

0.58

N=500

0.48

0.56

0.58

N=1000

0.49

0.56

0.58

N=2000

0.49

0.57

0.59

N=125

1.03

1.46

2.35

N=250

1.06

1.51

2.44

N=500

1.05

1.50

2.41

N=1000

1.05

1.49

2.43

N=2000

1.08

1.54

2.51

N=125

0.30

0.45

0.80

N=250

0.32

0.48

0.86

N=500

0.34

0.52

0.91

N=1000

0.33

0.51

0.88

N=2000

0.33

0.50

0.87

N=125

0.29

0.31

0.42

N=250

0.29

0.32

0.44

N=500

0.29

0.32

0.45

N=1000

0.30

0.33

0.45

N=2000

0.30

0.33

0.46

Standard errors provisional data GDP (V)

Standard errors exogenous variables GDP (V)

Standard errors parameters GDP (V)

Standard errors residuals GDP (V)

differ significantly. A second round of Monte Carlo experiments therefore exposes the economy to its Monte Carlo shocks in 2014. In table 5.5, we present standard errors in GDP volume per source of uncertainty for two different shock years, viz. 2011 (year I) and 2014 (year II). We compare the induced standard errors after 1, 2 and 4 years respectively. The results demonstrate that under the non-linearity of our model, the choice of the central path influences the Monte Carlo outcomes. For instance, the standard errors induced by the parameter uncertainty are smaller for shocks in year I than in year II. The low unemployment in year I is exceptional and might induce the non-linear character of the model. The second assumption on non-linearity between the sources of uncertainty is verified by running our Monte Carlo experiments per source of uncertainty separately and simultaneously. Subsequently, their outcomes are tested for the additivity property of the variances under independence. The shock year is kept constant at 2011. In this way, we investigate the impact of

40

Table 5.5

Standard errors in GDP volume for two different shock years, year I (2011) and year II (2014) (N=2000).

Year 1

Year 2

Year 4

Year I

0.49

0.57

0.59

Year II

0.55

0.56

0.61

Year I

1.08

1.54

2.51

Year II

1.50

1.99

2.65

Year I

0.33

0.50

0.87

Year II

0.44

0.64

1.05

Year I

0.30

0.33

0.46

Year II

0.31

0.40

0.64

Standard errors provisional data GDP (V)

Standard errors exogenous variables GDP (V)

Standard errors parameters GDP (V)

Standard errors residuals GDP (V)

non-linearity between the parameters and residuals of the equations. In addition, these sources are expected to demonstrate some correlation as the distribution of the residual of an equation is affected by an accurate estimation of its parameters. The results demonstrate that the violation of the linearity assumption has no significant impact on our outcomes.

5.7

Decomposition of the total error variance The standard errors given in Table 5.1 directly quantify the uncertainty in the endogenous variables induced by the various sources of uncertainty. In this section, we extend this analysis by assessing the relative impact of the different sources. Table 5.6 presents the contributions of the four sources of uncertainty to the total error variance of nine endogenous variables. Consider, for instance, the endogenous variable GDP volume, GDP(V). The entries in Table 5.6 show that the uncertainty in the provisional data, the exogenous variables, the residuals and the provisional data contribute to the total error variance of GDP (V) for respectively, 24%, 67%, 3% and 6%. In accordance with earlier reasoning, we derive these percentages under the assumption that the various sources of uncertainty are independent and occur linearly in our model. Consequently, we can add the separate Monte Carlo sample variances to obtain the total error variance of the endogenous variables. The various contributions are then easily computed. We present contribution data for various time periods after the imposed disturbances, viz. after 1, 2 and 4 years respectively. A detailed decomposition of the error variances per source of uncertainty, is available in appendix A.

41

The results demonstrate that the exogenous variables constitute most of the uncertainty in GDP (V, P), private consumption (V), CPI (P) and exports (V, P). The impact of the uncertainty associated with the exogenous variables, international prices and world trade volume, is considerable. The uncertainty in contractual wages is more or less evenly distributed over the four sources of uncertainty. The uncertainty in the provisional data is largely responsible for uncertainty in investment volume, although this impact diminishes over time. Employment in the market sector is relatively sensitive to uncertainty in the provisional data and the residuals. In the second forecast year, the contribution pattern becomes more distinguished. As expected, the impact of the uncertainty in provisional data on the total error variances reduces. Although still significant, even its contribution in the total error variance of investment is reduced from 68% to 21% within four years. The variance induced by uncertainty in provisional data increases over the forecast horizon, however its growth diminishes. The impact of uncertainty in the exogenous variables is strengthened over time. For all nine variables, the main contribution to the total error variance of the four-year ahead forecasts is induced by the exogenous variables. The only exception is the contractual wages, for which the contribution by parameter uncertainty competes with that of uncertainty in the exogenous variables. Except for export prices, uncertainty in the exogenous variables is mostly felt on the price side of the economy as this uncertainty resides in international prices. The parameter uncertainty is represented in private consumption (V), GDP (V) and CPI (P).

42

Table 5.6

Contributions of the four sources of uncertainty to the total error variance (in % of the total error variance). (N=2000)

2011

2012

2014

GDP (V)

15

11

5

Consumption (V)

15

9

5

Investment (V)

55

29

16

Exports (V)

10

5

3

Employment market sector

32

9

2

GDP (P)

23

11

11

CPI

24

8

9

4

1

2

18

15

12

GDP (V)

73

78

83

Consumption (V)

38

49

66

Investment (V)

16

37

44

Exports (V)

84

89

92

Contributions provisional data

Exports (P) Contractual wages Contributions exogenous variables

Employment market sector

4

34

56

GDP (P)

39

36

41

CPI

48

67

53

Exports (P)

93

97

94

Contractual wages

28

24

35

Contributions parameters GDP (V)

7

8

10

Consumption (V)

33

33

24

Investment (V)

25

30

35

Exports (V)

4

5

5

Employment market sector

17

22

27

GDP (P)

12

19

23

CPI

6

7

17

Exports (P)

1

1

2

25

32

34

Contractual wages Contributions residuals GDP (V) Consumption (V) Investment (V) Exports (V)

6

4

3

14

9

5

4

4

4

2

1

1

Employment market sector

46

36

15

GDP (P)

26

33

26

CPI

22

18

20

1

1

2

29

28

19

Exports (P) Contractual wages

43

44

6

Conclusions and Recommendations Uncertainty is an inherent attribute of any macroeconomic forecast. An essential auxiliary task of a forecasting institute is to provide insight into this uncertainty to its users. To this end, we have analysed the impact of four sources of uncertainty on the CPB’s macroeconomic forecast model SAFFIER, viz. uncertainty in provisional data, uncertainty in exogenous variables, uncertainty in model parameters and uncertainty in error terms of the behavioural equations. The uncertainty in the model was assessed by means of the Monte Carlo method. For each source of uncertainty, the standard error was computed as a measure of the resulting uncertainty in the macroeconomic forecast of the nine most important endogenous variables: gross domestic product (V, P), private consumption (V), investment (V), exports (V, P), employment and contractual wages in the market sector, and the consumer price index. Furthermore, the relative impact of the four sources of uncertainty on the overall forecast error was assessed by computing the contribution of each source to the total error variance of each endogenous variable. The results demonstrate that the main contribution to the total error variance of a four-year-ahead forecast emanates from uncertainty in the exogenous variables. The total error variance of a short-term forecast is mainly influenced by uncertainty in both the exogenous variables and the provisional data. Of all nine variables, investment volume is most sensitive to the four sources of uncertainty. In the medium-term, exports and contractual wages exhibit large standard errors as well. Error term uncertainty and parameter uncertainty seem to be dominated by uncertainty in the exogenous variables. Modellers often rely on model adjustments or reestimations of equations to reduce the uncertainty in their forecasts. Although model evaluation and adjustments are relevant on their own merits, a reduction of the uncertainty in the exogenous variables is more pertinent. However, generally these variables are forecasts themselves, which hampers the reduction of their uncertainty. Moreover, their uncertainty is mainly attributable to foreign exogenous variables, such as energy and import prices, which are profoundly difficult to predict. For future research, we recommend to reconsider the various models of uncertainty. In this paper, the impact of the various sources of uncertainty is measured by means of their contribution to the total error variance under the assumptions of model linearity, and linearity and independence of the sources of uncertainty. Is it possible to relax these assumptions? Can we determine a simultaneous density function combining the sources of uncertainty or would the adoption of a bootstrapping approach be useful? Furthermore, are there alternative indicators to measure the relative impact of the different sources that avoid the restrictions imposed by linearity and independence?

45

46

References Amano, R., K. McPhail, H. Pioro and A. Rennison, 2002, Evaluating the quarterly projection model: A preliminary investigation, Working Paper 2002-20, Bank of Canada.

Borbely, D. and C.P. Meier, 2003, Macroeconomic interval forecasting: The case of assessing the risk of deflation in germany, Working Papers 1153, The Kiel Institute for World Economics.

Canova, F., 1995, Sensitivity analysis and model evaluation in simulated dynamic general equilibrium economics, International Economic Review, vol. 36, no. 2, pp. 477–501.

Clements, M. and D. Hendry, 1998, Forecasting Economic Time Series, Cambridge University Press, Cambridge, United Kingdom.

CPB, 2004, Macro Economic Outlook 2005.

CPB, 2006, Central Economic Plan 2006.

Davidson, R. and J. MacKinnon, 2004, Econometric theory and methods, Oxford University Press, Oxford.

Don, F., 1994, Forecast uncertainty in economics, in J. Grasman and G. van Straten, eds., Predictability and Nonlinear Modelling in Natural Sciences and Economics, Kluwer Academic Publishers.

Engle, R. and C. Granger, 1987, Co-integration and error-correction: Representation, estimation and testing, Econometrica, vol. 55, pp. 251–276.

Ericsson, N., 2001, Forecast uncertainty in economic modelling, Discussion Paper 697, FRB International Finance.

Fair, R., 1993, Estimating event probabilities in macroeconometric models using stochastic simulation, in J. Stock and M. Watson, eds., Business Cycles, Indicators and Forecasting, pp. 157–176, Chicago Press.

Fair, R., 2003, Bootstrapping macroeconometric models, Studies in Nonlinear Dynamics and Econometrics, vol. 7, no. 4.

47

Fishman, G., 1996, Monte Carlo: Concepts, Algorithms and Applications, Springer Series in Operations Research, Springer, Berlin.

Gallo, G. and F. Don, 1991, Forecast uncertainty due to unreliable data, Economic and financial computing, vol. 1, pp. 49–69.

Garratt, A., K. Lee, M. Pesaran and Y. Shin, 2003, Forecast uncertainties in macroeconometric modelling: An application to the UK economy, Journal of the American Statistical Association, vol. 98.

Hers, J., 1993, De voorspelkwaliteit van de middellange-termijn prognoses van het CPB, CPB Research Memorandum 107.

Kapetanios, G., 2000, Model selection uncertainty and dynamic models, Discussion Paper 165, NIESR.

Kolsrud, D., 1993a, Stochastic simulation of KVARTS91, Rapporter 93/20, Statistisk sentralbyra.

Kolsrud, D., 1993b, Stochastic simulation of RIMINI 2.0, Arbeitsnotat 1993/6, Norge Bank, Oslo.

Kolsrud, D., 2004, On stochastic simulation of forward-looking models, Computational Economics, vol. 24, no. 2, pp. 159–183.

Kranendonk, H. and J. Verbruggen, 2005, Trefzekerheid van CPB-prognoses voor de jaren 1971-2003, CPB Document 77.

Kranendonk, H. and J. Verbruggen, 2006, Trefzekerheid van korte-termijnramingen van het CPB voor de jaren 1971–2004, CPB Document 106.

Kranendonk, H. and J. Verbruggen, 2007, Saffier, a ‘multi-purpose’ model of the dutch economy for short-term and medium-term analyses, CPB Document 144.

Kranendonk, H. and J. Verbruggen, 2001, De nieuwe consumptiefunctie van SAFE, CPB Memorandum 18.

48

Kusters, A., M. Ligthart and J. Verbruggen, 2001, De nieuwe uitvoervergelijkingen van SAFE, CPB Memorandum 25.

Lynch, R. and C. Richardson, 2004, Discussion on Patterson and Heravi, Journal of Official Statistics, vol. 20, pp. 623–629.

Mensbrugghe, D. van der, J. Martin and J.M. Burniaux, 1990, How robust are WALRAS results?, in OECD Economic Studies, 13, pp. 173–204, OECD.

Meyermans, E. and P. van Brusselen, 2006, An evaluation of the risks surrounding the 2006-2012 NIME economic outlook: Illustrative stochastic simulations, Working Paper 02-06, Federal Planning Bureau.

Onatski, A. and N. Williams, 2003, Modelling model uncertainty, Journal of the European Economic Association, vol. 1, no. 5, pp. 1087–1122.

Pagan, A., 1984, Econometric issues in the analysis of regressions with generated regressors, International Economic Review, vol. 25, no. 1, pp. 221–247.

Patterson, K. and S. Heravi, 2004, Revisions to official data on U.S. GNP: A multivariate assessment of different vintages, Journal of Official Statistics, vol. 20, pp. 573–602.

Ripley, B., 2006, Stochastic Simulation, Probability and Stochastics, John Wiley and Sons, Inc., New York.

Rubinstein, R., 1981, Simulation and the Monte Carlo method, Wiley Series in probability and mathematical statistics, John Wiley and Sons, New York.

Sandee, J., F. Don and P. van den Berg, 1984, Adjustment of projections to recent observations, European Economic Review, vol. 26, pp. 153–166.

Thissen, L., 2005, Point estimates versus confidence intervals, CPB Memorandum 119.

Toedter, K.H., 1992, Structural estimation and stochastic simulation of large non-linear models, Economic Modelling, vol. 9, no. 2, pp. 121–128.

Vlimmeren, J. van, F. Don and V. Okker, 1991, Modelling data uncertainty due to revisions, CPB Memorandum 75.

49

Vlimmeren, J. van, F. Don, V. Okker and J. Blokdijk, 1993, Policy feedback and forecast uncertainty, Tech. Rep. I/1993/12, CPB.

50

Appendix A

Table A.1

Decomposition of the total error variance per source of uncertainty

Contributions per exogenous variable (system) to the total error variance induced by uncertainty associated with the total set of exogenous variables

Year 1

Year 2

Year 4

Contributions world trade and international prices GDP (V)

99

97

96

Consumption (V)

70

74

92

Investment (V) Exports (V) Employment market sector

90

92

92

100

100

100

98

98

97

GDP (P)

100

100

97

CPI

100

100

98

Exports (P)

100

100

100

98

96

100

Contractual wages Contributions long-term interest rate GDP (V)

1

3

4

Consumption (V)

30

26

8

Investment (V)

10

8

8

Exports (V)

0

0

0

Employment market sector

2

2

3

GDP (P)

0

0

3

CPI

0

0

2

Exports (P)

0

0

0

Contractual wages

2

4

0

GDP (V)

0

0

0

Consumption (V)

0

0

0

Investment (V)

0

0

0

Exports (V)

0

0

0

Employment market sector

0

0

0

GDP (P)

0

0

0

CPI

0

0

0

Exports (P)

0

0

0

Contractual wages

0

0

0

Contributions share price

V : Volume P : Price

51

Table A.2

Contributions per initial variable (or system of variables) to the total error variance induced by provisional data uncertainty

Year 1

Year 2

Year 4

Contributions employment GDP (V)

37

30

17

Consumption (V)

12

29

33

Investment (V)

4

2

1

Exports (V)

5

4

4

Employment market sector

61

59

59

GDP (P)

15

35

43

CPI

29

26

29

1

19

41

34

51

47

GDP (V)

3

2

2

Consumption (V)

0

1

1

Investment (V)

0

0

0

34

21

13

Employment market sector

2

3

2

GDP (P)

0

0

0

CPI

0

1

0

Exports (P)

0

0

0

Contractual wages

0

0

1

Exports (P) Contractual wages Contributions exports of manufactured goods (volume)

Exports (V)

Contributions exports of manufactured goods (price) GDP (V)

2

3

8

12

6

2

1

0

0

Exports (V)

7

6

6

Employment market sector

1

0

8

50

40

15

Consumption (V) Investment (V)

GDP (P) CPI

31

37

25

Exports (P)

71

62

23

Contractual wages

32

16

5

Contributions investment GDP (V)

34

41

50

Consumption (V)

10

15

26

Investment (V)

72

79

85

Exports (V)

1

5

12

12

15

20

GDP (P)

1

3

19

CPI

0

1

15

Exports (P)

4

1

16

Contractual wages

3

10

23

Employment market sector

V : Volume P : Price

52

Table A.2

Contributions per initial variable (or system of variables) to the total error variance induced by provisional data uncertainty, continued.

Year 1

Year 2

Year 4

GDP (V)

3

3

5

Consumption (V)

6

3

2

Investment (V)

2

1

0

Exports (V)

0

0

0 0

Contributions contractual wages

Employment market sector

3

1

GDP (P)

17

10

5

CPI

25

21

13

5

3

4

16

6

3

Exports (P) Contractual wages Contributions imports of goods GDP (V)

7

7

7

Consumption (V)

8

10

12

Investment (V)

15

15

13

Exports (V)

52

61

63

Employment market sector

5

6

4

GDP (P)

2

2

7

CPI

1

0

5

Exports (P)

3

2

4

Contractual wages

4

6

10

GDP (V)

1

1

2

Consumption (V)

1

0

0

Investment (V)

0

0

0

Exports (V)

0

0

0

Employment market sector

0

0

1

GDP (P)

8

6

2

CPI

9

8

4

Exports (P)

4

5

3

Contractual wages

5

2

1

GDP (V)

14

13

8

Consumption (V)

50

36

24

6

3

1

Contributions GDP (P)

Contributions GDP and consumption (V)

Investment (V) Exports (V)

1

2

1

17

15

6

GDP (P)

7

4

9

CPI

6

4

8

11

8

10

5

8

11

Employment market sector

Exports (P) Contractual wages V : Volume P : Price

53

Table A.3

Contributions per equation group to the total error variance induced by parameter uncertainty

Year 1

Year 2

Year 4

63

52

44

Consumption (V)

3

3

6

Investment (V)

3

6

6

90

80

76

Employment market sector

1

8

10

GDP (P)

9

4

2

CPI

14

5

2

Exports (P)

17

13

6

6

3

3

GDP (V)

14

13

12

Consumption (V)

Contributions exports GDP (V)

Exports (V)

Contractual wages Contributions consumption

90

85

70

Investment (V)

0

1

1

Exports (V)

0

1

0

Employment market sector

0

1

1

GDP (P)

3

3

6

CPI

3

5

6

Exports (P)

1

1

10

Contractual wages

1

1

12

GDP (V)

4

3

3

Consumption (V)

0

0

0

Investment (V)

0

0

0

Exports (V)

0

0

0

Employment market sector

0

0

1

GDP (P)

1

0

0

CPI

1

1

0

Exports (P)

1

0

0

Contractual wages

0

0

0

19

29

30

2

1

4

96

93

90

Contributions imports

Contributions production function (short-term) GDP (V) Consumption (V) Investment (V) Exports (V)

7

13

5

Employment market sector

95

85

71

GDP (P)

23

14

7

CPI

27

16

7

Exports (P)

35

29

3

Contractual wages

15

8

6

V : Volume P : Price

54

Table A.3

Contributions per equation group to the total error variance induced by parameter uncertainty, continued.

Year 1

Year 2

Year 4

GDP (V)

0

2

11

Consumption (V)

5

10

20

Investment (V)

0

1

3

Exports (V)

2

6

18

Contributions wage equation

Employment market sector

4

5

17

GDP (P)

64

79

84

CPI

53

73

85

Exports (P)

46

58

80

Contractual wages

79

87

78

V : Volume P : Price

55

Table A.4

Contributions per residual (group) to the total error variance induced by the residual uncertainty

Year 1

Year 2

Year 4

6

52

28

Consumption (V)

13

54

75

Investment (V)

18

24

52

Exports (V)

34

75

74

Employment market sector

99

95

96

GDP (P)

33

43

66

CPI

18

30

70

Exports (P)

51

74

80

Contractual wages

73

78

82

GDP (V)

2

10

52

Consumption (V)

4

6

8

Investment (V)

5

5

1

Exports (V)

1

1

3

Employment market sector

0

0

1

GDP (P)

63

56

31

CPI

79

69

27

8

13

15

23

21

12

GDP (V)

0

0

1

Consumption (V)

0

0

0

Investment (V)

0

0

1

Exports (V)

0

0

3

Employment market sector

0

0

1

GDP (P)

0

0

2

CPI

0

0

2

Exports (P)

0

0

2

Contractual wages

0

1

3

Contributions contractual wages + employment GDP (V)

Contributions private consumption + exports of services (P)

Exports (P) Contractual wages Contributions labour supply

Contributions consumption excluding fixed charges GDP (V)

6

3

1

76

33

10

Investment (V)

1

1

0

Exports (V)

0

0

0

Employment market sector

0

0

0 0

Consumption (V)

GDP (P)

0

0

CPI

0

0

0

Exports (P)

0

0

0

Contractual wages

0

0

0

V : Volume P : Price

56

Table A.4

Contributions per residual (group) to the total error variance induced by the residual uncertainty, continued.

Year 1

Year 2

Year 4

16

6

3

Consumption (V)

1

1

1

Investment (V)

4

4

0

Exports (V)

2

1

0

Employment market sector

0

1

0

GDP (P)

1

0

0

Contributions exports of services GDP (V)

CPI

1

0

0

Exports (P)

1

0

0

Contractual wages

1

0

0

15

7

3

Consumption (V)

2

1

2

Investment (V)

5

5

1

54

16

9

Employment market sector

0

1

1

GDP (P)

1

0

0

Contributions exports of domestic origin GDP (V)

Exports (V)

CPI

1

0

0

Exports (P)

4

1

0

Contractual wages

1

0

0

Contributions price of exports excluding energy GDP (V)

1

1

2

Consumption (V)

0

0

0

Investment (V)

0

0

0

Exports (V)

4

4

7

Employment market sector

0

0

0

GDP (P)

1

0

0

CPI Exports (P) Contractual wages

0

0

0

27

6

1

0

0

0

Contributions investment of the private sector GDP (V)

1

1

3

Consumption (V)

0

0

0 44

Investment (V)

53

48

Exports (V)

0

0

2

Employment market sector

0

0

0

GDP (P)

0

0

0

CPI

0

0

0

Exports (P)

0

0

0

Contractual wages

0

0

0

V : Volume P : Price

57

Table A.4

Contributions per residual (group) to the total error variance induced by the residual uncertainty, continued.

Year 1

Year 2

Year 4

GDP (V)

0

0

0

Consumption (V)

0

0

0

Investment (V)

0

0

0

Exports (V)

0

0

0

Employment market sector

0

0

0

GDP (P)

0

0

0

CPI

0

0

0

Exports (P)

0

0

0

Contractual wages

0

0

0

53

19

7

Contributions investment by companies in outillage etc.

Contributions imports GDP (V) Consumption (V)

3

2

3

13

13

1

Exports (V)

5

3

1

Employment market sector

0

3

1

GDP (P)

1

0

1

CPI

2

0

1

Exports (P)

9

5

1

Contractual wages

2

0

1

Investment (V)

Contributions wealth of domestic dwellings GDP (V)

0

0

0

Consumption (V)

1

3

1

Investment (V)

0

0

0

Exports (V)

0

0

0

Employment market sector

0

0

0

GDP (P)

0

0

0

CPI

0

0

0

Exports (P)

0

0

0

Contractual wages

0

0

0

V : Volume P : Price

58

Appendix B

How to generate a sample from the multinormal distribution?

Our Monte Carlo experiments require several samples generated from a multivariate normal distribution. Although most programming languages do not facilitate direct sampling from this distribution, a sample can easily be derived when the language contains a standard normal univariate number generator. In this box, we demonstrate this approach. Let Z be a multivariate normal distributed random vector with mean, µ , and covariance matrix, Ω, Z ∼ N(µ , Ω). The covariance matrix, Ω, is symmetric and positive definite. Consequently, the matrix Ω is invertible and can be decomposed using a Cholesky decomposition. There exists an n × n matrix H such that Ω−1 = H T H. We apply this decomposition matrix, H, by formulating the transformation Z = H −1 U + µ ,

(B.1)

where U = (U1 , ..,Un )T denotes a stochastic random vector in Rn with independent normally distributed elements, Ui ∼ N(0, 1). The density of the stochastic vector U reads,   1 T − 21 n exp − u u . fU (u) = (2π ) 2 Here we used that the elements of U are independent and identically distributed, and that the density of a standard normal stochast is defined as fUi (ui ) = (2π )−1/2 exp(−ui2 /2). We will demonstrate that the transformation (B.1) yields a random vector Z which is multivariate normal distributed with mean µ and covariance matrix Ω. We recall the following theorem.

Proposition 2. Consider two stochastic vectors X and Y related by the transformation Y = h (X) .

(B.2)

Their densities satisfy ∂h , fX (x) = fY (h (x)) ∂X

(B.3)

where fX and fY are the densities of the stochastic vectors X and Y, and, ∂∂ Xh is the determinant of the Jacobian of the transformation (B.2) given by   ∂ h1 ∂ h1 · · · ∂ Xn   ∂ X1 ∂h  .  . . .. =  .. . . . ∂X   ∂ hn ∂ hn · · · ∂ X1 ∂ Xn

59

We apply this theorem to the transformation (B.1) by first rewriting this equation to U = H (Z − µ ). For the density of the random vector Z, we then find fZ (z) = fU (H (Z − µ )) |H| . Substituting the density function fU into equation (B.4) gives   1 1 fZ (z) = (2π )− 2 n |H| exp − (H (Z − µ ))T (H (Z − µ )) 2   1 1 1 = (2π )− 2 n |Ω|− 2 exp − (Z − µ )T Ω−1 (Z − µ ) . 2

(B.4)

(B.5)

This resulting density indeed resembles the density of a multivariate normal distributed vector Z with mean µ and covariance matrix Ω.

60

Appendix C

Estimation of systems of equations

When estimating a system like (4.10) and (4.11), we regularly rely on a two-step method. First, we estimate the long-term value ln y ∗ (t) of the endogenous variable ln y(t) by regressing ln y(t) on the explanatory variablesxlt (t).  This stage results in the approximated equilibrium value  −1 .. ln yb∗ = (X ι ) βb where βb = βb . cb with βb = (X ι )T (X ι ) (X ι )T ln y, and lt

lt

lt

lt

lt

b (t) = ln y(t) − ln yb∗ (t). ι denotes a vector with every element equal to 1. In the second stage, η we substitute ηb (t − 1) as a regressor in the short-term equation and estimate βst and ε by regressing y(t) ˙ on xst (t) and ηb (t − 1). This yields the parameter estimates βbst and b ε. A straightforward approach for estimating the covariance matrix associated with the parameters βblt , βbst and b ε would be to apply the OLS covariance matrices resulting from the separate stages in the estimation process.13 Under well-known assumptions, the OLS covariance matrix estimator is consistent. We rewrite our two-step method in matrix notation, which yields zlt = Xlt βlt + v,

(C.1)

zst = Xst βst + ηb ε + u,

(C.2)

where zlt = (ln y(t1 ), ln y(t2 ), .., ln y(tn ))T , zst = (y(t ˙ 1 ), y(t ˙ 2 ), .., y(t ˙ n ))T , b = (zlt (t1 − 1) − zblt (t1 − 1), .., zlt (tn − 1) − zblt (tn − 1))T and u and v are the residual vectors. Xlt η and Xst are matrices with row i, xlt (ti )T and xst (ti )T respectively. Adopting the OLS covariance matrix estimator for the separate OLS stages as estimators for the covariance matrices of βblt , βbst and b ε , we find    c βblt = slt2 XltT Xlt −1 , Var −1    c βc b )T (Xlt ηb ) . Var ε = sst2 (Xlt η st ,b

(C.3) (C.4)

Unfortunately, these matrices are not consistent. The special character of the two-step method compromises the consistency of the covariance matrices, as is shown in e.g. Davidson and MacKinnon (2004) for a regular 2SLS formulation and for more specific examples as ours in Pagan (1984). Normally, the covariance matrices can be corrected to yield a consistent estimate, see again Davidson and MacKinnon (2004) and Pagan (1984). In macroeconomic modelling, consistency of an estimator is important though not decisive. A good fit and plausible coefficients are more critical. Besides inconsistency, our system incorporates some additional problems. The endogenous variables ln y and y˙ in the first and second stage are differentially related and therefore the 13

Recall that the covariance matrix of a parameter estimate βb for a regular OLS estimation of a system, y = X β + u, with y ∈ Rn a vector of n observations, X a nxk -matrix containing k columns corresponding with the k regressors, β a vector   −1 2 T b \ containing the k parameters and u a vector of n residuals, is given by Var with OLS β = s X X

s2 =

1 n−k

n b2 ut and b u = MX u. We assume that the error-terms are independently and identically distributed with mean ∑t=1

zero and unknown variance σ 2 .

61

two-step method might not be appropriate to estimate the parameters in equations (4.10) and (4.11). It can occur that the parameter estimators βblt , βc ε are inconsistent, as the first st and b stage does not resolve all endogeneity between the error term and the regressors in the second stage. Note that, for a regular 2SLS system, consistency of the parameter estimators can easily be proven under the exogeneity assumption, see e.g. Davidson and MacKinnon (2004). A second method applied when estimating parameters in systems (4.10) and (4.11) concerns a simultaneous equation approach. The long- and short- term equation are estimated simultaneously using non-linear least-squares. We estimate z = x(βlt , βst , ε ) + u,

(C.5)

where z = (y(t ˙ 1 ), .., y(t ˙ n ))T , x(βlt , βst , ε ) = (xt1 (βlt , βst , ε ) , .., xtn (βlt , βst , ε ))T with  xti (βlt , βst , ε ) = xstT (t1 )βst − ε ln y(ti − 1) − xltT βlt − c , and u = (u(t1 ), .., u(tn ))T . Assuming that the error term u is independent and identically distributed with mean zero and unknown variance  T c σ 2 , we can derive a consistent estimator of the covariance matrix βb = βblt , β ε , st ,b    −1 c βb = s 2 XbT Xb Var ,

(C.6)

where s2 =

2 n n  1 1 ubt 2 = zt − xt (βb) , ∑ ∑ n − klt − kst − 1 t=1 n − klt − kst − 1 t=1

(C.7)

and Xb = X(βb), which denotes an nxk matrix with row Xt (β ) containing the partial derivatives of xt (β ) with respect to β . Again, we can question whether the error terms u and the regressors x satisfy the exogeneity assumption. If not, the covariance matrix given in (C.6) is inconsistent. However, based on similar arguments as before, we apply (C.6) as an estimator of the covariance matrix of the parameter βb whenever we use NLS for its estimation. Finally, we sometimes use three-stage least squares (3SLS). In that case, we focus on system (C.8) and (C.9), z∗lt = Xlt βlt + v,

(C.8)

zst = Xst βst + η ε + u,

(C.9)

where z∗lt = (ln y ∗ (t1 ), ln y ∗ (t2 ), .., ln y ∗ (tn ))T , and η = (zlt (t1 − 1) − zlt∗ (t1 − 1), .., zlt (tn − 1) − zlt∗ (tn − 1))T and u and v are the residual vectors. This

system differs from (C.1)–(C.2) as the short-term and long- term parameters are investigated simultaneously as is made explicit by the treatment of the error correction term. This error term is no longer fully determined before estimating the short-term equation. We assume that the error terms v and u are correlated and that there might exist both heteroskedasticity and contemporaneous correlation in the residuals. Let Ω denote the covariance matrix between the residuals u and v. Furthermore, we assume that the regressors, Xlt , Xst and η might be correlated with the error terms u and v. Moreover, the exogeneity assumption is not satisfied.

62

We first estimate system (C.8)– (C.9) using a 2SLS (generalised instrumental variable) method. The resulting 2SLS residuals b u and b v are subsequently applied to derive an estimate of the covariance matrix of u and v. Substitution of this matrix into the efficient GMM estimator of   b c system (C.8)– (C.9) results in the parameter estimate β[ ε3SLS . For a 3SLS = βlt 3SLS , βst 3SLS ,b thorough description of the 3SLS-technique, we refer to Davidson and MacKinnon (2004). The catch is in the identification of appropriate instruments for the 2SLS stage. The choice of a particular instrument is often based on the literature. The covariance matrix of the 3SLS estimator is consistent and can be used in a description of parameter uncertainty.

63