Comparing stochastic volatility models through ... - Semantic Scholar

4 downloads 14749 Views 334KB Size Report
Comparing stochastic volatility models is not an easy task, due to the presence .... In this framework log-prices Yt and log-volatilities Zt are modeled as ..... and independent chains with different starting values dispersed over the domain of the.
Computational Statistics & Data Analysis 50 (2006) 1678 – 1699 www.elsevier.com/locate/csda

Comparing stochastic volatility models through Monte Carlo simulations Davide Raggia,∗ , Silvano Bordignonb a Department of Economics, University of Verona, 37129 Verona, Italy b Department of Statistics, University of Padova, 35121 Padova, Italy

Received 24 December 2004; received in revised form 10 February 2005; accepted 10 February 2005 Available online 11 March 2005

Abstract Stochastic volatility models are important tools for studying the behavior of many financial markets. For this reason a number of versions have been introduced and studied in the recent literature. The goal is to review and compare some of these alternatives by using Bayesian procedures. The quantity used to assess the goodness-of-fit is the Bayes factor, whereas the ability to forecast the volatility has been tested through the computation of the one-step-ahead value-at-risk (VaR). Model estimation has been carried out through adaptive Markov chain Monte Carlo (MCMC) procedures. The marginal likelihood, necessary to compute the Bayes factor, has been computed through reduced runs of the same MCMC algorithm and through an auxiliary particle filter. The empirical analysis is based on the study of three international financial indexes. © 2005 Elsevier B.V. All rights reserved. Keywords: Jump diffusion; Stochastic volatility; MCMC; Particle filters; Bayes factor; VaR

1. Introduction Stochastic variance models are an important tool to describe financial time series due to their flexibility in modeling time-varying volatilities, which is a typical pattern in financial applications. In general such models are specified in continuous time and are ∗ Corresponding author. Tel.: +39 049 8274174; fax: +39 049 8274170.

E-mail address: [email protected] (D. Raggi). 0167-9473/$ - see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2005.02.004

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1679

used to price financial derivatives. In this paper we study some different non-nested parameterizations of stochastic volatility models recently proposed. More precisely, we consider models with and without jumps and with different specification on the volatility equations. Following Chernov et al. (2003) we distinguish between logarithmic and affine models. Furthermore, the introduction of jumps in modeling financial returns seems appropriate in order to describe rare events like crashes in the market. On the other hand, it is not intuitive to understand whether the volatility process jumps or not. Sometimes there is evidence that the volatility dynamics hardly follow a diffusive behavior and tend to sharply increase when a jump is observed in the return series. Some empirical evidence shows that taking into account jumps, together with stochastic volatility, leads to an improved fit of the data on derivatives (see Bakshi et al., 1997). In any case, it is not clear whether the introduction of a jump component improves the description of the underlying financial returns. Comparing stochastic volatility models is not an easy task, due to the presence of one or more latent factors in their specification. In the recent literature, Chernov et al. (2003) compare non-nested parameterizations using the efficient method of moments (EMM). In their analysis, models with and without jumps based on affine and logarithmic dynamics have been included. From a different point of view, simulation techniques appear to be a viable and simpler solution to rank alternative specifications. More precisely, sequential procedures and Markov chain Monte Carlo (MCMC) algorithms are useful to compute, respectively, the likelihood function and the posterior distributions. These Monte Carlo strategies naturally lead to the use of Bayesian tools for the analysis. For example, Bayes factors have been used in Chib et al. (2002) and in Eraker et al. (2003) to compare, respectively, logarithmic and affine models. A different approach based on the deviance information criterion (DIC) has been recently introduced in Berg et al. (2004) and has been applied to logarithmic parameterizations. Their analysis does not include affine models. All of these papers analyze the models just according to their ability to fit the data. Form a completely different point of view Eberlein et al. (2003) analyze some models, not necessarily based on stochastic variances, according to their forecasting performance. This seems a reasonable approach if we would like to focus on risk management applications. We face the problem of the comparison taking into account both goodness-of-fit statistics and forecasting performance. The first criterion we use is based on the Bayes factor, which seems a natural choice from a Bayesian point of view. The second criterion relies on the ability to forecast conditional variances. We test the forecasting performance studying the value-at-risk (VaR) statistics. Our evaluation has been carried out through MCMC and particle filtering procedures. In order to make inference we adopt an efficient MCMC strategy. The inferential procedure is based on the delayed rejection (DR) algorithm proposed in Tierney and Mira (1999) in a medical context and originally applied to affine models with jumps in Raggi (2004). To compute the likelihood function and the one-day-ahead forecasts we adopt a version of the auxiliary particle filter (Pitt and Shephard, 1999). The use of particle filters for stochastic volatility models with jumps has been introduced in Chib et al. (2002) and in Johannes et al. (2003). In those works volatilities and returns are uncorrelated. We adopt a particle filter that take into account the feedback effect induced by the correlations between processes.

1680

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

The first goal of this paper is to review some recent models that describe the conditional variance of some asset return. Furthermore, we are interested in detecting whether the introduction of more general dynamics, i.e. the inclusion of jumps, suggests an improvement on the fitting of the data. Second, we aim at assessing some convergence properties of the algorithms used through some well known convergence diagnostic tests. Third, we test the empirical performance of the procedures described analyzing three international financial indexes, i.e. the Dow Jones Industrial, the Nikkei 225 and the CAC 40. The remainder of this paper is organized as follows. In Section 2 various stochastic volatility models are introduced. In Section 3 the inferential solution is described. Furthermore, some simulation results about the performance of the estimation procedure are also provided. In Section 4, model selection strategies are described together with the methodology proposed to evaluate them. More precisely, the Bayes factor and the VaR are computed through sequential Monte Carlo algorithms and MCMC strategies. Finally, in Section 5 the empirical results are illustrated.

2. Stochastic volatility models In this section we introduce the models we consider in our analysis. Adopting the notation used in Chernov et al. (2003), we study dynamics belonging to the logarithmic and to the affine class. In the logarithmic models the variance is expressed as an exponential function of the latent factors whereas, in the affine framework, drift and diffusion are linear functions of the non-observable state. An example of logarithmic models is given in Hull and White (1987) while a general treatment on affine dynamics is given in Dai and Singleton (2000) and Duffie et al. (2000). We first consider a discrete time stochastic volatility model that belongs to the logarithmic class. It is an approximation, evaluated on a set of discrete points {i : i = 1, . . . , N} of the popular continuous time diffusion driven by a Brownian motion proposed in Hull and White (1987). In the recent econometric literature this model and some extensions has been extensively analyzed in Jacquier et al. (1994), Kim et al. (1998), and Jacquier et al. (2004) among others. For this reason we consider this parameterization as a benchmark for the other models analyzed in this paper. Returns Yi+1 and the log-volatilities Zi+1 = log Vi+1 are described by   −1/2 Yi+1 = exp Zi /2 i+1 i+1 , Zi+1 =  + Zi +  i+1

(1)

  in which i , i is a bivariate Normal random variable with correlation . The parameter  is the persistence of the volatility process,  is its expected value and  can be  interpreted as the volatility of the volatility. The random variable i+1 ∼ Gamma 2 , 2 allows us to extend the error term to a Student t with degrees of freedom. In the financial literature this latter specification is also appealing because it links stochastic volatility models to Lévy processes (see in this context Barndorff-Nielsen et al. (2001), for a survey on this topic).

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1681

Of course, if we constrain i+1 = 1 we return to the original model driven by a Brownian motion. Recently, more general dynamics have been proposed to model financial assets. The class of models we consider belongs to the affine jump diffusion family, introduced in Duffie et al. (2000) and extensively analyzed in Eraker et al. (2003) and in Eraker (2004). In this framework log-prices Yt and log-volatilities Zt are modeled as   dYt =  − 21 eZt− dt + e(1/2)Zt− dW1,t + HtY dNt ,     dZt = e−Zt− − 1 − 21 2v e−Zt− dt   + v e−(1/2)Zt dW2,t + log 1 + HtV eZt− dNt ,

(2)

  where W1,t , W2,t are Brownian motions such that dW1,t dW2,t =  dt. The log-volatility equation is obtained by applying a general version of the Itô lemma to dVt = ( − Vt− ) dt + v Vt− dW2,t + HtV dNt . (3) For all the models described up to now, a negative correlation can be interpreted as the leverage effect, that is, the asymmetric response to positive and negative shocks. Here Zt− indicates the left limit of a trajectory. Nt is a counting process whereas Hti , i = Y, V

t are the jump’s sizes or marks. The jump components Hti dNt = N Hji , i = Y, V are j =0   V ∼ Exp  and H Y |H V ∼ marked point processes. In this version, indexed by DPS, H v  

N y + j H V , 2y . It is worth noting that the jump component is common for both the equations but the jump’s sizes are different and correlated. The intensity  of the marked point process is assumed constant over time. Some conditional moments for this model are derived in Eraker et al. (2003). This specification is certainly appealing in mathematical finance, since it allows the pricing of financial derivatives in semi-closed form. In the affine framework it is in fact possible to compute the Fourier transform for YT |Ft and then evaluate a discounted payoff through numerical integration methods. It is also interesting to note that model (2) encompasses many other well known specifications. For example Heston’s model (Heston, 1993) can be obtained simply constraining  = 0, whereas Bates’s model (Bates, 1996) can be obtained imposing HtV = 0. For jump diffusion models in general the discretization procedure is not trivial, due to the dependence between the jump’s size and the state Z. In the general model defined in (2), the jump size is in fact dependent on the state Zt− . For this reason, to approximatethe continuous trajectories, we need to use a general Euler scheme. To be more precise we apply a particular case of the scheme proposed in Glasserman and Merener (2001) that leads to       Yi+1 = − 21 eZi (i+1 −i ) +e(1/2)Zi W1,i+1 −W1,i + HYi+1 Ni+1 − Ni , Z− = Zi + f0 i+1







  Zi (i+1 − i ) + f1 Zi W2,i+1 − W2,i ,

(4) (5)

1682

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699



V

Zi+1 = Z− + log 1 + Hi+1 e i+1

Z−

i+1





Ni+1 − Ni



(6)

    in which i+1 − i = , Yi+1 = log Si+1 /Si and Ji+1 = Ni+1− Ni is described  y by a Binomial distribution Bi(1,  ). The error terms are such that i+1 , vi+1 is a bivariate variable with  Gaussian random   and correlation . To fix the notation  zero mean Y = Y1 , . . . , YN , whereas Yi = Y1 , . .. , Yi indicates past observations up to time     i . Furthermore, f0 Zi = e−Zi − 1 − 21 2v e−Zi and f1 Zi = v e−(1/2)Zi . In practice, the use of daily intervals, i.e. = 1, seems to produce small biases with respect to the continuous model as stated in Eraker et al. (2003). For this reason we substitute i with t for the rest of the paper.

3. Inference We estimate all the models described in Section 2 through an efficient MCMC strategy. The goal of the inferential procedure is to estimate parameters and latent processes that in the more complex case are jumps and volatilities. Since volatilities and jumps are not observed, they are treated as missing data in an MCMC setup where the target distribution is (,V, J|Y). The vector V represents the stochastic volatilities, J is the jump process and  is the set of parameters. The technique aims at estimating the distribution p(|Y) simply by recursively simulating the latent processes and the parameters. This approach was introduced by Carlin et al. (1992) in state space modeling and applied to the stochastic volatility framework in Jacquier et al. (1994, 2004) and Kim et al. (1998). We adopt an MCMC strategy based on the DR method introduced in Tierney and Mira (1999) and Mira (2002) and applied to affine jump models in Raggi (2004). The idea is based on the componentwise Metropolis–Hastings strategies (see Levine et al. (2005) for a treatment on these topics). The goal of this algorithm is to reach the so-called Peskun optimality (Peskun, 1973) by reducing the number of rejections at each step of the routine. The procedure works as follows. If the MH algorithm rejects a candidate, we resample the new state of the chain from a different proposal, taking into account the information contained at the previous stages. In practice, if during the run of the chain a candidate y for Xt is rejected, a further MH step according to a new proposal is appended. To be more precise, the proposal at the ith step depends on the former rejections y1 , . . . , yi−1 , i.e. qi (x, y1 , y2 , . . . , yi ). In the general case Tierney and Mira (1999) and Mira (2002) prove that for each sub-step of the MH algorithm the acceptance probability i (x, y1 , . . . , yi ) that maintains the reversibility of the chain is

i (x, y1 , . . . , yi ) = min











(yi ) q1 yi , yi−1 q2 yi , yi−1 , yi−2 . . . qi yi , yi−1 , . . . , x

(x)q1 (x, y1 ) q2 (x, y1 , y2 ) . . . qi (x, y1 , yi )



         1−1 yi , yi−1 1−2 yi , yi−1 , yi−2 . . . 1−i−1 (yi , . . . , y1 )    ,1 . × [1−1 (x, y1 )] [1−2 (x, y1 , y2 )] . . . 1 − i−1 x, y1 , yi−1

(7)

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1683

The sampling scheme can be summarized as Delayed Rejection algorithm if y ∼ p1 (x, y) is rejected in MH then for i = 2; i k do Sample yi from qi (x, y, y1 , . . . , yi ); Accept yi with probability i (x, y1 , . . . , yi ) given by (7); if yi is accepted then Xt+1 = yi ; i = k + 1; end if end for end if Xt+1 = yi or Xt+1 = Xt . We use the DR to update the volatility process since simpler strategies based on independence and random walk proposals seem to give a poor approximation of the conditional distribution of Vt . We found the independence proposal inefficient. We adopt the DR to improve the solution based on a random walk proposal of Eraker et al. (2003). It is interesting to note that the method allows for a lot of flexibility, since there are no constraints for the selection of the proposal. Furthermore, this way of operating maintains the Markov property of the chain. Each sweep of the algorithm can be described as follows:   Update Zt , t = 1, . . . , T from p Zt |J, , Z−t , H V , HY ,Y via DR.  V Y Update the jump times Jt , t = 1, . . . , T from  pV Jt |Z, , H Y , H ,Y . V Update the sizes Ht , t = 1, . . . , n from p Ht |Z, , Jt , H ,Y . Y Update the sizes H p HtY |Z, , J, HV ,Y .  t , t = 1, . . . , TY from V Update  from p k |−k , Z, J, H , H ,Y , where  is the vector of the parameters, k is its kth element and −k is the vector without the kth element. If we look at the DPS model, the prior for the parameters are the following:  ∼ N(0, 25), ∼ N (0, 1), ∼ N (0, 1), 2v ∼ U (0, 1),  ∼ U (−1, 1),  ∼ Beta(2, 40), j ∼ N (0, 4), v ∼ I G(20, 10), 2y ∼ I G(5, 20), y ∼ N (0, 100), ∼ U (5, 45). This choice has been made in Eraker et al. (2003) and basically leads to a closed form for the full conditional for many parameters. This is a great advantage from a computational point of view. The only two exceptions are  and 2v . To update them we apply the adaptive rejection metropolis sampling (ARMS) algorithm of Gilks et al. (1995). In order to check the level of accuracy of the algorithm we apply the DR methodology to some simulated data sets. We describe the results for the general model with two jump processes, because from a statistical and a computational point of view it is the more general. The main improvement provided by the DR proposal for Vt concerns the reduction of the rejections at each step of the procedure. In our analysis we adopt a version of the

1684

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699 1 ACF−Delayed Rejection proposal

ACF−Random Walk proposal

0

0

200

400

600

800

Fig. 1. Correlogram of T −1

1000

1200

1400

1600

1800

2000

2200

2400

T

t=1 Vt : random walk vs. delayed rejection proposal.

method based on three stages even though the algorithm works reasonably well with just two steps. At a first level we adopt an independence proposal based on the results of Eraker (2001). At this stage, the proposal for Vt is a mixtureof Normal  random variables with same mean t = 0.5 (Vt+1 + Vt−1 ), variances 2t = 2v 1 − 2 Vt−1 /2 or 20 and weights 0.95 and 0.05, respectively. This proposal seems appropriate to take care of the tails of the conditional distribution. We tried other heavy tailed proposals, though we did not observe any significant improvement. At the second and third steps we used a random walk proposal with the same variance adopted before. These extra steps aim at avoiding an eventual bad behavior of the independence chains. With this setup, the proportion of rejections at each sweep of the algorithm drops from 30–40% to the 5%. This leads to a sensible reduction of the autocorrelation of the chain as shown in Fig. 1. This results are consistent with the ones obtained for real data. The reduction is considerable, and in our experience it is necessary to record one draw in every 3 or 4 to obtain a similar estimate of the autocorrelation function using a random walk proposal. In order to assess the asymptotic properties of the algorithm, we used some well known convergence diagnostic tests implemented in the language R through the routine BOA (Bayesian Output Analysis Program) provided by Smith (2004). This strategy provides some suggestions about the number of iterations needed for the convergence to the invariant distribution. For some details on the diagnostics adopted here see Brooks and Roberts (1998). To perform such analysis we run parallel and independent chains with different starting values dispersed over the domain of the parameters. In order to choose a proper burn-in we analyze the Gelman and Rubin shrink factors (Gelman and Rubin, 1992). This diagnostic test is based on the comparison of the within and the between chain variance for each parameter of interest. The test gives as output the so-called shrink factor related to some given quantile (in our analysis the 50% and the 97.5%). As a rule of thumb, these quantities have to be approximately close to 1 or less than 1.2, respectively, for the median and for the 97.5% quantile. In that case we can say that the samples are drawn from the true posterior. To compute the test, we run 10 independent chains for 60,000 iterations with a burn-in of 10,000. This should be an appropriate choice since pilot runs of the algorithm show no multi-modality on the estimated posterior. The true parameters are reported in the second column of Table 1,whereas the estimates for the shrink factors are shown in the third and fourth columns.

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1685

Table 1 Convergence checking statistics Gelman–Rubin



v   j v y y

Raftery–Lewis

True

Median

0.975 Quantile

Burn-in

Total

0.05 0.03 0.5 0.1 − 0.5 0.008 − 0.4 1 3.5 −2

1.002 1.006 1.004 1.005 1.004 1.014 1.067 1.009 1.009 1.044

1.005 1.013 1.008 1.010 1.008 1.029 1.126 1.018 1.019 1.086

13 34 5 247 378 41 501 133 215 1337

1219 3038 456 19931 23732 3591 31671 10937 14733 83768

All the factors are close to 1 and therefore, according to the test, we accept the hypothesis of convergence. We report the graphical analysis of the shrink factors in Fig. 2. It is clear that a burn-in of 10,000–15,000 iterations is necessary to reach convergence for each parameter. We also compute the Raftery and Lewis test (Raftery and Lewis, 1992) that consists of estimating a given percentile of a parameter of interest with a predetermined precision. We estimate the 2.5% quantile with a precision of 2%. This procedure suggests the length of the burn-in and the number of iterations required to reach the given accuracy. The results are reported in Table 1, in the fifth and sixth columns. They have been computed by averaging the burn-in and the total number of iterations suggested for each parameters and for each of the 10 chains. It is evident that the burn-in required is very small, even though for some parameters, i.e. y , it is necessary to run a longer chain to obtain the expected accuracy. This is basically due to the high autocorrelation among draws. Though, we observed this bad behavior in just a few experiments. Finally we consider the Geweke Z statistic (Geweke, 1992) that tests the stationary of the chain, monitoring the behavior of the average of the draws. In practice, after splitting the chain into two windows, it compares the means of the first and the last iterations. If the p-value is bigger than 5% we accept the hypothesis of convergence. In our analysis the test accepts the hypothesis of convergence very often. However, in a few simulations, it seems that some care has to be taken for the parameters y and v , in which case this diagnostic test suggests a slightly longer burn-in period than 10,000 iterations. The convergence analysis suggests that the algorithm is in general accurate and that a burnin of 10,000–15,000 iterations guarantees an appropriate convergence. In some occasions difficulties arise for the parameters related to the jump processes in which case some care has to be taken in the empirical analysis. The reason of these occasional pitfalls probably depends on the nature of the jumps, that are rare events. For this reason it is difficult to identify the parameters through the data. This lack of identification has also been noticed in Chib et al. (2002) and Eraker et al. (2003).

1686

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

97.5% Median

97.5% Median

3.0

µ

κ

1.10

2.0 1.00

1.0 0

10000

20000

30000

40000

50000

0

Last Iteration in Segment

20000

30000

40000

50000

Last Iteration in Segment 5

97.5% Median 2.0

97.5% Median

4 σv

κ∗θ

10000

3 2

1.0

1 0

10000

20000

30000

40000

50000

0

Last Iteration in Segment

10000

20000

30000

40000

50000

Last Iteration in Segment 4.0 97.5% Median

97.5% Median

3.0

λ

ρ

1.8 1.4

2.0

1.0

1.0 0

10000

20000

30000

40000

50000

0

Last Iteration in Segment

97.5% Median

30000

40000

50000

6

97.5% Median

4 µv

10 ρj

20000

Last Iteration in Segment 5

14

3 2

2

1 0

10000

20000

30000

40000

0

50000

10000

20000

30000

40000

50000

Last Iteration in Segment

Last Iteration in Segment 6 5

20

97.5% Median

4 3 2 1

µy

σy

10000

97.5% Median

10 5

0

10000

20000

30000

40000

Last Iteration in Segment

50000

0

10000

20000

30000

40000

50000

Last Iteration in Segment

Fig. 2. Gelman and Rubin shrink factors for each parameter.

4. Model selection In this section we describe the procedures implemented to evaluate and compare the models defined by (1) and (4)–(6). We estimate the Bayes factor to evaluate the goodness-

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1687

of-fit of the models. Furthermore, the use of a VaR forecasting procedure aims at evaluating the out-of-sample ability to predict the conditional variances. To fix the notation, we index the j th model by Mj and its vector of parameters by j .

4.1. Bayes factor There are different approaches available in Bayesian statistics to evaluate the best model that fits the data. Recently, the DIC proposed in Spiegelhalter et al. (2002) has been applied to stochastic volatility models in Berg et al. (2004). We rely on another criterion to perform model selection, i.e. the Bayes factor, which is defined as follows:   p (Mi |Y) /p Mj |Y m (Y|Mi ) =  ,  Bij =  m Y|Mj p (Mi ) /p Mj

(8)

where  m (Y|Mi ) =

p (Y|Mi , i ) (i |Mi ) di

(9)

is the marginal likelihood of the ith model. This can be decomposed as m (Y|Mi ) =

p (Y|Mi , i ) (i |Mi )

(i |Y, Mi )

∀i ∈ supp (i ) .

(10)

The use of the Bayes factors has been recently adopted in Chib et al. (2002) and in Eraker et al. (2003) in a stochastic volatility framework. When a latent process is considered, the computation of the Bayes factor becomes a challenging task, since many components of Eq. (10) are not available in closed form. This is the case of stochastic volatility models in which conditional variances and jumps are not observed. In practical applications, it is convenient to use the decomposition stated in (10) and  evaluate it at a given point ˆi . The first problem is to estimate the posterior ˆi |Y, Mi . This quantity is computed using the method proposed in Chib and Jeliazkov (2001), based on a sequence of reduced run of the same MCMC algorithm used   for the inference. The second ˆ problem arises when computing the likelihood p Y|Mi , i since the latent component has to be integrated out. We estimate the likelihood through a version of the auxiliary particle filter proposed in Pitt and Shephard (1999). A similar solution has been recently proposed for different models in Chib et al. (2002) and in Johannes et al. (2003). The main difference in this application is the introduction of a feedback effect equations induced by .  y between  In this case the covariance matrix of the error term t , vt has to be rewritten according to its Cholesky decomposition. The jump process has been  integrated out.  In this way, the processes estimated through the filter reduce to Xt = Zt−1 , HtY , HtV .The auxiliary

1688

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

particle filter for this application is summarized as follows: Auxiliary Particle filter for t = 1, t T do   i Given a sample Zt−1 , HtY , HtV : i = 1, . . . , M from p (Xt |Yt );   i i i , H Y i , H V i . Let H Y i = H V i = 0; Compute X t+1 , Z t = E Z |Z t t+1 t+1 t t t−1  i

Compute i =

p Yt+1 |X t+1

M

;

i

  Resample R times the  indexes iwith weights i , i.e. mj : j = 1, . . . , R ; i=1

mj

∗j

Draw Xt+1 from p Xt+1 |Xt Compute  ˆi =

p



∗j Yt+1 |Xt+1





; mj

/p Yt+1 |X t+1

M

i=1

ˆi 



;

Resample ˆ i;  i M times the indexes  1, . . . , R with weights  Store Xt+1 : i = 1 . . . , M from p (Xt+1 |Yt+1 ); end for Once the states are filtered, it is easy to evaluate the likelihood by T    pˆ (Yt |Yt−1 ) , p YT |ˆi =

(11)

t=1

  p Yt |Xt , ˆi p (Xt |Yt−1 ) dXt can be estimated through a Monte   i Carlo procedure, simulating the state Xt from p Xt |Xt−1 , i = 1, . . . , M and in which i Xt−1 is the outcome of the filtering procedure at time t − 1. The estimated likelihood is needed to compute the Bayes factor. In practice, the Bayes factor evaluation is synthesized by where p (Yt |Yt−1 ) =



  • posterior distributions: ˆi |Y, Mi ⇒ Reduced MCMC;   • likelihood: p Y|Mi , ˆi ⇒ Auxiliary particle filter. 4.2. Value-at-risk In order to evaluate the ability of the models to predict the future behavior of the volatility process, we study the forecasted VaR, which is also a common tool to measure financial and market risks. According to Jorion (1997) VaR measures the worst expected loss over a given time interval under normal market conditions at a given confidence level. To compute the VaR it is then important to fix a confidence level and a time interval that describes the number of days we need to hold a given portfolio and for which we are interested in evaluating the risk. In practice, VaR traces back to the computation of a quantile of interest that represents the probability associated to a certain extreme loss. The i-days ahead VaR forecast is defined by  = Pr t−i {Yt V aR t ()} ,

(12)

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1689

where i represents the forecasting time horizon which for the rest of the paper is fixed to 1. The probability  is the confidence or coverage level for the VaR and is often taken to be 1% or 5%. Following Niguez (2002), the one-day-ahead VaR forecast is computed as V aR t+1 () =  t+1 +  t+1 z ,

(13)

t+1 and  t+1 where z is the quantile associated to the probability level . The quantities  are the forecasts of the conditional mean and of the conditional variance for the model considered. In order to forecast the volatility and the drift, we adopt the same auxiliary particle filter used for evaluating the likelihood. Once the log-volatility Zt is estimated by the filter, we simulate the one-day-ahead forecast for t+1 and t+1 . Then, according to (13), we compute the quantile V aR t+1 (). To verify the accuracy of the VaR estimates for the analyzed models, we adopt a backtesting procedure, based on the counting of the number of excess of losses observed on a given sample. We exploit the procedure proposed in Christoffersen (1998) and Christoffersen and Pelletier (2004). This procedure is based on the study of the random sequence  1 if Yt < V aR t (), It = (14) 0 else. A sequence of VaR forecasts has correct conditional coverage if It is an i.i.d. sequence of Binomial random variables with parameter . In practice, this hypothesis can be verified by testing jointly the independence on the series and the unconditional coverage of the VaR forecasts, i.e. E [It ] = . If we assume that It follows a first order Markov sequence, with transition probability   1 − p01 p01 (15) P= 1 − p11 p11 then the null hypothesis of the conditional coverage test is H0 : p01 =p11 =p. The likelihood ratio test statistic is distributed according to a 2 (2) random variable. 5. Empirical results Our empirical analysis focuses on the study of three international financial indexes: the Dow Jones Industrial (1/1/1983–24/5/2004), the Nikkei 225 (4/1/1991–24/5/2004) and the CAC 40 (2/1/1990–24/5/2004). The data sets have been  downloaded from Datastream. The returns are defined as Yt =100× log St − log St−1 . We used the last 1500 observations to perform the out-of-sample analysis whereas we used the others (until 24th August 1998) to estimate the parameters and to evaluate the goodness-of-fit statistics. The plot of the time series is shown in Fig. 3. All the calculations made in this section are based on software written using the Ox䉷 3.0 language of Doornik (2001). The graph illustrates that the volatility is not constant. For the Dow Jones series the big crash dated the 19th October 1987 is evident. Eraker et al. (2003) notice that in correspondence with that jump event, it is also possible to observe a jump behavior on the volatilities. For this reason it seems sensible to introduce a jump component also on the variance equation.

1690

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

Nikkei 225

10 0 −10

August, 24th 1998 1991

1993

1995

1997

1999

2001

2003

10

CAC

5 0 −5

August, 24 1998 Dow Jones Industrial

1992

1994

1996

1998

2000

2002

2004

0 −20

August, 24 1998

−40 1984

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

Fig. 3. Returns.

We estimate six different stochastic volatility models belonging to the different classes described in Section 2. We also estimate two APARCH models, that will be used as a benchmark for the others, in order to evaluate the VaR. First of all we report the estimates obtained for the affine models. They are, respectively, the stochastic volatility model proposed in Heston (1993), the stochastic volatility with a jump component in the returns introduced in Bates (1996) and the contemporaneous jump stochastic volatility model (DPS) of Duffie et al. (2000). This latter model has also been extended to allow for fat tails on the returns. In this case we assume a Student t error term with degrees of freedom. The model is indexed by DPS t. Inference has been performed running the algorithm for 100,000 iterations with a burnin of 50,000. This seems to be an appropriate choice, considering the results obtained in Section 3. The results for the three indexes are reported in Tables 2–4. We conduct the analysis for the Dow Jones series. For the other indexes the conclusions are quite similar. The main difference is due to the different impacts of the jumps on the volatility. In fact, for the last two series, we consider different time intervals that do not include the crash occurred in October 1987. √ For the Heston model, the average annualized volatility is 260 × . The estimate of the quantity is 13.94, which is fairly different with respect to the sample volatility 16.17. The parameter is the mean reversion of the volatility equation and is 0.051. The correlation parameter  is close to zero. This result seems surprising with respect to the evidence provided in literature. However, this behavior is most likely due to the presence of the volatility factor in the return’s drift. This finding is confirmed by many Monte Carlo experiments (not reported here). The variance of the volatility process is 2v Vt so that the parameter v , that is 0.157, is important to asses the volatility of the volatility. If the first jump component is taken into account, the expected volatility decreases. This is because of the introduction of a

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1691

Table 2 Dow Jones Industrial—affine models



v   

Heston

Bates

DPS

DPS t

0.2955 (0.0004) 0.0514 (0.0007) 0.7481 (0.0013) 0.1578 (0.0011) 0.0182 (0.0027)

0.2761 (0.0006) 0.0352 (0.0007) 0.7104 (0.0053) 0.1139 (0.0008) −0.0370 (0.0055) 0.0131 (0.0003)

0.2719 (0.0009) 0.0802 (0.0020) 0.5321 (0.0031) 0.1361 (0.0017) −0.0390 (0.0051) 0.0194 (0.0005)

0.2519 (0.0004) 0.0520 (0.0012) 0.5432 (0.0038) 0.1124 (0.0006) −0.0340 (0.0069) 0.0057 (0.0003) 15.8617 (0.1198)

−0.3505 (0.0465)

−0.3103 (0.0381)

−3.0476 (0.3120)

0.5060 (0.0083)

0.6208 (0.0102)

1.9045 (0.0352)

2.4270 (0.0698)

 E HY   E HV

y

4.3495 (0.0634)

In parenthesis are reported the Monte Carlo standard errors.

Table 3 Nikkei Index—affine models



v   

 E HY   E HV

y

Heston

Bates

DPS

DPS t

0.6762 (0.0027) 0.0032 (0.0005) 0.3728 (0.2286) 0.1397 (0.0015) −0.3485 (0.0059)

0.4322 (0.0030) 0.0100 (0.0011) 0.9856 (0.2697) 0.1362 (0.0020) −0.3512 (0.0155) 0.0851 (0.0014)

0.4834 (0.0019) 0.1067 (0.0027) 1.1517 (0.0078) 0.2600 (0.0029) −0.0922 (0.0071) 0.0554 (0.0012)

0.4005 (0.0029) 0.0196 (0.0018) 1.3577 (0.0941) 0.1377 (0.0025) −0.3877 (0.0144) 0.0127 (0.0011) 5.1035 (0.0023)

1.4041 (0.0213)

2.2632 (0.0516)

1.4167 (0.2577)

0.4810 (0.0062)

0.4732 (0.0064)

2.1391 (0.0111)

1.8384 (0.0163)

1.9953 (0.0412)

In parenthesis are reported the Monte Carlo standard errors.

jump component: in fact it explains part of the volatility that in the previous model has been described just by a diffusive process. The parameter  is 0.013.This means that the model   expects 3.4 jumps per year. The annualized spot volatility, i.e. 260 + v  / reduces to 13.59% and the annualized total volatility, that is the mean square error of the returns that take into account the jump component, is 15.79%. In percentage terms, the average effect of the jump component on the total volatility is approximately 25.9%. The DPS model is now analyzed. The introduction of the second jump component increases the parameter

and then the mean reversion phenomenon sensibly speeds up. The parameter  basically remains unchanged with respect to Bates’s model. The annualized spot volatility decreases with respect to the other models and is 13.0%. On the other hand, the total volatility is 13.7% increased by around 10% by the return jumps. It is evident that there exists a reduction with

1692

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

Table 4 CAC Index—affine models



v   

 E HY   E HV

y

Heston

Bates

DPS

DPS t

0.6351 (0.0017) 0.0194 (0.0019) 1.3553 (0.3012) 0.1616 (0.0027) −0.2683 (0.0092)

0.5670 (0.0018) 0.0186 (0.0018) 2.2830 (0.1556) 0.1371 (0.0026) −0.1643 (0.0099) 0.0235 (0.0005)

0.5886 (0.0018) 0.0453 (0.0044) 2.2568 (0.0781) 0.1730 (0.0040) −0.1167 (0.0102) 0.0151 (0.0004)

0.4685 (0.0016) 0.0254 (0.0017) 1.9188 (0.0508) 0.1321 (0.0020) −0.1456 (0.0077) 0.0046 (0.0003) 5.6554 (0.0057)

1.2168 (0.0578)

3.1061 (0.0371)

1.1524 (0.093408)

0.1729 (0.3777)

0.4655 (0.0077)

0.5202 (0.0077)

3.4260 (0.0669)

2.0123 (0.0388)

In parenthesis are reported the Monte Carlo standard errors.

respect to the model proposed by Bates. This is due to the extra jump component that explains part of the volatility of  the complete model. In fact, the contribution of the jump √ on volatilities shifts Vt− to Vt− + HtV . The effect is mild with respect to the findings shown in Eraker et al. (2003) and in this work is approximately 4%. The introduction of the Student t distribution reduces the intensity  that now is 0.0057. The estimate for the degrees of freedom is 15.8. This result in some way rejects the hypothesis of normality. Furthermore, the jump sizes augment in absolute term. We find that the expected number of jumps are much smaller with respect to DPS but with higher impact. This is because the introduction of the new error term explains the behavior of many observations that are in the tails but that are not too extreme. On the other hand, the jump parameters are influenced by just a few returns that are really extreme. An example is represented by the crash observed in October 1987. The second group of models includes the one defined by Eq. (1) and it is indexed by HW. Its generalization with heavy tails is HW t. The prior distributions are:  ∼ N (0, 10)I[−1,1] ,  ∼ N(0, 100), ∼ I G(1, 0.005), | ∼ N (0, /2), ∼ U (5, 45) where  and  are

such that v = 2 +  and  = /v . This choice is the same adopted in Jacquier et al. (2004) and simplifies the updating scheme for the parameters. The results for the three series are reported in Table 5.Here, there is significant evidence of fat tails in the returns. The parameter is in general smaller than 10, apart for the CAC 40 series, signaling departures from the Normal assumption in favor of a distribution with fatter tails. The parameter  is in general close to 0.97 for the model with Gaussian errors and shifts towards 1 when the Student t is introduced. This is a standard result for this type of model. Thus for this data set the t specification seems the correct one. The posterior for v is also affected in a dramatic way by the normality assumption. The correlation  is negative, suggesting a strong leverage effect. Those estimates are different to the results reported in Tables 2–4. This difference depends on whether the volatility enters or not the drift term. This finding has been noticed through an extensive Monte Carlo analysis, available upon request.

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1693

Table 5 Results on Hull–White-type models Dow Jones

  v 

Nikkei

CAC

HW

HW t

HW

HW t

HW

HW t

0.9757 (0.0003) −0.0104 (0.0001) 0.1657 (0.0014) −0.2756 (0.0033)

0.9946 (0.0001) −0.0038 (0.0001) 0.0755 (0.0010) −0.2800 (0.0092) 5.907 (0.045)

0.9796 (0.0006) 0.0075 (0.0002) 0.1697 (0.0028) −0.4431 (0.0101)

0.9887 (0.0003) −0.0007 (0.0001) 0.1194 (0.0020) −0.6532 (0.0091) 7.2652 (0.1982)

0.9766 (0.0008) 0.0070 (0.0002) 0.1417 (0.0027) −0.4274 (0.0082)

0.9849 (0.0006) 0.0026 (0.0002) 0.1074 (0.0027) −0.4970 (0.0099) 15.904 (0.7458)



Table 6 Marginal likelihoods Dow Jones

DPS t DPS Bates Heston HW t HW

Nikkei

CAC

log-lik

Marg.-lik

log-lik

Marg.-lik

log-lik

Marg.-lik

−5001.6 −5031.3 −5046.9 −5144.8 −4923.9 −4987.4

−5027.47 −5058.14 −5075.50 −5162.54 −4948.42 −5007.33

−3272.4 −3266.1 −3267.5 −3353.4 −3261.7 −3277.0

−3277.84 −3271.87 −3273.09 −3354.81 −3242.01 −3297.36

−3349.6 −3348.2 −3359.5 −3376.0 −3338.1 −3341.6

−3347.38 −3336.55 −3362.00 −3380.07 −3355.73 −3361.84

We now compare the different model dynamics. The first task is to compare the six models according to their goodness-of-fit. In order to do this we compute the Bayes factors. We basically need to estimate the posterior distribution and the likelihood at a given point. Following Chib et al. (2002) we compute the posterior using the solution proposed in Chib and Jeliazkov (2001) and then we evaluate the likelihood using a suitable version of the auxiliary particle filter able to handle jump dynamics. The posterior has been evaluated through the run of the full Gibbs iterations with a burn-in of 10,000 and each reduced run for the same number of times. To evaluate the likelihood we used M = 20, 000 particles and we added at each step a resampling procedure based on R = 5 × M draws to avoid degeneracy problems of the filter (see Maskell and Gordon, 2001 for a survey on this problem). The results are reported in Table 6. For the Dow Jones series there is some evidence that the second class of models is preferred to the first one. The HW and HW t models have bigger marginal likelihoods indicating a better fit of the models to the data. For the Nikkei series the results are similar. Here the best model is still the Hull and White type with t error. But the second best model is the DPS followed by the DPS t. Finally for the CAC 40 series the results are slightly

1694

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

different. The best models are the DPS and the DPS t, followed by the HW t. These findings are in a way surprising. The models with jumps are described by more complex dynamics and we expected a better fit with respect to the logarithmic ones. It seems also interesting to note that the introduction of heavy tails appear more important than the jumps processes. These results, however, are consistent with some findings evidenced in Chib et al. (2002) with different jump models. Finally we compare the models according to their ability to forecast the one-day-ahead VaR. We compute the forecasts of the volatility as a byproduct of the filtering procedure and then we evaluate the required quantiles according to Eq. (13). We validate the models according to the backtesting procedure described in Section 4.2, evaluating the conditional coverage of an interval forecast. To make the comparison fairer, we also compute the one-day-ahead prediction of the VaR through the asymmetric power auto-regressive conditional heteroskedastic models (APARCH) proposed in Ding et al. (1993). These models encompass many well known GARCH parameterizations and have recently received extensive attention in the literature (see for example Laurent and Peters, 2002; Giot and Laurent, 2003). The APARCH(1,1) is described by Yt = t t , t =  + (|Yt−1 | − Yt−1 ) + t−1 .

(16)

We consider two versions with the error term t as Normal and Student t respectively. The APARCH model will be used as a benchmark to compare the one-step-ahead forecasting performance of the other models described in Section 2. The APARCH class has been proposed to forecast VaR in Giot and Laurent (2003). The estimation of the parameters has been carried out with the G@RCH 2.3 package developed by Laurent and Peters (2002). The maximum likelihood estimates are reported in Table 7. In order to obtain a more accurate volatility forecast, we re-estimate the parameters of all the models every 50 days. This temporal lag has been suggested in Giot and Laurent (2003). In our experience, using smaller intervals does not seem to produce sensible differences on the parameter estimates for these data. The results on VaR are reported in Tables 8–10.In the second column the coverage levels are reported and then the value of the conditional coverage test with their p-values. In general the best forecasts are obtained through models with heavy tails. This finding is consistent with the results stated in Eberlein et al. (2003) where the models based on Gaussian assumptions are in general rejected compared to hyperbolic laws. However, the Normal APARCH seems to work properly. This finding is different with respect to the results shown in Giot and Laurent (2003), where the use of a more flexible error term outperformed the Gaussian one. For the Dow Jones series the HW t model gives the best performance. In fact the conditional coverage test always accepts the null hypothesis of i.i.d. for It . If we consider just the out-of-sample results, the DPS t and the APARCH work properly with a coverage level of the 99%. At the 95% level the Gaussian stochastic volatility models seem to be sufficient. For the Nikkei series, the DPS t model is the best. In fact, the HW t alternative seems to be less precise in forecasting the VaR at the 95% level. The Gaussian APARCH works well at a 95% level but it does not work with a 99% threshold. For the

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1695

Table 7 Results for APARCH models Dow Jones APARCH



Nikkei APARCH t

0.0248 (0.0050) 0.9122 (0.0101) 0.0773 (0.0087) 0.4934 (0.0766) 1.2629 (0.1309)

   

0.0124 (0.0037) 0.9492 (0.0084) 0.0458 (0.0076) 0.3327 (0.1101) 1.3227 (0.1656) 4.9683 (0.3944)

log-lik

−5148.61

APARCH

−4917.3

CAC APARCH t

0.0236 (0.0063) 0.9380 (0.0111) 0.0514 (0.0115) 0.7477 (0.1875) 1.3516 (0.1897)

−3312.05

APARCH

0.0168 (0.0066) 0.9388 (0.0118) 0.0572 (0.0128) 0.6967 (0.1816) 1.3661 (0.2442) 6.003 (0.7836) −3253.21

0.0517 (0.0237) 0.9039 (0.0286) 0.0613 (0.0153) 0.5720 (0.1477) 1.5608 (0.1477)

−3362.64

APARCH t 0.0230 (0.0107) 0.9393 (0.0156) 0.0506 (0.0123) 0.5559 (0.1570) 1.3276 (0.2290) 9.2373 (1.5756) −3332.68

In parenthesis are reported the standard errors.

Table 8 Dow Jones Industrial—VaR forecasts through V aR t+1 =  t+1 +  t+1 z 1 −  (%)

In-sample (4080 obs.)

Out-of-sample (1500 obs.)

Cond. coverage

p-value

Cond. coverage

p-value

DPS t DPS Bates Heston HW t HW APARCH t APARCH

99 99 99 99 99 99 99 99

11.359 33.985 18.909 18.909 1.006 25.899 19.93 5.182

0.003 0.000 0.000 0.000 0.604 0.000 0.000 0.022

5.395 26.978 14.490 11.512 2.041 14.490 5.438 1.481

0.067 0.000 0.001 0.003 0.360 0.001 0.000 0.476

DPS t DPS Bates Heston HW t HW APARCH t APARCH

95 95 95 95 95 95 95 95

0.400 1.332 1.189 1.694 4.420 2.539 73.78 7.047

0.818 0.513 0.551 0.428 0.109 0.281 0.000 0.029

7.428 20.501 6.557 2.187 0.342 5.402 17.302 0.181

0.024 0.000 0.031 0.334 0.847 0.067 0.000 0.991

CAC series, many models work properly out-of-sample. At the 95% level the backtesting procedure rejects just the Bates and the Heston models whereas, at the 99% level the heavy tailed parameterizations give more reliable results.

1696

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

Table 9 t+1 +  t+1 z Nikkei Index—VaR forecasts through V aR t+1 =  1 −  (%)

In-sample (1990 obs.)

Out-of-sample (1500 obs.)

Cond. coverage

p-value

Cond. coverage

p-value

DPS t DPS Bates Heston HW t HW APARCH t APARCH

99 99 99 99 99 99 99 99

2.925 25.240 73.114 13.242 0.708 13.246 16.140 1.851

0.231 0.000 0.000 0.001 0.701 0.001 0.000 0.396

0.509 47.09 80.692 14.490 2.752 16.082 5.438 6.456

0.775 0.000 0.000 0.000 0.252 0.000 0.066 0.039

DPS t DPS Bates Heston HW t HW APARCH t APARCH

95 95 95 95 95 95 95 95

2.215 16.632 5.720 10.078 6.380 8.132 21.688 0.073

0.330 0.000 0.000 0.006 0.041 0.017 0.000 0.963

3.270 45.24 44.292 7.269 5.965 7.920 11.032 2.567

0.194 0.000 0.000 0.026 0.050 0.019 0.004 0.277

Table 10 CAC Index—VaR forecasts through V aR t+1 =  t+1 +  t+1 z 1 −  (%)

In-sample (2104 obs.)

Out-of-sample (1500 obs.)

Cond. coverage

p-value

Cond. coverage

p-value

DPS t DPS Bates Heston HW t HW APARCH t APARCH

99 99 99 99 99 99 99 99

10.751 2.161 1.586 33.003 5.281 10.912 5.985 5.394

0.004 0.339 0.452 0.000 0.071 0.004 0.050 0.067

4.061 11.512 14.490 12.965 2.064 11.512 4.061 5.395

0.131 0.003 0.001 0.001 0.356 0.003 0.131 0.067

DPS t DPS Bates Heston HW t HW APARCH t APARCH

95 95 95 95 95 95 95 95

21.424 12.718 4.007 15.702 3.055 3.555 9.510 0.600

0.000 0.001 0.134 0.000 0.217 0.168 0.008 0.740

0.156 3.436 7.176 4.866 2.716 7.280 2.016 4.359

0.924 0.179 0.027 0.087 0.257 0.026 0.364 0.113

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1697

6. Conclusions In this paper we have studied an efficient MCMC procedure to estimate some non-nested stochastic volatility models. More precisely, we have assessed some convergence properties of the algorithm according to some well known diagnostic tests. The use of an adaptive method increases the efficiency of the estimates for the latent processes, at least in the affine jump diffusion specification. An important advantage of the delayed rejection procedure is that it maintains the properties and the tractability of the plain Metropolis–Hastings algorithm, while avoiding a bad behavior that can arise from an imprecise choice of the proposal distributions. On the other hand, the introduction of further Metropolis–Hastings steps slows down the algorithm if the number of iterations is kept fixed. For these reasons some care has to be taken when the algorithm is planned. Another important point is related to the use of the auxiliary particle filter used to estimate the likelihoods and to forecast the volatility dynamics. These sequential Monte Carlo methods seem to be reliable and also faster from a computational point of view with respect to MCMC methods. It would be interesting to apply filtering procedures to make inference for the models analyzed in this paper. We have applied these Monte Carlo methods to the study of various international indexes. Surprisingly we frequently observed that the best models are the simpler ones both in term of its in-sample goodness-of-fit and with their forecast performance. Probably the impact of the jump processes in these data sets is not relevant. The introduction of heavy tails for the error term appears to be more important. For this reason a possible extension of the results obtained includes the study of models based on more flexible stochastic dynamics. Many different alternatives based on Lévy processes are growing in importance. For this purpose, sequential Monte Carlo methods seems to be appropriate for carrying on inference. These methods could be appropriate also for the analysis of derivatives assets. Acknowledgements Financial support from the “G. Ferrari Fund” (Raggi) and from MIUR, Grant 2002135473 (Raggi and Bordignon) are gratefully acknowledged. We wish to thank BjZrn Eraker, Stuart Coles, Nunzio Cappuccio, Diego Lubian, Fabio Rigat, Alessandro Palandri and Giacomo Chiozza for their helpful comments. We also thank the referees and the Editor whose comments have resulted in many improvements. All errors are our own. References Bakshi, G., Cao, C., Chen, Z., 1997. Empirical performance of alternative option pricing models. J. Finance 52, 2003–2049. Barndorff-Nielsen, O.E., Mikosch, T., Resnick, S.I., 2001. Lévy Processes: Theory and Applications. Birkhäuser, Boston. Bates, D., 1996. Jumps and stochastic volatility: exchange rate processes implicit in deutsche mark options. Rev. Financial Stud. 9, 69–107. Berg, A., Meyer, R., Yu, J., 2004. Deviance information criterion for comparing stochastic volatility models. J. Bus. Econom. Statist. 22, 107–120.

1698

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

Brooks, S.P., Roberts, G.O., 1998. Convergence assessment techniques for Markov chain Monte Carlo. Stat. Comput. 8, 319–335. Carlin, B.P., Polson, N.G., Stoffer, D.S., 1992. A Monte Carlo approach to nonnormal and nonlinear state-space modeling. J. Amer. Statist. Assoc. 87, 493–500. Chernov, M., Gallant, A.R., Ghysels, E., Tauchen, G., 2003. Alternative models for stock price dynamics. J. Econometrics 116, 225–257. Chib, S., Jeliazkov, I., 2001. Marginal likelihood from the Metropolis–Hastings output. J. Amer. Statist. Assoc. 96, 270–281. Chib, S., Nardari, F., Shephard, N., 2002. Markov chain Monte Carlo methods for stochastic volatility models. J. Econometrics 108, 281–316. Christoffersen, P.F., 1998. Evaluating interval forecasts. Internat. Econom. Rev. 39, 841–861. Christoffersen, P.F., Pelletier, D., 2004. Backtesting value-at-risk: a duration based approach. J. Financial Econometrics 2, 84–108. Dai, Q., Singleton, K., 2000. Specification analysis on of affine term structure models. J. Finance 55, 1943–1978. Ding, Z., Granger, C.W.J., Engle, R., 1993. A long memory property of stock market returns and a new model. J. Empirical Finance 1, 83–106. Doornik, J.A., 2001. Ox: An Object-Oriented Matrix Programming Language. Timberlake Consultants Press, Duffie, D., Pan, J., Singleton, K., 2000. Transform analysis and asset pricing for affine jump-diffusions. Econometrica 68, 1343–1376. Eberlein, E., Kallsen, J., Kristen, J., 2003. Risk management based on stochastic volatility. J. Risk 5, 19–44. Eraker, B., 2001. MCMC analysis of diffusion processes with application to finance. J. Bus. Econom. Statist. 19, 177–191. Eraker, B., 2004. Do equity prices and volatility jumps? Reconciling evidence from spot and option prices. J. Finance 59, 1367–1403. Eraker, B., Johannes, M., Polson, N., 2003. The impact of jumps in volatility and returns. J. Finance 58, 1269–1300. Gelman, A., Rubin, D.B., 1992. Inference for iterative simulation using multiple sequences. Stat. Sci. 7, 457–472. Geweke, J., 1992. Evaluating the accuracy of sampling-based approaches to calculate posterior moments. In: Bernardo, J., Berger, J., Dawid, A., Smith, A. (Eds.), Bayesian Statistics 4. Oxford University Press, Oxford, pp. 169–193. Gilks, W.R., Best, N.G., Tan, K.K.C., 1995. Adaptive rejection metropolis sampling within Gibbs sampling. Appl. Statist. 44, 455–472. Giot, P., Laurent, S., 2003. Value-at-risk for long and short trading positions. J. Appl. Econometrics 18, 641–664. Glasserman, P., Merener, N., 2001. Convergence of a discretization scheme for jump-diffusion processes with state-dependent intensities. Working Paper, Columbia University. Heston, S., 1993. A closed-form solution of options with stochastic volatility with applications to bond and currency options. Rev. Financial Stud. 6, 327–343. Hull, J., White, A., 1987. The pricing of options on assets with stochastic volatilities. J. Finance 42, 281–300. Jacquier, E., Polson, N.G., Rossi, P., 1994. Bayesian analysis of stochastic volatility models (with discussion). J. Bus. Econom. Statist. 12, 371–417. Jacquier, E., Polson, N.G., Rossi, P., 2004. Bayesian analysis of stochastic volatility models with fat-tails and correlated errors. J. Econometrics 122, 185–212. Johannes, M., Polson, N., Stroud, J., 2003. Nonlinear filtering of stochastic differential equations with jumps. Working Paper, Wharton School, University of Pennsylvania. Jorion, P., 1997. Value at Risk. McGraw-Hill, New York. Kim, S.N., Shephard, N., Chib, S., 1998. Stochastic volatility: likelihood inference and comparison with ARCH models. Rev. Econom. Stud. 65, 361–393. Laurent, S., Peters, J.P., 2002. G@RCH 2.2: an Ox package for estimating and forecasting various ARCH models. J. Economic Surveys 16, 447–485. Levine, R.A.,Yu, Z., Hanley, W.G., Nitao, J.J., 2005. Implementing Componentwise Hastings Algorithms. Comput. Stat. Data Anal. 48, 363–389. Maskell, S., Gordon, N., 2001. A tutorial on particle filters for on-line nonlinear/non-Gaussian Bayesian tracking. Working Paper, QuinetiQ Ltd. Mira, A., 2002. On Metropolis–Hastings algorithms with delayed rejections. Working Paper, Università dell’Insubria, Varese.

D. Raggi, S. Bordignon / Computational Statistics & Data Analysis 50 (2006) 1678 – 1699

1699

Niguez, T.M., 2002. Modelling daily value-at-risk using FIGARCH type models. Working Paper, University of Alicante. Peskun, P.H., 1973. Optimum Monte-Carlo sampling using Markov chains. Biometrika 60, 607–612. Pitt, M.K., Shephard, N., 1999. Filtering via simulation: auxiliary particle filters. J. Amer. Statist. Assoc. 94 (446), 590–599. Raftery, A.L., Lewis, S., 1992. How many iterations in the Gibbs sampler?. In: Bernardo, J., Berger, J., Dawid, A., Smith, A. (Eds.), Bayesian Statistics 4. Oxford University Press, Oxford, pp. 763–764. Raggi, D., 2004. Adaptive MCMC methods for inference on discretely observed affine jump diffusion models. Working Paper 1-2004, Dipartimento di Scienze Statistiche, Università degli Studi di Padova. Smith, B.J., 2004. Bayesian output analysis program (BOA) version 1.1 users’s manual. Technical Report, University of Iowa. Spiegelhalter, D.J., Best, N., Carlin, B., van der Linde, A., 2002. Bayesian measures of model complexity and fit (with comments). J. Roy. Statist. Soc. Ser. B 64, 583–639. Tierney, L., Mira, A., 1999. Some adaptive Monte Carlo methods for Bayesian inference. Statist. Med. 18, 2507–2515.