MaxVaR for Non-normal and Heteroskedastic

0 downloads 0 Views 418KB Size Report
Dec 12, 2007 - Lehman Brothers Ltd ... ∗This work was carried out when Nityanand Misra and Bharat Kodase were ... estimate of market risk than VaR if the holding duration of the asset is .... Depending on the values of a0, c0, c1 and c2, f(x) turns out to be .... again checking the significance, LL, AIC and BIC values.
MaxVaR for Non-normal and Heteroskedastic Returns∗ Malay Bhattacharyya†

Nityanand Misra

Professor, Quantitative Methods and Information Systems

Associate, FICCS Core Quant Modeling

Indian Institute of Management Bangalore

Goldman Sachs India Securities Pvt Ltd

Bannerghatta Road

Embassy Golf Links Business Park

Bangalore 560 076, India

Bangalore 560 071, India

Bharat Kodase Senior Analyst, Commodities Structuring Lehman Brothers Ltd 25 Bank Street London E14 5LE, UK

December 12, 2007



This work was carried out when Nityanand Misra and Bharat Kodase were graduate students at the Indian Institute of Management Bangalore. † Corresponding author. Email: [email protected]. Phone: 91-80-26993311. Fax: 91-80-26584050.

Abstract In this work we propose Monte Carlo simulation models for dynamically computing MaxVaR for a financial return series. This dynamic MaxVaR takes into account the time-varying volatility as well as nonnormality of returns or innovations. We apply this methodology on five stock market indices. To validate the proposed methods we compute the number of MaxVaR violations and compare them with the expected number. We also compute the MaxVaR to VaR ratio and find that on an average, dynamic MaxVaR exceeds dynamic VaR by 5–7% at 1% significance level, and by 12–14% at 5% significance level for the selected indices. Keywords: Risk Management, MaxVaR, GARCH, Pearson Type IV distribution, Monte Carlo simulation.

1

Introduction

One of the most popular risk management techniques is the Value-at-Risk (VaR) method to evaluate an investor’s exposure to market risk (Hull, 2006). VaR is the maximum loss that may be incurred on a portfolio at the end of a specified period, for a given confidence level. VaR is very appropriate for situations where one is certain that the asset will be held till the end of the VaR horizon or that there will be no cash outflow as a result of asset price movements during the intervening period. However, in case of markto-market environment of derivatives, it is essential to respond to margin calls when an investor’s position undergoes big losses. If there is a possibility of intermediate trading and cash outflow during the holding period, it is vital to know the risk exposure not just at the end but also during this holding period. VaR method is a standard tool used by regulatory authorities to determine initial margins and maintenance margins. Margins based on VaR may tend to be a conservative estimate to mitigate counter-party risks in mark-to-market scenarios. These problems can be addressed by using the MaxVaR measure proposed by Boudoukh et al. (2004). MaxVaR serves the same purpose as VaR, but differs from VaR in that it is an estimate of risk during a specified time period and not just at the end of it. MaxVaR is a more appropriate estimate of market risk than VaR if the holding duration of the asset is uncertain, or there is a probability of intermediate cash outflows. MaxVaR may prove to be very useful for futures trading, dynamic portfolio rebalancing, American call option writing, etc. To compute MaxVaR, Boudoukh et al. (2004) make simplifying assumptions such as independence of daily returns, constant mean and constant volatility of return series, and normality of the residuals. These assumptions fail to capture commonly observed properties of asset returns like heteroskedasticity, negative skewness, and thick tails. As a result, the MaxVaR estimate of Boudoukh et al. (2004) may understate the actual portfolio risk.1 In this work, we relax these assumptions and develop a dynamic MaxVaR estimate as a risk management tool. In our proposed models, we have assumed a simple GARCH based time varying variance of the returns process. To capture the conditional asymmetry and excess kurtosis of the innovations of the GARCH process, we have used the Pearson Type IV distribution. The organization of this paper is as follows. Section 2 introduces the concept of MaxVaR briefly, with some properties and calculation methods. The Pearson Type IV distribution is summarized in section 3. Section 4 describes our simulation framework for MaxVaR computation. The results of applying our models to five stock indices are presented in section 5. Section 6 concludes with some proposals for future work. 1 In

a heteroskedastic environment, their estimate may also overstate this risk in low volatility periods.

3

2

MaxVaR: Definition, Properties and Computation

Boudoukh et al. (2004) define MaxVaR as the measure of loss that may be incurred during the holding period (including the last day), for a given confidence level. For example, if a bank has a 10-day MaxVaR of $15 million at 99% confidence level, it means that the probability of incurring a loss of $15 million or more during the next 10 days is less than 1%. Burckhardt (2005) explores some properties of MaxVaR, and observes the following 1. All other parameters held constant, an increase in expected return (µ) leads to an increase in MaxVaR to VaR ratio, although both MaxVaR and VaR individually decrease. 2. Everything else remaining constant, an increase in volatility (σ) also leads to an increase in MaxVaR to VaR ratio, while both MaxVaR and VaR increase. 3. Increase in time horizon (T ) increases the MaxVaR to VaR ratio. As expected, both MaxVaR and VaR individually increase. 4. Differences between MaxVaR and VaR become smaller at higher confidence levels. Boudoukh et al. (2004) estimate MaxVaR for a hypothetical asset, the returns of which are independent and are normally distributed with constant mean and volatility. They derive an expression for MaxVaR for a continuous price process based on the first passing time of a Brownian motion with drift.2 Discrete MaxVaR3 is higher than continuous MaxVaR, since discrete sampling may miss the minima of a continuous process. Due to lack of a similar expression in the discrete case, simulation is perhaps the only method to compute discrete MaxVaR.4 To the best of our knowledge, the MaxVaR estimate has not been hitherto applied to a real-life scenario. The observation of volatility clustering in asset returns invalidates the assumption of constant volatility and independence of returns. In most markets, the assumption of normality for returns too is not found to be true due to the observation of leptokurtosis, or fat-tails. To get a precise estimate of MaxVaR for a real-life scenario, one has to address heteroskedasticity, serial dependence and non-normality of returns.

3

The Pearson Type IV distribution

In ARCH and GARCH models (Engle, 2001) the log return at time t, rt , is modeled as rt = µt + t = µt +

p

ht ηt

(1)

2 The expression is not a closed form solution for an asset’s MaxVaR, but a numerical approach like Newton-Raphson method can be used to solve for MaxVaR. 3 A financial institution would want to compute discrete MaxVaR in an end-of-day mark-to-market scenario. 4 Boudoukh et al. (2004) use 50,000 Monte Carlo simulations to achieve this.

4

where µt and ht represent respectively the mean and variance conditional upon the information up to time t − 1, t is the innovation (or shock) at time t and ηt are i.i.d. standardized innovations. The presence of conditional skewness (Grigoletto and Lisi, 2006) and conditional excess kurtosis suggests the non-normal nature of the innovations ηt . Several distributions have been used in literature to model ηt in equation (1). Verhoeven and McAleer (2004) try nine different distributions—Normal, (symmetric and asymmetric) Student-t, (symmetric and asymmetric) generalized t-distribution, (symmetric and asymmetric) generalized error distribution, Pearson Type IV and Gram-Charlier. Yan (2005) compares the results of modeling the distribution of ηt as Hyperbolic, Pearson Type IV and Johnson SU . A mixture of scaled and shifted normal distributions (Alexander and Lazar, 2006) and the use of Box-Cox transforms have also been suggested elsewhere. We choose the Pearson Type IV distribution for ηt for our models, because not only can it account for a very wide range of skewness and kurtosis values among the known distributions in literature (Johnson and Kotz, 1970)(Yan, 2005),5 but also its functional form and properties are very well documented (Nagahara, 1999)(Heinrich, 2004). The Pearson family of curves (Johnson and Kotz, 1970) encompasses a wide range of distributions such as Normal, Student-t, Gamma, Beta, F , Pareto, Inverse Gaussian etc. The family of Pearson curves satisfies the following differential equation for the probability density function (PDF) f (x) 1 df (x) a0 + x =− f (x) dx c0 + c1 x + c2 x2

(2)

Depending on the values of a0 , c0 , c1 and c2 , f (x) turns out to be Normal, Gamma, Beta or another distribution. As an example, when c1 and c2 are zero, f (x) becomes a normal distribution. For Pearson Type IV distribution, c1 and c2 are both non-zero and the roots of the quadratic equation in the denominator are complex. Following the notation in Heinrich (2004), the functional form of Pearson Type IV PDF is " f (x)dx = k 1 +



x−λ a

2 #−m

   x−λ −1 exp −ν tan dx a

where −∞ < x < ∞. m, ν, a and λ are real parameters (functions of a0 , c0 , c1 and c2 ) with m >

(3) 1 2

and

by convention a > 0. k is the normalizing constant whose value depends on m, ν and a. The Type IV distribution can be considered as a skewed version of the Pearson Type VII distribution (Nagahara, 1999). The parameter m is related to the kurtosis of the distribution. Small values of m indicate longer and heavier tails. ν is related to the skewness—the distribution is positively or negatively skewed depending on whether ν is negative or positive. a and λ are the scale and the location parameters respectively. Pearson Type IV 5 The

Johnson SU distribution can model a slightly wider range, see Yan (2005).

5

distribution is unimodal as can be seen from equation (3). The parameters m, ν, a and λ are uniquely related to the first four moments of the distribution (refer to section 4.2.2).

4

Dynamic MaxVaR model

In this section we describe our simulation models for dynamic MaxVaR. The models are developed using a MATLAB 7.3 system. Pearson Type IV normalizing constant computation and Type IV random number generation are implemented in C. The entire MATLAB and C source code is freely available under the GNU General Public License at http://nmisra.googlepages.com/, with sample data and results files.

4.1

Models for conditional mean and variance

We choose a constant mean model for the conditional mean, and a GARCH(1,1) model for the conditional variance. The µt and σt terms in equation (1) become µt = C

(4)

ht = α0 + α1 2t−1 + β1 ht−1

(5)

The reason behind choosing these simple models is to keep the number of parameters to a minimum. In addition, these models are found to be appropriate for all the five indices used to test the model (refer to section 5.1), with both Gaussian and Pearson Type IV assumptions for ηt . Advanced models for the conditional mean like AR(1) and ARMA(1,1) do not result in any significant improvement in terms of the log-likelihood (LL) and Akaike and Schwarz Information Criteria (AIC and SIC). Moreover, for some return series, the AR and MA coefficients for the AR(1) and ARMA(1,1) models are found to be significant for the GARCH-N model (section 4.2.1), but insignificant for the GARCH-PIV model (section 4.2.3). The GARCH(1,1) model may be replaced by more advanced models such as a GARCHGJR, EGARCH or IGARCH process for the conditional variance. This work focusses on the use of Pearson Type IV distribution to model the innovation distribution, so we keep our ARMA-GARCH processes simple.

6

4.2

Models for innovations

4.2.1

GARCH-N Model

In this model, ηt in equation (1) is assumed to follow the standard normal distribution. The PDF for the innovations t can be written from the standard normal density function.  p  ηt ∼ N (0, 1) ⇒ t ∼ N 0, ht   1 2t f (t ) = √ exp − 2ht 2πht The LL is given by L =

X

(6) (7)

lt , where

t

  2t 1 lt = − log (2πht ) + 2 ht

(8)

The maximum likelihood estimates (MLEs) for C, α0 , α1 and β1 are computed by minimizing the negative of the LL using the MATLAB function fmincon. The algorithm chosen is Sequential Quadratic Programming (SQP), a generalization of Newton’s method. After obtaining a set of parameter estimates from the SQP ˆ t are computed using inverse filtering and the algorithm, the innovations ˆt and the conditional variances h GARCH recursive equations (4) and (5).6 The LL can then be easily computed.

4.2.2

GARCH-N-PIV Model

In this model the distribution of ηt in equation (1) is assumed to be Pearson Type IV. We use the Pseudo Maximum Likelihood (PML) method and estimate the model coefficients in two steps. First, we estimate the mean and GARCH parameters of the return series assuming a standard normal distribution for ηt , as in the previous model. Using these PML estimates (PMLEs),7 we obtain the estimates of standardised residuals ηˆt and fit a Pearson Type IV distribution to ηˆt by the method of moments (MOM). The PML method yields consistent and asymptotically normal estimators (Gouri´eroux, 1997). It is used by McNeil and Frey (2000) to fit a GARCH(1,1) model assuming normal distribution for ηt and later model the tail of ηˆt (estimated innovations) using Extreme Value Theory. The MOM estimates the Pearson Type IV parameters r, ν, a and ˆt. expected value of ˆ2t is used to initialize h that the PMLEs for C, α0 , α1 and β1 in this model are the same as the MLEs for the respective parameters under the GARCH-N model. 6 The

7 Note

7

λ by matching the first four empirical and implied moments, using the following equations (Heinrich, 2004), 6(β2 − β1 − 1) (2β2 − 3β1 − 6 √ r(r − 1) β1

r = 2(m − 1) = ν = −p

(9) (10)

16(r − 1) − β1 (r − 2)2  √  µ2 16(r − 1) − β1 (r − 2)2 a= 4 √ √ (r − 1) β1 µ2 λ = hxi − 4 where hxi, µ2 ,



(11) (12)

β1 and β2 are respectively the mean, variance, skewness and kurtosis of ηˆt . The normalizing

constant k is computed from the values of m, ν and a using the C code in Heinrich (2004). One of the reasons for using this model and employing PML and MOM to estimate model parameters is that the parameters can be easily estimated, as shall be seen in later sections.

4.2.3

GARCH-PIV Model

In this model, we assume a Pearson Type IV distribution for ηt in equation (1), with the parameters m, ν, a and λ. As the normalizing constant k is inversely proportional to the scale parameter a,8  ηt ∼ P IV (k, m, ν, a, λ) ⇒ t ∼ P IV

p p  k √ , m, ν, a ht , λ ht ht

(13)

From equation (3), the PDF for t can be written as " √ 2 #−m √     k t − λ ht t − λ ht −1 √ √ √ f (t ) = 1+ exp −ν tan ht a ht a ht MLEs of C, α0 , α1 , β1 , r, ν, a and λ are computed. The LL is given by L =

X

(14)

lt , where

t

" √ 2 # √    1 t − λ ht t − λ ht √ √ lt = log k − log ht − m log 1 + − ν tan−1 2 a ht a ht

(15)

As in the GARCH-N model, we use the fmincon function with the SQP algorithm to minimize the negative of the LL. The MLEs from the GARCH-N Model and MOM coefficients from the GARCH-N-PIV model are used as the initial estimates for the maximum likelihood (ML) method.9 Inverse filtering, GARCH ˆ t are the same as in the the GARCH-N model. recursion and initialization of h The location parameter λ is required for consistency of the MLEs, as pointed out by Newey and Steigerwald (1997). It is noteworthy that Premaratne and Bera (2001) do not estimate α0 , while Verhoeven and √ 22m−2 |Γ (m + iν/2) |2 , where i = −1 (Heinrich, 2004) πaΓ (2m − 1) 9 Heinrich (2004) notes that “. . . [the method of moments] is not really adequate in many cases, but may be used to provide starting values to a maximum likelihood fitter.” 8 The

exact relation is given by k =

8

McAleer (2004) do not estimate a. We look at the the t-statistics of the estimates and also the LL values, and the AIC and SIC values to decide whether or not to include a and/or λ in the model. If the estimates of a and/or λ turn out to be insignificant, we exclude a and/or λ from estimation and reestimate the model, again checking the significance, LL, AIC and BIC values. When not estimated, a is set equal to 1 and λ is set equal to 0.

4.3

Monte Carlo simulation to compute MaxVaR

The three models, GARCH-N, GARCH-N-PIV and GARCH-PIV, are fitted to a return series rt . The ˆ t and innovation series ˆt are obtained from the constant mean and GARCH conditional variance series h estimates. We calculate the 10-day MaxVaR at the end of day t as follows 1. Ten random deviates are generated. The distributions used are N (0, 1) for the GARCH-N Model, and P IV (k, m, ν, a, λ) for the GARCH-N-PIV and GARCH-PIV Models. To generate Pearson Type IV deviates, the C code in Heinrich (2004) is used. This code requires uniform random deviates, for which we use ran3, a C implementation of Knuth’s subtractive method (Press et al., 1992). 2. Taking these deviates as standardized innovations (ˆ ηt+1 to ηˆt+10 ), a 10-day path for the return is ˆ t+i can be ˆ t and ˆt . Using the recursive equations (4) and (5), µ ˆt+i and h simulated, starting from rt , h ˆ t+i−1 and ηˆt+i−1 . Equation (1) can then be used to get rˆt+i from µ ˆ t+i computed from rˆt+i−1 , h ˆt+i , h   t+i X and ηˆt+i . The lowest cumulative return along the path, min  rˆj  for i = 1, 2, ..., 10 is recorded. i

j=t+1

3. On generating N such paths, we get N possible values of lowest cumulative return during the days t + 1 to t + 10. We take the pth percentile from these N values as the 10-day p% MaxVaR.

4.4

In-sample and out-of-sample testing

We divide the return series into two parts, in-sample and out-of-sample. We compute MaxVaRs for the in-sample points as described above. For every out-of-sample point, we take the returns from previous W days, rt−W +1 to rt , and re-estimate the model parameters using ML or PML and MOM. Simulations are now run using these new parameters, and MaxVaR is calculated as before. Different values of W can be tried on past data to determine the right window size. We note that a large value of W implies very gradual changes, while a small value of W may present parameter estimation problems with the ML method. We have observed that for the GARCH-N and GARCH-N-PIV models, 500 data points are usually sufficient for convergence of ML and PML methods, respectively. In contrast, in case of the GARCH-PIV model, we run into convergence problems and some ML coefficients have been found to be insignificant if less than 1000 data points are used. 9

5

Results

5.1

Data and model parameters

We test the model on five stock indices—the All Ordinaries Index (AORD), the Dow Jones Industrial Average (DJIA), the CAC 40 index (FCHI), the FTSE 100 index (FTSE 100), and the Swiss Market Index (SSMI). The daily closing prices for these indices have been obtained from http://finance.yahoo.com. The individual return series are summarized in Table 1. The sample data include the dot com crash of 2000 for all indices and also the 1987 Black Monday crash in case of FTSE 100. For some other stock indices, we have found that the ML (PML) method converges with significant coefficients in case of the GARCH-N (GARCH-N-PIV) model, but for the GARCH-PIV model we get insignificant coefficients. This shows that the GARCH-N and the GARCH-N-PIV models can be applied to a larger number of return series than the GARCH-PIV model. Table 2 shows the test statistics proposed by Bai and Ng (2005) for unconditional skewness, kurtosis and normality for the five return series. At 5% significance level, we find that the null hypothesis of symmetry is not rejected for any of the return series. A similar result is obtained by Grigoletto and Lisi (2006) for 8 out of 9 return series. Kurtosis tests for all series show significant excess kurtosis at 5% significance level. Normality hypothesis is rejected for three out of five series by both normality tests, not rejected by one test for AORD, and not rejected by both tests for FTSE 100. These test statistics suggest lack of unconditional skewness,10 but significant excess kurtosis in all return series. We fit the three models described in section 4.2 to the in-sample part of each series. The exact in-sample length is taken after some trial-and-error, ensuring that the GARCH-PIV MLEs are all significant at 5% level. The significance tests are one-sided for α0 , α1 and β1 since they are all defined to be positive. The test for r is also one-sided because by definition r > −1.11 The tests for C, a, ν and λ are two-sided.12 Table 3 shows the model fit statistics for in-sample data. The MLEs for GARCH-N and GARCH-PIV (with t-statistics), and the MOM estimates for GARCH-N-PIV models are shown in Table 4.13 As one of the reviewers points out, since the GARCH parameter estimates are sensitive to outliers, a leptokurtic distribution for ηt should reduce the effect of outliers on parameter estimates. The smaller estimates for α0 in Table 4 for the GARCH-PIV model compared to those for the GARCH-N model confirms this. The location parameter λ turns out to be insignificant for CAC 40 and DJIA, so we set it to 0 and re-compute the 10 However,

conditional skewness may still be significant even if unconditional skewness is not, refer Grigoletto and Lisi (2006). is defined to be greater than 1/2. Since r = 2(m − 1), r must be greater than −1. 12 Although a is defined to be positive, the null hypothesis here is a = 1, against the alternate hypothesis a 6= 1. Therefore, the test is two-sided. 13 The PMLEs for the GARCH-N-PIV model are not shown in Table 4, they are equal to the respective MLEs for the GARCH-N model. 11 m

10

parameters using ML. The empirical standardized residuals from the GARCH-N model (ˆ ηt ) do not follow a normal distribution, as shown by the K-S statistics for the GARCH-N model in Table 5. Table 5 shows the descriptives (both implied and empirical) of the in-sample standardized residuals obtained from the GARCH-N and GARCH-PIV models. The GARCH-N standardized residuals have mean and variance very close to 0 and 1 respectively, but exhibit negative skewness and excess kurtosis. The GARCH-PIV standardized residuals have first two moments as implied by the MLEs, but the implied skewness and kurtosis are different from the empirical values. The only exception is FTSE 100, where implied skewness is almost equal to empirical value.14 However the p-values from the Kolmogorov-Smirnov (K-S) test show that the ML fit of the GARCH-PIV model is better than the MOM fit of the GARCH-N-PIV model, although the MOM method matches all four empirical and implied moments. We use N = 10, 000 simulations to calculate MaxVaRs for the three models, and call them GARCHN MaxVaR, GARCH-N-PIV MaxVaR and GARCH-PIV MaxVaR accordingly. After computing in-sample MaxVaRs, we use a moving window with the same length as the in-sample data to compute MaxVaRs for out-of-sample data, i.e., W is set to 3500 for AORD, CAC 40 and DJIA, 4500 for FTSE 100, and 3550 for SSMI. These large values for W ensure that the ML method for the GARCH-PIV model converges (refer to section 4.4) at every out-of-sample point.15 To reduce computational effort, we estimate the same model at at every out-of-sample point as the one chosen for the in-sample data. For example, for CAC 40 and DJIA the location parameter λ, found to be insignificant for the in-sample data, is not estimated at any of the out-of-sample points. We also compute MaxVaR using 10,000 simulations with constant mean and variance assumption, as in Boudoukh et al. (2004), from the MLEs of mean and variance of the in-sample or the moving window data. We call this estimate Boudoukh MaxVaR. Computing all four MaxVaRs on an IBM Thinkpad R51 laptop16 with MATLAB 7.0 (R14) takes around 45 minutes for 3500 in-sample points and around 95 minutes for the 533 out-of-sample points for the DJIA index. This corresponds to less than 1 second for every in-sample point and 10-11 seconds for every out-of-sample point.17

5.2

MaxVaR violations and model comparisons

The four MaxVaR values are then checked for violations against the minimum cumulative return over the next 10 days for every in-sample and out-of-sample point. The number of violations of in-sample and out-of14 Premaratne and Bera (2001) report in their results for NYSE equal weighted returns that the empirical and implied skewness values are almost equal for the Type IV-AR-GARCH model, but the kurtosis values are not. 15 For every index, we use the same value of W in all the three models - GARCH-N, GARCH-N-PIV and GARCH-PIV, so that the results from the three models can be compared 16 The system has a 1.6 GHz Intel Pentium M Processor and a 512 MB RAM. 17 Most of this time (10-11 seconds) for each out-of-sample point is taken up by the ML estimation for the GARCH-N and GARCH-PIV models.

11

sample MaxVaR estimates for three values of p are shown in Table 6. As an example, Figure 1 shows the 5% MaxVaR values, with the actual minimum cumulative return over the next 10 days for the DJIA index. The vertical black line separates the in-sample data from the out-of-sample data. The graphs (not included) for other indices are more or less similar. The number of in-sample violations in Table 6 for p = 1% and p = 2.5% show that GARCH-N, GARCH-N-PIV and GARCH-PIV MaxVaRs are superior to Boudoukh MaxVaR in modeling the tail of the innovations distribution. For p = 5%, Boudoukh MaxVaR performs quite well for CAC 40 and FTSE 100, but underestimates the MaxVaR in the other cases. For out-of-sample data, Boudoukh MaxVaR performs well only for AORD; for other indices it is mostly an overestimate. GARCH-N-PIV MaxVaR gives the best results for AORD, and GARCH-PIV MaxVaR for CAC 40 and FTSE 100. In case of DJIA, all in-sample MaxVaR estimates underestimate the risk for p = 1%. However, the in-sample and out-of-sample violations for GARCH-N-PIV MaxVaR and GARCH-PIV MaxVaR, respectively, are very close to the expected numbers. This shows the superiority of the dynamic MaxVaR estimates. No measure performs well for the SSMI index, although the number of violations for the GARCH-PIV MaxVaR is closest to the expected number. To statistically test if our observed MaxVaR violations are different from the expected number, we note that the number of violations N in a sample of size T is distributed binomially as N ∼ B(T, p). Following Angelidis et al. (2004), under the null hypothesis N/T = p, the following likelihood ratio statistic asymptotically follows a chi-square distribution with one degree of freedom " T −N  N # h i N N T −N N 2 ln 1 − − 2 ln (1 − p) p ∼ χ2 (1) T T

(16)

The test statistics corresponding to the number of violations in Table 6 are shown in Table 7. Tables 6 and 7 show that although no single model uniformly dominates another, the GARCH-PIV and GARCH-N-PIV models perform better than the GARCH-N model in most cases. Also, the GARCHN MaxVar estimate outperforms Boudhoukh’s static estimate, as expected. An attractive feature of the GARCH-N-PIV model is that its parameters can be estimated with relative computational ease and with fewer data points compared to those of the GARCH-PIV model. Although the in-sample performance of Boudoukh MaxVaR is as expected, it seems counterintuitive that it should be an overestimate for most out-of-sample points. Figure 1 shows that the out-of-sample period returns are relatively less volatile than the in-sample returns.18 As a result, if the moving window size is large, the variance used to compute Boudoukh MaxVaR will be much higher than the conditional variance of the GARCH models, and so Boudoukh MaxVaR tends to be an overestimate. 18 The out-of-sample data consists of returns from November 2004 onwards (October 2002 onwards for FTSE 100), while the dot com bubble (with the dot com crash) and the early 2000s recession are a part of the in-sample data.

12

5.3

MaxVaR to VaR ratios

To assess by how much the traditional VaR underestimates risk in a mark-to-market environment, we observe the MaxVaR to VaR ratio. Boudoukh et al. (2004) state that MaxVaR may exceed VaR by over 40% for reasonable assumptions, and by as much as 80% for high Sharpe Ratio portfolios. We compare our dynamic MaxVaR to dynamic VaR, since it would not make sense to compare a dynamic measure with a static one. Using the same algorithm with 10,000 simulations as described in section 4.3, we compute dynamic VaR using the GARCH-PIV model.19 We call this VaR estimate GARCH-PIV VaR. For FTSE 100, the number of violations for GARCH-PIV VaR are 49, 123 and 232 for in-sample data and 15, 30 and 53 for out-of-sample data for p = 1%, 2.5% and 5%, respectively. These are very close to the expected number for FTSE 100, shown in Table 6.20 This implies that both GARCH-PIV MaxVaR and GARCH-PIV VaR are reasonably good estimates of actual MaxVaR and VaR, respectively, for FTSE 100. The average MaxVaR to VaR ratio for FTSE 100 are 1.069, 1.097 and 1.138 for p = 1%, 2.5% and 5%, respectively. We find that for the five indices considered, MaxVaR exceeds VaR by 5–7% at 1% significance level, and by 12–14% at 5% significance level. Figure 2 plots the MaxVaR to VaR ratio and conditional standard deviation (σt ) versus time, for p = 5%. The ratio is high in low volatility periods and low in high volatility periods.21 This implies that the use of a single multiplier, as suggested by Boudoukh et al. (2004), may not be sufficient to compute MaxVaR from VaR. Graphs for other indices (not included) show similar results.

6

Conclusions

In this work we adapt the concept of MaxVaR proposed by Boudoukh et al. (2004) to real financial return series, characterized by volatility clustering, asymmetry and leptokurtosis. We use a GARCH model for conditional variance with a Pearson Type IV distribution for the standardized residuals. With the moving window feature of our model, a financial institution can compute MaxVaR dynamically for real-life risk assessment and control. We also show that the MaxVaR to VaR ratio is higher in periods of low volatility and lower in periods of high volatility. This implies that an ad-hoc measure like multiplying the traditional VaR by an adjustment factor can underestimate or overestimate the continuous risk in low or high volatility periods, respectively. Our model gives robust results for some indices, and can easily be modified to best-suit other financial return series for superior performance. Some of the possible improvements to our model are trivial, like re-estimating the best model (by varying window size and/or number of parameters) at every out-of-sample point, or increasing number of simulations. 19 The

pth percentile of the distribution of cumulative return at the horizon is taken as the dynamic VaR. number of violations is same for VaR and MaxVaR at a particular significance level. 21 Boudoukh et al. (2004) also find that the adjustment factor decreases with increase in standard deviation (σ). 20 Expected

13

Some other alternatives that can be used to build further upon the model are 1. Higher order ARMA and GARCH models or advanced GARCH models like EGARCH, IGARCH or GARCH-GJR may be used. Verhoeven and McAleer (2004) present an AR(1)-GJR-GARCH(1,1) model with the Pearson Type IV distribution and test it on three indices. 2. Instead of modeling the conditional skewness and kurtosis of the Pearson Type IV innovations (ηt ) as static, one may use a dynamic skewness and kurtosis model. Yan (2005) discusses two approaches, Autoregressive Conditional Density (ARCD) and Autoregressive Conditional Moments (ARCM), to impose dynamic skewness and kurtosis. The author also presents an ARCD model for Pearson Type IV innovations. 3. ML estimation could be performed using method of conjugate gradients or the Levenberg-Marquardt algorithm. Alternatives to ML estimation can also be used to estimate GARCH-PIV model parameters. Grigoletto and Lisi (2005) present a GARCH type model for dynamic skewness and kurtosis, and estimate it using a Markov Chain Monte Carlo (MCMC) approach in a Bayesian framework. 4. The Johnson SU distribution can be used to model innovation series as in Yan (2005). The SU distribution also has four parameters, and three of them notionally similar to those of the Pearson Type IV distribution. Our future work is intended to compare the effectiveness of Pearson Type IV and Johnson SU distributions in modelling financial return series.

Acknowledgements The authors wish to thank two anonymous reviewers for their valuable comments and suggestions.

14

References Alexander, C., Lazar, E., 2006. Normal Mixture GARCH(1,1): Applications to Exchange Rate Modelling. Journal of Applied Econometrics 21 (3), 307–336. Angelidis, T., Benos, A., Degiannakis, S., 2004. The Use of GARCH Models in VaR Estimation. Statistical Methodology 1 (1-2), 105–128. Bai, J., Ng, S., 2005. Tests for Skewness, Kurtosis, and Normality for Time Series Data. Journal of Business & Economic Statistics 23, 49–60. Boudoukh, J., Richardson, M., Stanton, R., Whitelaw, R. F., 2004. MaxVaR: Long Horizon Value at Risk in a Mark-to-Market Environment. Journal of Investment Management 2 (3), 14–19. Burckhardt, N., 2005. Time dependency of Risk Measures. International Finance Seminar, University of St. Gallen. http://www.sbf.unisg.ch/org/sbf/web.nsf/c2d5250e0954edd3c12568e40027f306/ c838918233f320a7c125703c005da569/$FILE/Burckhardt%20Nicolas_paper.pdf. Engle, R., 2001. GARCH 101: The Use of ARCH/GARCH Models in Applied Econometrics. Journal of Economic Perspectives 15 (4), 157–168. Gouri´eroux, C., 1997. ARCH-Models and Financial Applications. Springer Series in Statistics. Springer, Ch. 4, pp. 43–52. Grigoletto, M., Lisi, F., September 2005. A GARCH-type Model with Dynamic Skewness and Kurtosis. In: Proceedings of the Meeting on Modelli Complessi e Metodi Computazionali Intensivi per la Stima e la Previsione. Department of Statistical Sciences, University of Padua, Italy, http://homes.stat.unipd. it/lisif/sco05.pdf. Grigoletto, M., Lisi, F., 2006. Looking for skewness in financial time series, Working Paper 7-2006, Department of Statistical Sciences, University of Padua, Italy. Heinrich,

J.,

2004.

A

Guide

to

the

CDF/MEMO/STATISTICS/PUBLIC/6820,

Pearson

University

Type of

IV

Distribution.

Pennsylvania,

Tech.

Rep.

http://www-cdf.fnal.

gov/publications/cdf6820_pearson4.pdf. Hull, J. C., 2006. Options, Futures and Other Derivatives, Sixth Edition. Prentice-Hall of India Private Limited, Ch. 18, pp. 435–459.

15

Johnson, N. L., Kotz, S., 1970. Distributions in Statistics: Continuous Univariate Distributions-1. John Wiley & Sons, Ch. 12, pp. 9–15. McNeil, A. J., Frey, R., 2000. Estimation of Tail-Related Risk Measures for Heteroscedastic Financial Time Series: An Extreme Value Approach. Journal of Empirical Finance 7 (3-4), 271–300. Nagahara, Y., 1999. The PDF and CF of Pearson type IV Distributions and the ML Estimation of the Parameters. Statistics & Probability Letters 43 (3), 251–264. Newey, W. K., Steigerwald, D. G., 1997. Asymptotic Bias for Quasi-Maximum-Likelihood Estimators in Conditional Heteroskedasticity Models. Econometrica 65 (3), 587–599. Premaratne, G., Bera, A. K., 2001. Modeling Asymmetry and Excess Kurtosis in Stock Return Data. Illinois Research & Reference Working Paper No. 00-123. http://ssrn.com/abstract=259009. Press, W. H., Teukolsky, S. A., Vetterling, W. T., Flannery, B. P., 1992. Numerical Recipes in C: The Art of Scientific Computing, Second Edition. Cambridge University Press, Ch. 7, pp. 282–286. Verhoeven, P., McAleer, M., 2004. Fat Tails and Asymmetry in Financial Volatility Models. Mathematics and Computers in Simulation 64 (3-4), 351–361. Yan, J., 2005. Asymmetry, Fat-tail, and Autoregressive Conditional Density in Financial Return Data with Systems of Frequency Curves. Tech. Rep. 355, Department of Statistics and Actuarial Science, University of Iowa, http://www.stat.uiowa.edu/techrep/tr355.pdf.

16

AORD

CAC 40

DJIA

FTSE 100

SSMI

Observations Dates

4053 1991-2006

4037 1991-2006

4033 1991-2006

5558 1985-2006

4026 1991-2006

Mean(%) Std Dev(%) Skewness Kurtosis

0.0365 0.7675 -0.4463 8.3448

0.0323 1.3160 -0.1130 5.9587

0.0388 0.9796 -0.2207 7.9774

0.0293 1.0204 -0.5615 11.3440

0.0461 1.1251 -0.2195 7.8967

Table 1: Summary of return series

Skewness (ˆ π3 ) Kurtosis (ˆ π4 ) Normality (ˆ π34 ) Normality (ˆ µ34 )

AORD

CAC 40

DJIA

-1.5476 (0.0609) 2.2017 (0.0138) 7.2405 (0.0268) 4.8234 (0.0897)

-0.9327 (0.1755) 5.8912 (