Estimation of quantiles and confidence intervals for the ... - Springer Link

5 downloads 66 Views 928KB Size Report
The log-Gumbel distribution is related to the type II General Extreme Value (GEV) distribution or GEV-2. Prescott and Walden (1980) derived the expected values.
Stochastic Hydrologyand Hydraulics 10:187-207 © Springer-Verlag 1996

Estimation of Quantiles and Confidence Intervals for the Log-Gumbel Distribution J.-H. Heo Department of Civil Engineering, Yonsei University, Seoul, Korea J. 1). Salas Hydrolic Science and Engineering Program, Department of Civil Engineering, Colorado State University, Fort Collins, CO 80523, USA

Abstract: The log-Gumbel distribution is one of the extreme value distributions which has been widely used in flood frequency analysis. This distribution has been examined in this paper regarding quantile estimation and confidence intervals of quantiles. Specific estimation Mgorithms based on the methods of moments (MOM), probability weighted moments (PWM) and maximum likelihood (ML) are presented. The applicability of the estimation procedures and comparison among the methods have been illustrated based on an application example considering the flood data of the St. Mary's River.

Key words: log-Gumbel distribution, flood frequency analysis, quantile estimation, confidence intervals. 1 Introduction The log-Gumbel distribution is one of the many distributions commonly used for frequency analysis in hydrology. In the hydrologic literature, the log-Gumbel distribution is also known as the Frechet distribution (NERC,1975). The importance of using the Gumbel and log-Gumbel distributions for flood frequency analysis was examined by Shen et al. (1980) and Ochoa et at. (1980). They studied the effect of the tail behavior assumptions of these distributions for fitting the annual floods of more than 200 stations in Texas, New Mexico, and Colorado. They concluded that the log-Gumbel distribution provided a better fit for more than two-thirds of all the stations. They also noted that the log-Gumbel distribution generally gave greater estimates of extreme flood magnitudes than the Gumbel distribution. The log-Gumbel distribution is related to the type II General Extreme Value (GEV) distribution or GEV-2. Prescott and Walden (1980) derived the expected values of the second order partial derivatives of the log-likelihood function of the GEV distribution with respect to the parameters. These expected values are elements of the well known Fisher's information matrix which gives the asymptotic variancecovariance matrix of the maximum likelihood estimators (Kendall and Stuart, 1979). Subsequently, Prescott and Walden (1983) gave an iterative procedure for obtaining

188

maximum likelihood estimates of the parameters of the GEV distribution and derived the observed information matrix of the censored/ complete samples as a reasonable approximation for the maximum likelihood estimates. They compared the observed information matrix with Fisher's information matrix using simulation experiments and concluded that the observed information matrix is preferable to estimate the variances and covariances of the maximum likelihood estimators. Hosking et al. (1985) showed how to estimate parameters of the GEV distribution based on the method of probability weighted moments (PWM) and gave the asymptotic variances of the parameters and a table which can be used to obtain the asymptotic variances of the PWM quantile estimators. They also compared the methods of PWM, maximum likelihood, and aenkinson's procedure by using simulation experiments and concluded that PWM estimators are good for small sample size (N < 100). Recently, Liu and Stedinger (1992) compared alternative variances of the PWM quantile estimator for the two- and three-parameter GEV distributions. This paper discusses three methods of quantile estimation for the log-Gumbet distribution, namely, the method of moments (MOM), the probability weighted moments (PWM) method, and the maximum likelihood (ML) method. In addition, the confidence limits on population quantiles based on the MOM, PWM and ML methods are derived by using the corresponding asymptotic variances. Finally, the applicability of the proposed estimation procedures are illustrated by using observed flood data. 2 Model definition

Consider that random variables X and Y are related as Y=gn(X-xo), in which Xo is a lower bound parameter. It may shown that Y is Gurnbel distributed with location parameter yo and scale parameter ct, if X is log-Gumbel distributed with parameters Xo, yo and c~. Thus, the cumulative distribution flmction of the log-Gumbel distribution is given by F(x) = e x p ( - e x p [ - • n ( X - X o ) - - y o ] ) for x > Xo and c~ > 0. Assuming that c~ = 1//3 andyo = g n ( O - x o ) , i t shown that Eq. (1) takes the form

xp( r0 xol) LX-XoJ

(1) may be

(2)

in which X o < X < e C , 0 > X o and /3 > 0. Equation (2) is another form of thelogGumbel distribution. The tog-Gumbet distribution is related to the GEV-2 distribution. For instance, by assuming that xo = x'o + ce'/[3', c~ = -fl' and yo = gn(-a"//3'), it may be shown that the CDF given by Eq. (1) can be written in the form of the CDF of the GEV-2 distri' a' and/3' are respectively the location, scale, and shape paramebution in which xo, ters of such GEV-2 distribution and the shape parameter/3' is negative. In addition, it may be also shown that by assuming/3 = -1//3', 0 = X'o and Xo = X'o+ a'//3', the CDF of Eq. (2) takes the form of the CDF of the GEV-2 distribution in which Xo, / Cr! and /3' are the GEV-2 parameters as above defined. In the remainder of this paper, we will use the Iog-Gumbel model given by Eq. (2). The derivative of F(x) of Eq. (2) gives the probability density function (PDF) as

189

f(x) -

(x-

, IO-xol, (_f°xo1% Xo) t x -

Xoj

exp

tx-

Xoj /

(3)

Figure 1 shows some typical shapes of the PDF of the log-Gumbel distribution. Likewise, the r-th moment of X around Xo can be shown to be E[(X - Xo)r] = (O - Xo)r F(1 - r//3)

(4)

where 12(.) denotes the complete g a m m a function. Note that such r-th moment exists only if/3 > r. The mean and variance can be obtained from Eq. (4) respectively as # = xo + ( 0 - X o ) r ( 1 -

1//3)

(5)

f o r 3 > 1, and

au = (0 - xo) ~ IF(1 - 2/3) - F2(1 - 1//3)]

(6)

for/3 > 2. Likewise, the skewness coefficient is given by r(X - 3/¢~) - 3r(1 - 2 / 3 ) r ( 1 - 1/3) + 2ra(1 - 1//3) ht =

[F(x - 2/3)

- r~(1

-

1//3)]a/~

(7)

for /3 > 3. It may be shown that the skewness coefficient of the log-Gumbel distribution is greater than 1.1396 (recall that the skewness coefficient of the Gumbel distribution is exactly equal to 1.1396). Also, the expression of the skewness coefficient of Eq. (7) is the same as that of the GEV-2 distribution if the parameter/3 in Eq. (7) is replaced by -1//3' in which /3' is the shape parameter of the GEV-2 distribution and has a negative value. Additionally, the mode is given by mode(x) = X o + ( 0 - Xo)[1+~/3] -1/0 - [ P J

(8)

3 E s t i m a t i o n of quantiles The quantile estimator :KT of the log-Gumbel distribution can be obtained from Eq. (2) by replacing F(x) by ( t - l / T ) as ~:T = ~o + (0 - % ) [ - & ( 1

- l / T ) ] -1/a

(9)

where xo, ~ and/~ are the estimators of the parameters. Also, the estimator )(T may be generally written in terms of the sample mean fi, the sample standard deviation 3-, and the frequency factor I(T (Chow, 1951) as )(w = /2 + KT3-

(10)

in which I~w may be obtained from gqs. (5), (6) and (9) as 14T =

[--gn(1 -- l/T)] -1/a - F(1 - 1//3) [r(1 - 2 / ~ ) - r~(1 - 2/~)11n

(11)

Note that 9 o m Eqs.(7) and (11), the frequency factor I4T is a function of the skewness coemcient and the return period. Numerical values for such a function are shown in Tabte 1.

190 T a b l e 1. Frequency factors for the log-Gumbel distribution

Coeff. of Sknewness

0.2

0,5

1,01

1.25

2

-1~6192 -I.6107 -1.5859 -1.5450 --1.5152 -1.4919 -1.4660 -1A382 -1.4096 -1,3816 -1.3557 -1.3331 -1,3143 -i,2994 -1,2880 -1,2790 -1,2717 -1,2651 -1,2584 -1.2509 -1.2422 -1.2319 -1.2198 -1,2057 -1.1897 -1.1720 -1.1531 -1,1334 ~4.I141 -1.0960 -1.0806 -1,0689 -1.0620 -1.0602 -1.0634 -1,0705 -1.0800 -1.0900 -1.0980 -1.1013

-.8175 -.8158 -.8112 -.8030 -,7967 -.7915 -.7855 -.7788 -.7715 -.7641 -.7570 -.7506 -.7451 -.7406 -.7371 -.7343 -.7320 -.7299 -.7278 -.7253 -.7225 -.7191 -.7151 -.7103 -.7047 -.6985 -.6916 -.6844 -.6770 -.6700 -.6640 -.6593 -.6565 -,6558 -.6571 -.6599 -.6637 -.6677 -.6708 -.6721

-.1692 -,1709 -,1761 -.1846 -,1905 -,1949 -.1998 -.2048 -,2097 -,2143 -,2183 -.2217 -,2244 -.2264 -.2280 -.2291 -,2301 -2309 -.2317 -,2326 -,2336 -.2348 ~.2361 -,2376 -.2392 -.2408 -.2424 -,2439 -.2453 -,2464 -,2472 -,2478 -,2481 -,2481 -.2480 -,2477 -,2472 -.2467 -.2462 -,2461

"y

1.14 1.20 1.30 1.40 1.50 1.60 1.70 1.80 1.90 2.00 2.10 2.20 2.30 2~40 2.50 2.60 2.70 2.80 290 3.00 3.10 3.20 3.30 3.40 3.50 3.60 3.70 3.80 3.90 4.00 4.10 4.20 4.30 4.40 4.50 4.60 4.70 4.80 4.90 5.00

Nonexceedance Probability (q) 0.8 0,9 0.95 0,98

0,01

Corresponding Return Period (T) 5 10 20 50 .7115 .7083 .6988 .6826 ,6705 .6608 ,6497 .6376 .6248 .6121 .6001 .5894 .5804 .5732 .5676 .5632 .5596 ,5564 .5530 .5493 ,5450 .5398 .5337 .5265 .5183 .5092 .4994 A891 .4789 .4694 .4612 .4550 .4513 .4503 .4520 ,4558 .4609 .4662 .4704 .4722

1,2999 1,2977 1.2913 1.2798 1.2705 1~2628 t,2536 1,2431 1,2316 1,2197 1.2080 1.1972 1.1880 1.1804 11745 1.1697 1.1658 1,1622 1,1585 1.1543 1.1494 1,1435 1.1365 1,1281 1.1183 1.1073 1.0951 1.0822 1.0691 1.0565 1,0456 1,0372 1.0321 1.0308 1,0332 1.0383 1.0452 1.0523 1.0579 1.0603

1.8684 1.8686 1.8698 1.8704 1.8697 1.8684 1.8661 1.8626 1,8580 1.8523 1.8461 1.8398 1.8340 1.8290 1.8249 1.8215 1.8187 1.8160 1.8133 1.8102 1.8064 1.8017 1.7960 1.7891 1.7808 1.7711 1.7601 1.7480 1.7354 1.7230 1.7119 1.7033 1.6980 1.6966 1.6991 t.7044 1.7115 1.7187 1.7244 1.7267

2.6101 2.6157 2.6336 2.6618 2.6810 2,6952 2.7100 2.7248 2.7386 2,7506 2.7603 2.7676 2.7729 2,7765 2.7789 2.7805 2.7817 2.7827 2.7836 2.7844 2,7852 2,7859 2,7864 2.7864 2.7857 2.7841 2.7813 2,7773 2,7721 2,7662 2.7602 2.7552 2.7519 2.7511 2~7526 2,7559 2.7600 2.7639 2,7669 2.7680

0.99

0.999

100

1000

3.1704 3.1817 3.2172 3.2753 3.3166 3.3483 3.3827 3.4186 3.4542 3.4876 3.5171 3.5418 3.5612 3.5761 3.5871 3.5954 3.6021 3.6080 3.6138 3.6202 3.6273 3.6355 3.6447 3.6548 3.6654 3.6760 3.6861 3.6950 3.7023 3.7076 3.7109 3.7127 3.7135 3.7136 3.7133 3.7125 3.7110 3.7090 3.7071 3.7062

5.0486 5.0894 5.2169 5.4344 5.5978 5,7284 5.8761 6.0379 6.2072 6.3754 6.5331 6.6725 6.7889 6.8816 6.9534 7.0093 7,0552 7.0967 7.1389 7.1858 7.2404 7.3052 7.3814 7.4698 7.5698 7.6800 7.7974 7,9179 8.0355 8.1435 8.2346 8.3024 8.3423 8.3524 8.3344 8.2933 8.2378 8,1793 8.1318 8.1119

191

1.5

/ l l

1.0

", ,

LG( Xo, '0, # ) . LG(I.0,2.0,I .0) LG(1.0,2.0,2.0) . . . . . LG(1.0,2.0,3,0)

',,

f(x) ,

.-Z/]

0.5

0.0

%

",",

i ",2,'.,.,,

I

1 x

Figure 1. Some example of the probability density function of the log-Gumbel distribution of Eq.

(a)

Therefore, the estimation of quantiles requires estimating the parameters of the log-Gumbel model. Three estimation procedures are presented for the log-Gumbel distribution; the methods of moments, probability weighted moments, and maximum likelihood.

3. I Method of moments (MOM) By substituting #, ~ and "~ in Eqs. (5), (6) and (7) by their corresponding sample estimators ,5, ff and "~ the moment estimators 0, fl and io can be obtained. The skewness coefficient in Eq. (7) is only a function of the shape parameter ft. Thus, the moment estimator of the shape parameter,/) can be obtained from the approximate regression equations given by ~) = 222.5222 - 313.1802 ~/+ 179.5053 ~2 _ 50.6058 ;/a + 6.978542 ~4 _ 0.376228 ~5

(12a)

which is valid for 1.48 < ~ < 5,4 and = 1731,6756 - 2342.8143 ~ + 802.1566 ~2

(12h)

valid for 1.1396 < ~ ~ 1.48, in which "~ is the sample skewness coefficient. For a more precise solution of fl, Eq. (7) can be solved numerically by the Newton-Raphson method. For this purpose, Eq. (7) is rewritten as

192 0(8)

=

r(1 - a//}) - at0

- 1//}) + 2I'3(1 - 1//})

- 2//})r(~

[P(1 - 2/;?) - r~(1

_ #

(13)

- t//})l,/~

and the first derivative of G(/)) with respect to ¢) is given by

1 G'(/}) = /)2IF( 1 _ 2//)) - P2(1 - 1//))] s/2 x {jar'(1

- a//}) - 6r'(1

- 2//})r(1

+ 612'(t - 1//))I~2(1 - 1//))][r(1 -

ap(1

- 2//})r(1

- 1//))

-

ap'(1

- 1//))r(1

- 1//})]}

- 1//)) - at'0 -

+ 2r~(1

2//))

-

- t//))r(1

- 2//})

F2(1 - t//))] - [P(t - a//))

- 1//})][at'(1

- 2//))

where F'(.) is the first derivative of the g a m m a function. equation to estimate/} in the iteration i + l is

(14)

Therefore, the recursive

/)i+, = /)~- G(/)i)/G'(/)i) The iteration proceeds until the error criterion

/}i+1 -/)i