Estimation and spatial interpolation of rainfall intensity ... - Springer Link

3 downloads 0 Views 1MB Size Report
Feb 17, 2009 - ORIGINAL PAPER. Estimation and spatial interpolation of rainfall intensity distribution from the effective rate of precipitation. Ming Li Ж Quanxi ...
Stoch Environ Res Risk Assess (2010) 24:117–130 DOI 10.1007/s00477-009-0305-3

ORIGINAL PAPER

Estimation and spatial interpolation of rainfall intensity distribution from the effective rate of precipitation Ming Li Æ Quanxi Shao Æ Luigi Renzullo

Published online: 17 February 2009 Ó Springer-Verlag 2009

Abstract Great emphasis is being placed on the use of rainfall intensity data at short time intervals to accurately model the dynamics of modern cropping systems, runoff, erosion and pollutant transport. However, rainfall data are often readily available at more aggregated level of time scale and measurements of rainfall intensity at higher resolution are available only at limited stations. A distribution approach is a good compromise between fine-scale (e.g. sub-daily) models and coarse-scale (e.g. daily) rainfall data, because the use of rainfall intensity distribution could substantially improve hydrological models. In the distribution approach, the cumulative distribution function of rainfall intensity is employed to represent the effect of the within-day temporal variability of rainfall and a disaggregation model (i.e. a model disaggregates time series into sets of higher solution) is used to estimate distribution parameters from the daily average effective precipitation. Scaling problems in hydrologic applications often occur at both space and time dimensions and temporal scaling effects on hydrologic responses may exhibit great spatial variability. Transferring disaggregation model parameter values from one station to an arbitrary position is prone to error, thus a satisfactory alternative is to employ spatial interpolation between stations. This study investigates the spatial interpolation of the probability-based disaggregation model. Rainfall intensity observations are represented

M. Li (&)  Q. Shao CSIRO Mathematical and Information Sciences, Private Bag No. 5, Wembley, WA 6913, Australia e-mail: [email protected] L. Renzullo CSIRO Land and Water, GPO Box 1666, Canberra, ACT 2601, Australia

as a two-parameter lognormal distribution and methods are developed to estimate distribution parameters from either high-resolution rainfall data or coarse-scale precipitation information such as effective intensity rates. Model parameters are spatially interpolated by kriging to obtain the rainfall intensity distribution when only daily totals are available. The method was applied to 56 pluviometer stations in Western Australia. Two goodness-of-fit statistics were used to evaluate the skill—daily and quantile coefficient of efficiency between simulations and observations. Simulations based on cross-validation show that kriging performed better than other two spatial interpolation approaches (B-splines and thin-plate splines). Keywords Rainfall intensity  Spatial interpolation  Distribution approach  Pluviograph

1 Introduction Great emphasis is being placed on the use of rainfall intensity data at short time intervals to accurately model the dynamics of modern cropping systems, runoff, erosion and pollutant transport. Many contemporary models require sub-daily rainfall records as inputs, such as soil water infiltration and movement (SWIM; Ross 1990), water erosion prediction project (WEPP; Lane and Nearing 1989), FLOw in CRacking soil (FLOCR; Oostindie and Bronswijk 1992) and water and agrochemicals in the vadose environment (WAVE; Vanclooster et al. 1994). It has been reported that the prediction skill can be substantially improved when higher resolution rainfall data (i.e. rainfall intensity data) are available (e.g. Dodov and Foufoula-Georgiou 2005; Kandel et al. 2005; Mertens et al. 2002).

123

118

Most rainfall data, however, are only available at more aggregated time scales (e.g. daily) and comparatively few locations measure rainfall intensity at higher frequency. For example, in Australia few breakpoint data (i.e. precipitation records every a few minutes) are longer than 25 years and many stations have much shorter records or only record daily rainfall. Conventionally, the effective rainfall intensity, defined as the average of above-threshold rainfall intensity (see Sect. 2.1 for more discussion) has been used as an alternative to breakpoint data, but it may be unsatisfactory because it fails to capture information about peaks, variance and so on. Kandel et al. (2004) reported that the coefficient of efficiency (CoE) for erosion simulations based on a Nepalese catchment was 0.80 by using 2-min rainfall intensity observations but only 0.26 by the effective daily rainfall intensity. This highlights the potential for mismatch between fine-scale models and coarse-scale rainfall data, and calls for the development of approaches that predict sub-daily or even sub-hourly rainfall intensity characteristics from more aggregated rainfall information such as effective rainfall intensity. One popular solution to this problem is simulation-based algorithms that disaggregate daily rainfall into sequences of individual showers. For example, Hershenhorn and Woolhiser (1987) derived daily rainfall using in excess of 20 years of summer breakpoint data from the Walnut Gulch Experimental Watershed, and managed to simulate the number of events per day, as well as the amount of rainfall, shower duration and starting time. Connolly et al. (1998) included seasonal patterns into this simulationbased disaggregation approach and applied it to Australian rainfall conditions. They found that the model was able to reproduce characteristics of rainfall from four locations in Australia with different climate conditions. However, they also admitted that it was difficult to represent occurrences of low intensity, long duration and multiple peaks in an event. Socolofsky et al. (2001) presented a stochastic selection method to break daily measurements into finer plausible storm intensity by selecting samples from nearby pluviometer stations with the same climate condition. Sivakumar and Sharma (2008) introduced a cascade approach to disaggregate low-resolution temporal rainfall into finer resolution sequences. This method is based on two steps: (1) identification of the presence of scaling behaviour by the statistical moment scaling function; and (2) generation of synthetic data with same/similar scaling properties to the observed process. However, the task of validation and calibration of multiplicative cascade models could be expensive. In many occasions, the assumptions of simulation-based methods are complex and hard to justify. Also, stochastic generation would produce different subdaily or sub-hourly intensity patterns from one simulation to another because of randomness.

123

Stoch Environ Res Risk Assess (2010) 24:117–130

Alternatively, Kandel et al. (2005) introduced a distribution approach as a good compromise between fine-scale (e.g. sub-daily) models and coarse-scale (e.g. daily) rainfall data. In this approach, the cumulative distribution function (CDF) of rainfall intensity is employed to represent the effect of within-day temporal variability of rainfall and a disaggregation model is used to estimate distribution parameters from the effective rate of precipitation. Kandel et al. (2005) found such a probability-based model compares favourably with a sub-hourly model when equivalent process descriptions were used in different Australian and Nepalese climate conditions. They concluded that the within-day distribution of rainfall intensity and total rainfall amounts are of most importance, and that the time sequence of intensity may be secondary. Various probability distributions have been used to represent the distribution of rainfall intensity, including exponential (Eagleson 1972; Woolhiser and Roldan 1982; Foufoula-Georgious and Lettenmaier 1987; Connolly et al. 1998; Van Dijk et al. 2005), gamma (Katz 1977), beta (Garciaguzman and Arandaoliver 1993), log-Poisson (Sivakumar and Sharma 2008), lognormal (Onof et al. 1998; Kandel et al. 2004) and generalized Pareto distributions (Cameron et al. 2000, 2001). Anderson et al. (2006) analysed high resolution (6 min) rainfall data recorded at 42 pluviograph stations across Australia, and concluded that the three-parameter generalized Pareto distribution provided the best fit, followed by the twoparameter generalized Pareto, exponential and gamma distributions. Using the average of the five highest intensity intervals (6 min) in a day, Anderson et al. (2006) is strongly in favour of extreme value distributions. However, the average of the five highest intensities represents a quantile of less than 1% extreme large values and is not the highest intensity of the contiguous 30-min averages, which is of interest in hydrological models. However, there is no common agreement amongst researchers as to which distribution is superior to others and the choice of distribution depends on motivation and regions where the model is being applied. The lognormal distribution with two parameters is employed in this study to represent the within-day rainfall distribution. A very large collection of literature in atmospheric sciences (Biondid 1976; Lopez 1997; Houze and Cheng 1977; Crane 1985; Kedem et al. 1990) shows the lognormal distribution describes rainfall statistics reasonably well from empirical studies. The occurrence of this particular distribution was explained in terms of physical principles by Ramond (1997). He proved that solutions for the homogeneous part of the moisture fluxes and the diabatic contribution to the ageostrophic wind approaches to a lognormal distribution asymptotically. In our study, chi-squared goodness-of-fit tests showed that the rainfall

Stoch Environ Res Risk Assess (2010) 24:117–130

intensity followed the lognormal distribution with the average of p values across sites equal to 0.40. Furthermore, the parsimonious form with only two parameters allows the lognormal distribution to be advantageous in model regionalization. Scaling in both space and time is a common problem in many hydrological applications, and temporal scaling effects on hydrologic responses may exhibit a great deal of spatial variability. As a result, the parameter values of this distribution-based disaggregation approach may spatially vary and it could be potentially dangerous to transfer them from one station to another. An estimation procedure is typically used to specify these parameters based on observed high-resolution rainfall data. As a result, distribution parameters are not able to be estimated through estimation at stations without pluviograph data. Therefore, spatial-interpolation techniques are required to obtain distribution parameters at unsampled locations without breakpoint data. This study provides an integrative approach for spatiotemporal downscaling rainfall data by combining a distribution-based disaggregation model and spatial interpolation. The proposed approach is based on two steps: (1) estimation of the probability distribution of rainfall data at sample sites (i.e. pluviometer stations) with several parameters; and (2) spatial interpolation of these parameters over space. Rainfall intensity data collected from pluviometer stations in a selected region in Western Australia are used to demonstrate and test the method. The effectiveness of spatial interpolation method is compared with other two methods by a numerical study. Future research directions are recommended in the discussion section.

2 Methods 2.1 Step 1: distribution-based rainfall disaggregation A two-parameter lognormal distribution is used to represent the daily rainfall intensity distribution. Our method is based on Kandel et al. (2004) but has three major modifications: (a) distribution parameters are estimated by the least-squares method instead of the method of moments (MoM); (b) a linear relationship between the average effective intensity and log-mean rainfall (instead of the square root of log-variance) is established; (c) the wet fraction is assumed known. The modification (a) allows the estimated parameters to minimise the mean squared-error, which is the most widely used performance indicators. The MoM used in Kandel et al. (2004) is computationally convenient but may not be as efficient as the least-squared estimation at finite sample. The second modification (b) is

119

motivated by the closer relationship between average effective intensity and log-mean rainfall because both describe the central tendency of the rainfall amount. With regards to the last modification (c), we noted that Kandel et al. (2004) classified storm types to estimate the wet fraction, but unfortunately this estimation performed very poorly for our datasets (results not given in this paper). Therefore, it is of great importance yet elusive to estimate wet fractions efficiently if pluviograph data are not available. This will be addressed in a subsequent paper. The importance of the wet fraction is also discussed in Sect. 5. For each pluviometer site on a given day, let P1  P2      Pm be the ordered above-threshold rainfall intensity, with a maximum m = 240 six-minutely observations. The average effective intensity at this day P is defined by the average of P1 ; P2 ; . . .; Pm , i.e. the average of above-threshold rainfall intensity. The wet fraction x is estimated as m/240. A lognormal distribution with two parameters l (log-mean) and r2 (log-variance) is used to fit the distribution of P1 ; P2 ; . . .; Pm . These distribution parameters are estimated from the pluviograph data by the least-squares method which minimises the objective function m  X

 2 Pi  q xi ; l; r2

ð1Þ

i¼1

where qðxi ; l; r2 Þ ¼ 12 þ 12 erf flnðxi Þl=rpffiffi2g with erfðxÞ ¼ R 2 x t2 dt: p 0 e Here xi ¼ ði  0:4Þ=ðm þ 0:2Þ is the corresponding probability of exceedance (Cunnane 1978) and q(x; l, r2) is the quantile function of Log - N(l, r2). In order to distinguish from the parameters estimated from the effective rainfall intensity, hereafter we call the parameter estimated directly from minimising (1) as the fitted parameters. The least-squares estimation guarantees that the fitted parameters yield the largest daily CoE (DCoE), which  is the CoE between the observed (Pi) and estimated P^i quantile of the daily rainfall intensity distribution and is given by  Pm  ^ 2 i¼1 Pi  Pi DCoE ¼ 1  Pm : ð2Þ 2 ðPi  PÞ i¼1

In contrast, other conventional parameter estimations do not hold this property in general. If other measures of efficiency are desired, it is recommended that the objective function be modified to match the performance indicator. For example, if the modified CoE (Legates and McCabe 1999) is used to evaluate performance, the sum of absolute error is the best objective function to estimate parameters. For a given site, we assume that the log-transformed average effective rainfall intensity is related to the logmean parameter l in a linear manner:

123

120

 þ e; l ¼ a þ b lnðPÞ

Stoch Environ Res Risk Assess (2010) 24:117–130

ð3Þ

where e is a random error. Two linear coefficients are assumed to be constants over time and are determined as ^a ^ by applying the ordinary linear regression to and b historical sub-daily rainfall data. Then the log-mean can be estimated from the average effective rainfall intensity by ^ lnðPÞ:  ^¼^ l aþb

ð4Þ

Inspired by the lognormal distribution expression for mean  ¼ expðl þ r2 =2Þ; EðPÞ

ð5Þ

we estimate the log-variance from the average effective intensity P by  l ^2 ¼ 2flnðPÞ ^g: r

ð6Þ

Kandel et al. (2004) related r to the average effective rainfall intensity by the linear regression and estimated l from the relationship implied by the expression for mean. Because the average effective rainfall depicts the mean, it could be more rational to link P with the log-mean parameter from a statistical point of view. Moreover, Sect. 4.1 shows that for our selected stations the estimation from (4) and (6) generally yields smaller MCoE (the median of DCoE) than that from Kandel et al. (2004), supporting our modification in providing more efficient estimation of distribution parameter. To evaluate model performance, we examine two performance indicators: (a) the CoE between the observed and estimated daily rainfall intensity distribution (DCoE) as defined by (2); (b) For a particular percentile q, quantile CoE (QCoE), which  is the CoE between the observed (Pq,i) and estimated P^q;i qth-quantile over all n days and is defined as:  Pn  ^ 2 i¼1 Pq;i  Pq;i ð7Þ QCOE ¼ 1  Pn   ;  2 i¼1 Pq;i  Pq where Pq is the average of observed qth-quantile. The value of DCoE quantifies how well the shape of a lognormal distribution matches the within-day rainfall distribution in an overall sense. A value of DCoE approaching unity indicates that the lognormal distribution with fitted/estimated parameters approximates the rainfall intensity CDF well for that given day. The difference between two performance indicators is that the DCoE investigates the goodness-of-fit within a particular day, whereas the QCoE aims to quantify the performance between days for a given percentile. As a result, for a given station one value of DCoE is calculated at a daily basis but one value of QCoE is produced for every percentile of interest. Though DCoE is the most common performance

123

indicator, the use of QCoE allows us to have a better understanding of model capability to fit low, moderate or high rainfall intensity events. 2.2 Step 2: spatial interpolation The above-mentioned estimation procedure requires high temporal resolution rainfall observations (e.g. minutes) and cannot be applied to aggregated (e.g. daily) precipitation measurements. In order to disaggregate the rainfall by the proposed distribution approach, it is essential that model parameters a and b are available at any positions of interest. This task relies on interpolating parameter values at the pluviograph locations over space. Spatial correlations in these two model parameters suggest kriging as the spatial interpolation method used in this study. Ordinary kriging is the most widely used geostatistical method (the best linear unbiased prediction) as discussed in Cressie (1993). It serves to estimate a value at a point from the optimal linear combination of data in the neighbourhood, provided that the variogram is known. The spatial variability is statistically accounted by covariance or variogram provided the process is stationary in secondorder. Let Z(s) be a spatial process at location s and define the variogram by cðsi  sj Þ ¼ 12 VarðZðsi Þ  Zðsj ÞÞ: The ordinary kriging estimate is given by ZðsÞ ¼

n X

ki Zðsi Þ;

ð8Þ

i¼1

where the weight is derived from the kriging system  Pn ki cðsi  sj Þ þ l ¼ cðsi  s0 Þ for i ¼ 1; . . .; n Pi¼1 ð9Þ n i¼1 ki ¼ 1: Typically, a variogram model fitted to the experimental variogram is used in the kriging system Eqn (9). The use of different variogram functions generally leads similar interpolation result (Cressie 1993). This study applied an exponential variogram model defined by  c0 þ c1 f1  expðr=RÞg if r [ 0 cðrÞ ¼ ð10Þ 0 if r ¼ 0; where c0, c1 and R are the nugget, sill and range, respectively. The presence of nugget allows the variogram to be discontinuous and accounts for measurement errors that cannot be resolved by the observation network. The range depicts the separation distance at which the spatial correlation no longer exists or is strong enough. The nugget-to-sill ratio controls the smoothness of estimated fields. These three parameters are estimated by the maximum likelihood estimation. R statistical package gstat (http://cran.r-project.org/

Stoch Environ Res Risk Assess (2010) 24:117–130

121

web/packages/gstat/) is used in this study to implement the algorithm. In order to compare the performance of the ordinary kriging, another two widely used interpolation approaches: B splines (Lee et al. 1997) and thin-plate splines (Wahba 1990). The surfaces interpolated by B-splines pass all data points and might show least fluctuation in all three interpolated surfaces. Thin-plate splines are known as a generalization of one-dimensional smoothing splines to high-dimensional spaces. The thin-plate spline estimation minimises the squared error penalized by the roughness and may not pass data points. R statistical packages MBA (http://cran.r-project.org/web/packages/MBA/), and fields (htttp://cran.r-project.org/web/packages/fields) are used to construct B-splines and thin-plate splines and kriging surfaces. A smoothing parameter is used in the thin-plate splines to control the amount that the data is smoothed. The value of the smoothness parameter is chosen by the generalized cross-validation.

3 Data A rectangular study region is situated within 31.2°–33.0° south and 115.60°–117.75° east and covers Perth

metropolitan area in Western Australia. There are 56 pluviometer stations falling in this 387 km2 region. Figure 1 shows the locations of the study area, the pluviometer stations with average rainfall amounts. The records span 45 years from 1961 to 2006. For each station, there are at most 240 6-min pluviograph data in each day. Due to some physical limitations of the tipping bucket mechanism in the pluviograph we only considered intervals where observations exceed 1 mm/h to fit the probability distribution. Furthermore, wet fraction was defined as the proportion of rain above-threshold (1 mm/h) instead of zero. The effect is illustrated in the times series plot for Perth Airport (station number: 009021) shown in Fig. 2. Figure 2a suggests that the primary pattern of non-zero rain proportion changed in the middle of 1980s due to the change of measurement devices (Dines Pluviographs was replaced by tipping bucket devices). The non-zero rain proportions are virtually less than 60% before a changing point but may approach to 100% afterward. In the meanwhile, daily rainfall amounts look very stable over all observed years in Fig. 2b. As a result, the daily average of non-zero rainfall intensities looks considerably higher in the period of 1960–1985 in Fig. 2c. However, the wet fractions ([1 mm/h) are not affected much by device change; see Fig. 2d. Therefore, removing extremely low

Fig. 1 Locations of 56 pluviometer stations with average rainfall amounts Station locations with average rainfall amounts

-32.0 -32.5

15mm/day 20mm/day 25mm/day 30mm/day

-33.0

Latitude

-31.5

-31.0

Perth

115.0

115.5

116.0

116.5

117.0

117.5

118.0

Longitude

123

(b)

1.2

Dines Pluviograph

Tipping bucket

1970

1980

1990

2000

(c)

100 80 60 1960

1970

1980

1990

2000

(d) Tipping bucket

0.0

10

0.2

20

The Wet Fraction

Dines Pluviograph 0.6

Tipping bucket

30

Dines Pluviograph

0

Average rainfall intensity(mm/hr)

40 0

0.0 1960

1960

1970

1980

intensity allows us to eliminate the data uncertainty brought from both adjustment methods and device change. Further data quality controls were conducted prior to use: (1) only days with complete pluviograph records were used to avoid gap-filling problems; (2) only rainy days with daily total equalled or exceeded 10 mm were considered because of high relative error from low intensity; and (3) at least four distinct above-threshold intervals were required to calculate probability moments. A day is called a valid day if all three requirements are satisfied.

4 Results 4.1 Results of disaggregation at sample sites We start from the estimation results obtained from Perth Airport (009021) to illustrate the estimation procedure and test performance. The station at Perth Airport is located in open ground so as to avoid exposure to wind effects or interception by trees or surrounding buildings. At each valid day, the lognormal parameters (a and b) were estimated by the least-squares method introduced in Sect. 2.1. The boxplots in Fig. 3a show that the fitted mean

123

Tipping bucket

20

0.2

0.4

0.6

0.8

Daily rainfall total(mm)

1.0

Dines Pluviograph

0.4

(a) Non-zero rainfall fraction

Fig. 2 Time series plots of a non-zero rainfall fraction, b daily rainfall amount (mm), c daily average of non-zero rainfall intensity (mm/h) and d the wet fractions defined by the above-threshold ([1 mm/h) rainfall proportion at Perth Airport (station number: 009021)

Stoch Environ Res Risk Assess (2010) 24:117–130

120

122

1990

2000

1960

1970

1980

1990

2000

parameters were fairly symmetric but the fitted variance parameters looked heavily positively skewed from 1,003 valid data. Furthermore, there is no clear evidence in Fig. 3b, c to support the existence of any long-term trends of two parameters. These two parameters look generally positively associated with daily rainfall amount. The apparent outliers highlighted by red circles in Fig. 3b, c occurred at 16 May 2005. The empirical rainfall intensity distribution for 16 May 2005 as well as the lognormal distribution curve with the fitted parameters is shown in Fig. 3d. It is observed that the value of one intensity record is substantially larger than others and induces the distribution to be heavily skewed to right. Interestingly, the value of the performance indicator (DCoE = 0.986) is still very high for this day. This implies that the lognormal is still flexible to represent the sub-daily rainfall intensity distribution in such a circumstance. Figure 4a, b illustrates that the effective intensity was well associated with the fitted mean parameter, despite a few outliers, by the following linear regression:  ^ ¼ 0:295 þ 0:87 lnðPÞ l

ð11Þ

with R2 = 0.609. A significant test based on F statistics showed the linear relationship in (11) is significant with p

(b)

1 0 -2

-1

5

log-mean

10

2

15

(a)

-3

0

1980

1990

0.2

0.4

0.6

2000

60

Intensity (mm/hr)

15 10

0

5

1970

80

1960

(d)

0

log-variance

(c)

log-variance

40

log-mean

20

Fig. 3 For Perth Airport station (009021) between 1961 and 2006 a boxplots of fitted mean and variance parameters; b time series of fitted mean; and c time series of fitted variance; d the empirical intensity distribution and the lognormal distribution fitting with least-squares parameter estimates at 16/05/ 2005

123

3

Stoch Environ Res Risk Assess (2010) 24:117–130

1960

1970

1980

1990

2000

0.0

0.8

1.0

Percentile

value almost equal to 0. The variance parameters seem to be a little underestimated when the fitted parameter is small. Further endeavour should be made to estimate the variance parameter more efficiently in the future. The relationship between the daily rainfall amount and the wet fraction in Fig. 4c is shown positively associated. However, the wet fraction is assumed to be given in this study and how to determine the average effective intensity from daily rainfall amount is left as an open question. The MoM is a convenient and widely used method to estimate distribution parameters. Using Perth airport station as an illustrative example, we compare the MoM and the least squares estimation and justify why the latter method is more appropriate to our problems. All scatter points in Fig. 5a are located below the 1:1 line. This finding verifies that the least-squares estimators lead substantially greater value of DCoE than MoM and are in line with the fact that the least-squares estimators are actually obtained through maximising the DCoE. When the parameters are estimated from the average effective intensity via the algorithm proposed above, the leastsquares estimation still results in a better fitting performance than MoM, which is illustrated by Fig. 5b. The boxplots presented in Fig. 5c suggest that the variation of the DCoE obtained from MoM is significantly higher than from the least-squares methods. It is interesting to see

that for MoM, the fitting performance improves when the parameters are estimated from the average effective intensity. Figure 5d compares the performance between valid days for various methods. The least-squares fitting method is shown to predict moderate to high rainfall intensity most efficiently in four methods, unfortunately at a cost of poor prediction at low rainfall. High rainfall intensity is closely related to the highest contiguous 30-min rainfall intensity and thus is of most importance for the use of rainfall-runoff-erosion models. These linear coefficients were calculated across all 56 stations based on the least-squares fitted parameters and were ready to use for the model interpolation. The linear relationships in 55 of 56 stations are significant with p values less than 0.05. The only one exception occurs at station 010009 where only five valid days are available. More observations would be useful to establish the relationship between the distribution parameter and effective rainfall intensity. At a given station, the median of DCoE (MCoE) was used as an overall performance index to assess how well sub-daily rainfall intensity distributions are fitted by lognormal distributions. Medians were used to describe the central tendency since they tend to be more robust to outliers than means. Figure 6 compares the performance of our estimation (i.e. Eqs. 4, 6) to distribution parameters and the estimation

123

(b)

Estimated variance 4 6 8

2 1 0 -1

2

-2

Calibrated mean

3

(a)

0

-3

Fig. 4 a The average effective intensity against the fitted logmean parameter directly from pluviometer data; b the fitted log-variance parameter directly from pluviometer data against the estimated log-variance parameter from the average effective intensity; and c the daily rain amount against the wet fraction. The data is for Perth Airport station (009021) between 1961 and 2006

Stoch Environ Res Risk Assess (2010) 24:117–130

10

124

0

5

10

15

20

25

30

5

10

15

Calibrated variance

0.4 0.3 0.1

0.2

Wet fraction

0.5

(c)

0

0.6

Effective intensity

20

40

60

80

100

120

Rain amount

introduced by Kandel et al. (2004). Although two lines are fairly close to each other, our method generally yields higher MCoE than Kandel et al. (2004). Therefore, for our data it is better to link the average effective intensity with the log-mean parameter than the squared root of log-variance. It supports one of our modifications to the method used by Kandel et al. (2004). Figure 7 summarises the overall fitting performance by different types of dots. Figure 7a shows that when the fitted parameters are applied, all values of MCoE are at least 0.90 and even greater than 0.95 at more than half stations. Therefore, the proposed two-parameter lognormal distribution fitted rainfall intensity distribution extraordinarily well when a pair of fitted parameters was incorporated. When two parameters in a lognormal distribution were estimated from the average effective intensity, all MCoE were virtually greater than 0.85 except for three stations on the region boundary in Fig. 7c. Figure 7b, d indicate that most QCoE (90%) were above 0.85 with fitted parameters and were more than 0.80 with estimated parameters. It seems more challenging to predict 90% upper high intensity in west than in east. In particular, heavy storms at eight stations close to the western coastline were predicted poorer than others with QCoE (90%) less than 0.8.

123

4.2 Results of spatial interpolation Estimated surfaces of the parameters a and b from the ordinary kriging and other two interpolation methods are illustrated in Fig. 8. The B-splines approach is an exact interpolation, which keeps all observations across the estimate surface without taking any error into account. As a result, the interpolation surface provided by the B-splines (Fig. 8a, b) produced the roughest surface. Instead, the thin-plate splines minimise the mean squared error with a penalty on roughness and the resulted mapping may not be necessary to pass observations. Figure 8c, d look smoother than other interpolated surfaces. The kriging with a nugget component is also able to accommodate the measurement error and may not be an exact interpolation. The kriging surfaces shown in Fig. 8e, f has less fluctuations than the thin-plate splines but more than the B-splines. In all panels of Fig. 8, the value of a is largest in northeast, where the value of b is smallest. The leave-one-out cross-validation was carried out to assess the prediction capability of three interpolation approaches. The coefficients a and b for a given station were predicted from information of all other stations. To evaluate the prediction performance, we compare DCoE

Stoch Environ Res Risk Assess (2010) 24:117–130

(a)

(b)

0.90

0.95

0.0

(d)

0.86

0.2

0.4

0.6

0.8

1.0

Quantile CoE

0.0

LS fit MoM fit LS est. MoM est.

-1.0

-0.5

Quantile CoE

0.5

0 -2 -4 -6 0.6

0.8

1.0

MoM fit

0.4

Cumulative frequency

0.2

0.5 -0.4

LS estimating

Boxplots of daily CoE

0.0

0.84

0.0

1.00

LS fitting

Proposed estimation Estimation by Kandel et al. (2004)

0.82

-0.5

MoM estimating

-1.5 0.85

LS fit

0.80

-1.0

0 -2

MoM fitting

-4 0.80

(c)

0.78

Daily CoE 1.0

Daily CoE

-6

Fig. 5 a Daily CoEs if the distribution parameters are fitted by the least-squares method and by the method of moments; b daily CoEs if the distribution parameters are estimated from Eqs. 4 and 6 where the linear coefficients are estimated from the least-squares estimates and from the method of moments. c Boxplots of daily CoEs obtained from various distribution parameter estimates; d Quantile CoEs obtained from various distribution parameter estimates

125

0.88

0.90

0.92

MCoE

Fig. 6 Cumulative distribution curves of MCoE with distribution parameters obtain from the proposed estimation (i.e. Eqs. 4, 6) and the method by Kandel et al. (2004)

and QCoE for all three interpolation methods and report the averages (in terms of stations) of mean, standard deviation and median of these performance indicators in

LS est.

MoM est.

0.2

0.4

0.6

0.8

Percentile

Table 1. Kriging was proved to be most reliable (least standard deviation) and most accurate (greatest mean and median), followed by thin-plate splines. The B-splines interpolation was inferior to other estimations possibly because of failure to model randomness and covariance structure. However, the DCoE performed very similar across all methods. Moreover, the summary and boxplots of QCoE for 25, 50, 75 and 90% presented in Table 2 and Fig. 9 show that different interpolations could lead very different performance to fit a specific quantile. For example, thin-plate splines lead the best fitting in low rainfall whereas the other two methods generally provided very coarse estimates. The lognormal distribution with interpolated parameters was not able to produce the median (50% quantile) of the daily rainfall intensity very well for all methods, though kriging is still the preferred option because of the smallest observed variation. All methods resulted in larger values of QCoE as the percentile increased, because least-square estimation favours heavy storms. The most impressive feature of the boxplots in Fig. 9 is that the thin-plate splines yielded higher variations of QCoE than other methods, though it provided the similar coefficient estimates (Fig. 8) and DCoE (Table 1) to kriging.

123

126

MCoE (fitted)

-31.0 -32.5

-32.0

Latitude

-31.5

-31.5 -32.0 -32.5

Latitude

QCoE(90% ) (fitted)

(b)

-31.0

(a)

-33.0

-33.0

Fig. 7 MCoE and QCoE (90%) obtained from fitted (from pluviometer data directly) and estimated (from the average effective intensity) parameters (diamond above 0.95, X-shape cross between 0.90 and 0.95, cross between 0.85 and 0.90, triangle between 0.80 and 0.85, circle below 0.80)

Stoch Environ Res Risk Assess (2010) 24:117–130

115.0

116.0

117.0

115.0

118.0

Longitude

(c)

(d)

118.0

-31.0

QCoE(90% ) (estimated)

Latitude

-32.5

-32.0

-31.5

-31.0 -31.5 -32.0 -33.0

-33.0

-32.5

Latitude

117.0

Longitude

MCoE (estimated)

115.0

116.0

117.0

Longitude

5 Discussion and summary This paper concerns a distribution-based disaggregation of daily rainfall time series into higher solution. The probability distribution of 6-min rainfall data is estimated from the daily total and wet fraction. The proposed model is summarised in the following flow chart (Fig. 10). The rainfall intensity distribution is represented by a two-parameter lognormal distribution and we suggest the distribution parameters be fitted using least-squares estimation to optimise the ordinary daily coefficient of efficiency (DCoE). Numerical studies proved that the leastsquares estimation yields much better overall performance than the MoM. However, we observed the lognormal distribution with the least-squares estimators fitted low to moderate rainfall relatively poorly. More flexible probability distributions could improve the model fitting but may bring more difficulties in the interpolation procedure due to the increased number of fitted model parameters. In particular, when above-threshold rainfall intensity observations are limited, models involving more parameters would be a challenge to estimate efficiently. This study

123

116.0

118.0

115.0

116.0

117.0

118.0

Longitude

shows that the two-parameter lognormal distribution is a reasonably good compromise between model complexity and model performance. We assume the wet fraction is known in this study because it is still very hard to estimate this parameter efficiently. The importance of the wet fraction in our proposed disaggregation approach is twofold: (1) rainfall data contain both below-threshold and above-threshold observations. This manuscript mainly focuses on the estimation of the distribution of above-threshold observations. A complete rainfall intensity distribution should include the proportion of above-threshold observations in a day. It is not necessary to estimate the distribution of belowthreshold ones, as they have little impact on hydrological consequences. (2) The effective rainfall is the average of the above-threshold rainfall intensity. The rain amount below-threshold (less than 1 mm/h) is almost negligible and the effective rainfall is approximated by the daily total divided by the wet fraction. A reliable estimation of the rainfall intensity distribution from cumulative rain totals is highly dependent on the efficient estimation of the average effective intensity rate. Kandel et al. (2004) used an antecedent precipitation

Stoch Environ Res Risk Assess (2010) 24:117–130 B-spline for α

(a)

B-spline for β

(b) 0.2 0.0

1.4 -31.5

-31.5

Fig. 8 Surface estimates of a and b by B-splines, thin-plate splines and kriging

127

1.2

-0.4

-32.0

-32.0

-0.2

1.0

-32.5

-0.8

-32.5

-0.6

0.8

-1.0 -1.2 116.0

116.5

117.0

117.5

116.0

Thin-plate spline for α

(c)

0.6 116.5

117.0

117.5

Thin-plate spine for β

(d) 0.0

1.4 -31.5

-31.5

0.2

1.2

-0.4

-32.0

-32.0

-0.2

1.0

-0.8

-32.5

-32.5

-0.6

0.8

-1.0 -1.2 116.0

116.5

117.0

117.5

116.0

Kriging for α

(e)

0.6 116.5

117.0

117.5

Kriging for β

(f) 0.0

1.4 -31.5

-31.5

0.2

1.2

-0.4

-32.0

-32.0

-0.2

1.0

-0.8

-32.5

-32.5

-0.6

0.8

-1.0 -1.2 116.0

116.5

117.0

Table 1 Summary of daily CoE (DCoE) for three interpolation methods DCoE

Kriging

B-spline

Thin-plate

Mean

0.831

0.820

0.828

St. dev.

0.187

0.189

0.189

Median

0.881

0.876

0.881

index (API) to separate intense storms from less-intense ones. This method is typically applied to discriminate between different types of storm in tropical regions, but

117.5

0.6 116.0

116.5

117.0

117.5

may not be a complete indictor for the Mediterranean climate region. Incomplete observations at the day preceding the valid day of interest also make harder for the calculation of API in many cases. No sufficient evidence justifies that seasonal factors such as day of year, month or season could contribute to the effective rate estimation. This could be an interesting topic of the future research. The two parameters of the lognormal distribution are linked with the effective intensity rate. The spatial interpolation of rainfall intensity distribution is implemented by predicting the values of two parameters at arbitrary points in the landscape. The proposed method based on the

123

128

Stoch Environ Res Risk Assess (2010) 24:117–130

Table 2 Summary of quantile CoE (QCoE) for three interpolation methods Kriging

B-spline

Thin-plate

Mean

0.243

0.055

0.600

St. dev.

0.379

0.844

0.335

Median

0.344

0.293

0.749

Mean

0.574

0.489

0.585

St. dev. Median

0.184 0.623

0.398 0.604

0.388 0.698

Percentile = 25%

Percentile = 50%

Percentile = 75% Mean

0.790

0.764

0.632

St. dev.

0.087

0.137

0.271

Median

0.804

0.801

0.702

Mean

0.861

0.856

0.635

St. dev.

0.063

0.066

0.297

Median

0.865

0.862

0.740

Percentile = 90%

(b)

0.5

0

0.0 -1.0 -0.5

-2 -3

-1.5

-4 -5

B-spline

Tps

Kriging

Tps

Kriging

0.5

1.0

QCoE (90%)

0.0

0.5

-1.5 -1.0 -0.5

0.0 -0.5 -1.0

B-spline

123

B-spline

(d)

QCoE (75%)

1.0

(c)

QCoE (50%)

1.0

QCoE (25%)

1

(a)

-1

Fig. 9 Boxplots of QCoE (25, 50, 75 and 90% quantiles) obtained from B-splines, thin-plate splines and kriging interpolations

ordinary kriging performed better than its competitors: B-splines, thin-plate splines and kriging. Though all three methods performed very similarly in terms of DCoE, kriging was shown to be superior. More substantial advantage of kriging was found from the quantile coefficient of efficiency (QCoE). The comparison showed that kriging lead the most reliable and accurate estimates, followed by B-splines. Thin-plate splines fitted less-intense storms better than other approaches, but it generally produced relatively large variation in the estimates for moderate to high intense rainfall. An integrative approach is proposed for spatiotemporal downscaling rainfall data within a probability distribution framework. A peer reviewer suggests it would be interesting to compare the merits with other alternatives: (a) interpolate the aggregate rainfall at any sites using the sampled data, then assume pdf to get sub-day rainfall at any sites; (b) assume the calibrate pdf of rainfall at sample sites then interpolate the phdf to any sites. These two options do not assume the aggregate rainfall (such as daily total) is known and the comparison would be very useful to

Tps

Kriging

B-spline

Tps

Kriging

Stoch Environ Res Risk Assess (2010) 24:117–130 Fig. 10 The flow chart of the distribution-based spatiotemporal disaggregation approach

129

At each pluviometer site, two distribution parameters are estimated by the high-solution observations using the least-squares method.

Establish a relationship between the mean parameter and effective rainfall intensity by the linear regression

Interpolate the linear coefficients derived at each pluviometer site spatially.

Any site with observed daily rainfall amount and wet fraction.

assess the value of aggregate rainfall data. This would be a very interesting topic for a further research.

References Anderson B, Siriwardena L, Western A, Chiew F, Seed A Blo¨schl B (2006) Which theoretical distribution best fits measured withinday rainfall distributions across Australia? In: 30th hydrology and water resource symposium, Institution of Engineers, Australia, Launceston, Australia, 4–7 Dec 2006 Biondid R (1976) Cloud motion and rainfall statistics. J Appl Meteorol 15:205–224 Cameron D, Beven K, Tawn J (2000) An evaluation of three stochastic rainfall models. J Hydrol 228:130–149 Cameron D, Beven K, Tawn J (2001) Modelling extreme rainfalls using a modified random puse Barlett-Lewis stochastic rainfall model (with uncertainty). Adv Water Resour 24:203–211 Connolly RD, Schirmer J, Dunn PK (1998) A daily rainfall disaggregation model. Agric For Meteorol 92(2):105–117 Crane RK (1985) Evaluation of global and CCIR models for estimation of rain rate statistics. Radio Sci 20:865–879 Cressie NAC (1993) Statistics for spatial data (revised edition). Wiley-Interscience, New York Cunnane C (1978) Unbiased plotting positions—a review. J Hydrol 37:205–222 Dodov B, Foufoula-Georgiou E (2005) Fluvial processes and streamflow variability: interplay in the scale-frequency continuum and implications for scaling. Water Resour Res 28:711–728 Eagleson P (1972) Dynamics of food frequency. Water Resour Res 8:878–898 Foufoula-Georgious E, Lettenmaier D (1987) A Markov renewal model for rainfall occurrences. Water Resour Res 23:875–884 Garciaguzman A, Arandaoliver E (1993) A stochastic model of dimensionless hyetograph. Water Resour Res 29:2363–2370 Hershenhorn J, Woolhiser D (1987) Disaggregation of daily rainfall. J Hydrol 95:299–322

Obtain the distribution parameters from the interpolated model parameters.

Houze RA, Cheng CP (1977) Radar characteristics of tropical convection observations observed during GATE: mean properties and trends over the summer season. Mon Weather Rev 105:964–980 Kandel DD, Western AW, Grayson RB, Turral HN (2004) Process parameterization and temporal scaling in surface runoff and erosion modelling. Hydrol Process 18:1423–1446 Kandel DD, Western AW, Grayson RB (2005) Scaling from process timescales to daily timesteps: a distribution function approach. Water Resour Res 41:W02003. doi:10.1029/2004WR003380 Katz R (1977) Precipitation as a chain-dependent process. J Meteorol 16:671–676 Kedem B, Chiu LS, North GR (1990) Estimation of mean rain rate: application to satellite observations. J Geophys Res 95D:965– 1972 Lane LJ, Nearing MA (eds) (1989) USDA—water erosion prediction project: hillslope profile model documentation. NSERL Report No. 2, USDA-ARS National Soil Erosion Research Laboratory, West Lafayette, IN, USA Lee S, Wolberg G, Shin SY (1997) Scattered data interpolation with multilevel B-splines. IEEE Trans Vis Comput Graph 3:228–243 Legates DR, McCabe GJ (1999) Evaluating the use of ‘‘goodness of fit’’ measures in hydrologic and hydroclimatic model validation. Water Resour Res 35:233–241 Lopez RE (1997) The lognormal distribution and cumulus cloud populations. Mon Weather Rev 105:865–872 Mertens J, Raes D, Feyen J (2002) Incorporating rainfall intensity into daily rainfall records for simulating runoff and infiltration into soil profiles. Hydrol Process 16:731–739 Onof C, Mackey NG, Oh L, Wheater HS (1998) An improved rainfall disaggregation technique for GCMS. J Geophys Res Atmos 103:19577–19685 Oostindie K, Bronswijk JJB (1992) FLOCR—a simulation model for the calculation of water balance, cracking and surface subsidence of clay soils. DLO Winand Staring Centre Report No. 47, Wageningen, Netherlands Ramond WH (1997) A theoretical evaluation of the relevance of lognormal distributions for the moisture flux and wind components. Mon Weather Rev 125:3018–3023

123

130 Ross PJ (1990) SWIM: a simulation model for soil water infiltration and movement: reference manual. CSIRO Division of Soils, Townsville, Queensland, 59 pp Sivakumar B, Sharma A (2008) A cascade approach to continuous rainfall data generation at point locations. Stochastic Environ Res Rish Assess 22:451–459 Socolofsky S, Adams E, Entekhabi E (2001) Disaggregation of daily rainfall for continuous watershed modelling. J Hydrol Eng 6:300–309 Van Dijk AIJM, Meesters AGCA, Schellekens J, Brunijnzeel LA (2005) A two-parameter exponential rainfall depth-intensity

123

Stoch Environ Res Risk Assess (2010) 24:117–130 distribution applied to runoff and erosion modelling. J Hydrol 300:155–171 Vanclooster M, Viaene P, Diels J (1994) WAVE: a mathematical model for simulating water and agrochemicals in the vadose environment: reference and users manual (release 2.0). Katholieke Universiteit, Leuven Wahba G (1990) Spline models for observational data. SIAM, Philadelphia Woolhiser D, Roldan J (1982) Stochastic daily precipitation models. 2. A comparison of distribution of amounts. Water Resour Res 18:1461–1468