Simple Disaggregation by Accurate

0 downloads 0 Views 1MB Size Report
KOUTSOYIANNIS. AND MANETAS: SIMPLE DISAGGREGATION. (also known as PAR(l)) model. Thus the overall model uses exactly the same parameter set as ...
WATER RESOURCES RESEARCH, VOL. 32, NO. 7, PAGES 2105-2117, JULY 1996

Simple disaggregationby accurate adjusting procedures Demetris Koutsoyiannisand AlexandrosManetas Departmentof Water Resources,Facultyof Civil Engineering,National TechnicalUniversity,Athens,Greece

Abstract. A multivariatedisaggregation methodis developedfor stochasticsimulationof hydrologicseries.The methodis basedon three simpleideasthat havebeen proven effective.First, it startsusingdirectlya typicalPAR(l) model and keepsits formalismand parameterset,whichis the mostparsimonious amonglinear stochasticmodels.This model is run for the lower-levelvariableswithout any referenceto the knownhigher-level variables.Second,it usesaccurateadjustingproceduresto allocatethe error in the additiveproperty,i.e., the departureof the sumof lower-levelvariableswithin a period from the corresponding higher-levelvariable.They are accuratein the sensethat they preserveexplicitlycertainstatisticsor eventhe completedistributionof lower-level variables.Three suchprocedureshavebeen developedand studiedin this paper,both theoreticallyand empirically.Third, it usesrepetitivesamplingin order to improvethe approximations of statisticsthat are not explicitlypreservedby the adjustingprocedures. The model,owingto the wide rangeof probabilitydistributionsit can handle(from bellshapedto J-shaped)and to its multivariateframework,is usefulfor a plethoraof hydrologicapplicationssuchas disaggregation of annualrainfall or runoffinto monthlyor weekly amounts,and disaggregation of eventrainfall depthsinto partial amountsof hourly or even lessduration.Suchreal-worldhydrologicapplicationshave been exploredin this studyto test the model performance,whichhasprovenvery satisfactory. 1.

Introduction

Vogel, 1984; Grygierand Stedinger,1988; Lane and Frevert, 1990] disaggregate higher-levelvariablesat one or more sites

The use of disaggregation techniquesin hydrologicapplica- to lower-level variables at those and other sites in two or more tionshasbecomemore and more widespreaddue to the ability steps. The condenseddisaggregationmodels [Lane, 1979, of thesetechniquesto increasethe time or spaceresolutionof 1982; Peteira et al., 1984; Oliveira et al., 1988; Stedingerand certainhydrologicprocesses, suchas rainfall and runoff.Sev- Vogel,1984;Stedinger et al., 1985;Grygierand Stedinger, 1988] eral disaggregationmodels have been developed,which can reducethe number of required parametersby explicitlymoddivideknownhigher-levelamounts(e.g., annual) into lower- elingfewer of the correlationsamongthe lower-levelvariables. 1½v½1 ones(e.g.,monthly,daily,½t.c.). The higher-level amounts Stepwisedisaggregationschemes[Santosand Salas, 1992; are usuallyderivedby typicalsequentiallinear modelssuchas Salas,1993,p. 19.34]performthe disaggregation alwaysin two autoregressive (AR) or autoregressive moving average partsor two seasons. For examplean annualvalueis disaggre(ARMA) or by othertypesof models.The couplingof sequengatedinto 12 monthlyvaluesby first disaggregating the annual tial modelsfor the higher-levelvariableswith disaggregation value into the first monthly and the sum of the remaining 11 modelsresultsin a multiple-scalepreservationof the stochastic months.Then the latter sum is disaggregated into the second structureof hydrologicprocesses. monthlyvalue and the sumof the remaining10 values,and so The first developeddisaggregation modelswere multivarion until all monthlyvaluesare obtained.Salas[1993,p. 19.36] ate, i.e., performed a simultaneousdisaggregationat several notesthat the Santosand Salasstepmodelisverysimilarto the sites.Typicalexamplesof suchmodelsare thoseby Valencia et al. [1985]. and Schaake[1972, 1973],Mefia and Rousselle[1976],Hoshi condensedalgorithmof Stedinger Another stepwise disaggregation approachwasproposedby andBurges[1979],Stedinger and Vogel[1984],Tao andDelleur Koutsoyiannis [1988] (see also Koutsoyiannis andXanthopoulos [1976] and Todini [1980]. These modelsexpressthe vector [1990]) and further developed by Koutsoyiannis [1992]. This containingall unknownlower-levelvariablesas a linear funcapproach, called the dynamic disaggregation model (DDM), at tion of the higher-levelvariablesof all sitesand someinnovaeach step disaggregates a given amount into two parts, the tion vatlares.Thus they attempt to reproduceall covarianc½ propertiesbetweenlower-levelvariablesas well as thosebe- variableof the nextperiodand the amountto be allocated(in steps)acrossthe remainingperiods.In thisrespect, tween lower-leveland higher-levelvariablesamong all sites subsequent DDM is very similar to the Santosand Salas stepwisedisagand time steps.This resultsin a huge numberof parameters gregation scheme. However, there are some differencesbebecauseof the large number of crosscorrelationsthat they tween the two schemes.At each step,DDM usesa nonlinear attempt to reproduce. the given amount into Different proceduresreducingthe requirednumber of pa- generationmodule that disaggregates two parts. This module adds notable mathematical complexity, rameters have been developed. The staged disaggregation but it is capable of preserving the third moments of the varimodels [Lane, 1979, 1982; Salas et al., 1980; Stedingerand ables,if they are not normally distributed.Apart from that Copyright1996by the AmericanGeophysicalUnion. generationmodule,DDM has anothermodule,whichat each step determinesthe parametersrequired for the generation. Paper number96WR00488. 0043-1397/96/96WR-00488509.00 Thismoduleis linearandis closelyrelatedto a seasonal AR(1) 2105

2106

KOUTSOYIANNIS

AND

MANETAS:

SIMPLE

DISAGGREGATION

(also knownas PAR(l)) model.Thus the overallmodeluses txl vectorof lower-level variables of all sub-periods at exactlythe sameparameterset as the PAR(l) model,which locationl (sizek); involvesthe minimum number of parameters,significantly tZ vectorof higher-level variablesat all locations(sizen). lower than direct(all-at-once)disaggregation schemes. The vectors,Z and ,Xs are relatedby the additiveproperty,i.e., In this paper we present a new method that keeps some k ideas of the DDM approachbut is simpler. This method initially retainsthe formalism,the parameterset, and the generZ tXs: tz (1) s=l ation routineof the PAR(l) model.Then it usesan adjusting procedureto achievethe consistency of lower-levelandhigherTo simplifythe notation,we can omit the subscriptprefixof level variables.The adjustingproceduremodifiesthe values generatedby the PAR(l) modelto preservethe additiveprop- the higher- and lower-levelvariablesthat correspondsto the erty. The idea of adjustingproceduresis not new in the liter- periodt, asgenerallythere is no ambiguitybecausewe refer to ature [see, e.g., Grygierand Stedinger,1988, 1990; Lane and a specificperiod. However, in some casesit is necessaryto Frevert,1990,p. V-22]. However,here we introducesomedif- refer to lower-levelvariablesof other periodsthat are closeto ferent proceduresthat have been proved,both theoretically the start or the end of the examinedperiod t, i.e., the periods and empirically,to be accuratein the sensethat theypreserve t - 1 and t + 1. In sucha casewe usethe subperiodindex0 certainstatistics (and in specialcasesthe completedistribution for the last subperiodin the previousperiod and alternatively lower-levelvariableas X/0 '-function)of lower-levelvariables.Anotheridea that is usedin denotethe corresponding we denotethe firstsubperiod of the next the presentmethod is repetition:Instead of runningthe gen- t-•X•. Similarly, thisnotation, thelowererationroutineof the PAR(l) modeloncefor eachperiod,we periodasX• +•: = t+•X/•ßExtending level variables of the previous period are denoted as run it severaltimesand choosethat combinationof generated xl_k+•,''', X l •, Xlo and those of the next period are devalues,which is in closeragreementwith the known value of 1 ß the higher-levelvariable. With this techniquewe can obtain notedasXJ,+•, X•:+2, '", X•2k The location index may be viewed in a wide manner; e.g., good approximationsof the statisticsthat are not explicitly preservedby the adjustingprocedure.Overall,the three simple differentvalues of l may describedifferenthydrologicpro(suchas contemporary rainfalland runoff)at the same ideasof the developedapproach,i.e., the PAR(l) formalism, cesses the accurateadjustingprocedures,and repetition,haveproved location.Also, l may be omitted if we refer to a single-site problemor, more generally,if there is no ambiguity.In that very efficient,aswill be shownthroughoutthis paper. The PAR(l) formalismwas selectedfor the sakeof parsi- casethere is only a scalarhigher-levelvariableZ and a vector X = [X•, ..., Xs,'", X•] r, which mony of parameters.Apparently,the techniquemay also be of lowerlevelvariables contains the lower-level variables of a specificperiodat a single coupledwith other sequentiallinear models such as PARMA(p, q). In addition,the developedadjustingtechniques location. Overall, the notation is oriented toward multivariate but it canbe easilymodifiedfor spaand the repetitionmaybe usefulin other disaggregation mod- temporaldisaggregation, tial disaggregation. elsin casethey do not explicitlypreservethe additiveproperty. We will study in detail in this paper the casewhere the This paper is structuredas follows.Section2 introducesthe seanotation and the general methodology.In section3, which is lower-levelvariablesare related by a contemporaneous the theoreticalcore of our model, we presentthree different sonalAR(1) (or PAR(l)) model,i.e., adjustingprocedures.In section4 we discussseveralcompuXs = asXs-•+ bsVs (2) tational issuesfor the applicationof the method.In section5 we give some resultsof the method for real-word situations whereasisa (n x n) diagonal matrix,i.e.,as= diag(a•,..., and in section 6 we draw some conclusions.In addition, there

a•n);bs- [b?]is a (n x n) matrixof coefficients; andVs=

are two appendiceswhere we give details of mathematical [V•, ..., V•] r is a vectorof independent (bothin timeand developmentsand proofs. location)randomvariates,not necessarily Gaussian.Equation (2) is not a structuralconstraintof the proposedmethodbut rather is a simplificationfor mathematicalconvenienceand 2. Notation and Methodology parsimonyof parameters.Higher-order PAR or PARMA can We considera specifichigher-leveltime stepor period(e.g., be usedinsteadof (2) if necessary, and a more complexdis1 year), denotedwith an indext = 1, 2, ..., anda subdivision aggregationmodel canbe formed.However,aswas explained of the periodin k lower-leveltime stepsor subperiods (e.g.,12 aboveour objectivein this studyis to build a model as simple months),each denotedwith an index s - 1, ..., k. Let a and parameterparsimoniousas possible,and this objectiveis specifichydrologicprocess(rainfall,runoff,etc.) be definedat better attainedby the PAR(l) model. The parametersthat n locationsspecifiedby an indexl = 1, ..., n. We denotethis thisspecificmodelexplicitlypreserves are (1) the meanvalues processwith the symbolsX and Z for the lower- and higher- of lower-levelvariables,i.e., the k vectorsl•s = E[Xs] of size level time step, respectively.Generally,we use uppercaselet- n each; (2) the variancesand lag-zerocross-covariances of ters for random variables, and lowercase letters for values, lower-levelvariables,i.e., the k matrices•rss=Cov [Xs, Xs] parameters,or constants.Furthermore,we usebold lettersfor E[(Xs - l•s)(Xs- l•s)r] of size(n x n) each;(3) thelag-1 arrays or vectors,and normal letters for their elements.In autocovariances of lower-levelvariables,i.e., thek vectors(i.e., particular, onefor eachsubperiod) [Cov[xls,xls_i], l = 1,..., n] =

tXls lower-level variableat periodt, sub-period s and location l;

[E[(Xls_ •s)(X's-i l i _ Cs-i)], l l - 1, '", n] of sizen each; and (4) the third momentsof lower-levelvariables,i.e., the k

tZI higher-level variableat periodt andlocation l;

vectors % = [E[(Xls- ½/s)3], l = 1, ..., n] of sizen each.

,Xs vector of lower-levelvariablesof sub-periods at all locations(sizen);

In a similar situationKoutsoyiannis [1992] showedthat the total number of second-orderparametersof this model con-

KOUTSOYIANNIS

AND

MANETAS:

figuration(i.e., parametergroupsof points2 and 3 above)for all subperiods and locationsis kn(n + 3)/2. Also, he madea comparisonwith the total numberof parametersin other disaggregationschemes,resultingthat, for instance,whenk = 12 and n = 3 the direct (all-at-once),condensed,and staged disaggregation schemesinvolvea numberof parameterstwice (for the stagedscheme)to 20 times(for the Mejia-Rousselle scheme)more than that of the PAR(l) scheme.We note however,that the parsimonyof parametersof the PAR(l) scheme involvessomerisk of leavingout importantparametersof the long-termstochasticstructureof the process. The parametersas and bsof (2) are relatedwith parameter groups2 and 3 aboveby as• '

II

b•b•= {r•- a•tr•_l,•_•a•

DISAGGREGATION

2107

wereplaceXswithX**.Equation(7) isreplaced bytheassumption that Vs is Gaussian(zero third moments). The modelsdefinedby (2) and (8)-(9) are properfor sequentialgenerationof Xs but they do not take accountof the knownhigher-levelvariablesZ and apparentlyare not disaggregationmodels.What disaggregation modelstypicallydo is to incorporateZ into the model equationsand perform a sort of conditionalgenerationusingthe givenvaluesof the higherlevel variables.The method presentedhere is different and muchsimpler.It consistsof the followingsteps: 1. Use (2) or (8)-(9) to generatedirectlysomeXs within a period (s = 1, ..., k) withoutreferenceto the givenhigherlevel variablesZ of that period.

2. Calculate 9,= Ek •s andthedistance AZ = IIz -

Cov[XJ,X•_l]/Var [XJ_i] = COV[X•, I

SIMPLE

(3)

(4)

consideringall sites(see section4 for detailsabout this distance). 3. Repeat steps1 and 2 until the distanceAZ is lessthan an acceptedlimit, and choosethe final set of Xs and Z, which

hastheminimum distance (thisstepisoptional; seesection 4_).

Theseequationsare extensions for the seasonalmodelof those 4. Apply an adjustingprocedureto correctthe chosenXs for the stationaryMarkov modelgivenby Matalasand Wallis andobtainXs suchthatZ•=• k X• = Z. [1976, p. 63]. They can also be obtainedby simplifyingthe 5. Repeat steps1-4 for all periodsthat have givenZ. equationsof the standardPAR(l) model[Salase! al., 1980,p. An adjustingproceduremay be viewedas a transformation 381; Salas, 1993, p. 19.31] for diagonal%. The matrix bs is or function

obtainedfrom bsb, r by decomposition, eithertriangularor Xs = f(X, Z, Z, statistics)

singularvalue.

Furthermore, thestatistics of thevariates V•sthatareneeded for the completedeterminationof (2) are givenby

E[Vs] = b•-•{E[Xs]- asE[Xs_•])

(5)

Var [ •] = 1

(6)

(lO)

whichgivenits argumentsreturnsthe correctlower-levelvariablesthat satisfy(1). The easiestway to expressit is to usen single-siteequations,one for eachlocation,whichmeansthat in each location the adjustmentis done separately.Specific formsof that equationthat preserveaccuratelycertain statistics are studied in the next section.

/x3[Vs] = (b?))-l('ys-a?)'ys-0

(7)

where/x3[Vs]is the vectorof third centralmomentsof Vs and the superscript (3) in a matrixindicatesthat thismatrixhasto be cubedelementby element.Equation(5) indicatesthat the variablesXs are not translatedto have zero mean, and this happensalsowithVs. Other modelsassumevariableswith zero mean and avoid equation (5). However, in our model it is desirableto use the real quantitiesfor Xs, as certain adjusting proceduresrequire the variablesto be in this form (see sections3.1 and 3.3). Equation(7) is a directextensionof that of the stationaryMarkov model given by Matalas and Wallis [1976,p. 64]. It is appropriatefor the seasonalmodeland does not requirethe matrixbs to be lower triangular. A casealternativeto (2) which has alsobeen examinedin this studyis the PAR(l) model for somenonlineartransformations of the lower-level variables, such as

X •*'= ln(Xs- c0

(8)

3.

Accurate Adjusting Procedures

In this sectionwe will presentthree adjustingprocedures alongwith their theoreticalbackground.First is the proportional procedurethat is appropriatefor lower-levelvariables with gamma distributionsthat satisfycertain constraints.Second is the linear procedure,which is more general,as it can apply to any distributionand it can preservethe first and secondmomentsregardlessof the type of the distribution. Third is the powerprocedure,whichis a modificationof the linear one for positivelower-levelvariablesand alsoincorporates, as a specialcase,the proportionalprocedure.Sinceall the examinedproceduresapplyto eachlocationseparately,in this sectionwe will use the single-sitenotation, omitting the locationindexfor the sakeof simplicity.It is emphasizedthat all the adjustingproceduresstudied can be used with any disaggregation modelthat needsadjustmentfor the variablesit generates.

wherecsis a vectorof parametersto be estimatedin a manner 3.1. Preservation of Gamma Marginal Distributions

thattheelements of X**areGaussian. Standard methods of the

three-parameter lognormal distribution such as those describedby Kite [1988,p. 72] and Stedinger et al. [1993,p. 18.15] i for each location. (amongothers)canbe utilizedto estimatecs In this casethe PAR(l) modelis

X*s= asX* s-1+ bsVs

(9)

The statisticsof interestare similarto parametergroups1-3 above,but now they are concernedwith the transformedlow-

er-levelvariables X•*.The thirdmoments (group4) are now replacedwith parameterscs. Equations(3)-(6) are stillvalidif

(Proportional Adjusting Procedure)

In a recentpaper [Koutsoyiannis, 1994] the followingproposition was studied:

Proposition 1. Let X s be independent variables with gammadistributionfunctionsand parameters% and A (s = 1, ..., k). Let alsoZ be a variable independentof X s with gammadistributionand parameters k

• '= Z •s s=l

(11)

2108

KOUTSOYIANNIS

AND

MANETAS:

and X. Then the variables

SIMPLE

DISAGGREGATION

E[(_X s - •s)(Xi - •)]. LetalsoZ bea variable independent of X s with mean k

k

E[Z] = j=l

are independentand have gammadistributionswith parameters Ksand X. The proof of thispropositionis givenbyKoutsoyiannis [1994, AppendixB]. Note that the variablesX s have the samejoint distributionfunctionas their corresponding Xs and they add up to Z. Also, note that the attainedresultcannotbe extended theoreticallyfor gammavariableswith differentscaleparameters or for normal variableseven if they obey quite similar restrictions.This propositiongivesrise to the adjustingprocedure definedby

x=2x

(13)

k

s=l

Then

3.2.

Preservation

of Second-Order

Statistics

in the General

(15)

j=l

the variables

s=l,...,k

(16)

j=l

mean values

and variance-covariance

matrix

with thoseof X s, if Xs is definedproperly,that is,

As'= rrs./rr..

(17)

where k

1988;LaneandFrevert, 1990,p. V522 ] because it provides the mostnaturalway of transformingX s into X s. The contribution of proposition1 is that it revealsa set of conditionsthat make the procedureexactin a strictmathematicalsense.Apparently, thissetof conditionsintroducesseverelimitations,i.e., (1) the variablesX s shouldbe two-parametergammadistributed,(2) they alsoshouldhavecommonscaleparameter,whichrequires that E[Xs]/Var [Xs] be constantfor all subperiods, and (3) all X s shouldbe mutuallyindependent. However, the proportional adjustmenthas been used in a wide range of casesnot obeyingthe aboveseverelimitations, with satisfactory results.Grygierand Stedinger[1988] provide an empiricalinvestigationof its performance,alsocomparingit with other adjustmentschemes. We note that their resultsmay not be directly applicablein our model, which unlike other modelsdoesnot assumean explicit dependenceof lower- to higher-levelvariables.Koutsoyiannis [1994] after an empirical investigation proposedthe followingrelaxedrestrictions in order for the proportionaladjustingprocedureto give satisfactory results,when appliedwithout explicitdependenceof lower- to higher-levelvariables:(1) the variablesX s have a distribution approaching the two-parameter gamma(e.g.,a three-parameter gamma distribution),(2) the statisticsE[Xs]/Var [Xs] are close to each other for different subperiodss, and (3) the variablesXs are correlatedwith Corr[Xs_•, Xs] not too large (a value as high as 0.60-0.70 is preservedby the procedure without problems).All theserelaxedrestrictionsare satisfied in the caseof short-scale(e.g., hourly)rainfall series,and thus the procedurewasusedfor the disaggregation of total amounts of rainfalleventsinto hourlydepthsby couplingthisprocedure with a stationaryAR(1) model.

k

tr..'= Var[Z]: • • O'sj

where2 denotes thesumof all•s. Thisis thewell-known proportional adjustingprocedure, which has been used in manyother disaggregation models[e.g.,Grygierand Stedinger,

(14)

and variance

have identical

Z

es s=l

,

'= Y; O"s. j=l

The proof of this propositionis givenin AppendixA. This givesrise to the adjustingproceduredefinedby

Xs: Xs + hs(Z - Z)

(19)

where Z denotesthe sum of all X s. This procedurewill be referred to as the linear adjustingprocedure. It is obviousfrom (17), (18), and (15) that As (s = 1, ..', k) alwaysaddup to unity,whichassures the preservationof the additivepropertyby the procedure.Furthermore,as a direct consequence of the preservationof both the additiveproperty and the variance-covariance matrix, o-s. equalsthe covariance of the lower-levelvariableX s with the higher levelvariableZ. Indeed, by multiplyingboth sidesof k

z- e.: Z (x:- 6)

(20)

j=l

by (X s - •s) and then takingexpectedvalues,we obtainthat the right-handpart of (18) equalsCoy [Xs, Z]. Thusequation (17) maybe alternatively writtenasAs=Cov [Xs, Z]/Var [Z]. The linear adjustingprocedureappearsalsoin other disaggregationmodels[e.g.,Grygierand $tedinger,1988;Lane and Frevert,1990,p. V-22], but with coefficients Xsdefinedin terms of the standard deviations

of the lower-level

variables rather

than covariances with the higher-levelvariables,as in (17). Using our notationwith oM denotingthe varianceof the lower-

levelvariable Xsand•

denoting itsstandard deviation, the

adjustingcoetficients usedby thosemodels,whichare in proportion to the standarddeviation,are

Case (Linear Adjusting Procedure)

The secondadjustingprocedureis basedon the following (21) Xs= proposition: j=l Proposition2. LetX s (s = 1,..., k) be any random variables with mean values •s = E[Xs] and variance- The differencebetween(21) and (17) is moreobviouswhenXs covariance matrix{r with elementso-si= Coy IXs, Xi] = are independent,in whichcase(17) yields

KOUTSOYIANNIS

AND

MANETAS:

SIMPLE

DISAGGREGATION

2109

whichis consequence of the PAR(l) model.Similar(but more complex)relationscanbe written for othersequentialmodels.

Jk

Afterdetermining all rrsi,weuse(18)to determine thecovari-

j=l

That meansthat the adjustingcoefficients Asfor independent variablesare proportionalto the variances, not to the standard deviations.Interestingly,a resultsimilarto (22) is obtainedin anothersituationby the approachof Pereiraet al. [1984].This situationis concernedwith the correctionof negativeflows, whichare possiblygeneratedby disaggregation models.Pereira et al. [1984]proposedan adjustingschemebasedon the min-

ancesOs..If the PAR(I) model is a goodchoice,i.e., representswell the covariancestructure,then the covariances•rs.

will add up to the historicalvarianceof the higher-level variable. Nevertheless, the procedureutilizesnot the covariances Os. but their ratios to rr..,which are not anticipatedto be

seriouslyaffectedby a possiblepoor representationof the processby the PAR(I) model.Furthermore,once As are determinedby (17) with rr..determinedasthe sumgivenin (15), imization of 5•=• (Xi - .•i)2/o'ii(equation (21)of their they will add up to unity even if there is some departure paper). It can be shownthat the solutionof this problem betweenrr..and the historicalvarianceof the higher-levelvariresultsin adjustmentsproportionalto the variancesfor those able. variablesthat are greaterthan zero. We emphasize,though,that our linear adjustmentscheme 3.3. Modification for Positive Lower Level Variables usesthe coefficients Asas definedby (17) and not thosegiven (Power Adjusting Procedure) by (22). The latter are correctonlyin the special(and unusual) A weaknessof the linear procedureis that it may resultin casethat the lower-levelvariablesare independent.Further- negativevaluesof lower-levelvariablesas we can observein more,we provein AppendixA that the coefficients Asgivenby (19), while most hydrologicalvaluesmust be positive.The (17) are the only onesthat preservethe variance-covarianceproportionalprocedurealwaysresultsin positivevariables,but matrix

as we have discussedabove,it is strictlyexact only in some specialcases.Here we will combineboth proceduresto form a third one,whichmust(1) resultin positivevaluesonly, (2) be identical to the proportionalprocedureif the related conserves both the mean values and the variance-covariance mastraintsare satisfied,and (3) be identicalto the linear procetrix of the lower-level variables. Notably, the procedure durein someareaandpreferablyin theneighborhood of mean involvesonlylinear transformations of the variables,as (19) is values,i.e., near (Xs, Z, Z) - ( •s, •., •.). linear. This leadsto the preservationof the completemultiWe observethat both procedures1 and 2 can be written in

The linear procedurewith adjustingcoefficients definedby (17) is very generaland can be appliedregardlessof the distribution function or the covariancestructureof X s. It pre-

variate

distribution

function

of the lower-level

variables

in

a common

form

single-siteproblems,if they are Gaussian.Indeed, if the higher-levelvariableZ and the initial lower-levelvariablesX s have been generatedwith Gaussiandistribution,then the adjusted variables will also have Gaussian distribution.

Since the mul-

tivariateGaussiandistributionis completelydeterminedby its

•ss =fs•,• where

mean values and the variance-covariance matrix, which are

both preservedby the procedure,the completemultivariate distributionfunctionwill be preserved. We clarifythat the proceduremustbe appliedfor the real higher-and lower-levelvariablesin order for (14) and (15) to be valid.If transformations (suchaslogarithmic)are appliedto the variables,then the inversetransformationmustbe applied beforethe procedureis used. In the abovepresentationof the procedure,we have not made any assumptionabout the covariancestructureof the lower-levelvariables,which can be any kind. In the general casethe procedure,as a stand-aloneone, needsthe covariancesof the lower-levelvariables,with the higher-levelvariableto be specified.The historicalcovariances maybe usedfor that purpose.However,if we assumea particularcovariance

(24)

rs(U, w) = u/w

(25)

for the first procedure,and rs(u, w)= 1 + X,(u- w)

(26)

for the secondprocedure. By analogywe set for the new procedure

fs(U, w) =

(27)

wheregs and Vsare parametersto be estimated.Thisequation alwaysresultsin positivelower-levelvalues and it is more generalthan (25). If the lower-levelvaluesare independent and

•slrrss = •./rr..

(28)

structure of the lower-level variables, the covariancescan be

then consistency with (25) demandsgs = Vs= 1. To examine calculatedusingthisstructure,andthereis no needto consider the consistency with (26), we linearize (27) in the neighborthem as extra parametersestimatedfrom the historicaldata. hood of the means,taking the first three terms of its Taylor This is our case,sincefor the entire disaggregation schemewe seriesabout the point (u, w) = (1/•s, 1/•s ), where •s = have assumedthat the covariancestructureis describedby a PAR(I) model.The independentparametersin the PAR(i)

modelare the lag-1covariances only(rrs,s_•).Any othercovariance o'siforj > s + 1 canbecomputed bycombining lag-1 covariances, i.e., using

=

O's,s+lO's+l,s+2,'''

O's+1,s+1,'''

, O'j-l,j

, O'j-l,j-1

• (1/•'-•')(1-

(23)

#,•+ v, + #,,r•su- v,r•,w)

By comparisonof (29) with (26) we directlyobtain

(29)

2110

KOUTSOYIANNIS

DISTRIBUTION OF LOWER-LEVEL

TRANSFORMATION

OF

LOWER-LEVEL

VARIABLES

AND

MANETAS:

SIMPLE

DISAGGREGATION

ADJUSTING Iturbe, 1985,p. 121] can be usedto determinethe momentsof PROCEDURE the untransformedvaluesgiventhoseof the transformedones:

VARIABLES

•s- Cs= exp(•* + o's*s/2) s

Normal

Linear

o'sj - exp[•s*+ •?+ (o's*s + tr•)/2][exp(o'}) - 1]

(37)

(38)

(where(38) isvalidalsoforj = s). Thento evaluateX*•,we Gamma

Proportional

calculateO's.,O'..,and •. from (18), (15), and (14), respectively. 4.

Log-normal

4.1.

Power

Figure 1. Suitability of adjustingproceduresfor the most commontypesof marginaldistributionof the lower-levelvariables.

=

= Xs/s

(30)

For independentlower-levelvariablessatisfying(28), (30) becomes

=

•.s

o's.•.

ns

tr.. •s

= ........

o'ss•.

as was required. Hence

L(u, w)= (u/w)

= (u/w)

Practical

(32)

Finally,the adjustingprocedure,whichwill be referredto as power adjustingprocedure,is

Xs-- •s(ZI2) •s/•s = •s(ZI2) (•s'/•")(•'/•s) (33)

Issues

Selectionof Adjusting Procedure

Selectionof the appropriateadjustingproceduredependson the typeof the distributionfunctionof the lower-levelvariables (see Figure 1). For normal variablesthe best choiceis the linear procedure,which preservesthe entire multivariatenormal distribution.If negativevaluesare meaningless, anygenerated negativevalue is either rejected (and the generation repeated)or setequalto zero. In the latter casethe produced error is correctedby reapplyingany of the adjustingprocedures.Particularly,if the linear procedureis chosenfor that correction,it will require someiterationsbecauseeachtime it is executedit resultsagain in a negativevalue. The iterations stopwhenthe resultingnegativevalueis very closeto zero and can be neglected.The power adjustingprocedureis alsoiterative, as was explainedin the previoussection.On the other hand, the proportionaladjustingprocedureperformsthe correction without

iterations.

For two-parameteror three-parametergamma distributed variables,all three procedurescan be used.For suchvariables we have avoided approximatetransformations,such as the Wilson-Hilferty, once our model can handle explicitly the skewhess coefficients

of the variables

without

transformations.

This adjustingproceduredoesnot preservethe additiveproperty at once.Thus its applicationmustbe iterative,until the calculatedsum of the lower-levelvariablesare equal to the givenZ. Due to the iterative applicationand the approximationsmadefor its development,the procedureis not exactin strictsense,exceptfor specialcases.However,the power adjustingproceduremaybe a usefulapproximategeneralization of the proportionalprocedureretainingthe advantageof returningpositivevalues. In casethat X s and Z have lower boundscs and c (4: 0), respectively, (33) can be directlymodifiedto

If the constraintsof the proportionaladjustingprocedureare satisfied(even in the relaxedversion),then this procedureis the bestand simplestchoice.Otherwise,we canuseone of the otherprocedures.The linear adjustmentis a goodchoiceif the probabilityof generatingnegativevaluesis adequatelysmall, whichhappensif the coefficients of variationof all variablesare relativelylow. The power adjustingprocedurehas no limitationsandwill work in any case.Its weaknesses are that it is not exactin a strictsenseand that it requiresiterationsto fulfill the additivepropertyandthusis slowerthanthe otherprocedures. For lognormalvariablesone couldfollow the samemethod as in the caseof gammavariablesand use either the linear or (34) the power procedure.However, it is more reasonableto use Xs-Cs=(Xs-cO2the logarithmictransformationof variablesand perform the generationof transformedvariables.In that casewe use the with *isnowdefinedas *is = (•,. - Cs)/(•. - c). Interestingly, powerprocedurein its linear form (35). if we takelogarithms in (34) anddenoteX*• = In (Xs - Cs), Finally,for any other type of distribution,we canproceedas Z* = In (Z - c), etc., we obtain in the closestof the abovethree cases,entrustingthe PAR(l) Xs*=Xs + Xs*(Z*- 2*) (35) modelthe taskto generaterandomvariablesfrom that specific distribution. with

X* '= s

/•s

*Is

=

o's. •.-- C

o'.. •s- cs

4.2. Repetition and Relevant Criteria

(36)

Equation (35) will be referred to as linearizedform of the

Until now we have not examinedthe preservationof the skewnessof the variables, nor the cross-correlation coefficients

amongvariablesof differentlocations.The reasonis that they

poweradjusting procedure. We mustemphasize that X*•de- are not preservedin a strict senseby the studiedadjusting pendson the moments(firstand secondorder) of the untrans- procedures,apart from very specialcases.For example,the variables. In proportional procedure preservesthe skewnessesfor twocasethat the lower-levelvariablesare assumedlognormal,the parametergammavariables(as well as the entire distribution followingequationsof the literature [e.g.,Brasand Rodriguez- functions)if the relevant constraintsare satisfied.Also, the formed variables and not those of the transformed

KOUTSOYIANNIS

AND

MANETAS:

SIMPLE

DISAGGREGATION

2111

1 n Zi - 21 n _- x/Var[Z•]

linear procedurepreserveszero skewnesses. It can be shown that the crosscorrelationsare preservedonly in two extreme cases:(1) if the variablesof differentlocationsare independent (zero crosscorrelations)and (2) if the variablesof different locationsare fully dependent(all crosscorrelationsequal to unity). In the generalcasethe adjustingproceduresgivesome approximationsof thesestatistics,which tend to be lower than the correctvaluesand maynot be adequate.Specifically, if we

(40)

whichwas finally usedin our model. Another problem that is met in almost all disaggregation modelsconcernsthe preservationof the correlationcoefficient of the first variable of a period with the last variable of the previousperiod. This problem was first reported by Lane consider thelinearadjusting procedure givenby(19),weob- [1980,1982]and discussed by Stedinger and Vogel[1984],while servethat the initial variableXs will have the correctskewness contributionsto overcomeit havebeenmadebyLin [1990]and for the subperiods. The term ,•s(Z - Z) will havemuchlower Koutsoyiannis [1992]. In our disaggregation methodwe do not skewness asimpliedbythecentrallimittheorem_ (in fact,this use statisticssuchas laggedcovariancesbetween higher-level term is a linearcombination2k variablesX, andX,). Thusthe and lower-levelvariables,which are generallyresponsiblefor sumof theterms-•s and•s(_Z- 2) willhaveskewness this problem [Stedinger and Vogel,1984]. However, we still coefficientsmallerthan that of X s (consideragainthe central expectto have this problem.The situationwith our method is limit theorem). the following:When we startthe generationat the periodt, we However, we can improve the approximationof these sta- know the lower-levelvariablesof all subperiodsof the period tisticsby repetition.The idea behindrepetitionis that of con- t - 1. Then we use the lower-level variables of the last subditional sampling.The disaggregation problemmay be viewed periodk of period t - 1 to generatewith the PAR(l) model asthe problemof generatingvectorsof lower-levelvariablesXs the lower-levelvariablesof the first subperiodof period t. At conditionalon a givenvector of higher-levelvariablesZ such this phasethe correlationbetweentheselower-levelvariables

that 23s•= • Xs = Z. An analytical solutionto that problem is preserved.However, applicationof the adjustingprocedure

would

be the

determination

of the

conditional

distribution

affectsonly the lower-levelvariableof periodt, thusintroducing bias to its correlationwith the variable of period t - 1. This biasmay be transferredto the neighboringsubperiodsas well, but with a smallermagnitude.Repetition and minimizaF X1,''',Xk Xs=Z tion of the distanceAZ is a simplesolutionto this problem s=l also,althoughmore sophisticated procedurescouldbe formuand the samplingof Xs from that distribution.Sucha solution lated. In section5 we will see that this problem is the most is possiblein limited simplecasesonly.Another solutionis the difficultto fix amongthe other problemsdiscussed above. repetitivegenerationof Xs from their unconditionaljoint distributionfunctionuntil their sumis equal to the givenZ. This 4.3. Problems Associated With the PAR Model apparentlywould require an infinite number of repetitions. There are two well-known potential problems associated The proposedalternativeis to combineconditionalsampling with adjusting,that is to generateXs from their unconditional with the PAR(l) model(aswell asanyother linear model):the function

(

)

in decomposition of thematrixds = bsbf,andthe joint distributionfunctionuntil the error in the additivecon- difficulties potentially high coefficients of skewnessof the auxiliaryvaridition is smalland to applyan adjustingprocedureto allocate ables V s. Here both problems have been fixed by weakening this error amongthe differentsubperiods.We anticipatethat the smaller this error is, the lower is the bias in the statistics the off-diagonal elements of the variance-covariancematrix. problem that are not explicitlypreservedby the adjustingprocedure. Sucha methodis appropriatefor the decomposition and can also fix the skewness problem because skewness coefFor the hypotheticalcasethat the error is zero for all locations, no bias would be introduced

to either the skewness coefficients

or the cross-correlation coefficientsamongdifferentlocations. Here we have adopteda very simplerepetition schemeas describedin section2. This methodhasbeen provenefficient, aswill be shownin section5, where we give someindicationof its performance.For each period we perform an unspecified number of generationsthroughthe PAR(l) model until the

ficientsare proportional to the inverseof matrixb•(3) (see equation(7)). Our methodadaptsan idea byMejœa and Millan [1974]quotedbyBrasandRodriguez-Iturbe [1985,p. 98]. Their idea is to strengthenthe diagonalof the matrix by repeatedly addinga positivenumber (equal to the absolutevalue of the

mostnegativeeigenvalue) to eachdiagonalelementof bsbf,

until the modifiedmatrixis positivedefinite. distance AZ '= IZ - •;1[dropsbelowa specified tolerance Before describingour methodto remedytheseproblems,we level. The tolerancelevel dependson the desiredaccuracyof mustquantifythe requirements.For the first problemit is well the results.The lower the tolerancelevel is the greater is the known that ds must be positivesemidefinite.For the second accuracyof the resultsand, also,the numberof repetitions.To problemwe must set an acceptedupper bound to the coeffiof the auxiliaryvariablesVs. To this aim, we avoid a huge number of repetitionsfor some periods, it is cient of skewness recall that in a finite sampleof sizeN the coefficientof skewadvisableto set also a maximum allowed number of repetinessCs is upper bounded [Wallis et al., 1974; Kirby, 1974; tions. Todini, 1980]by The distance AZ can be defined as the Euclidean norm N-2

AZ = • (Z•- 2•)2

However, it is preferableto determinethe distancein a manner dimensionless andindependentof the numberof locations, such as

C....=x/N•• •

(39)

/=1

(41)

Thus an appropriateacceptableupper bound for the model would be

C.... -- ES....

(42)

2112

KOUTSOYIANNIS

AND

MANETAS:

where0 < e < 1 (e.g.,e = 0.5-0.75). A weakness of thisbound is its dependenceon the size,N of the samplethat will be generated.However,it is sufficientlyjustifiedwhile any other boundwould be arbitrary. Now we can proceedto the algorithmof the method,which has the followingsteps: 1. Select a constante and calculatethe acceptedupper bound of the skewnesscoefficientsof the auxiliaryvariables using(41) and (42) for N equal to the size of the synthetic samplethat will be generated. 2. Selecta constant{bcloseto (and lessthan) 1, e.g.,{b= 0.99.

3. Compute the eigenvaluesof ds. If any eigenvalueis negative(which meansthat ds is not positivesemidefinite), proceedto step 4; otherwisego to step 5. 4. Multiply all the off-diagonalitemsof dsby the constant {band return to step 3. 5. Calculatebs usingthe singularvalue decomposition. 6. Calculatethe skewness coefficients of Vsusing(7). If any coefficientis greater (in absolutevalue) than the accepted bound, then return to step4; otherwisego to step7. 7.

Done.

The singularvalue decomposition(step 5) has been preferredbecauseit generallyresultsin lowercoefficients of skewnessthan triangulardecomposition. We emphasizethat the method doesnot modify the variancesnor the coefficientsof skewnessof the lower-levelvariables.However, if step 4 is executed even once, the method results in a reduction of the

SIMPLE

DISAGGREGATION

mented in our model that is orientedtoward multivariateapplications.

5.

Applications

In this sectionwe presentthree applicationsof the model indicatingits performancein preservingthe statisticsof interest. Application 1 is involvedwith the simulationof monthly rainfall at three rain gauges,Aliartos, Mouriki, and Kallithea, located in the Lake Iliki basin near Athens, Greece. The mean

annual rainfall depth for the three rain gaugesrangesfrom about 420 mm to 610 mm, and the mean monthlydistribution has a peak at December rangingfrom 70 to 100 mm and a minimumat Julyor Augustaround5 mm. The hydrologicyear for thisregionis consideredto starton October1 (thusmonth 1 in Figure2 is October).In the summermonthsthe marginal distribution of rainfall is quite skewed (Figure 2b). The monthly cross-correlationcoefficientsreach a value of 0.8 for autumnand springmonths(Figure 2d)while the lag-1 autocorrelation coefficientsare low as usual for monthly rainfall (Figure 2c). For this applicationwe have adoptedthe gamma distribution functionfor both annualand monthlyrainfall depths,an assumptionconsistentwith the data (as verified by statistical testsand empirically;seeexamplein Figure3). We generated 8000yearsof syntheticannualrainfallfor the three rain gauges usinga multivariateAR(1) modelandthenwe disaggregated it into monthlyamounts.The specificassumptions for the disaggregation are summarizedin Table 1. To explore the two differentmodel modeswe performedtwo runs of the model, onewithoutrepetition(case1.1) and onewith repetition(case 1.2). In addition,we performeda run of the PAR(i) model

lag-zerocrosscorrelationsbetweenvariablesof differentlocations,but thisisunavoidable.In our applications thishappened very rarely, and when it happened,the problemwas usually resolvedin two to four iterations,which meansan insignificant withoutdisaggregation for the sakeof comparison (case1.3). 2- 4% reduction in cross correlations. Someresultingstatistics of the syntheticdatafor all threecases are depictedin Figure 2 in comparisonwith thoseof historical 4.4. Random Number Generation data. We observethat the model, even without repetition, The model incorporatesnumerousexact random number approachesacceptablythe statisticsof interest. The model generators,avoidingthe use of approximategenerators(such performancein preservingthe coefficientsof skewness and the asthe onebasedon the Wilson-Hilfertytransformation). Thus cross-correlation coefficients is improvedwhen we userepetifor normal (and log-normal)variablesthe polar coordinates tion. Generally,the performanceof the modelwith repetition exact algorithmis used [Papoulis,1990, p. 266]. For gamma is very closeto that of the typicalPAR(l) model (without variableswe have usedthe Whittaker[1973] exactalgorithm. disaggregation), whichis a standardvalid modelfor suchpurWe have alsodeveloped(seeAppendixB) a new exactalgo- poses.Furthermore, in Figure 3 we comparegraphicallythe rithm for the gammadistribution,whichis very efficient,espe- empirical distributionof syntheticdata for one location and ciallyfor very highcoefficients of skewness. We note that both one subperiod(March) with the corresponding empiricaldisalgorithmsfor gamma distributionare exactonly for the vari- tribution of historicaldata and the assumedtheoreticalgamma ables that are directly generated, in our case the auxiliary distributionfunction.The resultsof the comparisonare satisvariablesVs. This meansthat the strict exactnessis lost when factory. Notable is the reasonablebehavior of the synthetic we add severalvariablesto determine Xs. For the latter the seriesat the tails, e.g., at probabilitiesof exceedanceor nonmodelmanagesto preservethe first three momentsand gives exceedancelessthan 5% (or normalizedvalueshigherthan adequateapproximations of the distributionfunction(seeFig- 1.65and lessthan -1.65, respectively). A departureappearing ure 3). An alternativeto preserveexactlythe distributionof Xs at the very end of the upper tail, i.e., at normalizedvalue 3.66 is to useanotherclassof modelssuchasgammaautoregressive (corresponding to empiricalexceedance probability1/8000),is (GAR) or periodic gamma autoregressive(PAGAR, PM- causedby one point only,whosevalue is about500 mm (the GAR) models.Suchmodelshavetheoreticallystudiedfor sin- maximumgeneratedvalue).The secondin order point (corregle-sitestationaryprocesses [Lawranceand Lewis, 1981] and spondingto empiricalexceedanceprobability2/8000 and norsingle-siteperiodicprocesses [Fernandez and Salas,1986]with malizedvalue 3.48) is in well agreementwith the theoretical gammamarginaldistributionsand havebeenappliedfor mod- distribution.The probabilitythat one out of 8000 March rainelingof streamflows [e.g.,Fernandezand Salas,1990].To the fall depthsexceedsor equals500 mm was theoreticallyestiauthors'knowledge,similarmodelshave not been constructed mated to about0.25, whichmeansthat it is not so unlikelyto for multivariateprocesses (for example,Salas[1993,pp. 19.22, have one generatedvalue of 500 mm. 19.27] includesthesemodelsonly in the sectionsabout modApplication 2 is involvedwith the simulationof monthly eling of singletime series),and thus they were not imple- discharge of the riverNile at Aswan(onelocation).In a similar

KOUTSOYIANNIS

AND MANETAS: SIMPLE DISAGGREGATION

situation,Todini[1980]studiedthiscasethoroughly, using105 years of naturalizeddischarges. His main objectivewas to preservethe skewness of monthlysynthetic datagenerated by disaggregation models.From the tablesand figuresof that

,_,

60.0

'•

50.0

1:3 30.0 •

studywe extractedthe statisticsof interestfor our model. The

ß

20.0 lO.O

mean monthlydistributionof dischargehas a peak of about

23,000m3/sat September anda minimum of about1,600m3/s at May. The hydrologic yearis considered to starton July1

....

1

2

4

5

6

7

8

9

10

11

12

5.0

(b)

:

3.0 2.0

tion, we haveadoptedthe gammadistributionfunctionandthe

1.0

linearadjusting procedure(seeTable 1). We generated10,000 yearsof synthetic annualandthenmonthlydischarges for three cases,oneby disaggregation withoutrepetition(case2.1),one by disaggregation with repetition(case2.2), and onewithout disaggregation (for comparison; case2.3). As depictedin Figure 4, the modelperformance is as satisfactory asin application1. Repetitionimproves themodelperformance in preservingskewness (Figure4b). Also,it reducesthe departureof the syntheticvalue of the first lag-1 autocorrelationcoefficient fromthehistorical value(Figure4c,at month1),whileincreas-

0.0

1

2

3

4

5

6

7

8

9

10

11

12

Month,

0

0.0 1

ing the departurefrom the historicalvalue in month 2. How-

2

ever,thisdepartureis stillapparentin Figure4c,whilebeyond the third subperiodthe linesthat representthe historicaland syntheticvaluesare indistinguishable. In a similar situation, Koutsoyiannis [1992,Figure4] foundafter a numericalinves-

3

4

5

6

7

8

9

10

11

12

Month, s

i

tigation that a value of this autocorrelationcoefficientcloseto

!(d).

i

0.4

0.5 is the mostdifficultto preserve,whereasvaluescloserto zero or 1 are approximated better.Note that in application2 the first lag-1 autocorrelationcoefficientis about 0.5. To explorethe modelperformanceunder the logarithmic transformation of variablesand the power adjustingproce-

0.2 0.0 1

2

3

4

5

6

7

8

9

10

11

12

Month, s

Legend

dure,we madeanotherapplication (aPpliCation 3; seeTable 1), againfor the Nile Riverat Aswan.Thisis similarto appli-

almostzero or slightlynegative,we set skewness equalto a smallpositivevalue(0.01).In Figure5 we presentthe statistics of the logarithms of the generateddata (8000years)versus

3

Month,

(thusmonth1 in Figures4 and5 isJuly).Duringthe monthsof low flowsthe coefficients of skewness are high,exceeding 2.0 for April and May (Figure4b). The monthlylag-1autocorrelation coefficients are veryhigh,reachinga valueof 0.939for Februaryand March (Figure4c). As in the previousapplica-

cation2 exceptthat it assumes lognormaldistributionof lowerlevel variables,usesthe transformation (8), and applies•the poweradjustingprocedure.This applicationmustbe viewed not asa real'worldapplicationbut ratherasan exerciseof our methodundersomespecialconditions.As we did not havethe historicalrecordof discharges available,we usedthe method of momentsto obtainthe necessary statistics of the logarithms of discharges. For subperiods2 and 3, whoseskewnesses were

2113

Historical



Disaggregationwithoutrepetition(Case 1.1) *

'Disaggregationwith repetition (Case 1.2)

...... -a......PAR(l) withoutdisaggregation(Case 1.3)

Figure 2.

Comparison of various statisticsof historic and

generated. monthlyrainfalldata at the Iliki basin(application

1;simulation periodis8000years):(a) standard deviation for location3, (b) coefficient of skewness for location3, (c) lag-1 autocorrelationcbefficientfor location3, and (d) lag-zero cross-correlation

coefficient for locations 2 and 3.

thoseof historical data.We Observe thatthe disaggregation modelwithoutrepetitiondoesnot perform as well as in the procedure. This application is concernedwith short-scale previoustwo applications.The reasonfor this lower achieve- (hourly),one-dimensional temporaldisaggregation of rainfall ment is the featureof the poweradjustingprocedure,whichis with J-shapedgammadistributionof lower-levelvariables. not strictlyexact,as was explainedabove.However,this situ-

ationisgreatly improved whenweuserepetition, inwhichcase

we achieveperformance similarto the previousapplication. 6. Concluding Remarks In Conclusion, the aboveapplications indicatea reasonable The developeddisaggregation methodis basedon three performanceof the modelin reproducing summarystatistics simpleideas that have been proven effective.First, it starts andmarginaldistribution functionsof the processes examiried. usingdirectlya sequentialPAR(i) modeland keepsits forWe note,though,that onlymodelverification wasdone,not malismand parameterset,whichis the mostparsimonious. validation.For example,the behaviorof syntheticseriesin This modelis run for the lower-levelvariableswithoutany reproducinglong-termdroughtswas not examined.

reference to theknownhigher-level variables. Second, it uses

Finally,werefertoKoutsoyiannis [1994]for an application of accurateadjustingproceduresto allocatethe error in the ada specialcaseof the modelusing.the.proportional adjusting ditive property,i.e., the departureof the sum of lower-level

2114

KOUTSOYIANNIS

AND

MANETAS:

SIMPLE

DISAGGREGATION

Table 1. Summaryof Characteristicsof the Model Applications Simulation

Assumed

Period, years

Case

Procedure

Marginal Distribution

Transformation Type

Adjusting Procedure

for

Maximum

Correctionof NegativeValues

Maximum

Allowed DistanceAZ*

Allowed Repetitions

1.1 1.2 1.3

8 000 8 000 8 000

gamma gamma gamma

none none none

Application1 linear linear ...

linear linear not applied

-.. 0.4 ......

1 1000

2.1 2.2 2.3

10 000 10 000 10 000

gamma gamma gamma

none none none

Application2 linear linear -.-

linear linear not applied

... 0.01 ......

1 3000

3.1 3.2

8 000 8 000

lognormal lognormal

logarithmic logarithmic

Application3 power power

not necessary not necessary

--. 0.0007

1 3000

Application1 is a simulationof rainfall data from three rain gaugesin the Iliki basin,Greece(historicrecordlength,38 years).Applications 2 and 3 simulateNile River dischargeat Aswan(historicdata adaptedfrom 105-yearrecordof Todini[1980]).

*Following a previous versionof theprogram, AZ wascalculated in theseapplications by theexpression AZ = •nl=1 [Z 1 _ 2 l /Zt instead of the more appropriateexpression (40). The differentorder of magnitudeof AZ in thesethree applications is due to the differentmagnitudes of standarddeviationsof the higher-levelvariables.

variableswithin a period from the correspondinghigher-level variable. They are accuratein the sensethat they preserve explicitlycertainstatisticsof lower levelvariables(and in special casesthe completedistribution).Three suchprocedures have been developedand studiedin this paper both theoretically and empirically.Third, it usesrepetition in order to improve the approximationsof statisticsthat are not explicitly preservedby the adjustingprocedures. The model,owingto the wide rangeof probabilitydistributionsit can handle(from bell-shapedto J-shaped)and to its multivariateframework,is usefulfor a plethora of hydrologic applicationssuchas disaggregation of annualrainfall or runoff into monthly or weekly amounts,or disaggregationof event rainfall depths into partial amounts of hourly or even less duration. Such real-world hydrologicapplicationshave been exploredin thisstudyto testthe modelperformance,whichhas been proven satisfactory. The main advantagesof the model are the simplicity,parsimony of parameters, and mathematical and computational convenience(due to the simplicityof equationsand the reduc-

-•'søøø ........ i i ...... i i 3000

2000 1000 0

I

2

3

4

5

6

7

8

9

10

11

12

Month, s ,-,

2.5

.•. 2.0 "

1.5 1.0

0.5 0.0 -0.5

I

2

3

4

5

6

7

8

9

10

11

12

Month, s

:: .

:. ,_.,__;.......

.•. 0.t5

-

-

o

500

0

450 •'

400



350

.........

0.2 •

...................

2

3

4

5

6

7

8

9

10

11

12

Month, s 250

'• 200

,==,

Legend Historical

o



50

...

Disaggregationwithout repetition (Case 2.1)

0 -4

-3

-2

-1

0

I

Normal

2

reduced

3

Disaggregationwith repetition (Case 2.2)

4

PAR(l) withoutdisaggregation(Case 2.3)

variate

Figure 3. Theoretical gamma distribution function (thick solidline) and empiricaldistributionfunctionsof historicdata (triangles)and syntheticdata obtainedby disaggregation with repetition (thin line) for application1 (case 1.2; simulation period is 8000 years),location3, and subperiod6.

Figure 4. Comparison of various statisticsof historic and generatedmonthlydischarges for the Nile River (application 2; simulationperiod is 10,000years): (a) standarddeviation, (b) coefficientof skewness, and (c) lag-1 autocorrelation coefficient.

KOUTSOYIANNIS

,_,

AND

MANETAS:

SIMPLE

DISAGGREGATION

2115

thosewe studied in section5, takes only a few minutes to an

0.7

hour on a modern(486) PC. The secondissueis the adequacyof the PAR(l) schemein e 0.4 hydrologicproblems.The adoptionof the PAR(l) schemehas .,., 0.:3 someadvantagesand somedisadvantages. The advantagesare • 0.2 o. 1 the structural simplicity and the parsimony of parameters, o.o which often lead to a more accurateestimationof parameters. lO 11 12 I 2 3 4 5 6 7 8 9 The disadvantagesare related with the risk of leaving out importantexplanatoryvariablesand the misspecification of the Month, s long-term stochasticstructure of the process.However, the (b) incorporationof the PAR(I) module into a disaggregation 0.5.................................... r-- :................................................................ scheme,whosehigher-levelvariablesare generatedindepen0.0 ' ; '--5---'---a .... = dently (possiblyusing a more complexmodel), reducesthe i : effectsof these disadvantages. The lower-levelvariablesgenerated by the PAR(l) model are adjustedwithin the disaggre:: i • i gationschemeto agreewith the higher-levelvariables.Thuswe 9 10 11 12 I 2 3 4 5 6 7 8 do not haveaccumulationof the possibleerrorsdue to model Month, s misspecification. Surely,thereare situationswherethe PAR(l) model may not work well, such as snowmelt-dominated streamflowswith a large time delay of the snowmelt.For such :• 0.8 situationsthe adoptionof anothermodel,suchasPARMA(p, *• 0.6 q), for the lower-levelvariableswill be more appropriate.As '•' 0.4 b was discussed above,the PAR(I) module is not a structural o 0.2 constraintof the presenteddisaggregation frameworkand can 0.0 be replaced. 10 11 12 I 2 3 4 5 6 7 8 9 :

:

:

Month, s

Appendix A: Proof of Proposition 2 Legend

Taking expectedvaluesin (16) and using(14), we directly Historical

obtain

Disaggregationwithoutrepetition(Case 3.1)

E[Xs] = E[Xs]

Disaggregationwith repetition(Case 3.2)

(A1)

whichprovesthe firstpart of the propositionthat concernsthe Figure 5. Comparison of various statisticsof historic and equalityof the meanvalues.Thususing(14) and (A1), we can generated monthly dischargesfor the Nile River using loga- write (16) as rithmic transformation(application 3; simulationperiod is 8000 years):(a) standarddeviationof logarithms,(b) coeffi- (Xs- E[Xs]) = (Xs - E[Xs]) cient of skewness of logarithms,and (c) lag-1 autocorrelation coefficientof logarithms. /=1

tion in the sizeof matrices).Most of all, the model combines simple individual modules in a general framework. Certain items of the model can be easilymodifiedor replacedto cover specificneeds,without changingthe framework.For example, the studiedPAR(l) schemecanbe directlyreplacedby a more complicatedlinear scheme(e.g., PARMA(p, q)) that maintains a greater set of statistics.In addition,certain modulesof the method, suchas the accurateadjustingprocedures,can be incorporatedin anyother disaggregation model that makesuse of adjustment. As incidentalcontributionsof the paper we mention here the algorithm for fixing the problems associatedwith the PAR(l) model (correctionof nonpositivesemidefinitematricesand of very high coefficients of skewness for the auxiliary variables)and a newalgorithmfor generationof randomnumbers with gamma distribution. Finally, we wishto commenton two issuesthat may giverise to criticism.The first is the speedof the model. Indeed, repetition reducesthe model speed.However, due to the overall simplicityof the modeland the exponentiallyincreasingspeeds of computers,a typical10,000-yearrun with repetition,suchas

Denoting X; = X s - E[Xs], X; = X s - E[Xs], Z' =

Z - E[Z], andintroducing thevectors X' = [X[, --., X•,] r

•, = [•, '", •,]r, X = [3•, '", 3v,]r andq•= [l,'--, 1] (a row vector with k elements)we can write (A2) in matrix notation

as

X' = X' - Xq•X' + Z'X

(A3)

Then transposing both partsof (A3) and postmultiplying with (A3), we get

(A4)

After algebraicmanipulations, (A4) becomes

-{-•.lhO • t• tTIho T•.T-- •.lho (Z ' • t) •.T-{-•.(Z ' • iT)

-- •.(Zt•tT)l•T•.Tq-Zt2•.•.T

(A5)

Taking expected values in (A5) and observing that

E[•'• 'T] = o',E[Z'•'] = 0 (a vector withzeroelements,

2116

KOUTSOYIANNIS

AND

MANETAS:

SIMPLE

DISAGGREGATION

owing totheindependence of-•sandZ) andE[Z '2] = 00.., we that •