The relationship between the state of the stratosphere and&blank ...

2 downloads 0 Views 446KB Size Report
ment – National Research Institute, Podlesna 61 str, 01-673 Warsaw, Poland, e-mail: .... of the year for the drought (symbol ''h'' means warm sea- son and ''c'' ...
Meteorologische Zeitschrift, Vol. 22, No. 5, 561–567 (published online January 2014)  by Gebru¨der Borntraeger 2013

Open Access Article

The relationship between the state of the stratosphere and the occurrence of meteorological drought in Poland Krystyna Pianko-Kluczynska* Institute of Meteorology and Water Management – National Research Institute, Warsaw, Poland (Manuscript received December 26, 2012; in revised form August 9, 2013; accepted October 18, 2013)

Abstract Meteorological drought has a strong impact on many areas of life. In this paper, this phenomenon is of interest during the warm and cold halves of the year at 11 synoptic stations in Poland. The five classes of drought are specified by the value of the standardized precipitation index (SPI). This article attempts to verify the hypothesis of the connection between the class of drought and changes in the stratosphere. The state of the stratosphere is defined based on temperature, geopotential height, and the zonal wind component at levels from 10 hPa to 250 hPa. A selection of potential predictors is proposed based on the correlation. The information capacity for the full set of predictors is compared with the capacity given by the selected explaining variables. The self-organizing map (SOMs) method was used to determine the input stratospheric patterns for each drought class. Then, using a fuzzy classification, drought classes were reconstructed based on pre-defined patterns. Next, the verification of the quality of the model for the connection stratospheredrought for various configurations was conducted, indicating the best model parametrisation. The results are promising. However, it is crucial that the validation be carried out separately for each synoptic station and updated with the addition of extra data. Keywords: Stratospheric Patterns, Drought, SOMs, Fuzzy Grouping.

Introduction

information-capacity (HC) (HELLWIG, 1969). To describe the relationship between the weather and the different parameters of the atmosphere, self-organizing maps (SOMs) (SKUBALSKA-RAFAJLOWICZ, 2000), as well as fuzzy logic (DIKBAS et al., 2012), are often used. This paper is the first work in a series of studies that focuses on the analysis of the relationship between drought classes and the dynamics of the stratosphere. Characteristics of rainfall and the state of the stratosphere are defined for the warm (April–September) and cold (October–March) halves of the year.

To build an effective, seasonal weather forecasting model is an interesting and important, but very difficult task. Customers for such products are particularly interested in information about the risk of weather events, that are unfavourable to the environment and that affect many areas of their lives. Meteorological drought (resulting from the periodic deficiency of rainfall in a given location) is just such a severe weather phenomenon. Unfortunately, there is no clear definition of meteorological drought (AGNEW, 2000). In this paper, this concept refers to the nature of the seasonal precipitation for each year (KANECKA-GESZKE and SMARZYNSKA, 2007). It is therefore important to determine patterns in the atmosphere accompanying different drought classes. The quality of the results depends on the choice of input data, the proposed methods of searching for the patterns and the reconstruction of the output data. Many authors suggest that to understand the causes of weather phenomena, the stratosphere needs to be examined (MAYCOCK, 2008), and that a particularly important region is the polar region (MAYCOCK, 2008; HORSTMAN, 2003). The correctness of the selection of input variables can be determined by various indices, for example, Hellwig’s

BR60 – a large sub-polar domain (90N–60N; 40W– 77.5E), and PL – the local domain covering Poland (45N–55N; 10E–25E).

* Krystyna Pianko-Kluczynska, Institute of Meteorology and Water Management – National Research Institute, Podlesna 61 str, 01-673 Warsaw, Poland, e-mail: [email protected]

1 NCEP re-analysis data is provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA, from their Web site at www.esrl.noaa.gov/psd/.

DOI 10.1127/0941-2948/2013/0446

Data Information on the state of the stratosphere comes from re-analysis conducted four times a day during the period 1948–2011, with a grid step of 2.5 · 2.5 from NCEP/ NCAR (KALNAY et al., 1996).1 There are two researched domains for the input information:

0941-2948/2013/0446 $ 3.15  Gebru¨der Borntraeger, Stuttgart 2013

562

K. Pianko-Kluczynska: Stratosphere and meteorological drought in Poland

The state of the stratosphere is defined based on temperature, geopotential height, and the zonal wind component at levels of 10 hPa, 20 hPa, 30 hPa, 50 hPa, 70 hPa, 100 hPa, 150 hPa, 200 hPa, and 250 hPa. The relationship between the predictors and predictands is described by data-driven models (D. SOLOMATINE et al., 2008), using applied mathematics. This is an unconventional approach in which the form of the input information is far from that which is typically used by agro-meteorologists and climatologists. The input (explaining) variable is the number of cases (for each domain, point, level, and meteorological variable in a half of the year), with a negative sign for temporal six-hour changes in variable Z (which means a temperature decrease that is not related to daily variability, a reduction of the geopotential height and a decrease in the U vector – the zonal wind component). The temporal change of a variable means: D Z(A, t) = Z(A, t + Dt)  Z(A, t), where Dt is a sixhour time step and A is a set of the attributes of variable Z; A = {D, V, H, fi, la, HY}, where D is the domain, V is the meteorological element, H is the level, fi is the latitude, la is the longitude, and HY is a half of the year. An additional description of the situation requires a negative increase in U: the zonal component of wind. It describes not only the wind speed, but also its direction (a negative component for the wind from east to west). One of the most popular indices defining the drought class (the precipitation deficit) during the period is the Standardized Precipitation Index (SPI) (MCKEE et al., 1993). In this paper, a modified SPI is used. It employs a transformation of precipitation (GASIOREK and MUSIAL, 2011). The SPI is designated for 11 synoptic Polish stations, representing different climatic conditions (Poznan, Krakow, Warszawa, Koszalin, Zielona Gora, Wroclaw, Lublin, Chojnice, Gdansk, Lodz, and SuwalkI). This index is defined based on six months of total precipitation observed at each station.

Methodology The following steps were undertaken: (a) The calculation of the SPI values and their classification: The procedure covers the period 1948–2011 (the cold half of 2010 ends in 2011) and the selected locations. The SPI determines the following classes of meteorological drought: (0) no drought (SPI > 0); (1) mild (SPI from 0.99 to 0); (2) moderate (SPI from 1.49 to 1.00); (3) strong (SPI from 1.99 to 1.50); and (4) extreme (SPI   2.00).

Meteorol. Z., 22, 2013

(b) Identification of the variables with the strongest connection to the SPI: The analysis is performed separately for the warm/cold half-years and the station. This intends to limit the amount of processed input information (input variables). This is achieved by imposing the condition: their correlation with the predictand must be greater than the threshold (0.3 for the domain PL and 0.4 for BR60). (c) Hellwig’s information capacity (HC): This indicator (HELLWIG, 1969) is applied to determine the strength of the relationship between the state of the stratosphere and the SPI. The Hellwig’s measure not only takes into account the correlation of each input variable with the output, but also the system overhead due to the correlation between input variables. Let n be the number of all available describing variables X1, X2, ... Xn. Xj, Xi are any pair of variables describing variable Y, i,j = 1,2,...n. They can be correlated and it means that Xj contains the information from Xi. Let rj mean the correlation between Xj and Y and rij the correlation between the explanatory variables Xj and Xi. Now we can define the individual contamination of a describing variable as: gj ¼

1 X jrijj i6¼j n1

The ideal situation is when gj = 0. In the case of total contamination (gj = 1), the variable Xj does not provide additional knowledge about Y. Based on the gj can be r2j calculated, the individual information capacity 1þðn1Þgj. Next, two integral Pn information capacities are calculated: HC(nÞ ¼ j¼1 hj for the set of all potential input variables (the size of the set is n), and HC(lpr), for the predictors selected by the correlation (the size of this set is lpr). Then these two values are compared. If HC(lpr) > HC(n) it means that the selected predictors carry more information about the predictand than the set of all potential explanatory variables. If HC(lpr) < HC(n), it means that the removed input variables carry important information about drought. (d) Pattern detection using SOMs: This method represents an Artificial Intelligence model (Neural Network; a variant without a teacher). This methodology is applied to extract the stratospheric patterns for each type of meteorological drought. From a rich database of algorithms, The Winner Takes All is chosen (version in which the weights of the winner’s neighbours have not been modified) (MASTERS, 1996; SKUBALSKA-RAFAJŁOWICZ,

Meteorol. Z., 22, 2013

K. Pianko-Kluczynska: Stratosphere and meteorological drought in Poland

2000). The network structure is presented based on the example of the k-th class of drought: k = 0,1,2,3,4. This type of network has two layers: the input vector of the explanatory variables: X = (X1, X2, ..., Xlpr), and the output layer of the competing neurons (weights): Wj = (Wj1, Wj2, ... Wj lpr), j = 1,2,. . .d, where lpr is the number of predictors and d is a number of cases with a k-th drought class in the validation material. The weights are initialised randomly at the beginning of training. The model is looking for a neuron that is closest to the vector X (according to its Euclidean distance): nw = arg mini | X - Wi |. Components of vector Wnw are modified according to the formula: Wnwj (t) = Wnwj (t-1) + a (t) [Xj – Wnw j(t-1)], where j = 1,2,. . .lpr and t is a number for the training epoch. Learning factor a(t) controls the speed of the modification of neurons;, during training, the network is adjusted for epoch t according to the formula: aðtÞ ¼

aðt  1Þ ðaðt  1Þ þ 1Þ

This training ends after 3500 epochs, when the changes in the neurons are close to zero. At that moment, the neurons Wj become patterns Ck, j for the k-th drought class (j=1,2...c, c-number of patterns). The experiments are repeated with different initial values a0 of parameter a(t) (a0 = 0.01, 0.1, 0.25, 0.5, 0.9) and a different number of neurons c. (e) Fuzzy Classification: For all of the material, irrespective of the drought classes, the fuzzy measures of compatibility (fuzzy memberships) are calculated between explanatory vector X and all the stratospheric pre-defined patterns C, (RUTKOWSKI, 2005; DIKBAS et al., 2012). In this paper, fuzzy memberships are of the form (m=2): 1 Uj ¼ c   2 P  xcj  m1 xck  k¼1

This pattern, which shows the greatest compliance with the input information, represents the class to which the object is included. (f) Verification of the reconstruction of the drought classes: Very high compliance is expected between the reconstructed and observed values of the output variable. This quality of model determines the optimal number of patterns (validation material) and recommends the best model parameterisation (based on the verification material;, this is done separately for each station). For this purpose, two research tools have been used: – The Jaccard’s index, which describes the strength of the relationship between the occurrence of event A

563

Figure 1: Number of variables for the domain BR60, after selection based on the correlation, that describe the drought in the warm half of the year. The states of stratosphere come from cold and warm seasons.

(the state of the stratosphere) and B (class of drought). This is, expressed in percentage, the ratio of the number of simultaneous occurrences of events A and B to the number of cases when there is even one of the events A or B (REAL & VARGAS, 1996). This measure was used to determine the correctness of the reconstruction of the situation when there was no drought (drought class ‘‘0’’). – Success rate – the percentage of correctly reconstructed drought classes based on data from the stratosphere.

Results and Recommendations The available material is divided into two parts: the training (validation material), during which the neural network is learned; and the verification material, assessing the quality of the constructed model. To identify the experiments, the phrase ‘‘ss_sd’’ is used. The variable ss represents a half of the year for the state of the stratosphere (input data) and sd is a half of the year for the drought (symbol ‘‘h’’ means warm season and ‘‘c’’ means cold half-year). One of the issues in this study is that there is too much input information. There are 16,845 potential predictors for the domain BR60 (3 meteorological elements at the 9 levels of the atmosphere; the domain covers 13 values of latitude and 48 values of longitude) and 945 potential predictors for the local domain PL (the smaller domain PL has only 5 latitudes and 7 longitudes). The selection of the describing variables is carried out. It is based on the correlations between predictors and predictand (they must be strong enough). It significantly reduces the number of describing variables. Fig. 1 (model BR60_c_h and BR60_h_h), Fig. 2 (model PL_c_h and PL_h_h), Fig. 3 (BR60_c_c and BR60_h_c), and Fig. 4 (PL_c_c and PL_h_c) show the numbers of selected predictors (separately for each station). The HC confirmed that reducing the number of input variables based on their correlation with the output of model is the right approach. The values obtained after

564

K. Pianko-Kluczynska: Stratosphere and meteorological drought in Poland

Figure 2: Number of variables for the domain PL, after selection based on the correlation, that describe the drought in the warm half of the year. The states of stratosphere come from cold and warm seasons.

Figure 3: Number of variables for the domain BR60, after selection based on the correlation, that describe the drought in the cold half of the year. The states of stratosphere come from cold and warm seasons.

Meteorol. Z., 22, 2013

SOMs emerged from 7 to 24 patterns (within an acceptable range of 5–40 patterns). These situations, when the number of patterns is less than 10, occur episodically. The number of patterns depends on the domain, the parameter a0 and the station. Unfortunately, we can see that the stratosphere–drought relationship is heavily fuzzy. For most of the classified objects, a pattern is shown in which the membership measure is significantly larger than for the other patterns (but all measures are greater than zero, meaning that each pattern carried some information about the object). The most important measure of accuracy is the percentage of success in the reproduction of an output variable. What is especially interesting here is the ability to reconstruct drought classes for the independent part of the material. The percentage of correctly reconstructed drought classes for the validation material is more than 75%. The experiments achieved at least 50% accuracy for the verification material for all stations. These results (the highest quality of reconstruction for verification material) are shown in Fig. 5 and Fig. 6. Thus, we are always able to identify a good configuration of the model for the relationship state of stratosphere–meteorological drought. Another issue relates to the correctness of indicating the class of ‘‘no drought’’ (there was no precipitation deficit, and it cannot exclude the possibility that precipitation was even in excess). Let us examine the Jaccard’s index (Jac) for two of the suggested models: PL_h_c and BR60_c_h. Its values depend on the parameter a0. A new variable Jac0 is defined as follows: Jac0 = 1, when for all a0 the index Jac is less than 50%, Jac0 = 2, when 50  Jac