Improving forecasting performance by employing ... - Semantic Scholar

6 downloads 0 Views 191KB Size Report
manufacturing parameters for a brake lining using Taguchi method. Journal of Materials Processing Technology 136,. 202–208. Syrcos, G.P., 2003. Die casting ...
European Journal of Operational Research 176 (2007) 1052–1065 www.elsevier.com/locate/ejor

Stochastics and Statistics

Improving forecasting performance by employing the Taguchi method Tai-Yue Wang *, Chien-Yu Huang Department of Industrial and Information Management, National Cheng Kung University, Tainan 701, Taiwan Received 23 August 2004; accepted 3 August 2005 Available online 22 November 2005

Abstract To satisfy the volatile nature of todayÕs markets, businesses require a significant reduction in product development lead times. Consequently, the ability to develop precise product sales forecasts is of fundamental importance to decision-makers. Over the years, many forecasting techniques of varying capabilities have been introduced. The precise extent of their influences, and the interactions between them, has never been fully clarified, although various forecasting factors have been explored in previous studies. Accordingly, this study adopts the Taguchi method to calibrate the controllable factors of a forecasting model. An L9(34) inner orthogonal array is constructed for the controllable factors of data period, horizon length, and number of observations required. An experimental design is then performed to establish the appropriate levels for each factor. At the same time, an L4(23) outer orthogonal array is used to consider the inherited parameters of forecasting method as the noise factors of Taguchi method simultaneously. An illustrated example, employing data from a power company, serves to demonstrate the thesis. The results show that the proposed model permits the construction of a highly efficient forecasting model through the suggested data collection method.  2005 Elsevier B.V. All rights reserved. Keywords: Forecasting; Taguchi method; Orthogonal array

1. Introduction Many forecasting approaches of differing capabilities have been introduced for a variety of applications (Ko et al., 1999; Kumar and Jain, 1999; *

Corresponding author. Tel.: +886 6 275 7575; fax: +886 6 236 2162. E-mail address: [email protected] (T.-Y. Wang).

Prybutok et al., 2000). The performance and feasibility of these approaches depend significantly upon adopted horizon lengths, accuracy of each horizon, cost of development, data period, frequency of revision, type of application, potential for automation, external and subjective data, pattern recognition capability, and the number of observations required (Delurgio, 1998). Although previous studies have addressed some of these

0377-2217/$ - see front matter  2005 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2005.08.020

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

issues, further study is still required in order to develop methods to screen key controllable factors, to calculate interactions between them, to assess their effectiveness, and to verify their optimal level settings (Radaev, 1998; Russell et al., 2000). Decision-makers are judged upon the quality of the decisions they make, and hence they demand highly precise forecasting models. Even a small improvement in the forecasting accuracy can yield considerable time and effort savings (Makridakis et al., 1982). Previous studies have investigated the relative capabilities of different forecasting methods and have considered the effects of various horizon lengths and data periods. Makridakis et al. (1982) compared the performances of several leading forecasting methods and developed general guidelines for the selection of appropriate forecasting models for various applications. Data type (i.e. micro or macro), horizon length, and data period factors exert a considerable effect on the performance of a forecasting model. Although data type is inherently dictated by the chosen data, horizon length and data period are controllable factors, which can be specified by the decision-maker. More recently, Radaev (1998) proposed an approach to determine the number of observations required when developing a forecasting model. When performing data collection, it is essential to establish appropriate values as to horizon length, number of observations required, and data period, in order to enhance forecasting accuracy. However, previous studies have not fully explored the impact and interactions of these controllable factors as they interact simultaneously. Some way satisfy one particular requirement, while others may produce a negative influence. This interaction, or lack of interaction, effect presents difficulties in determining the optimal robust settings of the controllable factors of the forecasting model (Chen, 1997). Therefore, it is essential to develop a robust method to rank and screen controllable factors based upon a thorough investigation of their main and interaction effects. Selecting the appropriate forecasting model and then establishing suitable modeling settings is a challenging task even for the most experienced decision-makers. Consequently, these individuals generally apply a trial-and-error approach when

1053

attempting to determine an optimal combination of controllable factors in their design experiments. The Taguchi experimental design method (Taguchi, 1986; Tan and Tang, 2001) reduces cost, improves quality, and provides robust design solutions. Khoei et al. (2002) employed this method to determine the optimal configuration of performance, quality, and cost design parameters in the aluminum recycling process. As demonstrated by Casab et al. (2003) in their study of enzyme linked immunosorbent assay (ELISA) optimization, the Taguchi method is capable of establishing an optimal design configuration, even when significant interactions exist between and among the control variables. The Taguchi method can also be applied to designing factorial experiments and analyzing their outcomes. For example, McMillan et al. (1998) employed the Taguchi method to design a two-level screening experiment in order to determine the significant factors in an environmental stress screening application. The Taguchi method has evolved into an established approach for analyzing interaction effects when ranking and screening various controllable factors. Moreover, this method is applicable to solving a variety of problems involving continuous, discrete and qualitative design variables (Lin and Tseng, 2000). Therefore, the present study adopts the Taguchi method to investigate the main effects and interactions of three forecasting model controllable factors, namely research data period, horizon length, and number of observations required. Noise factor, according to Taguchi method, are factors that influence the response of a process, but cannot be economically controlled. And they are usually the prime sources for variations. Therefore, a design affected minimally by noise is needed urgently for experimenter. As to the adopted forecasting method, parameters of some exceptional forecasting methods have a significant effect upon the performance of the model. Therefore, when conducting an experimental design for forecasting model, the parameters of the adopted forecasting method (for example, the network topology of a back-propagation network, BPN and the p, d, q in autoregressive integrated moving average, ARIMA) may include noise hindrances over which an experimenter has no direct control and which

1054

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

may vary with adopted forecasting methods of models themselves. Experiments may be sensitive simultaneously to all forms of noise. Consequently noise influences the forecasting objective. Previous research has shown that noise factors might affect the response of the experiments (Lin and Tseng, 2000; Roy, 1990). In this study, we are not only interested in the effect of controllable factors, but also in controlling or at least limiting noise. The Taguchi method has deployed an outer orthogonal array and created a transformation of repetition data to another value (a measure of the variation present) as signal to noise ratio (S/ N). In this study, the parameter of the adopted forecasting method will be placed in the outer orthogonal array of the parameter design layout in order to obtain robust forecasting results. When investigating the main effects and interactions of controllable factors, previous researchers have frequently attempted to reduce experimental time and costs by implementing a Taguchi inner orthogonal array (Reh and Ye, 2000; Seong et al., 2003; Syrcos, 2003). However, since this approach fails to consider the influence of noise, the experimental design is usually not robust. Therefore, the present study configures controllable factors of the forecasting model, i.e., data period, horizon length, and number of observations required in an inner orthogonal array. Parameters of the adopted forecasting method are then located in an outer orthogonal array so that their influence in determining optimal level settings of controllable factors can be thoroughly explored. Meanwhile, we employ the S/N and analysis of variance (ANOVA) (Roy, 1990) approach to investigate their main effects and interactions. We anticipated that the optimal combination of controllable factors and appropriate settings of each would yield a superior forecasting result. The objectives of the present study can be summarized as follows: 1. To investigate the influence of the main effects and interactions of the specified time series related to controllable factors in forecast. 2. To investigate the influence of the adopted forecast methodÕs parameters on its model, and to adopt them as the noise factors.

3. To investigate the optimal combination of controllable factors and their level settings through a simultaneous deployment of inner and outer orthogonal arrays. 4. To permit decision-makers to specify an appropriate data period, horizon lengths, and numbers of observations required in order to establish the data collection method, and then to determine the construction of a highly efficient forecasting model. The basic elements of the present study are presented in Fig. 1, and can be briefly described as follows: Step 1. Determine the controllable factors of the forecasting model and the appropriate number of level settings for each factor. The data period, horizon length, and numbers of observations required are identified in this study. Step 2. Determine parameters of the adopted forecasting method. These factors represent noise in the forecasting model and configured in an appropriate outer orthogonal array. Step 3. Construct the complete parameter design layout. The controllable factors of the forecasting model are configured in an inner orthogonal array, in accordance with the Taguchi method, while noise factors associated with the parameters of the adopted forecasting method are deployed in an outer orthogonal array. We use S/ N as the determining variance index. The greater this value, the smaller the product variance around the target value. The S/ N concept will be used throughout this study. Step 4. Rank and screen selected factors. This study applies the analysis of means (ANOM) (Nelson and Dudewicz, 2002) and ANOVA to the main effects of individual, controllable factors and the interactions between them for different noise conditions. By doing so, it is possible to identify the significance of each controllable factor.

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

1055

Identify the Problem

Parameter Design Determine the Needed Factors and Level Settings of Forecasting

Determine Parameters of the Adopted Forecasting Method

Control Factors

Noise Factors

Construct Inner Orthogonal Array

Construct Outer Orthogonal Array

Illustrative Data: Taiwan Power Company

Establish Criteria of Performance

Construct Orthogonal Array

Implement an Experiment

Pool Factors

Rank the Selected Factors

Yes Pool Factors ? No Optimize Factors' Level Settings

Establish Forecasting Model

Fig. 1. Outline of present research.

Step 5. Evaluate the selected factors. The forecasting model can be improved by widespread data collection, but costs involved may be significant. This step, therefore, determines why data collection need not be performed in order to accomplish a compromise between forecasting performance and data collection costs. Step 6. Optimize factor level settings and establish the forecasting model. Having ranked and

screened potential controllable factors, the optimal level settings of controllable factors are determined by further analyzing their main effects and the interactions between them. 2. Methodology This study employs the Taguchi method to rank and explore the main and interaction effects of

1056

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

controllable forecasting model factors. Initially, quantitative factors and level settings are determined; an orthogonal array is then adopted in order to determine appropriate treatment combinations. The response table of the design experiment is analyzed afterward in order to rank and screen controllable factors. A compromise is ultimately achieved between forecasting accuracy and data collection costs by specifying optimal level settings for each significant controllable factor.

of the forecasting model. The level settings of these controllable factors define the solution space for optimizing the forecasting model. Moreover, these settings define the extent of the data collection required for each controllable factor. 2.3. Determine the parameters of the adopted forecasting method for outer orthogonal array

The Taguchi method is a commonly adopted approach for optimizing design parameters. The method was originally proposed as a means of improving the quality of products through the application of statistical and engineering concepts. Since experimental procedures are generally expensive and time consuming, the need to satisfy the design objectives with the least number of tests is clearly an important requirement. The Taguchi method involves laying out the experimental conditions using specially constructed tables known as ‘‘orthogonal arrays’’. The use of these tables ensures that the experimental design is both straightforward and consistent. Adopting the Taguchi approach, the number of analytical explorations required to develop a robust design is significantly reduced, with the result that both the overall testing time and the experimental costs are minimized.

When a Taguchi experiment is conducted, numerous external factors not built into the experiment influence its outcome. These effects deeply influence the forecast, and disturb its robustness. On the other hand, the forecasting performance is determined by a suitable forecasting method. Some performance is affected by parameters of the adopted forecasting method. Studies showed that different parameter designs of forecasting methods have different forecasting results (Niska et al., 2004; Pitcher et al., 2002). In order to prevent forecasting deviations resulting from the adopted forecasting method itself, we deem those parameters of forecasting method as noise factors. In fact, the variation occurs mostly because of the uncontrollable noise factors. By expanding the design of the experiment to contain noise, the optimal conditions that are insensitive to the noise factors will be found. Therefore, noise brought by adopted forecasting methodÕs level settings must be reduced before being incorporated into the experiment. Consequently, an outer orthogonal array can be deployed to determine the number of repetitions for trial runs.

2.2. Determine controllable factors and their initial level settings for inner orthogonal array

2.4. Using Taguchi method to evaluate controllable factors and levels

In conducting an experimental design, the decision-maker is concerned primarily with establishing the relative influences of various controllable factor level settings, and in adjusting level settings so that they are less sensitive to the effects of noise. The present study specifies the data period, horizon length, and number of observations required as controllable factors in the forecasting model. These factors are then configured within an inner orthogonal array in order to explore the influence of main and interactions effects as to the accuracy

The Taguchi method reduces experimental costs by using orthogonal arrays to extract useful and sufficient information from a minimum set of test data. Through an analysis of the main effects and interactions of various controllable factors, the use of an orthogonal array enables a decisionmaker to screen and rank individual factors in order to establish their optimal level settings. Furthermore, orthogonal arrays ensure consistent results when different experimenters conduct the design experiment. In the Taguchi method, exper-

2.1. Taguchi method

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

imental robustness is enhanced through the systematic design of the system, its parameters, and its tolerance. This study focuses primarily upon developing an optimal parameter design, thereby enhancing the performance of the forecasting model, while simultaneously reducing associated experimental costs. 2.4.1. Construction of an orthogonal design In a design experiment, an appropriate choice of orthogonal array depends upon the degrees of freedom of the particular experiment. Therefore, a linear graph is employed to identify suitable treatment and level settings of the array. The current study considers three controllable factors and one interaction, and therefore adopts an L9(34) inner orthogonal array to calibrate factor level settings. Rotating the factorsÕ locations in the L9(34) array permits the investigation of other significant interaction effects between and among controllable factors. Meanwhile, we configure noise factors of the adopted forecasting methodÕs parameters in an outer array. 2.4.2. Implementation of an experiment The L9(34) inner orthogonal array permits the analysis of four three-level factors. The array comprises four columns, within which each entry can be assigned to one of three different levels. Additionally, the array consists of nine rows, where each row represents a trial condition with a partic-

1057

ular combination of factor level settings. In other words, the information required to fully analyze factors and their respective level settings can be obtained from just nine experimental trials, each trial having many repetitions with outer array. In the Taguchi approach, the controllable factors are assigned to particular columns of the inner orthogonal array. Accordingly, the present study assigns the data period (factor A), horizon length (factor B), and number of observations required (factor C) to Columns 1, 2, and 4 of the adopted L9(34) inner orthogonal array, respectively (Table 1). And each trial will be replicated several times with different level settings of adopt forecasting methodÕs parameters of the outer arrays. In practice, these factors can be assigned arbitrarily to any of the arrayÕs columns, provided that all combinations are included. After assigning appropriate level settings, the S/ N analysis is needed to evaluate experiment results. In S/N analysis, the greater the S/N, the better the experimental results: S=N ¼ 10 Log10 ðMSDÞ; where MSD is the mean squared deviation from the target value of the quality characteristic. We define MSD ¼ ðy 21 þ y 22 þ y 23 þ   Þ=n; where y1, y2, . . . etc. is the result of experiments; and n, the number of repetitions (yi).

Table 1 L9(34) inner orthogonal array for forecasting model (BPN) Experiment/column

L9 A

1 2 3 4 5 6 7 8 9

S/Ni B

AB

C

Results

1

2

3

4

Y1

Y2

Y3

Y4

Y

1 1 1 2 2 2 3 3 3

1 2 3 1 2 3 1 2 3

1 2 3 2 3 1 3 1 2

1 2 3 3 1 2 2 3 1

4.9129 10.6529 13.5263 5.0397 7.5895 10.4738 1.8902 7.5483 10.3526

4.7203 9.3897 27.397 2.2619 7.4801 12.2062 0.9949 8.074 14.4084

4.1752 9.5462 15.3083 7.1054 10.1366 13.5069 4.4845 14.7624 13.5689

4.3961 10.073 11.6487 3.6721 9.4406 13.6117 3.4048 9.2666 11.6791

4.5511 9.9154 16.9701 4.5198 8.6617 12.4497 2.6936 9.9128 12.5022

A: Data period used, B: Horizon length, C: Number of observations required.

26.821 20.063 14.869 26.267 21.171 18.052 30.426 19.727 17.991

1058

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

2.4.3. Ranking of selected factors The orthogonal array described above contributes to the systematic testing of various combinations of controllable factor level settings and for all possible combinations of noise. In this study, the results of each experimental run are analyzed using the ANOM approach in order to establish optimal conditions. Controllable factors are screened and ranked by ANOVA so as to analyze the main effects of each factor on the response table data. We can then identify the significance of each controllable factor as to overall performance and establish their appropriate level settings. 2.4.4. Pooling of factors The ranking and screening activities described above may prompt the experimenter to pool one or more controllable factors in order to minimize costs. It may be necessary to conduct a further optimization process with the remaining factors. Because experimental costs govern the termination of ranking and screening operations, a combination of factors may also yield a satisfactory result. The experimenter should aim for a compromise between solution quality and experimental cost, specifying system parameters accordingly. 2.4.5. Optimization of factor level settings and construction of forecasting model Following further optimization experiments with the surviving or renewed factors, we analyzed the final main effects table to obtain an optimal experimental design. It is important to identify optimal factor level settings that improve average response consistency. These level settings are determined by analyzing the final response table. 2.4.6. Validation of experiment Once the optimal condition is determined, we conducted a validation experiment. The experimen-

tal results may be implemented at a non-optimal condition in order to estimate forecast performance. The percentageP mean absolute error, N xi defined as %MAE ¼ N1 i¼1 j xi ~ j  100% (Saab xi et al., 2001), is proposed as a performance index for solution quality. These indexes will serve as the response table output and main effect table for the L9(34) experiment. The decision-maker can perform the experiment more economically by factoring in performance criteria and overall cost.

3. An Illustrated example To illustrate the methodology presented in this study, we selected the Taiwan Power Company (ROC) for analysis. The data extend over a 180month period, January 1987–December 2001, based on the BPN forecasting method that has been extensively adopted in a variety of applications. BPN-based forecasting models tend to yield superior forecasting solutions for many applications (Abhijit and Hojjat, 2003; Kalaitzakis et al., 2002; Kyong and Ingoo, 2000). 3.1. Data collection A steadily increasing population, coupled with demands for an increased standard of living, and an emphasis on large-scale industrialization, has prompted a steep increase in TaiwanÕs energy consumption. Therefore, an effective forecasting approach was essential if investment planning in power generator and distribution infrastructures were to prove efficient and cost-flexible. The data considered in the present case study relate to Taiwan Power Company, a state owned enterprise serving the island of Taiwan. In recent years, the installed nameplate capacity, energy generation, and total energy sales of the Taiwan Power Company have been rising steadily. Table 2

Table 2 Status of Taiwan Power Company at year-ends 1998–2002 Description

1998

1999

2000

2001

2002

Average growth rate (%)

Installed nameplate capacity (MW) Energy production (million KWH) Sales (million KWH)

26,680 1430 1281

28,480 1458 1317

29,634 1565 1424

30,136 1581 1436

31,915 1659 1512

4.4 4.6 5.0

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

describes the recent status of this enterprise. As can be seen, the total installed nameplate capacity at the end of 2002 was 31,915 MW (MegaWatts), which represented a 5.9% increase over the previous year. Meanwhile, the average growth rate over five years was 4.4%. Energy production reached 1659 million KWH (kiloWatt hours) by the end of 2002, which produced a 4.9% increase from 2001. Finally, total energy sales involved 1512 million KWH, representing a 5.3% increase over the previous year. The average growth rate for electricity generating demand exceeds the average growth rate of TaiwanÕs gross national product (3.3%). Hence, electrical supply has become a leading economic index in Taiwan. 3.2. Determine level settings of controllable factors The collected data include monthly, quarterly, semi-annually, and annual data, which mimics MakridakisÕs study (1982). Because yearly data can be represented by monthly, quarterly and half-yearly data, the data period level setting was fixed as monthly, quarterly, and semi-annually (1, 3, 6 months). Also, we chose level of horizon length setting for one to three years (12, 24, 36 months) because energy consumption forecasting involves long-term capital expenditure. In order to obtain objective forecasting results, sufficient data are required for training the BPN. On the other hand, the data must be timely in order to keep established data from compromising forecasting performance. Therefore, level setting for the numbers of observations required were set to 96, 120, and 144 months. Controllable factor level settings were specified as follows: Factor A {1, 3, 6}, Factor B {12, 24, 36}, and Factor C {96, 120, 144}. A summary of factor level settings is provided in Table 3.

Table 3 Parameter level settings for forecasting model (month) Name of parameter

Level settings (month)

Data period Horizon length Number of observations required

1, 3, 6 12, 24, 36 96, 120, 144

1059

3.3. Determine BPNÕs topology and parameters for outer orthogonal array The BPN network consisted of a number of interconnected neurons distributed across a specific number of discrete layers. It might be expected that a greater number of neurons would yield a superior forecasting result. However, in practice, an excessive number of neurons prevented the network from converging and lengthened unnecessarily the training time. Conversely, an overly simplified network has an adverse effect upon forecasting performance. BPNÕs topology and parameters were considered noise factors and replicated using an outer orthogonal array. For the outer orthogonal array, experienced judgment and systematic forecasting expertise is needed to reduce noise. In this study, we preceded and deployed controllable variables and factor levels of outer orthogonal array in order to determine and isolate the effect of each noise factor. Accordingly, they are constructed according to an L4(23) outer orthogonal array, where the three essential factors correspond to the number of hidden layers, the number of neurons in each hidden layer, and the learning rate. Using this array, each noise factor can be specified at one of two different level settings (Table 4). When L4(23) is deployed, the number of hidden layers (Factor A), the number of neurons in hidden layer (Factor B), and the learning rate (Factor C) are assigned to Columns 1, 2, and 3 of the outer array, respectively (Table 5). 3.4. Construct orthogonal designs and implementing an experiment As described in Section 2.4 (sub-section, ‘‘Implementation of an experiment’’), after having established an appropriate number of factor level settings, controllable factors of the forecasting model were assigned to Columns 1, 2, and 4 of the L9(34) inner orthogonal array (shown in Table 1), while the interaction between Factor A and Factor B was assigned to Column 3. Table 1 also shows the response results for the nine experimental trials of the inner arrays and each trial will be replicated four times with different level settings

1060

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

Table 4 Noise factor (BPNÕs topology and parameter) and level Variables

Level 1

Level 2

A: number of hidden layers B: number of neurons in hidden layer C: learning rate

1 or 2 Number of input neuron · 1.0 h0.5

3 or 4 Number of input neuron · 1.5 >0.5

Table 5 L4(23) outer orthogonal arrays for BPN noise factors

Table 6 Analysis of the time series related factors experiment (ANOVA)

Experiment/column

Column

1 2 3 4

L4 A 1

B 2

C 3

1 1 2 2

1 2 1 2

1 2 2 1

Factors

f S

V

P

1 Factor A 2 6.8718 3.4359 3.37 2 Factor B 2 185.8200 92.9110 91.18 3 Interaction A · B 2 0.9073 0.4537 0.45 4 Factor C 2 10.1890 5.0945 5.00 All other/error 0 0 0 0 Total

8 203.7881

100%

A: number of hidden layers; B: number of neurons in hidden layer; C: learning rate.

of BPNÕs topology and parameters of the outer arrays for the three controllable factors and their interaction. In S/N analysis, the results register as the right-hand column of Table 1. S=N ¼ 10 Log10 ðMSDÞ and MSD ¼ ðy 21 þ y 22 þ y 23 þ y 24 Þ=4; where y1, y2, y3, y4 are the results of each experiment of the L9(34). 3.5. Rank and screen the selected factors Having conducted the experimental design, we surveyed the main effects in order to rank controllable factors. ANOVA establishes the relative significance of individual factors and interaction effects. The symbols in Tables 6 and 7 are indicated as follows: f S V F P

degree of freedom sum of squares mean squares (variance) variance ratio percent contribution

In Table 6, one can chart the contributions of Factors A (3.37%), B (91.18%), interaction A · B

(0.45%), and Factor C (5.00%) from the righthand column. According to this result, controllable factors can be ranked as follows: Factor B, Factor C, and Factor A. One might conclude that Factor B is stronger than Factor A, C, and interaction A · B as to the overall performance of these forecasting approaches. Factors A, C, and interaction A · B were then pooled for optimal condition. 3.6. Pool factors Upon analyzing the main effect, we see that factor B contributes more effect than Factor A and Factor C do. These two factors can then be pooled to obtain a new table of ANOVA, illustrated in Table 7. Factors A and C, then, can be pooled when they prove too difficult or too expensive. In other words, the decision-maker can arbitrarily assign level settings of Factor A and Factor C without significantly influencing the performance of the forecasting model. However, eliminating them rests with the decision-maker. 3.7. Optimize factor level settings and establish a forecasting model If the decision-maker decides not to pool factors A and C, we optimize their level settings and construct a forecasting model. The ANOM

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

1061

Table 7 Analysis of the time series related factors experiment (pooled ANOVA) Column

Factors

f

S

V

1 2 3 4 All other/error

Factor A Factor B Interaction A · B Factor C

(2) 2 (2) (2) 6

(6.8718) 185.8200 (0.9073) (10.1890) 17.9681

Pooled 92.9110 Pooled Pooled 2.9947

8

203.7881

Total

approach is used to establish optimal level settings for each factor, i.e., settings that yield the most significant mean average S/N for each factor during the experimental trials. If we examine the columns for Factor A (Table 1) and find that level 1 occurs in experiment numbers 1, 2, and 3, the average effect of A1 (level 1 of Factor A) is calculated by adding the results, S/N, of these three experiments as follows: A1 ¼ ðSN1 þ SN2 þ SN3 Þ=3

F

P

31.0250

88.24

11.76 100%

%MAE = 9.1307, before optimization process applies. These results confirm that the Taguchi method can indeed expand the searching space of the solution while BPN is deployed to search the feasible solutions, and indicate that optimal level forecasting model settings reflect data period of 6 months, a horizon length of 12 months, and a required number of observations, 120 months. 3.8. Validation of experiment

¼ ð26:821 þ 20:063 þ 14:869Þ=3 ¼ 20:584; which column 1, of Factor A, Table 8, unveils. The average effects of other factors are computed in a similar manner, shown in Table 8. Table 8 also displays an improvement at level 3 (22.715), 1 (27.838), and 2 (22.847) (with the highest value of S/N) for factors A, B, and C in the S/N. Optimal level settings should be obtained as follows: Factor A: Level 3, Factor B: Level 1, and Factor C: Level 2. Coincidentally, experiment number 7 tested these conditions and produced its optimal result. By choosing above optimal level settings and %MAE in Section 2.3, one can then obtain a forecasting modelÕs performance index as %MAE = 2.6936. This value represents a significant improvement over the performance index of

Table 8 Analysis of the time series related factors experiment (main effect) Factor/level

Level 1

Level 2

Level 3

Data period (A) Horizon length (B) Interaction (AB) Number of observation required (C)

20.584 27.838 21.533 21.994

21.830 20.320 21.440 22.847

22.715 16.971 22.155 20.288

It has been shown that Factor B is more significant than either Factor C or Factor A in the preceding analysis. If the level setting of Factor A are set at 1, 3, or 6, and Factor C is specified from 96 to 144 months, the performance index validating experiments is 3.9215 (%MAE). The difference between 2.6936 (controlling factors A, B, and C, according to Table 6) and 3.9215 (only controlling factors B, according to Table 7) is insignificant. Therefore, when the experimental budget is limited, decision-maker should first rank and screen all controllable factors, then pool insignificant factor(s), and ultimately focus on those having a significant effect. Fig. 2 presents a comparison of forecasting performance when optimized, using BPN, involving different numbers of calibrated factors. The first histogram (%MAE = 9.1307) in the figure indicates the non-calibrated performance, while the second histogram (%MAE = 2.6936) presents all three calibrated controllable factors. The third histogram (%MAE = 3.9215) presents the performances where just Factor B is calibrated. 3.8.1. Evaluation and comparison The forecasting ability of the multiplicative seasonal autoregressive integrated moving average

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065 Percentage Mean Absolute Error

1062

9.1307

10 8 6 4 2 0

3.9215

2.6936

Non-Calibrated

Calibrating Factors A, B, C

Calibrating Factor B

Fig. 2. Comparison on forecasting performance of different controllable factors.

(SARIMA) model built in this section is evaluated and compared with that of the proposed BPN model using the same database. When formulating and fitting the SARIMA model to the problem described above, the following SARIMA model is recommended: (p, d, q)(P, D, Q)s model (Box and Jenkins, 1976; Lim and McAleer, 1999). The general Box–Jenkins model is given by

p-values obtained for the model were found to be greater than 0.05. Hence, a 95% confidence exists that the model is adequate. In other words, the following model is suitable: SARIMA (2, 1, 1)(0, 1, 0)12. Forecasts of the monthly aggregate retail sales of Taiwan Power Company were made using the above SARIMA model. The forecasting performance of the proposed BPN and SARIMA models were evaluated. Fig. 3 presents a graphical and statistical comparison of these two forecasting models, and indicates that the proposed BPN model outperforms the SARIMA model. The %MAE of the five-year period (1997–2001) estimates calculated by these two models is 2.696 and 4.367, respectively. In addition to %MAE, the results based on the mean absolute deviation (MAD) for the proposed model is also better than SARIMA (17.909 versus 29.002 (million KWH)). In order to compare the relative effectiveness of these two models, the Wilcoxon signed-rank test (at 0.05 level of significance, one-tailed test) was performed

UP ðBs Þ/p ðBÞð1  Bs ÞD ð1  BÞd Z t ¼ C þ HQ ðBs Þhq ðBÞat . It is important to investigate the extent to which the model fits the given time series. Hence, this study investigated the behavior of the residuals by testing the hypothesis, i.e., the joint null hypothesis, that the residual autocorrelations are independent (white noise). The Ljung–Box test (Ljung and Box, 1978; Prybutok et al., 2000) for the SARIMA residual autocorrelation was used to check the adequacy of the tentatively identified model. In the Ljung–Box Chi-square test, all of the

Sale (log Thousand KWH)

5.95 5.9 5.85

Actual BPN

5.8

ARIMA

5.75 5.7 5.65 1

2

3

4

5

6

7

8

9

10

Date (the first half year, 1997--the second half year, 2001) Fig. 3. Comparison of Taiwan Power Company sale forecasting model.

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

in a matched pair experiment and the rank sum for the negative and positive differences was calculated (Lehmann, 1975). The results indicated that the proposed BPN model outperformed the SARIMA model.

1063

sets of the SARIMA (p, d, q)(P, D, Q)s model, whose adequacy of models were all be checked by the Ljung Box test of residual autocorrelation. The ANOM approach was used to establish the optimal level setting for each factor in a similar manner to that described above. The optimal level settings of the forecasting approach in this application were specified as follows: Factor A: Level 2, Factor B: Level 1, and Factor C: Level 3. Experiment Number 4 tested these conditions and generated an optimal result. A forecasting model performance index of %MAE = 2.923 was obtained using the optimal level settings. This value represents an improvement over the performance index of %MAE = 4.420 (simulation test) obtained prior to the optimization process. As shown in the above analysis, the proposed methodology provides a method for considering both the main factors and their interactions simultaneously. Meanwhile, the controllable factors can be ranked and screened using the Taguchi method. Although, each application has its own inherent characteristics, and it is difficult to determine a forecasting model to suit every application, the proposed approach offers a robust design for a forecasting model with noise immunity.

3.8.2. Generalization and contrast with the combination of SARIMA and Taguchi designs The same experimental procedure as described above for the combined BPN and Taguchi method was followed in order to generalize the combined SARIMA and Taguchi method. Clearly, different selections of the SARIMA model will have different forecasting accuracies due to the various data periods, horizon lengths, and number of observations involved. For instance, if the data period is monthly or quarterly, then the corresponding s periods of the SARIMA model will be indicated by (p, d, q)(P, D, Q)12 and (p, d, q)(P, D, Q)4. This may result in a different forecasting accuracy since different seasonal filters are cited. Therefore, it is important to investigate the effect invoked by the different controllable factors. Consequently, the same controllable factors of the forecasting model (Table 3) were deployed in the inner orthogonal array (shown in Table 9). Regarding noise immunity, in order to avoid the influence of the different parameter sets of the SARIMA model chosen by different researchers when solving practical problems, rather than deploying the outer orthogonal array, each experimental trial of the inner array was replicated three times with different parameter

4. Conclusion This study employs the Taguchi method to calibrate a forecasting model comprising multiple

Table 9 L9(34) inner orthogonal array for forecasting model (SARIMA) Experiment/column

L9 A

1 2 3 4 5 6 7 8 9

S/Ni B

AB

C

Results

1

2

3

4

Y1

Y2

Y3

Y

1 1 1 2 2 2 3 3 3

1 2 3 1 2 3 1 2 3

1 2 3 2 3 1 3 1 2

1 2 3 3 1 2 2 3 1

4.479 4.801 6.298 2.604 3.333 2.618 4.526 4.322 6.121

4.824 4.611 5.109 2.605 3.021 2.853 4.488 4.338 6.076

5.616 4.411 5.010 3.559 4.026 4.780 4.481 4.368 6.076

4.973 4.608 5.472 2.923 3.460 3.417 4.498 4.343 6.091

A: Data period used; B: horizon length; C: number of observations required.

26.028 26.725 25.187 30.583 29.155 28.992 26.939 27.245 24.306

1064

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065

controllable factors. Through its simultaneous consideration of the main and interaction effects of controllable factors, this method permits an efficient data collection (the prototype of data collecting) for forecasting purposes. Furthermore, by configuring the parameters of an adopted forecasting method as noise factors in an outer array, the proposed model yields a more objective forecasting result, enabling decision-makers to generate more competitive strategies. The Taguchi method in the proposed methodology delineates and detects the interaction effects between controllable factors and expands the feasible searching space of the forecasting model. Furthermore, it has been shown that orthogonal arrays facilitate the ranking and screening of controllable factors, and allow decision-makers to achieve a compromise between experimental costs and forecasting performance. The results of the illustrated example suggest that the proposed method offers an acceptable forecasting accuracy; even when extensive data is unavailable for less significant controllable factors. The parameters of the adopted forecasting method in the outer orthogonal array prevent it from interfering with forecasting performance. Meanwhile, due to various types of applications, the simultaneous deployment of inner and outer orthogonal arrays permits decision-makers to conduct a more effective and objective data collection activity. Because each application has its own inherit characteristics, it is difficult to determine which controllable factors should be calibrated and compared with typical forecasting performance. The proposed approach does offer an alternative forecasting model in data collection. We recommend that future studies extend current controllable factors to include alternative forecasting applications, and to extend prospective approach as to other production applications.

Acknowledgment This research is partially supported by the National Science Council of Taiwan, ROC, under the Contract # NSC 93-2416-H-006-014.

References Abhijit, D., Hojjat, A., 2003. Neural network model for rapid forecasting of freeway link travel time. Engineering Applications of Artificial Intelligence 16, 607–613. Box, G.P., Jenkins, G.M., 1976. Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco. Casab, J., Orsolya, D., Anna, L., Eya, A., Lstyan, N., 2003. Taguchi optimization of ELISA procedures. Journal of Immunological Methods 223 (2), 37–146. Chen, L.H., 1997. Designing robust products with multiple quality characteristics. Computers and Operations Research 24 (10), 937–944. Delurgio, S.A., 1998. Forecasting Principles and Applications. McGraw-Hill, Boston. Kalaitzakis, K., Stavrakakis, G.S., Anagnostakis, E.M., 2002. Short-term load forecasting based on artificial neural networks parallel implementation. Electric Power Systems Research 63, 185–196. Khoei, A.R., Masters, I., Gethin, D.T., 2002. Design optimization of aluminum recycling processes using Taguchi technique. Journal of Materials Processing Technology 127 (1), 96–106. Ko, D.C., Kim, D.H., Kim, B.M., 1999. Application of artificial neural network and Taguchi method to perform design in metal forming considering workability. International Journal of Machine Tools and Manufacture 39, 771–785. Kyong, J.O., Ingoo, H., 2000. Using change-point detection to support artificial neural networks for interest rates forecasting. Expert Systems with Applications 19, 105–115. Kumar, K., Jain, V.K., 1999. Autoregressive integrated moving averages (ARIMA) modeling of a traffic noise time series. Applied Acoustics 58, 283–294. Lehmann, E.L., 1975. Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, San Francisco. Lim, C., McAleer, M., 1999. A seasonal analysis of Malaysian tourist arrivals to Australia. Mathematics and Computers in Simulation 48, 573–583. Lin, T.Y., Tseng, C.H., 2000. Optimum design of artificial neural networks: An example in a bicycle derailleur system. Artificial Intelligence 13, 3–14. Ljung, G.M., Box, G.E.P., 1978. On a measure of lack of fit in time series models. Biometrika 65 (2), 297–303. Makridakis, S., Andersen, A., Carbone, R., Fildes, R., Hibon, R., Lewandowski, R., Newton, J., Parzen, E., Winkler, R., 1982. The accuracy of extrapolation (time series) method: Results of a forecasting competition. Journal of Forecasting 1, 111–153. McMillan, A.R., Jones, I.A., Rudd, C.D., Middleton, V., 1998. Statistical study of environmental degradation in resintransfer moulded structural composites. Science and Manufacturing (Incorporating Composites and Composites Manufacturing) 29 (7), 855–865. Nelson, P.R., Dudewicz, E.J., 2002. Exact analysis of means with unequal variances. Technometrics 44 (2), 152–160. Niska, H., Hiltunen, T., Karppinen, A., Ruuskanen, J., Kolehmaxinen, M., 2004. Evolving the neural network

T.-Y. Wang, C.-Y. Huang / European Journal of Operational Research 176 (2007) 1052–1065 model for forecasting air pollution time series. Engineering Applications of Artificial Intelligence 17, 159–167. Pitcher, T.J., Buchary, E.A., Hutton, T., 2002. Forecasting the benefits of no-take human-made reefs using spatial ecosystem simulation. Journal of Marine Science 59, S17–S26. Prybutok, V.R., Yi, J., Mitchell, D., 2000. Comparison of neural network models with ARIMA and regression models for prediction of HoustonÕs daily maximum ozone concentrations. European Journal of Operational Research 122, 31–40. Radaev, N.N., 1998. Estimate of the number of observations required to check the adequacy of dose-effect models. Atomic Energy 85 (1), 60–65. Reh, L., Ye, H., 2000. Neural networks for on-line prediction and optimization of circulating fluidized bed process steps. Powder Technology 111, 123–131. Roy, R., 1990. A Primer on the Taguchi Method. Van Nostrand Reinhold, New York.

1065

Russell, N.T., Bakker, H.H.C., Chaplin, R.I., 2000. Modular neural network modeling for long-range prediction of an evaporator. Control Engineering Practice 8, 49–59. Saab, S., Badr, E., Nasr, G., 2001. Univariate modeling and forecasting of energy consumption: The case of electricity in Lebanon. Energy 26, 1–14. Seong, J.K., Kwang, S.K., Ho, J., 2003. Optimization of manufacturing parameters for a brake lining using Taguchi method. Journal of Materials Processing Technology 136, 202–208. Syrcos, G.P., 2003. Die casting process optimization using Taguchi methods. Journal of Material Processing Technology 135, 68–74. Taguchi, G., 1986. Introduction to Quality Engineering. Asian Productivity Organization, Tokyo. Tan, K.K., Tang, K.Z., 2001. Vehicle dispatching system based on Taguchi-tuned fuzzy rules. European Journal of Operational Research 128, 545–557.