Virtual Models for Prediction of Wind Turbine ... - Semantic Scholar

6 downloads 32931 Views 2MB Size Report
Feb 21, 2010 - velopment of virtual models of a wind turbine is presented. To .... are used to build models. ... Thus, five parameters are used for building.
IEEE TRANSACTIONS ON ENERGY CONVERSION, VOL. 25, NO. 1, MARCH 2010

245

Virtual Models for Prediction of Wind Turbine Parameters Andrew Kusiak, Member, IEEE, and Wenyan Li

Abstract—In this paper, a data-driven methodology for the development of virtual models of a wind turbine is presented. To demonstrate the proposed methodology, two parameters of the wind turbine have been selected for modeling, namely, power output and rotor speed. A virtual model for each of the two parameters is developed and tested with data collected at a wind farm. Both models consider controllable and noncontrollable parameters of the wind turbine, as well as the delay effect of wind speed and other parameters. To mitigate data bias of each virtual model and ensure its robustness, a training set is assembled from ten randomly selected turbines. The performance of a virtual model is largely determined by the input parameters selected and the data mining algorithms used to extract the model. Several data mining algorithms for parameter selection and model extraction are analyzed. The research presented in the paper is illustrated with computational results. Index Terms—Data mining, parameter selection, power prediction, virtual model, wind turbine.

I. INTRODUCTION HE LARGE-SCALE wind energy industry is relatively new and is rapidly expanding [1]. The ability of a wind turbine to extract power from the wind is a function of three main factors: the measured wind speed, the power curve of the turbine, and the ability of the turbine to handle wind fluctuations [2]. Different methodologies for studying wind farms have been published in the literature [3], [4], most often physics-based and statistical approaches [5]–[9]. Due to the dynamic nature of wind, prediction of wind power is a major challenge calling for new and accurate prediction models [10]. Data mining offers algorithms for modeling wind farm performance. Numerous successful applications of data mining in manufacturing, marketing, medical informatics, and the energy industry have been reported in the literature [11]–[14]. The literature on data mining in wind energy has primarily focused on estimating and optimizing the power output. A review of the literature on forecasting wind speed and generated power using both physical models and data mining methods is presented in [15]. Models for long-term and short-term predictions of power with data mining algorithms are reported in [10]. An approach to optimizing power by controlling generator torque is discussed in [2]. Optimization of power output and operational performance is reported in [16] and [17].

T

Manuscript received December 19, 2008; revised May 26, 2009 and August 4, 2009. First published November 24, 2009; current version published February 17, 2010. This work was supported by the Iowa Energy Center under Grant 07-01. Paper no. TEC-00506-2008. The authors are with the Intelligent Systems Laboratory, The University of Iowa, Iowa City, IA 52242-1527 USA (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TEC.2009.2033042

Models can be built with a variety of machine learning algorithms, e.g., neural networks [18]–[21]. A neural network was applied in [22] to estimate power output as a function of the time delay of wind speed and the power itself. However, none of the published papers has focused on virtual and generalized models to predict turbine parameters of interest, e.g., the power output and the rotor speed. Data mining algorithms construct models using large datasets. It requires data preprocessing that is time-consuming, e.g., 80% of the analysis time may be spent on data sampling, parameter selection, and other data analysis tasks [23]. Methods such as simple random sampling and stratified sampling [24] can be used. Parameter selection is regarded as an important task in data mining, and some algorithms have been proven to be effective [25]–[28] in determining relevant parameters. Considering the fact that wind speed and wind turbine performance vary across different turbine locations of a wind farm, the question arises as to whether a generalized model (called in this paper a virtual model) of a wind turbine could be developed. Such a virtual model is developed in this paper based on supervisory control and data acquisition (SCADA) data collected at wind turbines. As a wind turbine is a complex system, two models of a wind turbine are developed in this paper, namely, the power output and the rotor speed model. Though the physicsbased relationship between the two parameters is known, the data-derived models are not. The methodology presented here can be applied to modeling many aspects of a wind turbine. Predicting the power output demonstrates the capability of the virtual model to improve performance of a wind turbine, while predicting the rotor speed points to the utility of the virtual model to improve the lifetime of turbine components, e.g., the gearbox. The methodology for developing virtual models (Section II) involves data preprocessing, model extraction, and model validation. Section III discusses the collection and analysis of data collected from 30 wind turbines. Selection of parameters and temporal analysis of input parameters for data mining are presented in Section IV. Data sampling is illustrated in Section V. Performance of models extracted with six data mining algorithms is discussed in Section VI. A neural network is selected to extract models to predict power output and rotor speed. Section VII includes computational results, with the models selected in Section VI utilizing datasets with various characteristics. Section VIII concludes the paper. II. METHODOLOGY FOR DEVELOPING VIRTUAL MODELS Wind turbines are equipped with sensors providing various measurements of parameters such as wind speed, power output,

0885-8969/$26.00 © 2009 IEEE Authorized licensed use limited to: The University of Iowa. Downloaded on February 21,2010 at 20:55:23 EST from IEEE Xplore. Restrictions apply.

246

IEEE TRANSACTIONS ON ENERGY CONVERSION, VOL. 25, NO. 1, MARCH 2010

Fig. 2. Fig. 1.

Comparison of wind speeds of four different wind turbines.

Methodology for developing virtual models.

generator torque, and so on. Some of these parameters are used to monitor and control the performance of wind turbines. A datadriven approach for predicting parameters of interest based on 10 s data (i.e., high-frequency data averaged over 10 s intervals) is proposed. The methodology for developing virtual models to predict selected parameters (aspects) of wind turbines includes three phases and five steps (see Fig. 1): Phase 1: Data preprocessing includes three steps: data collection and analysis, parameter selection, and data sampling. Phase 2: Model extraction includes: comparison of different algorithms for model construction. Phase 3: Model validation involves: analysis of computational results over a number of datasets. The steps of the methodology outlined in Fig. 1 are described next. Step 1 (Data collection and analysis): The turbine data, which are captured by the SCADA system, are explored. In particular, data formats and frequency are preprocessed for uniformity. Any data that is incomplete, in error, or is missing, needs to be dealt with. Step 2 (Parameter selection): Parameter selection is considered from two perspectives: domain knowledge and data mining algorithms. In terms of the domain knowledge, all the parameters of a wind turbine system are classified into three categories: 1) controllable parameters, e.g., blade pitch angle and generator torque; 2) noncontrollable parameters, e.g., wind speed; 3) turbine performance parameters, e.g., power output and rotor speed. Controllable and noncontrollable parameters constitute inputs to the virtual models, while the performance parameters create the outputs predicted by the models. The impact of input parameters on the output (performance) parameter varies. Some insignificant parameters are easy to eliminate based on the domain knowledge, while other parameters require algorithmic selection. Thus, the impact of input parameters on the performance parameter is ranked by the data mining algorithms.

Step 3 (Data sampling): Data sampling is commonly used for selecting a subset of data from a large volume of data. In this paper, data sampling is performed according to the range of wind speed, which is the only noncontrollable parameter available in the dataset. This sampling strategy leads to a data sample that is representative across different wind speed ranges. Step 4 (Model extraction): Different data mining algorithms are used to build models. The model that performs best is selected for an in-depth study. Step 5 (Computational analysis): In this paper, three types of datasets with different characteristics are used to evaluate the performance of the model extracted in Step 4. III. DATA COLLECTION AND ANALYSIS The measurements of parameters collected at turbines have relatively high frequency (e.g., 20 Hz). These measurements are then averaged over longer time intervals, such as 10 s, 30 s, or 10 min intervals required by different applications. In this paper, 10 s data from 30 turbines collected at two time periods are used. One time period provides data for high wind speed (i.e., the average wind speed in this period was high), and the other provides data for low wind speed. The analysis performed in this research shows that the data collected on the same parameters across different turbines of the same wind farm exhibit different characteristics. To illustrate the data variability, four random turbines of the same type have been selected. Fig. 2 shows the wind speed recorded by the SCADA system of these turbines. It is clear that the range of wind speed for turbine 2 is significantly different from the other three turbines. The wind speed of turbine 1, turbine 3, and turbine 4 is higher than the cut-in speed of 3.5 m/s for this turbine type. The wind speed of turbine 2 is below the cut-in speed, which indicates that this turbine could not produce power, as opposed to the other three turbines. Fig. 3 shows the power curves generated using 10 min data for the four turbines, which not only looks different from the ideal power curves, but also shows distinct characteristics. It is obvious that the negative power output of turbine 2 indicates that this turbine is consuming (e.g., power electronics) rather than

Authorized licensed use limited to: The University of Iowa. Downloaded on February 21,2010 at 20:55:23 EST from IEEE Xplore. Restrictions apply.

KUSIAK AND LI: VIRTUAL MODELS FOR PREDICTION OF WIND TURBINE PARAMETERS

247

TABLE I IMPORTANCE OF PARAMETERS IN PREDICTING THE POWER OUTPUT

Fig. 3.

POs of four turbines based on 10 min data. TABLE II IMPORTANCE OF PARAMETERS IN PREDICTING THE ROTOR SPEED

Fig. 4.

POs of four turbines based on 10 s data.

producing energy. The power of the remaining three turbines differs in ranges and shapes. Fig. 4 illustrates the power output generated from 10 s data. Note that the data used in Fig. 3 was obtained by averaging the data used in Fig. 4 to 10 min data. Although the power ranges of turbine 1, turbine 3, and turbine 4 differ, they have similar shapes. To ensure that the behavior of the turbines is adequately reflected in the model, the 10 s data is used for analysis. Of the 30 turbines considered in this research, the data from ten randomly selected turbines constitutes a training set, and the data from all 30 turbines are used to test the proposed methodology. IV. PARAMETER SELECTION A. Impact of Various Parameters The 10 s SCADA data provided for this research includes ten parameters: the power output (PO), generator torque (GT), generator speed (GS), wind speed (WS), generator bearing A temperature (GBAT), generator bearing B temperature (GBBT), drive train acceleration (DTA), blade pitch angle (BPA), nacelle position (NP), and the rotor speed (RS). To illustrate the methodology outlined in Section II (see Fig. 1), the following two performance parameters have been selected: power output and rotor speed. Though the relation-

ship between the power, rotor speed, and torque is known from physics, it has not been derived from the data. Therefore, the virtual models are built for the power output and the rotor speed. After deletion of all low-quality data of turbine 1, e.g., negative power outputs, two algorithms (boosting tree and neural network) were used to rank order the parameters that could be potentially used to predict power output and rotor speed (see Tables I and II). Note that both tables include parameters that can be controlled (e.g., blade pitch angle) and those that cannot be controlled (e.g., wind speed). Based on the value of importance reported in Tables I and II, and the control logic of a wind turbine, two controllable parameters are selected: the generator torque and the blade pitch angle. The drive train acceleration is not selected as it is not directly controlled and is determined by the generator torque and blade pitch angle. Thus, five parameters are used for building a virtual model: WS, BPA, GT, PO, and RS among which WS is a noncontrollable parameter, BPA and GT are controllable parameters, and PO and RS are system performance parameters. The data provided by the wind turbine manufacturer shows that the maximum generator speed is 1600 r/min, the maximum rotor speed is 23 r/min, the maximum power output produced by the double-fed induction generator is 1600 kW, the generator torque is limited to 10,090 N·m, and the maximum generator torque change (ramp) rate is 4500 N·m/s.

Authorized licensed use limited to: The University of Iowa. Downloaded on February 21,2010 at 20:55:23 EST from IEEE Xplore. Restrictions apply.

248

IEEE TRANSACTIONS ON ENERGY CONVERSION, VOL. 25, NO. 1, MARCH 2010

Fig. 5. Parameter selected for virtual models (see Section IV-A for definitions of the abbreviated parameters).

B. Impact of the Past States Due to the dynamic nature of the wind energy conversion process, it is necessary to consider the time-based values of input parameters discussed next. 1) Impact of the Past Values of Noncontrollable Parameters: The only noncontrollable parameter considered in this research is wind speed. The boosting tree and the neural network algorithms are used to determine the importance of different past states of the wind speed v, i.e., v at time t, t – 1, t – 2, until t – 9, in predicting the power output and rotor speed. When predicting the wind turbine power, the importance of the wind speed at the previous states is arranged in time sequence. Namely, for both algorithms, the sequence of predictors is identical v(t) ≥ v(t − 1) ≥ · · · ≥ v(t − 9). However, when predicting the rotor speed, the order of importance deviates from this time sequence toward the end of the sequence. Therefore, the values v(t), v(t−1), v(t−2), and v(t−6) are selected for modeling. 2) Impact of the Past Values of Controllable and Performance Parameters: The impact of the input parameters measured at past intervals on the future state of the turbine was shown in [12]. The model governing the relationship between the past and future parameters is not known. In this paper, the values of controllable parameters at time intervals t, t − 1, and t − 2 and the controllable parameters at two past intervals, t − 1 and t − 2, are used to predict the performance parameter at time t (see Fig. 5). Fig. 5 shows the final parameter selection for modeling power output and rotor speed. Three types of input parameters and their past state data are considered: 1) Wind speed is the only noncontrollable parameter considered in this paper with v(t), v(t − 1), v(t − 2), and v(t − 6) used in virtual models. 2) Two controllable parameters, BPA and GT, and their two immediate past states. 3) The two immediate past states of performance parameters, PO and RS. V. DATA SAMPLING In this section, the statistical properties of the datasets used in this research are summarized. The cumulative distributions of the wind speed, the power output, and the rotor speed are

Fig. 6.

Comparison of wind speed distributions.

Fig. 7.

Comparison of PO distributions.

Fig. 8.

Comparison of the rotor speed distributions.

presented in Figs. 6–8. For low wind speed, 96.75% of the wind speed values are less than 12.5 m/s, and 88.5% of the power outputs are less than 1000 kW. For high wind speed, almost 18% of the wind speeds are larger than 12.5 m/s, and nearly half of the power outputs are higher than 1000 kW. The rotor speed for high wind speed values is higher than 10 r/min, while 16% of the rotor speeds for low wind speeds are less than 10 r/min. As the wind speed in the interval [3.5–13] m/s is studied, 1500 data points were randomly selected in each category of

Authorized licensed use limited to: The University of Iowa. Downloaded on February 21,2010 at 20:55:23 EST from IEEE Xplore. Restrictions apply.

KUSIAK AND LI: VIRTUAL MODELS FOR PREDICTION OF WIND TURBINE PARAMETERS

TABLE III PERFORMANCE COMPARISON FOR MODELS EXTRACTED BY SIX DIFFERENT ALGORITHMS

249

TABLE IV POWER OUTPUT PREDICTION

TABLE V MAXIMUM ABSOLUTE AND RELATIVE PREDICTION ERRORS FOR THE POWER OF THE SIX SELECTED TURBINES

the wind speed data to form training data. This way the training dataset is balanced across all categories. The data from turbines 1 to 10 were used to assemble the training dataset. In total, 19 (categroies) × 1500 (data instances), i.e., 28,500 data instances are used in the training dataset. VI. MODEL EXTRACTION The models for predicting the power output and the rotor speed are expressed in (1) and (2), respectively y1 (t) = f1 (y1 (t − 1), y1 (t − 2), y2 (t − 1), y2 (t − 2), x1 (t), x1 (t − 1), x1 (t − 2)x2 (t), x2 (t − 1), x2 (t − 2)v(t), v(t − 1), v(t − 2), v(t − 6))

(1)

y2 (t) = f2 (y1 (t − 1), y1 (t − 2), y2 (t − 1), y2 (t − 2)x1 (t), x1 (t − 1), x1 (t − 2)x2 (t), x2 (t − 1), x2 (t − 2), v(t), v(t − 1), v(t − 2), v(t − 6)).

(2)

The performance of the models (1) and (2) built by six different algorithms, specifically, random forest, neural network, boosting tree, support vector machine, generalized additive model, and the k-nearest neighbors, are reported in Table III. The absolute error (AE) and relative error (RE) used in Table VIII and all other tables are defined in (3) and (4) Absolute error = |ˆ y (t) − y(t)|    yˆ(i) − y(t)   × 100%. Relative error =   y(t)

(3) (4)

Based on the results in Table III, the neural network performed best among the six algorithms tested. The neural network algorithm is used to train a virtual model to predict power output and rotor speed. Here, 30 different neural networks were trained, and the best performing one was selected to form a virtual model. The training time was about 5 h long. VII. ANALYSIS OF COMPUTATIONAL RESULTS In this section, the virtual model is evaluated using three types of industrial datasets. The nature of each dataset of the first type is similar to the training dataset. In fact, the training dataset is a subset of the combined set of data from 30 turbines.

Therefore, the values of the noncontrollable parameter (wind speed), controllable parameters (blade pitch angle and generator torque), and performance parameters (power output and rotor speed) share the same characteristics. The nature of each dataset for the second type varies with the training dataset because each dataset is collected for different wind speed values. The training dataset itself includes data at low wind speeds, while each test dataset corresponds to high wind speed. Thus, the values of noncontrollable parameters do not share the same characteristics of the training dataset. In the third dataset, the values of controllable and noncontrollable parameters have been randomly selected for a turbine and are much different than those in the training dataset. A. Power Output Prediction Results The dataset collected from 30 turbines varied in quality. The data of turbine 6 and turbine 21 were removed from the test data set due to their low quality. Table IV presents statistics for the six selected turbines and averages over 28 turbines. The data in Table IV indicates that the smallest mean AE is 5.57 kW (for turbine 28), and the smallest RE is 2.32% (for turbine 2). The largest mean absolute error is 10.41 kW (for turbine 16), and the largest relative error is 11.41% (for turbine 10). Thus, these four turbines are selected for further analysis. To provide a broader context, the results for turbine 1 and turbine 26 are also included in the results discussed next. 1) Minimum AE and RE: The minimum AE and RE obtained are 0.00 kW and 0.00%, respectively, for the six selected wind turbines. 2) Maximum AE: The observed power, the predicted power, the RE, and the maximum AE statistics for the six selected turbines are shown in Table V.

Authorized licensed use limited to: The University of Iowa. Downloaded on February 21,2010 at 20:55:23 EST from IEEE Xplore. Restrictions apply.

250

Fig. 9.

IEEE TRANSACTIONS ON ENERGY CONVERSION, VOL. 25, NO. 1, MARCH 2010

AE distribution for turbine 16 and turbine 28.

For the turbines in Table V, which represent the worst case prediction outcomes of the 28 turbines tested, some errors are not acceptable. For example, for turbine 1, the maximum absolute error is 81.28 kW, yet the relative one is only 8.18%. The maximum AE for turbines 2 and 28 is similar in magnitude to that of turbine 1. The prediction results for turbine 26, however, are not accurate. Therefore, it is necessary to analyze the error distribution over all data points for the turbines of Table V. The results of error distribution indicate that nearly 80% of the absolute errors for each turbine are smaller than 10 kW, and nearly 99% of the absolute errors are smaller than 50 kW. This implies that most of the time, power output is accurately predicted. Fig. 9 shows the AE distribution for turbine 16 and turbine 28. Turbine 28 shows the best results, and turbine 16 shows the worst results, among the six selected turbines. The results for the remaining four turbines are between turbine 28 and turbine 16. 3) Maximum RE: The maximum RE for wind turbines usually corresponds to low power outputs. For example, the maximum RE for turbine 26 is 36,908.10%, which is the largest value among the six turbines. The smallest value for six turbines occurs at turbine 16, and it equals 638.84%. The distribution of mean REs indicates that almost 90% of the relative errors are less than 5% and 97% of the relative errors’ percentages are not greater than 10%. This confirms that at operational (relatively high) wind speed ranges, the model accurately predicts power output for the vast majority of turbines. Fig. 10 shows the RE distribution for turbine 16 and turbine 28. Turbine 28 shows the best results, and turbine 16 shows the worst results. The prediction results for all other four turbines fall between those of turbine 28 and turbine 16. B. Results for the Rotor Speed Prediction The average observed and predicted values of the rotor speed for the four selected turbines and the average values over the 28 turbines are shown in Table VI. The data in Table VI illustrate that the mean AEs are between 0.1 and 0.4 r/min. Turbine 28 shows the smallest mean AE, and turbine 18 shows the largest. Thus, the turbines 1, 18, 26, and 28 are selected for further analysis. As a significant percentage of

Fig. 10.

RE distribution for turbine 16 and turbine 28. TABLE VI AVERAGE RESULTS FOR THE RS PREDICTION

TABLE VII STATISTICS FOR THE FOUR TURBINES REPRESENTING THE MAXIMUM AE

rotor speed values are zero or close to zero, the corresponding RE is large, which is meaningless, and therefore, these results are not presented here. 1) Minimum AE: The minimum AE for each the four selected turbines is 0.00. 2) Maximum AE: Table VII shows the observed and predicted rotor speed data for the four turbines corresponding to the maximum absolute prediction error. For maximum AE, some predictions are obviously wrong; however, some are acceptable, and turbine 28 is such an example. If the prediction error is small, then the prediction model is acceptable. 3) Distribution of AE: The analysis of AE of rotor speed prediction indicates that most of the AEs are less than 1 r/min. Turbine 1 represents the worst case scenario, where the AE is less than 1 r/min for 62.54% of the instances tested. For the remaining three turbines, the AE is less than 1 r/min for 97% of the instances.

Authorized licensed use limited to: The University of Iowa. Downloaded on February 21,2010 at 20:55:23 EST from IEEE Xplore. Restrictions apply.

KUSIAK AND LI: VIRTUAL MODELS FOR PREDICTION OF WIND TURBINE PARAMETERS

TABLE VIII POWER PREDICTION RESULTS FOR THE THREE SELECTED TURBINES AT HIGH WIND SPEED

TABLE IX SPEED PREDICTION RESULTS FOR THE THREE SELECTED TURBINES AT HIGH WIND SPEED

C. Prediction of Power Output and Rotor Speed for High Wind Speed For the same turbines, datasets for high wind speed are also provided. To indicate that the model is also suitable in the high wind speed situation, turbines 1, 16, and 28 have been randomly selected to test the accuracy of the prediction models. The statistics for the prediction of the power rotor speed for the three selected turbines at high wind speed are shown in Tables VIII and IX, respectively. As illustrated in Tables VIII and IX, the prediction accuracies of power output and rotor speed are acceptable. The models are robust even two periods between training and test data share different characteristics, as shown from Figs. 6–8. The error analysis indicates that over 99% of the AEs are less than 50 kW. As this dataset represents high wind speeds, the average power output is high, and the prediction accuracy is much higher than for the low wind speed scenario. Over 97% of REs are less than 10%, and nearly 99% of the REs are less than 20%. The distribution of AEs for predicting rotor speed shows that over 99.5% of the errors are less than 2 r/min. D. Prediction of Power Output and Rotor Speed Using Independent Datasets In this section, another 10 s dataset randomly selected from the same wind farm is used to test the virtual models. After denoising, 32,725 data points are considered. The prediction results of the power output indicate that mean AE is 5.59 kW and mean RE is 2.59% based on the average observed power output of 418.62 kW. The prediction results of rotor speed illustrate that the mean AE is 0.14 r/min for the average observed rotor speed of 14.87 r/min. The results provide evidence that the power output and rotor speed can be accurately predicted for an operational range of wind speeds. Note that the test data represent measurements taken at turbines different from the ones from which a training dataset was derived.

251

VIII. CONCLUSION In this paper, a methodology for building virtual models for the prediction of the parameters of a wind turbine was presented. The proposed methodology involved three steps: data preprocessing, model extraction, and model validation. In the first step, after analyzing the raw data, the controllable parameters and noncontrollable parameters, as well as their past states, were considered for parameter selection. In order to eliminate data bias, a stratified sampling is performed based on the wind speed. Two parameters were selected to test the proposed methodology: power output and rotor speed. The models were extracted by six different algorithms: random forest, neural network, boosting tree, support vector machine, generalized approach, and the k-nearest neighbor algorithm. The neural network showed the best performance and was selected for extraction of the models for parameter prediction. The models developed in this paper were validated by three datasets of different characteristics, including the wind speed range, the time period, and the source. The first dataset included data corresponding to low wind speeds, the second dataset was generated at high wind speeds, and the final dataset was randomly selected from a turbine at the same wind farm. Although the test datasets shared different characteristics, the parameters predicted by the virtual models were accurate. This implies that the virtual models can be used to predict the power output and rotor speed for a turbine of interest using the data collected at other turbines.

REFERENCES [1] G. V. Kuik, B. Ummels, and R. Hendriks, Sustainable Energy Technologies. Amsterdam, The Netherlands: Springer-Verlag, 2007. [2] B. Boukhezzar, H. Siguerdidjane, and M. M. Hand, “Nonlinear control of variable-speed wind turbines for generator torque limiting and power optimization,” ASME Trans.: J. Solar Energy Eng., vol. 128, no. 4, pp. 516– 531, 2006. [3] C. Alexandre, C. Antonio, N. Jorge, L. Gil, M. Henrik, and F. Everaldo, “A review on the young history of the wind power short-term prediction,” Renew. Sustain. Energy Rev., vol. 12, no. 6, pp. 1725–1744, 2008. [4] A. Kusiak, H. Zheng, and Z. Song, “Models for monitoring wind farm power,” Renew. Energy, vol. 34, no. 6, pp. 1487–1493, 2009. [5] L. Landberg, “Short-term prediction of the power production from wind farms,” J. Wind Eng. Ind. Aerodyn., vol. 80, no. 1–2, pp. 207–220, 1999. [6] M. Alexiadis, P. Dokopoulos, H. Sahsamanoglou, and I. Manousaridis, “Short-term forecasting of wind speed and related electrical power,” Solar Energy, vol. 63, no. 1, pp. 61–68, 1998. [7] M. Negnevitsky and C. W. Potter, “Innovative short-term wind generation prediction techniques,” in Proc. Power Syst. Conf., 2006, pp. 60–65. [8] I. G. Damousis, M. C. Alexiadis, J. B. Theocharis, and P. S. Dokopoulos, “A fuzzy model for wind speed prediction and power generation in wind parks using spatial correlation,” Energy Convers., vol. 19, no. 2, pp. 352– 361, 2004. [9] U. Focken, M. Lange, K. Monnich, H. P. Waldl, H. G. Beyer, and A. Luig, “Short-term prediction of the aggregated power output of wind farms— a statistical analysis of the reduction of the prediction error by spatial smoothing effects,” J. Wind Eng. Ind. Aerodyn., vol. 90, no. 3, pp. 231– 246, 2002. [10] A. Kusiak, H. Zheng, and Z. Song, “Short-term prediction of wind farm power: A data-mining approach,” IEEE Trans. Energy Convers., vol. 24, no. 1, pp. 125–136, Mar. 2009. [11] A. Kusiak and Z. Song, “Combustion efficiency optimization and virtual testing: A data-mining approach,” IEEE Trans. Ind. Informat., vol. 2, no. 3, pp. 176–184, Aug. 2006.

Authorized licensed use limited to: The University of Iowa. Downloaded on February 21,2010 at 20:55:23 EST from IEEE Xplore. Restrictions apply.

252

IEEE TRANSACTIONS ON ENERGY CONVERSION, VOL. 25, NO. 1, MARCH 2010

[12] J. A. Harding, M. Shahbaz, S. Srinivas, and A. Kusiak, “Data mining in manufacturing: A review,” ASME Trans.: J. Manuf. Sci. Eng., vol. 128, no. 4, pp. 969–976, 2006. [13] M. Berry and G. S. Linoff, Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, 2nd ed. New York: Wiley, 2004. [14] P. Backus, M. Janakiram, S. Mowzoon, G. C Runger, and A. Bhargava, “Factory cycle-time prediction with data-mining approach,” IEEE Trans. Semicond. Manuf., vol. 19, no. 2, pp. 252–258, May 2006. [15] L. Ma, S. Luan, C. Jiang, H. Liu, and Y. Zhang, “A review on the forecasting of wind speed and generated power,” Renew. Sustain. Energy Rev., vol. 13, no. 4, pp. 915–920, 2009. [16] R. Datta and V. T. Ranganathan, “A method of tracking the peak power points for a variable speed wind energy conversion system,” IEEE Trans. Energy Convers., vol. 18, no. 1, pp. 163–168, Mar. 2003. [17] T. Ustuntas and A. D. Sahin, “Wind turbine power curve estimation based on cluster center fuzzy logic modeling,” J. Wind Eng. Ind. Aerodyn., vol. 96, no. 5, pp. 611–621, 2008. [18] S. Haykin, Neural Networks: A Comprehensive Foundation. New York: Macmillan, 1994. [19] M. Mabel and E. Fernandez, “Analysis of wind power generation and prediction using ANN: A case study,” Renew. Energy, vol. 33, no. 5, pp. 986–992, 2008. [20] Y. Xiao, W. Wang, and X. Huo, “Study on the time-series wind speed forecasting of the wind farm based on neural networks,” Energy Conserv. Technol., vol. 25, no. 2, pp. 106–109, 2007. [21] S. Li, “Wind power prediction using recurrent multilayer perceptron neural networks,” in Proc. 2003 IEEE Power Eng. Soc. Gen. Meet., vol. 4, pp. 2325–2330. [22] S. Kelouwani and K. Agbossou, “Nonlinear model identification of wind turbine with a neural network,” IEEE Trans. Energy Convers., vol. 19, no. 3, pp. 607–612, Sep. 2004. [23] S. Piramuthu, “Evaluating feature selection methods for learning in data mining applications,” in Proc. Thirty-First Hawaii Int. Conf. Syst. Sci., Kohala Coast, HI, 1998, vol. 5, pp. 294–302. [24] P. N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining. New York: Addison-Wesley, 2006. [25] J. Hua, W. D. Tembe, and E. R. Dougherty, “Performance of featureselection methods in the classification of high-dimension data,” Pattern Recognit., vol. 42, no. 3, pp. 409–424, 2009. [26] C. Tsai, “Feature selection in bankruptcy prediction,” Knowl.-Based Syst., vol. 22, no. 2, pp. 120–127, 2009. [27] J. H. Friedman, “Stochastic gradient boosting,” Comput. Statist. Data Anal., vol. 38, no. 4, pp. 367–378, 2002. [28] J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Ann. Statist., vol. 29, no. 5, pp. 1189–1232, 2001.

Andrew Kusiak (M’90) received the B.S. and M.S. degrees in engineering from the Warsaw University of Technology, Warsaw, Poland, in 1972 and 1974, respectively, and the Ph.D. degree in operations research from the Polish Academy of Sciences, Warsaw, in 1979. He is currently a Professor in the Department of Mechanical and Industrial Engineering, University of Iowa, Iowa City. He speaks frequently at international meetings, conducts professional seminars, and does consultation for industrial corporations. He has served on the editorial boards of more than 40 journals. He is the author or coauthor of numerous books and technical papers in journals sponsored by professional societies, such as the Association for the Advancement of Artificial Intelligence, the American Society of Mechanical Engineers, etc. His current research interests include applications of computational intelligence in automation, wind and combustion energy, manufacturing, product development, and healthcare. Prof. Kusiak is the Institute of Industrial Engineers Fellow and the Editor-inChief of the Journal of Intelligent Manufacturing.

Wenyan Li received the B.S. and M.S. degrees from Beihang University, Beijing, China, in 2005 and 2008, respectively. She was with Lenovo Group, Beijing. Thereafter, she joined the Graduate Program at the University of Iowa, Iowa City, where she is currently with the Intelligent Systems Laboratory. Her research interests include data mining, computational intelligence, and process optimization applied to wind power and manufacturing industry.

Authorized licensed use limited to: The University of Iowa. Downloaded on February 21,2010 at 20:55:23 EST from IEEE Xplore. Restrictions apply.