Electricity Load and Price Forecasting using Data

0 downloads 0 Views 300KB Size Report
Index Terms—Load forecasting, Price forecasting, Feature. Selection .... problem addressed. Statistical study on MLP. LSTM [15]. Finding feature importance.
Electricity Load and Price Forecasting using Data Analytics Techniques (Technical Report for MSSE Course: Research Methodology in Information Technology RMIT CUI Fall 2018 13 ) Mah Noor Asmat, Nadeem Javaid* Dept. of Computer Science CUI, Islamabad Islamabad, 44000, Pakistan *Corresponding author: www.njavaid.com, [email protected] Abstract—This paper proposes load and price forecasting1 using New York Independent System Operator (NYISO) load and price data of January to May, 2014. It is done with a two-stage method of feature selection by implementing Classification and Regression Analysis (CRA) and Recursive Feature Elimination (RFE). Both of these techniques are used to eradicate feature redundancy. Linear Discriminant Analysis (LDA) is applied to reduce dimensionality. Later, Multi-Layer Perceptron (MLP), LDA and Logistic Regression (LR) are used for classification and prediction. Simulations are performed for load and price prediction. The load and price forecasting is measured with LDA which outperforms 1.1%. Index Terms—Load forecasting, Price forecasting, Feature Selection, Feature Extraction, Classification, Smart Grids, Data Analytics

I. I NTRODUCTION Traditional grids are difficult to use due to increasing demand of electricity in smart grids [1]. Traditional grids are unsafe for electricity generation and distribution [2]. They are sited at remote placed from power consumption zone and electric power is transmitted through lengthy wires. Traditional grid works on the fossil fuel which is dangerous for health and environment. It is inefficient as compared to the smart grids. Smart grid is the best practice to be used [3]. Smart grid is safe and secure to use, since it can automatically detect, a system fault and solve it, thus offer consistent safety from any hazard and cyber-attack. The use of electricity follows a unique and regular peak demand curve in the traditional market. Demand management concept is needed to fix it. Smart grid uses consumer and supplier activities’ information and attempts to get prediction to improve electricity production and distribution. Smart grid allows joint communication between supplier and the consumer. This way supplier can get consumption feedback and consumer can check on his consumption behavior to obtain lowest electricity usage. The main idea behind demand 1 Prediction

and forecasting are used alternatively in this paper.

management is to plan a pricing scheme that can resolve the hourly price to convince consumers to modify their electricity consumption design [4]. Data Analysis is the logical and statistical way to evaluate data taken from smart grids. Classifiers are used to get evaluation and prediction of consumption of electricity in data analysis [5]. In decision making, price forecasting is important in energy trading and utilities [6]. It is crucial for consumers to know about electricity price. This is the only way, they can keep track of their electricity usage and decrease consumption, if needed. Price forecasting is an important part of the smart grid because it makes smart grids cost efficient [7]. Load and price are predicted in this article. Load forecasting is crucial for market management. It is essential to keep the balance between generation and consumption. It is stated that efficient generation and consumption is a problem for an energy sector. Utility maximization is main goal of user and utility [8]. Load forecasting is crucial as that of price. Suppliers can keep track of their electricity load and inform consumers to decrease load at peak hours. Electricity is predicted with the help of classifiers using data analysis. Logical reasoning is provided to predict load and price. Error value is calculated to get prediction accuracy. MAPE, MAE, RMSE and MSE are used to evaluate error for load and price prediction. There are different classifiers that can be used to achieve electricity prediction. Linear Discriminant Analysis (LDA), Logistic Regression (LR) and MultiLayer Perceptron (MLP) are used in this paper to achieve maximum predicted value of electricity price and load. In the table I, abbreviations are listed. A. Motivation LDA [9], is used to predict facial features. LDA [10] is accurate and achieves higher accuracy in prediction. That is why LDA is implemented in this article for load and price forecasting.

TABLE I L IST OF A BBREVIATIONS Abbreviations CRA DTC DTR LDA LR MAE MAPE MLP MSE NYISO RFE RMSE

Full Forms Classification and Regression Analysis Decision Tree Classifier Decision Tree Regressor Linear Discriminant Analysis Logistic Regression Mean Absolute Error Mean Average Percentage Error Multi-Layer Perceptron Mean Square Error New York Independent System Operators Recursive Feature Elimination Root Mean Square Error

B. Problem Statement LR is beneficial for circumstances in which it is needed to forecast the existence or absence of a distinctive product based on standards of set of forecasting variables. LR is method for forecasting a“dichotomous dependent variable”[11]. LR is not more accurate than LDA. For this reason LDA [9], [10] and MLP [14] are applied to achieve more accuracy. LDA has higher accuracy of load and price prediction as compared to other two applied classifiers on this particular dataset.

TABLE II S UMMARY OF R ELATED W ORK Techniques LDA [9]

Objectives Person re-identification

LDA [10]

Dimensionality reduction

LR [11] LR [13]

Description on LR Discussed asymptotic covariance matrices Prediction and verification problem addressed

Multilayer Neural network [14] LSTM [15] SVM [16]

Finding importance load prediction

feature

Limitations Not used in electricity forecasting Cannot deal with nonlinearity It is not research on LR Statistical study on LR Statistical study on MLP

No description on price or load prediction Overfitting for SNN

used in $ in this paper. Preprocessing of data is performed. Features are selected using CRA and RFE. Extraction is performed using LDA. After preprocessing, classifiers LDA, LR and MLP are implemented. Testing data of May 2014 is used for load and price forecasting.

Input

Load and Price data

C. Contributions Three classifiers are applied on NYISO electricity data from January to May 2014, to forecast next month load and price of electricity. For this reason, MLP is implemented after applying classification of LDA and LR. LDA achieved higher accuracy on certain dataset in this daily data of load and price.

Splitting data

Features

Feature Selection (CRA and RFE) Rejected Features

II. R ELATED W ORK These papers [9], [10], [11], [13], [14] are using same techniques as this paper. In paper [9], a person re-identification prediction is being done. LDA in that paper is used for their own scenario of pattern recognition. In this article, LDA is used for dimensionality reduction and electricity load and price forecasting. It is a classification technique and it can be used for prediction. In this particular scenario, LDA is predicting load and price. There are some other techniques [14], [15] that are used for forecasting. MLP is a neural network so it contains layers. It has hidden layers and visible layers. Visible layers get data and data is input to the hidden layers and those hidden layers works as activation layers and changes the input and sent it back as output to the visible layer after comparing the data with expected result. Summary of this section is described in table II.

Selected Features

Feature Extraction (LDA)

Classification (LDA, LR and MLP)

Output (Performance Evaluation)

Predicted Load and Price

III. P ROPOSED S YSTEM M ODEL System model is described in figure 1. It shows input dataset from January to May 2014 of NYISO market. Input data is in the form of load data. Price is calculated by multiplying with 3 to get price per unit as electricity price per unit is PRs. 3 in Pakistan. Although data taken for simulation is not from Pakistan, PRs. 3 is supposed to get price. However price is

Fig. 1. Flow of System Model

IV. S IMULATIONS AND R ESULTS Simulations are performed using python in Spyder on Windows 10 Operating System running core i7, 8GB RAM, and

Price: Classification and Regression Analysis

500GB storage.

0.6 0.4

Variable TABLE III S PLITTING OF DATA Training 113

wind_speed

Historical load and price datasets are gathered from NYISO of 2014. Data is categorized into 75% for training and 25% for testing. Categorized data is shown in the table III.

wind_direction

0.0

B. Preprocessing Data

temperature

0.2

pressure

Hourly electricity load data from January to May, 2014 of NYISO market [12] is used. It is processed and price is calculated from load data by supposing per unit price. There are total 8 features in selected dataset. Data is reduced hourly to daily basis, so that processing time can be reduced.

humidity

A. Dataset Description

Feature Importance

0.8

Fig. 3. Feature Importance of Price Data

Testing 38 TABLE IV F EATURE S ELECTION USING RFE

C. Feature Selection Feature selection is first performed using CRA. This analysis is implemented by using Decision Tree Classifier (DTC) and Decision Tree Regressor (DTR). The Mean value of both techniques is used to find feature importance. This process is performed for both load and price data. Figure 2 shows feature importance of load data and the figure 3 depicts feature importance of price data. RFE is implemented to eliminate features redundancy in feature selection. During this process, four features are selected as shown in the table IV.

Load: Classification and Regression Analysis 0.8

Rejected pressure -

D. Feature Extraction LDA is applied to reduce dimensionality and get improved features. LDA is usually used for prediction of data. E. Classification and Forecasting using LDA, LR and MLP

Feature Importance

LDA, LR and MLP are the techniques used as classifiers and predictors. Load and price forecasting results are mentioned in subsection. Algorithm for LDA is descripted below.

0.6 0.4

Algorithm 1 LDA 1: Train the model on training data 2: Fit the model on testing data 3: Prediction of unseen data 4: Performance Evaluation

Variable

Fig. 2. Feature Importance of Load Data

wind_speed

wind_direction

temperature

pressure

humidity

0.2 0.0

Selected wind speed humidity temperature wind direction

1) Load Forecasting: Load data is used to forecast by applying LDA and LR. MLP is applied to achieve greater accuracy in forecasting. Those 35 days are predicted as shown in figure 4. Load data of last 5 days of April 2014 is used to forecast its prediction as shown in figure 5. May 2014 data is used to get forecast for 1 month as shown in figure 6.

1800

Actual LR LDA MLP

Load(MW)

1700 1600

in figure 8. May 2014 data is used to get forecast for 1 month as shown in figure 9.

5400

Actual LR LDA MLP

5200 5000

Price $

1500

4800

1400

4600

5

10

15

20

Days

25

30

35

4400 4200 5

Fig. 4. Load Prediction for 35 Days

1750

10

15

20

Days

25

30

35

Fig. 7. Price Prediction for 35 Days

Actual LR LDA MLP

1700

Load(MW)

1650 1600

5200

Actual LR LDA MLP

5000

1550

Price $

4800

1500

4600

1450 14001

2

3

4

Days

5

4400 42001

Fig. 5. Load Prediction of 5 Days

2

3

4

Days

5

Fig. 8. Price Prediction for 5 Days

1800

Actual LR LDA MLP

Load(MW)

1700

5400

Actual LR LDA MLP

5200

1600

Price $

5000 4800

1500

4600

1400 5

10

15

Days

20

25

30

Fig. 6. Load Prediction of 30 Days

2) Price Forecasting: Price data is used to forecast by applying LDA and LR. Furthermore, MLP is applied to achieve greater accuracy in forecasting. Those 35 days are predicted all together as shown in figure 7. Price data of last 5 days of April 2014 is used to forecast its prediction as shown

4400 4200 5

10

15

Days

20

25

30

Fig. 9. Price Prediction for 30 Days

3) Evaluating Performance: Our goal is accomplished higher accuracy value of prediction. Performance of LDA and LR is evaluated. For this purpose, Root Mean Square Error

(RMSE), Mean Average Percentage Error (MAPE), Mean Absolute Error (MAE) and Mean Square Error (MSE) are used. In table V, LDA load and price performance is described. In table VI, LR load and price performance is described. Likewise, MLP load and price performance is calculated as shown in table VII. Performance comparison of these three techniques is plotted as shown in figure 10, and figure 11.

Load 0.669 3.205 0.640 0.504

Performance RMSE: MAPE: MSE: MAE:

Load 0.844 4.384 1.424 0.689

Price 2.532 4.384 9.157 2.067

TABLE VII MLP P ERFORMANCE C OMPARISON

TABLE V LDA P ERFORMANCE C OMPARISON Performance RMSE: MAPE: MSE: MAE:

TABLE VI LR P ERFORMANCE C OMPARISON

Performance RMSE: MAPE: MSE: MAE:

price 2.007 3.205 5.757 1.511

Load 0.317 2.823 0.805 0.442

Price 0.951 2.823 7.242 1.327

R EFERENCES

LDA LR MLP

Load Error Value

4 3 2 1 0 RMSE

MAPE

MSE

MAE

Fig. 10. Performance Comparison for Load

LDA LR MLP

Price Error Value

8 6 4 2 0 RMSE

MAPE

MSE

MAE

Fig. 11. Performance Comparison for Price

C ONCLUSION LDA, LR and MLP are classifier techniques that are implemented to accomplish greater accuracy for prediction of load and price. Simulations are performed in python. Features are selected using CRA and extracted using LDA and RFE. 1.1% accuracy is achieved with LDA.

[1] Bhatti, H.J. and Danilovic, M. “Business Model Innovation Approach for Commercializing Smart Grid Systems,” American Journal of Industrial and Business Management, 8, 2007-2051, p. 30, Sept. 2018. [2] Zame, K.K., et al. “Smart Grid and Energy Storage: Policy Recommendations,” Renewable and Sustainable Energy Reviews , 82, 1646-1654, p. 2017. [3] Hossain, M.R., Oo, A.M. and Ali, A.S. “Smart Grid, in Smart Grids,” Springer, Berlin, 23-44, p. 2013. [4] B. Neupane, W. Woon and Z. Aung. “Ensemble Prediction Model with Expert Selection for Electricity Price Forecasting,” Energies, vol. 10, no. 1, p. 77, 2017. [5] Alan Agresti. “ An Introduction to Categorical Data Analysis,” Wiley, Ed. 3rd, ISBN: 978-1-119-40526-9, p. Nov. 2018. [6] R. Angamuthu Chinnathambi, A. Mukherjee, M. Campion, H. Salehfar, T. Hansen, J. Lin, and P. Ranganathan. “A Multi-Stage Price Forecasting Model for Day-Ahead Electricity Markets,” Forecasting, vol. 1, no. 1, p. 3, Jul. 2018. [7] N. Ayub, N. Javaid, A.Abbas. “Big Data Analytics for Electricity Load Forecasting in Smart Grids,” [unpublished]. [8] K. Wang, C. Xu, Y. Zhang, S. Guo, A. Zomaya, “Robust Big Data Analytics For Electricity Price Forecasting In The Smart Grid,” IEEE Access, p. 05 Jul. 2017, 10.1109/TBDATA.2017.2723563. [9] LinWu, Chunhua, ShenAnton, denHengel. “Deep linear discriminant analysis on fisher networks: A hybrid architecture for person reidentification,” Pattern Recognition. Volume 65, P. 238-250, May 2017. [10] S. Yuan, X. Mao and L. Chen. “Multilinear Spatial Discriminant Analysis for Dimensionality Reduction,” IEEE Transactions on Image Processing, vol. 26, no. 6, pp. 2669-2681, June 2017.doi: 10.1109/TIP.2017.2685343. [11] D.W. Hosmer, S. Lemeshow. “Applied logistic regression,” John Wiley Sons, New York, p. 2000. [12] NYISO Market load dataset 2014. [Accessed: 9, Oct. 2018] [13] K. F. Cheng and H. M. Hsueh, “ESTIMATION OF A LOGISTIC REGRESSION MODEL WITH MISMEASURED OBSERVATIONS,” Statistica Sinica. 2010. [14] W. Xiang, H. Tran and T. T. Johnson, “Output Reachable Set Estimation and Verification for Multilayer Neural Networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 11, pp. 5777-5783, Nov. 2018. [15] Zheng, Huiting, Jiabin Yuan, and Long Chen. ”Short-term load forecasting using EMD-LSTM neural networks with a Xgboost algorithm for feature importance evaluation.” Energies. vol. 10, no. 8, 2017: 1168. [16] Liu, Jin-peng, and Chang-ling Li. ”The short-term power load forecasting based on sperm whale algorithm and wavelet least square support vector machine with DWT-IR for feature selection.” Sustainability 9, no. 7 (2017): 1188.