Approximate Dynamic Programming Coordinated ... - IEEE Xplore

4 downloads 416 Views 215KB Size Report
approximate dynamic programming (ADP) technique is proposed based on ... the parameters of different controllers can be adjusted online to optimize one .... degrees of freedom in neural networks parameters to maintain power system ...
Approximate Dynamic Programming Coordinated Control in Multi-infeed HVDC Power System Chao Lu, Member, IEEE, Jennie Si, Senior Member, IEEE, Xiaochen Wu, Peng Li

 Abstract -- This paper is concerned with large scale, interconnected, AC/DC transmission network of a power system. As the system grid becomes larger and power demand becomes higher, low frequency oscillation has become the primary concern of power system stability. High voltage direct current (HVDC) modulation control is a traditional means of damping the low frequency swing. However, for multi-infeed system, the interactions and correlations among multiple HVDC links have made the stabilizing controller design more challenging. The current approach to controller design, which only takes into consideration of a single transmission link, is not sufficient to cope with the complex network topology, and the stringent stabilization requirement of a system operating close to its limit. In this paper, a coordinated control framework using an approximate dynamic programming (ADP) technique is proposed based on wide-area measurement system (WAMS). Since real-time system responses are available through WAMS, the parameters of different controllers can be adjusted online to optimize one common objective function that reflects the system stability as a whole instead of just one single transmission link. The paper will include a design procedure on how ADP can be applied. The performance of this coordinated control approach using the proposed ADP is validated in a real system, namely the China Southern Grid. Index Terms—Power system dynamic stability, Dynamic Programming, Neurocontrollers, Multi-infeed HVDC system

I. INTRODUCTION

I

N power systems, all units must operate in reasonable ranges, specified as, for example, the relative rotor angles between different generators, the system frequency and the bus voltages. With the expansion and increase connectivity of regional grids, power system stability control is becoming a challenging task. When the transmission distance is more than 600-800km, DC line is attractive because of large transfer capacity, reduced land use and low cost. A receiving power system with several DC links is often called a multi-infeed system. Given that the control of HVDC is becoming more complicated, more issues require special attention including The research was supported by National Natural Science Foundation of China under project numbers 50595413 and 50407001. The second author’s research was supported by NSF under grant ECS-0002098. C. Lu is with the Department of Electrical Engineering, Tsinghua University, Beijing, 100084, P. R. China. (email: [email protected]) J. Si is with the Department of Electrical Engineering, Arizona State University, Tempe, AZ 85287 USA. (email:[email protected]) X. Wu and P. Li are with The Research Center of China Southern Power Grid Co., LTD.

1­4244­0178­X/06/$20.00 ©2006 IEEE

system nonlinearity, uncertainty, and coordination among network components. Some special characteristics of nonlinear systems, bifurcation and chaos for example, have been observed in power systems [1]. Even more challenging is the fact that many power system nonlinearities are difficult to model mathematically. It is not straightforward for modern control theories to handle power system models even after simplifications in modeling. As regional grids continue merging into larger networks, the challenges due to system uncertainty become ever greater (1). The models and parameters deviate from the truth due to simplifications made in modeling and their variations in time (2). The power system operating conditions vary constantly, but the controller design is only based on typical operating scenarios (3). There are also considerable disturbances on the controllers caused by noise and time delays during transmission. In modern power grid, many devices are available for maintaining system stability, e.g. exciters and governors of a generator, HVDC and flexible AC transmission system (FACTS). For some global stability problems, such as low frequency oscillation, the control of various devices must be coordinated. Some traditional techniques can readily be used to meet various control design needs: for example, exact feedback linearization (EFL) methods [2] have been demonstrated for nonlinear system control, robust control can guarantee the linear time-invariant (LTI) system stability in the presence of uncertainty [3], and multi-input multi-output (MIMO) system designs including different control loops have also been implemented [4]. However, how to consider these problems simultaneously, i.e., how to design coordinated stability controllers for complicated nonlinear and uncertain systems with great scale is not well studied. This is the attempted focus of this paper. Most classic control designs are based on mathematical models, which makes the controller analyses and design exact in its framework, however, the limitations on the models also affects the practical control performance seriously. Several approximate dynamic programming (ADP) methods [5] are based on real system responses, which reflect the most important system dynamics, and then constraints of nonlinearities and uncertainties can be potentially avoided. For example, the generator control using dual heuristic programming (DHP) was carried out through digital

2131

PSCE 2006

simulations and physical experiments [6]. In this approach, a system model implemented by a neural network was pretrained to predict system responses in the next time step, and the training quality of the model constraints how much the controller can do. In this paper we consider a modelindependent ADP approach [7], which can be viewed as a model-free action-dependent heuristic dynamic programming in adaptive critic designs, which will be referred to as direct HDP in this study. In [8], direct HDP was employed to control the flight of an Apache helicopter, a complex, continuous state/control, MIMO nonlinear system with uncertainty. The learning performance of the direct HDP application in multiple machines power system is studied in [9]. These results together point to the potential of adaptive critic designs for scalable complex system control applications. In this paper, the direct HDP method is employed to solve the low frequency oscillation problem in a multi-infeed HVDC system: China Southern Grid. The rest of this paper is organized as follows: in section II the general direct HDP framework is elaborated. The implementation with new structure in power system is presented in Section III to demonstrate how to apply direct HDP. The description of China Southern Grid and simulation results are presented in IV. Section V summarizes this paper.

control signals according to the learned policy, and the latter approximates the function J of the Bellman equation in dynamic programming. The structure of these two parts can be look-up tables, neural networks or decision trees. In most cases, neural networks are used because of their universal approximation capability, and associated simple learning algorithm based on gradient descent. In power systems, the real time system dynamics fed back to the direct HDP controller are provided by wide area measurement system (WAMS), which is based on synchronized phasor measurement techniques and modern digital communication networks. Through this system, some key system variables (e.g., internal voltage angles of generators in different areas) unavailable in the past can be measured directly and used as remote feedback to improve system performances [10]. Fig. 2 is the schematic diagram of the direct HDP control scheme [7]. The reinforcement signal r(t) is from the external environment and, typically, is either a “0” or a “1” corresponding to “success” or “failure,” respectively.

II. INTRODUCTION OF DIRECT HDP CONTROL Direct HDP belongs to the ADP family. Its basic control framework is shown in Fig.1 where u is the control signal computed by an ADP controller, and is outputted to the environment; X is the state vector, the response of the environment to the input u. At the same time, the effect of control u is evaluated, using a cost functional, and is used to update the control policy.

Fig.1. Schematic diagram of approximate dynamic programming control

Since the control environment/plant is usually represented by a set of differential equations, a time-domain computer simulation or the real dynamic system itself, strong nonlinearities can be easily embedded. The iterations of controller parameters are based on real time system responses, which reflect the practical system conditions, so the uncertainties are implied. The controller performance is determined by the cost function, which can be used to reflect global system dynamics and coordination among designed controllers. The direct HDP control is composed of two main parts: an action network and a critic network. The former produces

Fig. 2. Schematic diagram for implementation of direct HDP. The solid lines represent signal flows, and the dashed lines are the paths for parameter tuning.

During on-line learning, the controller is “naive” when it starts to control, namely both the action and critic neural networks are randomly initialized for their weights. Once a system state is observed, an action will be subsequently produced based on the parameters in the action network. A “better” control value under the specific system state will make the equations of the principle of optimality more balanced. This set of system operations will be reinforced through memory or other association between states and control output in the action network. Otherwise, the control value will be adjusted through tuning the weights in the action network. The output of the critic network (the J function) approximates the discounted total reward-to-go. Specifically, it approximates R(t) given by f

R (t )

¦D

k 1

r(t  k )

(1)

k 1

where R(t) is the future accumulative reward-to-go value at time t, D is a discount factor for the infinite-horizon problem ( 0  D  1 ). About the detailed update algorithm of action and critic network realized using multi-layer feed-forward neural

2132

network can refer to [7]. Y (s)

III. COORDINATED CONTROL USING DIRECT HDP METHOD IN POWER SYSTEMS A. Framework of Online Coordinated Control Online adaptation of direct HDP control requires real time system responses, which can be provided by WAMS. This process is illustrated in Fig. 3.

1  T1s

X (s)

(2)

1  T2 s

In time domain, this can be transformed to a difference equation with its solution obtained through the following iteration: y( n 1)

x( n 1)  K1 ( y( n )  x( n 1) )  K 2 ( x( n )  x( n 1) )

(3)

where

­K °° 1 ® ° K2 °¯

T2  't / 2 T2  't / 2

(4)

't / 2  T1 T2  't / 2

't is the time step. Equation (3) can be calculated using a neural network, where K1 and K2 are the weights. The diagram is shown in Fig.4. This is a dynamic neural network because of the employment of TDL block, but the simple backpropagation algorithm is still feasible.

K1 K2

Fig. 3. Framework of online coordinated control based on direct HDP method and WAMS

output

input

1

Signals reflecting system dynamics can be collected by phasor measurement units (PMU), tagged with exact time stamp coming from global positioning systems (GPS), and then transmitted to the WAMS center through wide area communication network. After data pre-processing, the information can be used as inputs of local controllers or online direct HDP design program. During every time step, the new control signals or parameters for different controllers are produced through the interaction between direct HDP and environment (power system). Because the cost functions for all controllers are identical, this design results are coordinated. B. New Direct HDP Controller Structure The application of neural networks in power system stability control has been investigated widely in the last several decades, but nearly no such controllers operate in a real grid. The primary reason is the concern of too many degrees of freedom in neural networks parameters to maintain power system reliability. In addition, the controller parameters are better directly related to system dynamics. Most of the current power system stability controllers are composed of some simple basic blocks, such as integral, differential, inertial, lead-lag, etc. The universal approximation capability of neural network makes it possible to replace the traditional blocks, and at the same time, the said neural network can also be simplified. As an example, the lead-lag block expressed in neural network structure (phaseshift neural network, PSNN) can be deduced as followings. In frequency domain, the lead-lag block can be described as:

Fig.4. The structure of PSNN

Through these equations, the normal lead-lag block is completely equivalent to a neural network, and then direct HDP controller can be embedded into traditional ones, e.g., the DC power modulation controller, as shown in Fig.5. The gain K can also be considered in the above structure, and then the fixed weight “1” is freed. Q (t )

J Q (t  1)  r (t )

Q*

y

Tw s 1  Tw s

1  T1s 1  T2 s

1  T3 s 1  T4 s

Pmodu

Fig.5. DC power modulation control with direct HDP controller embedded. The action network is PSNN.

Parameters except those in the direct HDP are designed using traditional methods or based on practical experiences. For parameters in direct HDP, since the values of T1 and T2 are in a region for real system, K1 and K2 are also limited. In addition, these weights of PSNN can be constrained further to guarantee stability during online control updates. Under this condition, the search space for neural network parameters is

2133

also reduced, and thus training efficiency is enhanced, which is important for real-time applications.

adjusted continuously.

IV. SIMULATION RESULTS A. China Southern Grid China Southern Grid is an AC/DC hybrid power system, and mainly composed of four provincial grids. The distance of transmission from west to east is over 1000km. Large capacity of power transmission takes place through 5 AC lines and 2 DC links (Guiguang and Tianguang) in parallel in 2005. The grid structure is shown in Fig.6. Upon completion of connecting regional grids, low frequency oscillation has become a prominent problem which is a potential threat to system stability. Although approximately 30 PSSs were installed on many important generators, once the load demands at the receiving end increase, the system is still poorly damped. Therefore, power modulation control of HVDC is an attractive alternative. The structure of modulation controller is shown in Fig.5, and the input signals usually are the active power on the parallel AC lines.

C. Online Coordinated Control using Direct HDP Starting from the independently designed Guiguang and Tianguang DC power modulation controller parameters, the new direct HDP controller can obtain a set of coordinate values. Fig.8 is the learning process, and the variation is acceptable for real devices.

Guizhou

Yunnan

Guangxi

Fig.7. Rotor angle between Generator Qianxi in Guizhou and SJCG in Guangdong. Dot line: without power modulation control, dash line: Guiguang modulation control, dash dot line: Tianguang modulation control, solid line: Guiguang and Tianguang modulation controls

Guangdong

Fig. 6. Diagram of South China Grid structure

B. Independent Design of HVDC Modulation Controllers Since the terminals of two DC links are not far from each other in terms of electricity distance, and with the development of this system, there will be two more HVDC lines in commission, their controllers must be coordinated. In section III, the learning ability of direct HDP controller is validated, and in this section, its coordinating design considering practical engineering constraints will be demonstrated. If only one DC power modulation controller is used, the parameters can be tuned to achieve desired control performance, however, if these independently designed controllers work together, the interaction between any two DC links deteriorates system damping capability, as shown in Fig.7. In the multi-infeed dc system, the supplementary dc control must be designed on a coordinated basis to avoid unexpected excitation of new low damped mode [11]. For direct HDP controller, the control law update is instructed by the reward function. Its definition is the weighted sum of squares of relative rotor speed differences among three approximate inertia centers: Guangdong, Guizhou and Yunan. This function reflects stability of the entire system, if only one oscillation mode is suppressed, the reward is not minimal, and the controller parameters will be

Fig.8. Learning process (K1 and K2) of Guiguang modulation control.

After coordination, the oscillation with several frequencies is damped well, as shown in Fig.15.

2134

[7]

J. Si, Y. T. Wang, Online Learning Control by Association and Reinforcement, IEEE Trans. Neural Networks, Vol. 12, No. 2, pp. 264276, Mar. 2001. [8] R. Enns, J. Si, Helicopter Trimming and Tracking Control Using Direct Neural Dynamic Programming, IEEE Trans. Neural Networks, Vol. 14, No. 4, pp. 929-939, Jul. 2003. [9] C. Lu, J. Si, X. Xie, et al. SVC Supplementary Damping Control Using Direct Neural Dynamic Programming. Proceedings of the 2004 IEEE International Symposium on Intelligent Control, 2004, 270~274 [10] B. Chaudhuri, R. Majumder, B. C. Pal, Wide-Area Measurement-Based Stabilizing Control of Power System Considering Signal Transmission Delay, IEEE Trans. Power Systems, Vol. 19, No. 4, pp. 1971-1979, Nov. 2004 [11] L. A. S. Pilotto, M. Szechtman, A. Wey, et al. Synchronizing and damping torque modulation controllers for multi-infeed HVDC systems. IEEE Transactions on Power Delivery, 1995, 10(3):1505~1513

Fig.9. Rotor angle between Generator Qianxi in Guizhou and SJCG in Guangdong. Dash line: independent design; solid line: coordinate design.

V. CONCLUSIONS AND FUTURE WORK In multi-infeed power system, the interaction between parallel AC and DC lines, on top of the nonlinearities and uncertainties, makes the DC controller design more challenging. In this paper, a framework using direct HDP control method based on WAMS is proposed. Multiple controllers can be coordinated online according to the same cost function, and the real time system responses are available through WAMS. Further, the traditional lead-lag block is transformed to the structure of an ordinary feed-forward neural network, and this part can be embedded into the HVDC modulation controller. The above control approach is validated by the China Southern Grid, which is one of the most complex AC/DC hybrid systems in the world. Simulation results have demonstrated superior performance of the proposed coordinated design over independently designed controllers. For future research, we will explore system stability guarantee during controller parameter updates. With the adoption of the wide area communication network, time delays are inevitable. The impact of those time delays of field measurements and corresponding compensation methods will also be studied.

VII. BIOGRAPHIES Chao Lu (M’05) was born in Hebei province in China. He received the B.E. and Ph.D. degree from Tsinghua University in 1999 and 2005, respectively. Now, he works as an assistant professor in the Department of Electrical Engineering at Tsinghua University. His research interests include power system stability and control. Jennie Si received the B.S. and M.S. degrees from Tsinghua University, Beijing, China, in 1985 and 1988, respectively, and the Ph.D. degree from University of Notre Dame, Notre Dame, IN. Since 1991, she has been on faculty at Arizona State University, Tempe, where she is now Professor in the Department of Electrical Engineering. Her current research interest includes theory and application of artificial neural learning systems. Major application areas of her research are learning controllers, semiconductor manufacturing process optimization, and neural cortical information processing. Dr. Si is a recipient of the 1995 NSF/White House Presidential Faculty Fellow Award. She was Associate Editor of the IEEE TRANSACTIONS ON AUTOMATIC CONTROL in 1998 and 1999, IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING from 1998 to 2002, and IEEE TRANSACTIONS ON NEURAL NETWORKS since 2000. Xiaochen Wu received the M.S degree in electrical engineering from Huazhong University of Science and Technology (HUST), Wuhan, in 1996. Currently, he is a Senior Engineer with the China Southern Power Grid Co., Ltd, GuangZhou, where he leads a power grid research team. His research activities include stability analysis and control, wide-area monitoring, and HVDC control. Peng Li received the B.S, M.S and Ph. D degree in electrical engineering from Tianjin University, Tianjin, in 1999, 2001, and 2004 respectively. His research interests include power system dynamic security assessment, power system optimization, HVDC control and operation.

VI. REFERENCES [1] [2]

[3] [4] [5]

[6]

T. Van Cutsem, C. Vournas. Voltage stability of electric power systems. Boston: Kluwer Academic Publishers, 1998 Y. N. Yu, K. Vongsuriya, L. N. Wedman, “Application of an Optimal Control Theory to a Power System,” IEEE Trans. Power Apparatus and Systems, Vol. 89, No. 1, pp.52-62, 1970. Q. Lu, Y. Sun, S. Mei, Nonlinear Control Systems and Power System Dynamics, Boston: Kluwer Academic Publishers, 2001. J. G. Webster. Encyclopedia of electrical and electronics engineering. New York: John Wiley, 1999 J. Si, A. Barto, W. Powell, D. Wunsch, Learning and Approximate Dynamic Programming: Scaling up to the Real World, New York: John Wiley & Sons, 2004. G. K. Venayagamoorthy, R. G. Harley, D. C. Wunsch, Dual Heuristic Programming Excitation Neurocontrol for Generators in a Multimachine Power System, IEEE Trans. Industry Applications, Vol. 39, No. 2, pp. 382-294, Mar./Apr. 2003

2135