Distributed Generator Maintenance Scheduling - Semantic Scholar

3 downloads 1452 Views 226KB Size Report
Artificial Intelligence Laboratory, http://liawww.epfl.ch/ .... between solution quality and computational/communication load. We believe that these hybrids are ...
1

Distributed Generator Maintenance Scheduling Adrian Petcu and Boi Faltings {adrian.petcu, boi.faltings}@epfl.ch Artificial Intelligence Laboratory, http://liawww.epfl.ch/ Ecole Polytechnique F´ed´erale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland

Abstract— In recent years the electricity sector has undergone a number of changes that all point in the direction of liberalization and decentralization of control. However, a number of technological challenges have to be addressed before the desired degree of decentralization is obtained. Currently, most of the decisions about the operation of a power system are still made in control centers in a centralized fashion. We present in this paper a distributed optimization method that allows several power plant operators to schedule preventive maintenance on their generation units in a distributed fashion. This method does not require the centralization of private data of the power plant operators (like available capacities, internal maintenance schedules, maintenance and operation costs, etc). Furthermore, the method guarantees globally optimal schedules while observing the power generation demand at all times. The problem is modeled as a constraint optimization problem, which is a powerful paradigm for solving numerous tasks in distributed AI, like planning, scheduling, resource allocation. The algorithm that we use is a complete method for distributed constraint optimization, based on dynamic programming. It requires a linear number of messages, whose maximal size depends on a parameter of the constraint graph, called induced width. This makes our algorithm well suited for large but loose problems. We present a number of interesting extensions of the basic method that show promise as applications in power systems well beyond this scheduling scenario. Index Terms— Maintenance Scheduling, Distributed Combinatorial Optimization, Multiagent Systems

I. I NTRODUCTION Nowadays, the transition of the electric power industry from a regulated monopoly to a deregulated industry is in full swing. The regulatory bodies that oversee the functioning of the power sector have widely begun to believe that new orientations, towards market-based systems with many smaller players are more desirable, and adapt their policies accordingly. Consequently, many aspects of the industry are changing, including its infrastructure and operation. Important shifts in the number and ownership of power production facilities, the volume of power generation and capacity have taken place in the past decade. The fundamental structure of the industry has been based in the past on the vertical integration of utilities, i.e., their involvement in the three functions of power supply: generation, transmission, and distribution of electricity. Generation is defined as the production of electric energy from other energy sources. Transmission is the delivery of electric energy over high-voltage lines from the power plants to the distribution areas. Distribution includes the local system of lower voltage lines, substations, and transformers which are used to deliver the electricity to end-use consumers.

Today, all three segments of the power sector see profound changes, all pointing in the direction of liberalization and decentralization of control. A compelling example that sees increasing interest is the distributed generation of electric power. Small rooftop photovoltaic arrays or fuel cells ranging from several to a few hundred kilowatts can be installed at or near the customer’s site. Thus, the customers can also supply power back to the grid, possibly decreasing bottlenecks and costs, and increasing fault tolerance and overall efficiency. However, a number of technological challenges have to be addressed before the desired degree of decentralization is obtained. Currently, most of the decisions about the operation of a power system are still made in control centers in a centralized fashion. This will arguably no longer be the case in tomorrow’s power grids. The diversification and multiplication of the actors of the power generation/transmission/distribution cycle require new tools and methodologies that are able to deal with these problems in a distributed fashion. We believe that many of the coordination techniques developped by the distributed AI community have started to be mature enough to see large-scale deployment for a wide range of tasks in the operation of the power systems. Most notably, the distributed constraint satisfaction/optimization framework (DCOP) is an efficient and competitive tool that has the potential to find wide applicability in power-systems related coordination tasks. DisCSP techniques have already been successfully applied to various distributed problems like meeting scheduling for a group of agents ([1], [2], [3]), resource allocation in sensor networks ([4], [1], [3], [5]), large scale coordination/scheduling for a large organization [6], or distributed timetabling problems ([7]), We believe that such methods can be applicable in a wide range of problems which are distributed by nature. Power systems, especially with the current trend for liberalization and decentralization may be a very good candidate to benefit from them. In this paper we present as an example a maintenance scheduling scenario where this method can be applied for distributedly finding the optimal maintenance schedule for power generating units across several independent power plants. The rest of this paper is structured as follows: Section II introduces the Distributed Constraint Optimization Framework. Section III presents the maintenance scheduling problem, and expresses it in the distributed optimization framework. Section IV introduces DPOP, a distributed optimization algorithm. Section V presents a number of extensions of the basic DPOP optimization algorithm, that can be very useful in power

2

systems related tasks. Section VI compares our approach with existing work. We conclude in Section VII. II. T HE D ISTRIBUTED C ONSTRAINT O PTIMIZATION F RAMEWORK Constraint satisfaction/optimization is a powerful paradigm for solving numerous tasks in distributed AI, like planning, scheduling, resource allocation, etc. Traditionally, such problems were gathered into a single place, and a centralized algorithm was applied in order to find a solution. However, many real problems are sometimes naturally distributed between a set of agents, each one holding its own subproblem. A very good example is the power grid, with many actors involved in a host of complex interactions. Each actor can be modeled as an intelligent agent that seeks to optimize its operations. Globally, the designer of such a system would like the functioning of the system as a whole to be optimized with respect to some desirable criteria, like the overall degree of satisfaction of the users, minimal expected downtime, minimal maintenance costs, etc. The whole system can then be seen as a multiagent system that has as a goal finding the optimal solution to this large optimization problem. In such a setting, the practical advantages of solving these problems in a decentralized fashion become apparent. Centralized solving can pose difficult data integration problems. Centralization entails a single entity gaining knowledge of all private data of the agents involved (like available capacities, internal maintenance schedules, maintenance and operation costs, etc). In dynamic systems, by the time we manage to centralize the problem, it has already changed. To overcome these problems, distributed protocols that seek these optimal solutions without data centralization can be designed. The agents then have to communicate with each other through message exchange to implement the distributed optimization protocol. A. Formal Definition Definition 1: Distributed Constraint Satisfaction (DisCSP) is a framework that models these problems, formalized in the early 90’s by Yokoo [8]. In distributed optimization, each variable and constraint is owned by an agent, and an agent can own multiple variables and constraints. We define formally a discrete distributed constraint optimization problem (DCOP) as a tuple < A, X , D, R > such that: • A = {A1 , ..., Ak } is a set of real entities in the system (e.g. power plant operators); • X = {X1 , ..., Xn } is the set of variables; each variable is controlled by a virtual agent Xi ; • D = {d1 , ..., dn } is a set of domains of the variables, each given as a finite set of possible values. • R = {r1 , ..., rm } is a set of relations, where a relation ri is any function di1 × .. × dik → R, which denotes how much utility is assigned to each possible combination of values of the involved variables. When these amounts are negative, then their semantics is of cost.

This is a multiagent instance of the valued CSP framework from Schiex et al [9]. In this formalization, the constraints are represented as tables of values, and their associated utility/cost. They can express a wide range of functions, not restricted to the linear constraints used by the vast majority of operations research techniques that have been applied so far to various optimization tasks in power networks. The down side is that they generally do not have compact representations as the linear constraints do. In a DCOP, any value combination is allowed (these are the so called soft constraints); the goal is to find an assignment X ∗ for the variables Xi that maximizes the sum of utilities. Hard constraints (that strictly forbid/enforce certain value combinations) can be simulated by soft constraints by assigning large negative valuations to disallowed tuples, and 0 to allowed tuples. A utility maximizing algorithm will thus avoid assigning such value combinations to variables. We also define for a node Xk , Rk (Xj ) = the (set of) relation(s) between Xk and its neighbor Xj . B. Simplifying Assumptions in the Distributed Optimization Framework For simplicity, we assume that each variable Xi is controlled by a corresponding virtual agent Xi , and use the terms interchangeably. A real agent Aj in the system can control several virtual agents and their associated variables. For example, a power plant operator has an associated real agent Aj that controls a set of virtual agents Xi ∈ X(Aj ) assigned to each one of its generating units. We emphasize that the knowledge of variables/domains belongs only to the agents that own the respective variables, and the knowledge of the constraints is shared only between the agents involved in the respective constraints. An agent is able to communicate through message exchange with its neighbors in the constraint graph. It is desirable that any agent only communicates with neighbors. Communication is error-free, and messages are received in the order they were sent. C. Existing Work in Distributed COPs DCOP is a relatively young and very promising research area, which boomed over the past few years, with many new and interesting results. The existing algorithms that address distributed optimization fall into two main categories: complete methods (guarantee the optimality of the solutions they find), and incomplete methods (they offer no guarantees on the optimality of the solutions). When comparing DisCSP algorithms from the same class, one usually considers as a performance measure the amount of message exchange generated during the execution of the algorithm: the less communication, the better. This is normally appropriate since communication is usually orders of magnitude more expensive than local computation. Distributed local search methods ([10]) belong to the class of incomplete methods. These methods start with a random solution, and then gradually improve it. Sometimes they produce good results with a small amount of effort. However,

3

they offer no guarantees on the quality of the final solution, which can be arbitrarily far from the optimum. The second type are complete methods, which guarantee the optimality of the solution they find. Some of these methods ([2]) have the downside that they may require a very large number of messages, thus producing important communication overheads. Other complete methods ([11]) centralize parts of the problem; it is unknown apriori how much needs to be centralized where, and privacy ([12], [13], [14]) is an issue. These are shortcomings that have so far prevented the widespread use of distributed optimization methods. We will present in Section IV DPOP [3] (Distributed Pseudotree OPtimization), a recently introduced optimization algorithm. DPOP is a complete method (always finds the optimal solution), and is based on dynamic programming. As compared to other algorithms, DPOP has the advantage that it generates a number of messages which is linear in the problem size. This algorithm thus eliminates the overhead of sending many small messages, but has the drawback that some of them may be large. In fact, the maximal message size depends on a parameter of the constraint graph, called the induced width. The induced width of a graph is related to its connectivity and clustering, so our algorithm works well for large but sparse problems. We believe that typical distributed optimization problems tend to be indeed loosely connected, so this optimization method based on dynamic programming can find a wide range of practical applications. Section V will then discuss a number of extensions of the basic DPOP algorithm, providing different compromises between solution quality and computational/communication load. We believe that these hybrids are suitable for large, distributed problems, where optimal solutions may be too costly to compute. We will also briefly present an anytime extension, which provides increasingly accurate solutions while the execution is still in progress. We will also discuss dynamic problems (problems that change over time) and a self-stabilizing version of DPOP that can be applied in such settings. This technique always stabilizes in a state corresponding to the optimal solution of the optimization problem, even upon node failures or dynamic changes to the problem. A continuous-time optimization extension also takes into account the costs incurred by revising decisions that were previously taken, in the light of the evolution of the problem. We will also briefly outline some of our recent work touching on the topic of problem solving in systems with self interested agents. We have developed algorithms based on a Vickrey-Clarke-Groves taxation scheme that induce incentivecompatibility i.e. they make it in the agents’ best interest to behave truthfully. We will also present a scheme which redistributes tax payments between the agents in the system, thus achieving budget-balance with high probability. III. M AINTENANCE S CHEDULING We start out with Frost and Dechter’s simplified model from [15], which follows closely the one of Yellen et al [16]. In brief, this model assumes a power plant with a given number of power generating units. There is a limited number

of maintenance crews, so a limited number of units can be in maintenance simultaneously. Each generating unit has an associated maintenance cost, a required maintenance duration, an operating cost, and a maximal power output. The schedule to be generated spans a certain number of weeks. A power demand forecast is available, based on past experience and predictions for the future. A valid maintenance schedule must take this demand into account, and make sure that enough units are online at all times to satisfy this demand. The goal is a valid maintenance schedule that minimizes maintenance costs and maximizes revenue. We extend this model to a multi-power plant setting, in the sense that we want to jointly optimize the maintenance schedules for several different power plants, each one with a number of generating units, with its own constraints, operating costs, etc. In this case, the demand constraint is not imposed on a single power plant anymore, but on subsets of power plants, according to geographic location, high-voltage interconnections between power plants, etc. These constraints are imposed by central authorities, like the current control centers. However, we emphasize that private data belonging to each power plant (like available capacities, internal maintenance schedules, maintenance and operation costs, etc) is not transmitted to any central authority. A. Maintenance Scheduling as Distributed Optimization The modeling as a constraint problem is similar to Frost and Dechter [15]. We present in figure 1 an example scheduling problem for 2 power plants, each with 3 power generating units, over a 2 week period. We will refer to this example for the rest of this section and in the following section. Each power plant has a matrix of variables specifying the status of each power generating unit in each week of the schedule. Each variable W K2 /U3,1 (status of unit 3 of power plant 1 in period 2) has as domain 3 values: ON, OFF, MAINT. Variables corresponding to the same unit for each period are connected through the solid horizontal constraints that ensure an adequate period for maintenance: for example, each unit must be maintained at least once, hydro-electric units require less time to maintain than nuclear powered ones, etc. Maintenance and running costs per period per unit, as well as utility gained from running a unit in a given week are captured by unary relations on individual state variables. These unary relations model the maintenance costs as negative valuations assigned to periods when the units are on maintenance. Conversely, the revenue generated by an operational unit is modeled as a positive valuation for the period when the unit is ON. For example, if in week 1 having maintenance performed on U1,1 would cost 100, having it offline would cost nothing and would return nothing, and having it running would cost 50 and would yield 200, then the unary relation on W K1 /U1,1 assigns -100 to W K1 /U1,1 ’s MAINT value, 0 to its OFF value, and 150 to its ON value. In addition to [15], we have for each power plant a total output variable for each schedule period: W K1 /Out2 is the output of power plant 2 in period 1. These variables are

4

Fig. 1.

An example scheduling problem (a), and (b) part of its pseudotree generated by a depth-first traversal of the problem graph (DFS).

connected to all variables corresponding to the generating units of the respective power plant by a sum constraint (the vertiP cal, hashed constraints): W K1 /Out2 = P oweri (Ui,2 ) × W K1 /Ui,2 . This constraint forces the total output variable to equal the sum of the outputs generated by online generating units in the power plant. All corresponding total output variables from each power plant for each period are connected through the demand constraints. These constraints ensure that the sum of the outputs of all power plants is greater than the forecast. For example, the projected demand for W K1 must be met: W K1 /Out = P i W K1 /Outi > Demand1 . The total output variables W Ki /Out have a double role. On one hand, they reduce the arity of the demand constraints from the total number of generating units to the number of independent power plants. On the other hand, they help preserve the privacy of each power plant because private information like the capacity and status of internal generating units is obscured outside the owning power plant: only the total output is visible from outside. IV. DPOP: D ISTRIBUTED P SEUDOTREE OP TIMIZATION Once we have modelled the maintenance problem as a MCOP problem, we can apply a general-purpose distributed optimization algorithm like DPOP [3] to solve it. DPOP is a distributed optimization algorithm which works on a pseudotree arrangement of the problem graph. It has 3 phases. In the first phase (see Section IV-A), the pseudotree structure is established through a depth-first traversal of the graph, using a custom distributed token passing algorithm [3]. The result of this phase is that all nodes consistently label each other as parent/child or pseudoparent/pseudochild. The second phase (see section IV-B) is a bottom-up utility propagation, and the third phase (see section IV-C) is a top-down value assignment propagation.

Algorithm DPOP requires a linear number of messages, the largest one being space-exponential in the induced width of the pseudotree. For a formal description of DPOP, correctness and complexity proofs, and performance evaluations, see [3]. A. Pseudotrees: definition and distributed generation Definition 2 (Pseudotree): A pseudo-tree arrangement of a graph G is a rooted tree with the same vertices as G and the property that adjacent vertices from the original graph fall in the same branch of the tree (e.g. W K1 /U3,1 and W K2 /U3,1 in Fig. 1). Pseudotrees have already been investigated as a means to boost search ([17], [18], [19], [20]). The main idea with their use in search, is that due to the relative independence of nodes lying in different branches of the pseudotree, it is possible to perform search in parallel on these independent branches. It is known that any DFS traversal of a graph is a pseudotree arrangement, although the inverse is not necessarily true. Therefore, we will use as a pseudotree ordering a DFS arrangement of the problem graph. This is obtained using a custom token-passing mechanism, described in the following subsection. Figure 1 shows a scheduling problem for 2 power plants, each with 3 power generating units, over a 2 week period. The graph on the left is DFS-traversed from node W K1 /Out, and a part of the resulting pseudotree is depicted on the right side. The pseudotrees consist of tree edges, shown as solid lines, and back edges, shown as dashed lines, that are not part of the DFS tree. We call a tree-path a path entirely made of tree edges. Definition 3: P(X)/C(X) are the parent/children of a node X: these are the obvious definitions (e.g. C(W K1 /Out1 ) = {W K1 /U1,1 }). PP(X) are the pseudo-parents of a node X: the set of nodes higher in the pseudotree that are connected to the node X directly through back-edges (P P (W K2 /U2,1 ) =

5

{W K1 /U2,1 }). PC(X) are the pseudo-children of a node X: the set of nodes lower in the pseudotree that are connected to the node X directly through back-edges (e.g. P C(W K1 /U2,1 ) = {W K2 /U2,1 }). 1) Token Passing Mechanism for Distributed DFS Generation: We used a custom implementation that generates in total 2 × |R| linear size messages (two times the number of edges in the graph). Non-binary constraints like the demand constraints are treated like cliques of the involved variables. We use the example from Figure 2 to show the functioning of this mechanism. Figure 2 (a) shows a generic constraint optimization problem with 14 variables, and Figure 2(b) shows a possible depth-first traversal of the corresponding problem graph that starts from X0 . Briefly, the process is initiated by the root, which sends a TOPO message to one of its neighbors (e.g. to X1 ). This TOPO message contains as context the id of the root: TOPO[0]. Subsequently, a node which receives a TOPO message for the first time, marks the sender as its parent (P (X1 ) = X0 ). X1 adds its own id to the context of the received TOPO message, and then sends it to an unvisited neighbor (e.g. to X4 ). X4 receives TOPO[0,1] from X1 and marks it as its parent. Now, since X4 ’s neighbor is X0 , and X0 is also present in the context of the message that X4 received from X1 , X4 marks X0 as its pseudoparent, and sends the message TOPO[0,1,4] to X0 . Thus, X0 can also mark X4 as its pseudochild. X4 continues by sending back to X1 a TOPO[0,1] message, which informs X1 that the discovery of the subtree hanging from X4 is finished. X1 can then continue with the exploration of its other subtree, and sends its TOPO[0,1] message to X3 . X3 sends TOPO[0,1,3] to X8 , which marks X1 as its pseudoparent and sends it TOPO[0,1,3,8], which means that X1 can also mark X8 as its pseudochild, and so on. The process finishes for a node when it received a TOPO message from each of its neighbors. The whole process finishes when the root has finished. Note that the UTIL propagation can, and in fact does begin before the whole DFS generation process is over. As soon as a node finishes the DFS generation, it starts the UTIL phase (see Section IV-B). It is easy to see that there is exactly 1 TOPO message going in each direction through each edge, thus the total number of messages is |R|. Their size is linear, the largest one having a number of ids in the context that equals the height of the DFS tree.

Otherwise, these back-edges have to be taken into account, and their handlers are present as dimensions in the message from Xi to Xj . Definition 4: U T ILji - the UTIL message sent by agent Xi to agent Xj ; this is a hypercube with one dimension for each variable present in the context. dim(U T ILji ) - the whole set of dimensions (variables) of the message. To compute this message, a node Xi has to join all the messages it received from its children, and the relations it has with its parent and pseudoparents. Afterwards, it considers all combinations of values of its parent/pseudoparents, and for each one it finds its optimal value, and the associated utility. The leaf nodes initiate the process. Then each node Xi relays these messages according to the following process: • Wait for UTIL messages from all children. Since all the respective subtrees are disjoint, joining messages from all children gives Xi exact information about how much utility each of its values yields for the whole subtree rooted at itself. In order to assemble a similar message for its parent Xj , Xi has to take into account Rij and any back-edge relation it may have with nodes above Xj . Performing the join with these relations and projecting itself out of the result gives a matrix with all the optimal utilities that can be achieved for each possible combination of values of Xj and the possible context variables. Thus, Xi can send to Xj its U T ILji message. • If root node, Xi receives all its UTIL messages as vectors with a single dimension, itself. It can then compute the optimal overall utility corresponding to each one of its values (by joining all the incoming UTIL messages) and pick the optimal value for itself (project itself out). C. VALUE propagation The VALUE phase is a top-down propagation phase, initiated by the root after receiving all UTIL messages. Based on these UTIL messages, the root assigns itself the optimal value that maximizes the sum of utility of all its subtrees (overall utility). Then it announces its decision to its children and pseudochildren by sending them a VALUE message (VALUE(Xi ← vi∗ )). Upon receipt of the VALUE message from its parent, each node is able to pick the optimal value for itself in a similar fashion, and then in its turn, send its VALUE messages. When the VALUE propagation reaches the leaves, all variables in the problem are instantiated to their optimal values, and the algorithm terminates.

B. UTIL propagation The UTIL propagation starts bottom-up from the leaves and propagates upwards only through tree edges. The agents send UTIL messages to their parents. Intuitively, such a message informs a parent node Xj how much utility u∗Xi (vjk ) each one of its values vjk gives in the optimal solution of the whole subtree rooted at the sending child, Xi . If there is no backedge connecting a node from Xi ’s subtree to a node above Xj , then these valuations depend only on Xj ’s values, and the message from Xi to Xj is a vector with |dom(Xj )| values.

D. Complexity Analysis of DPOP It has been proved in [3] that DPOP produces a linear number of messages for general distributed optimization problems. Its complexity lies in the size of the UTIL messages (the VALUE messages have linear size). This is also true for its instantiation to maintenance scheduling problems. Let us denote by w the width of the problem graph, induced by the DFS ordering used by the process in Section IV-A. The induced width of a graph is a parameter that captures the

6

density of the graph [19]. DPOP’s complexity is completely characterized by the parameter w: Theorem 1: The maximal amount of computation on any node in DPOP is O(exp(w + 1)). The largest UTIL message has exp(w) values. Sketch of Proof. Petcu and Faltings [3] present a complexity analysis of DPOP and a detailed proof for this claim. A proof sketch is that both the maximal dimensionality of any message, and the induced width of the graph are equal to the same number: the maximal number of overlaps of tree-paths associated with back-edges with distinct handlers. 2 V. DPOP EXTENSIONS We present in the following a number of useful extensions of the basic DPOP optimization algorithm that deal with different aspects of problem solving in a distributed environment. The basic mechanics of the algorithm are the same: DFS arrangement, UTIL propagation and VALUE propagation. The changes are mostly made at the level of the UTIL propagation. A. Dealing with difficult problems As shown in Section IV-D, DPOP is time and space exponential in the induced width of the problem. Its complexity lies in the size of the UTIL messages. Therefore, in case the problems have high induced width, the messages generated in the high-width areas of the problem get large, and DPOP may be infeasible. If we consider the example from Figure 2, we see that the UTIL message from X11 to X5 normally has to have 3 dimensions: X0 , X2 , X5 . In case the domains of X0 , X2 , X5 have 100 values, the message U T IL511 has 1 milion values, which may be too expensive to send. We present in the following two orthogonal methods that deal with such difficult problems by renouncing exactness, and settle for good, but suboptimal solutions. 1) Approximations and anytime optimization: In [21] we present an approximate version of DPOP, which allows the desired tradeoff between solution quality and computational complexity. The scheme can be tuned from two parameters: maxDims represents the maximal number of dimensions that any message in the system can carry, and maxδ represents the maximal allowed distance from the optimum. The algorithm works as normal DPOP, except in dense parts of the problem, where an agent needs to send highdimensionality messages (more dimensions than maxDims). In these cases, dimensions in excess of maxDims are forcibly removed using maximal/minimal projections that retain best/worst valuations from the original message. Thus, the outgoing messages consist of two parts (an upper and a lower bound part) and obey the size limit imposed by maxDims. In the example from Figure 2, this happens with U T IL511 if we impose maxDims = 2. Then, X11 drops X0 from U T IL511 , and creates 2 messages (upper/lower bounds) with only 2 dimensions: X5 and X2 . Wherever such maximal/minimal projections are performed by a node Xi , one can compute a maximal distance from

the optimum δi that is guaranteed to be observed during the VALUE assignment phase. maxDims is an accurate measure of the computational/communication effort, as it bounds the exponential of the message size. The new complexity is O(dmaxDims ), as opposed to O(dw ), and maxDims < w. maxδ is a measure of the desired solution quality. The two parameters maxDims and maxδ are obviously conflicting. In case one cannot satisfy both of them, one needs to settle for the classical trade-off: accuracy vs. complexity. If optimality is the main concern, then one can specify e.g. maxδ = 10%, and no maxDims. This would have as an effect that as many dimensions as needed would be used in order to guarantee that the obtained solution is within 10% of the optimum. Notice that this does not necessarily mean that the maximal number of dimensions will actually be used; depending on the valuation structure of the problem, one or two dimensions could very well be enough. Conversely, if computation/communication load is the main concern, then one can specify e.g. maxDims = 2 and no maxδ. In this case, the largest message would have 2 dimensions, and we would obtain the best solution available for this much computation, together with an upper bound on its distance from the true optimum. If this distance is good enough, then the algorithm returns this solution. Otherwise, we can re-run the algorithm with an increased maxDims. Notice that in this case, we can reuse a lot of the previous work: one needs to re-run the propagation only in those areas of the problem where the maximal dimension bound was exceeded. 2) Local search hybrids: An alternative approach for difficult problems is given by local search methods. These methods start with some assignment (in our case an existing schedule), and then gradually improve it by applying incremental changes. Their advantage is that they require linear memory, and in many cases provide good solutions with a small amount of effort. However, the agents involved often take myopic decisions in the sense that they take into account only local information, thus getting stuck into local optima rather easily. Large neighborhood search tries to overcome this problem by exploring a much larger set of neighboring states before moving to the next one. We proposed in [22] a distributed algorithm that combines the advantages of both these approaches. This method is controlled by a parameter maxDims which specifies the maximal allowable amount of inference. The maximal space requirements are exponential in this parameter. In the dense parts of the problem, where the required amount of inference exceeds this limit, the algorithm executes a local search procedure guided by as much inference as allowed by maxDims. In the example from Figure 2, this happens with U T IL511 if we impose maxDims = 2. Then, X11 notifies X0 and X2 that they have to start a local search procedure. X0 and X2 then start with a random assignment of their values, and make improving changes, guided by inference on X11 and X5 for each local search step. If maxDims is equal to the induced width of the graph or larger, then the algorithm is full inference, therefore complete. Larger values of maxDims are proved to produce better

7

Fig. 2. A generic constraint optimization problem (a), and (b) a possible DFS arrangement, and the corresponding DPOP message flow: UTIL messages go bottom-up, and VALUE messages go top-down.

results in terms of solution quality, at the cost of increased computational/communication effort. Experimental results on meeting scheduling problems show the effectiveness of the scheme, and the fact that very good solution quality can be obtained even for small values of maxDims (little effort required). B. Anytime optimization We present in [21] an anytime version of this DPOP, which provides increasingly accurate solutions while the propagation is still in progress. This makes it suitable for very large, distributed problems, where the propagation may take too long to complete, and good (but not necessarily optimal) solutions are needed fast. The method works by having all agents compute online upper/lower bounds on the valuations they could receive from their neighbors that did not send their UTIL messages yet. These bounds are then joined with actual messages that the agents did receive from their other neighbors. The result is that each agent can compute better and better upper/lower bounds on the utility given by each one of its values to the rest of the problem, therefore being able to pick its value with increasing quality guarantees. C. Self-stabilization and fault containment Self stabilization in distributed systems [23] is the ability of a system to always reach an optimal state even when faults occur, or the environment changes dynamically. This makes such systems particularly interesting for power systems that can be both faulty and dynamic. We have proposed in [24] a self-stabilizing extension of DPOP which, given enough time, is guaranteed to always stabilize in the optimal solution of the optimization problem, even upon faults or dynamic changes to the problem. We envisage as application dynamic optimal power flow, or dynamic unit commitment with the possibility to deal with unforeseen unit failures.

1) Super stabilization: A super-stabilizing extension of this technique ([24]) can be used as a safety preserving mechanism in power system control, where we want to ensure that the controls we apply to the system are consistent with each other, despite being applied by independent operators. This extension works by ensuring that the system maintains the previous optimal state in a transitory phase, until the new optimal solution is found. Then, the system as a whole executes a transition from the ”last-known-good” state to the new optimal state in a synchronized, atomic switch. 2) Fault containment: A general scheme for fault containment ([24]) upon low impact failures/changes is an effective method that limits the areas where a change in the optimization problem has an effect. The scheme works by analyzing the updated messages that have to be retransmitted in case of a change in the problem, and confining their propagation only to areas where they can possibly change the current system state. Multiple, simultaneous, isolated failures/changes are handled effectively, without solving the problem from scratch. 3) Fast response time upon isolated faults: In highly dynamic systems, optimal decisions have to be made as quickly as possible. In some cases, we want to respond to a perturbation by immediately assigning the new optimal value to the ”touched” variable, and then gradually re-assigning the neighboring ones to their new optimal values, until all the system is again stabilized. For example, when a generator breaks down, we want to cancel maintenance on another one as quickly and as cost effective as possible to compensate for its load, and then gradually recompute the new optimal maintenance schedule. We also want to deal effectively with multiple simultaneous faults which are unrelated (their effects are localized in different parts of the problem). This is possible with a uniform UTIL propagation ([24]) that gives each node global utility information. Thus, it is easy for each node to immediately assess locally the global effect of a local perturbation, and switch immediately to its new optimal value.

8

D. Continuous-time optimization In a dynamic environment, optimizing maintenance schedules continuously and never providing a solution is obviously not useful. Commitments need to be made and acted upon, and, occasionally, revised as new events unfold. In dynamic systems is in general desirable to maintain a current optimal state as much as possible, even in the face of changes in the environment. In [25] we define the distributed, continuous-time optimization problem, that provides a way to model dynamic environments and reason about the optimality of sequences of adjustments to existing solutions to such dynamic problems. We identify two kinds of commitments: soft commitments and hard commitments. Soft commitments model reversible decisions (e.g. typically contracts with associated penalties if broken): a maintenance crew is assigned and deployed, maintenance materials are purchased, etc. These can be revised if the benefit extracted from the change outweighs its cost. Hard commitments model irreversible processes, and are impossible to undo. For example, some generating units, once shut down are impossible to restart immediately. In dynamic constraint reasoning systems, solution stability [26] is defined as a metric on the distance between solutions to succesive instances of a dynamically changing problem. We proposed in [25] a general, semantically well-defined notion of solution stability in such systems, based on the cost of change from an already implemented solution to the new one. The approach allows maximum flexibility in specifying these costs through the use of stability constraints. Users have a maximum of flexibility in specifying both when they want to commit their variables to their current optimal values (assigning crews to tasks, taking generators down, etc), and how much would it cost to revise a decision, once it is already taken (relocation costs for a crew, contract re-negotiation with suppliers, etc). We presented in [25] the first mechanism for combinatorial optimization that guarantees optimal solution stability in dynamic environments, based on this notion of solution stability. Optimal decisions are continuously made, and even revised upon changes in the environment, if the benefit extracted from the changes outweigh their cost, and better overall solutions can be found. For example, a generator may experience a failure, and needs to be taken down for unscheduled maintenance. The power plant operator may then decide to reassign a maintenance crew from another (previously scheduled!) task, if this means saved costs from penalties for not observing contractual duties to customers. On an acurate model, the dynamic algorithm from [25] ensures optimal decisions are always made from a current state, given a set of events that have happened. E. Incentive compatible optimization In a setting where the participating agents are self-interested, it is possible that they would try to manipulate the optimization process such that the outcome is more favorable to themselves than the overall optimal solution. Such manipulations steer the

final outcome away from a global optimum which is otherwise achievable. The Vickrey-Clarke-Groves (VCG) tax mechanism is a way to ensure that the agents in the system are always better off by declaring their true preferences, thus allowing for the optimal outcome to be chosen. The VCG mechanism is truthful, meaning that each agent can always maximize its own utility by reporting its true utility function, whatever the reports of other agents. This dominant-strategy equilibrium is useful because it frees an agent from modeling the behavior of other agents. The VCG mechanism works by fining the participating agents with taxes which are proportional to the damage that they cause to others. Thus, the agents do not have an incentive to understate their valuations because the outcome chosen would not be the best for them, and do not overstate because that would induce a high amount of tax they would have to pay for hurting the others. There is a long tradition of leveraging the VCG mechanism within distributed AI, going back to Ephrati and Rosenschein [27] who considered the use of VCG mechanisms to achieve consensus amongst agents. In [28], [29] we propose two incentive compatible mechanisms that allow a set of self-interested agents to express preferences over a set of decisions they want to jointly take, and reach a globally optimal solution. Both mechanisms work by computing and collecting VCG taxes in a distributed fashion. A common problem with the VCG taxes is that Groves payments cannot, in general, be distributed back to agents without affecting the incentive properties, and have to be burned or donated outside the system. This translates into the agents’ net loss in utility, due to the fact that taxes are levied and never returned. However, tax payments made by other agents can be refunded to an agent as long as that agent has no influence on the computation of the payments. [30], [28], [31] present methods that under some conditions are able to redistribute VCG payments back to the agents, thus achieving better net utility for the agents involved. VI. R ELATED W ORK This work is closely related to [15]. Compared to this, we extend the model to a multi-power plant setting. We directly achieve optimization of the schedules, as opposed to solving a sequence of satisfaction problems with decreasing upper bounds on cost. Furthermore, we solve this task in a distributed fashion, preserving the private information of the agents involved. To the best of our knowledge, this is the first attempt to deploy a complete, distributed optimization method to solve a power systems related task. We are aware of other multiagent systems for power systems like [32], but these approaches are either negotiation-based (meaning they cannot guarantee globally optimal solutions), or they include some ”data integration” techniques (centralize information). Linear/quadratic programming methods from the operations research community seem to define the current state of the art

9

in optimization in power systems applications. However, up to now, it seems like these methods are not good candidates for distributed algorithms, and there are few attempts at this. VII. C ONCLUSIONS AND FUTURE WORK We have shown how the maintenance scheduling problem can be solved in a decentralized fashion using a distributed constraint optimization algorithm. The need for centralizing all data is eliminated, thus offering the advantage of increased privacy of the parties involved, and possibly efficiency gains as well, since the data integration step is skipped. Our optimization method is a utility propagation algorithm based on dynamic programming, that requires a linear number of messages. We believe that these constraint optimization methods based on dynamic programming are probably the best suited ones for distributed environments. We also presented a number of interesting extensions of the basic method that deal with different aspects of distributed problem solving, like complexity, dynamic environments and self interest. We believe that the applicability of this array of techniques to power systems extends well beyond this simple maintenance scheduling scenario, and we currently consider other tasks like determining the optimal power flow, dynamic stability, unit commitment, power system restoration and expansion planning. R EFERENCES [1] Rajiv T. Maheswaran, Milind Tambe, Emma Bowring, Jonathan P. Pearce, and Pradeep Varakantham, “Taking DCOP to the realworld: Efficient complete solutions for distributed multi-event scheduling”, in AAMAS-04, 2004. [2] Pragnesh Jay Modi, Wei-Min Shen, Milind Tambe, and Makoto Yokoo, “ADOPT: Asynchronous distributed constraint optimization with quality guarantees”, AI Journal, vol. 161, pp. 149–180, 2005. [3] Adrian Petcu and Boi Faltings, “A scalable method for multiagent constraint optimization”, in Proceedings of the 19th International Joint Conference on Artificial Intelligence, IJCAI-05, Edinburgh, Scotland, Aug 2005. [4] P. Modi, W. Shen, M. Tambe, and M. Yokoo, “An asynchronous complete method for distributed constraint optimization”, 2003. [5] Ramon Bejar, Cesar Fernandez, Magda Valls, Carmel Domshlak, Carla Gomes, Bart Selman, and Bhaskar Krishnamachari, “Sensor networks and distributed CSP: Communication, computation and complexity”, Artificial Intelligence, vol. 161, no. 1-2, pp. 117–147, 2005. [6] Carlos Eisenberg, Distributed Constraint Satisfaction For Coordinating And Integrating A Large-Scale, Heterogeneous Enterprise, Phd. thesis no. 2817, Swiss Federal Institute of Technology (EPFL), Lausanne (Switzerland), September 2003. [7] Amnon Meisels and Eliezer Kaplansky, “Scheduling Agents Distributed Timetabling Problems (DisTTP)”, in The Fourth International Conference on the Practice and Theory of Automated Timetabling (PATAT2002), Gent, Belgium, 2002, pp. 166–180. [8] Makoto Yokoo, Edmund H. Durfee, Toru Ishida, and Kazuhiro Kuwabara, “Distributed constraint satisfaction for formalizing distributed problem solving”, in International Conference on Distributed Computing Systems, 1992, pp. 614–621. [9] Thomas Schiex, H´el`ene Fargier, and Gerard Verfaillie, “Valued constraint satisfaction problems: Hard and easy problems”, in Proceedings of the 15th International Joint Conference on Artificial Intelligence, IJCAI-95, Montreal, Canada, 1995. [10] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated annealing”, Science, Number 4598, 13 May 1983, vol. 220, 4598, pp. 671–680, 1983. [11] Roger Mailler and Victor Lesser, “Solving distributed constraint optimization problems using cooperative mediation”, Proceedings of Third International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS 2004), 2004.

[12] Marius Silaghi and Makoto Yokoo, “Nogood-based asynchronous distributed optimization (adopt-ng)”, in WINE’05: Workshop on Internet and Network Economics, Hakodate, Japan, May 2006. [13] Marius-Calin Silaghi, D. Sam-Haroud, and B. Faltings, “Distributed asynchronous search with private constraints”, in Proc. of AA2000, Barcelona, June 2000, pp. 177–178. [14] Makoto Yokoo, Koutarou Suzuki, and Katsutoshi Hirayama, “Secure distributed constraint satisfaction: Reaching agreement without revealing private information”, in Proceedings of the Distributed Constraint Reasoning workshop at AAMAS 2002, Bologna, 2002. [15] Daniel Frost and Rina Dechter, “Optimizing with constraints: A case study in scheduling maintenance of electric power units”, in Fifth International Symposium on Artificial Intelligence and Mathematics, 1998. [16] J. Al-Khamis Yellen, T.M. Vemuri, and L. S. Lemonidis, “A decomposition approach to unit maintenance scheduling”, IEEE Transactions on Power Systems, vol. 7, no. 2, pp. 726–733, May 1992. [17] Eugene C. Freuder, “A sufficient condition for backtrack-bounded search”, Journal of the ACM, vol. 32, no. 14, pp. 755–761, 1985. [18] Eugene C. Freuder and Michael J. Quinn, “Taking advantage of stable sets of variables in constraint satisfaction problems”, in Proceedings of the 9th International Joint Conference on Artificial Intelligence, IJCAI85, Los Angeles, CA, 1985, pp. 1076–1078. [19] Rina Dechter, Constraint Processing, Morgan Kaufmann, 2003. [20] Thomas Schiex, “A note on CSP graph parameters”, Tech. Rep. 03, INRA, 07 1999. [21] Adrian Petcu and Boi Faltings, “Approximations in distributed optimization”, in CP05 - workshop on Distributed and Speculative Constraint Processing, DSCP, October 2005. [22] Adrian Petcu and Boi Faltings, “A propagation/local search hybrid for distributed optimization”, in CP 2005- LSCS’05: Second International Workshop on Local Search Techniques in Constraint Satisfaction, Sitges, Spain, October 2005. [23] Edsger W. Dijkstra, “Self stabilizing systems in spite of distributed control”, Communication of the ACM, vol. 17, no. 11, pp. 643–644, 1974. [24] Adrian Petcu and Boi Faltings, “Superstabilizing, fault-containing multiagent combinatorial optimization”, in Proceedings of the National Conference on Artificial Intelligence, AAAI-05, Pittsburgh, USA, July 2005. [25] Adrian Petcu and Boi Faltings, “Optimal solution stability in continuous time optimization”, in IJCAI05 - Distributed Constraint Reasoning workshop, DCR05, August 2005. [26] G. Verfaillie and T. Schiex, “Solution reuse in dynamic constraint satisfaction problems”, in Proceedings of the National Conference on Artificial Intelligence, AAAI-94, Seattle, WA, 1994, pp. 307–312. [27] E. Ephrati and J.S. Rosenschein, “The Clarke tax as a consensus mechanism among automated agents”, in Proceedings of the National Conference on Artificial Intelligence, AAAI-91, Anaheim, CA, July 1991, pp. 173–178. [28] Adrian Petcu and Boi Faltings, “Incentive compatible multiagent constraint optimization”, in WINE’05: Workshop on Internet and Network Economics, Hong Kong, Dec 2005. [29] Adrian Petcu and Boi Faltings, “MDPOP: Faithful distributed implementation of efficient social choice problems”, Tech. Rep., EPFL/IC/LIA, Lausanne,Switzerland, Oct. 2005. [30] Boi Faltings, “A budget-balanced, incentive-compatible scheme for social choice”, in Workshop on Agent-mediated E-commerce (AMEC) VI. 2004, Springer Lecture Notes in Computer Science. [31] Adrian Petcu, Boi Faltings, and David Parkes, “MDPOP: Faithful Distributed Implementation of Efficient Social Choice Problems”, in Proceedings of the International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS-06), Hakodate, Japan, May 2006. [32] Christian Rehtanz, Ed., Autonomous Systems and Intelligent Agents in Power System Control and Operation, vol. ISBN: 3540402020, Springer Verlag, December 2003.