Finding maximum-cost minimum spanning trees

0 downloads 0 Views 76KB Size Report
Given an n-node m-edge connected undirected graph. G(n, m) ... is the inverse Ackermann function). The MST of a ... Let j ≤ n be the largest integer satis- fying:.
Finding maximum-cost minimum spanning trees Ahmed Belal Dept. of Computer Engineering and Systems Alexandria University, Egypt [email protected]

Abstract Consider the scenario in which a start-up communication company charges its network users by the cost of the minimum spanning tree (MST) they use in their protocols. Wanting to increase their profits, they aim at maximizing the cost of the MST of their network. We consider two different cases. In the first case, the company has a set of links with fixed cost vector W and wants to configure the network so that the MST of the network will have the maximum possible cost. In the second case, the network topology is fixed, but the costs on the links assume d different values W1 , W2 , . . . , Wd over the day.PThe company wants to fix the link costs to a value W = i pi WiP , for some weights p1 , p2 , . . . , pd , where 0 ≤ pi ≤ 1 and i pi = 1, so that the resulting network has a maximum-cost MST.

Amr Elmasry Dept. of Computer Engineering and Systems Alexandria University, Egypt [email protected]

this was followed by linear-time implementable algorithms by Dixon et al. [6] and King [14]. Karger et al. have discovered a randomized algorithm with linear expected complexity [13]; in their algorithm, they used the linear-time verification idea as a subroutine. If the edge costs are integers and the model allows bucketing and bit manipulation, it is possible to solve the problem in linear time deterministically, as was shown by Fredman and Willard [8]. If the edges of the graph are presented one at a time in arbitrary order, we can build an MST on-line in O(m log n) using the link-cut trees in [22]. Two related problems can be solved in O(mα(n, m)). Given an MST, we can test by how much the cost of each edge can be increased or decreased while the given tree is still an MST [23]. Given an MST, we can find for each edge e a minimum-cost substitute edge e0 such that, if e is deleted from the graph, replacing it in the old MST by e0 produces an MST in the new graph [24].

1 Introduction 2 Maximizing the MST Given an n-node m-edge connected undirected graph G(n, m) each of whose edges has a real-valued cost we , its minimum spanning tree (MST) is a spanning tree of G whose total edge cost is minimum. MSTs are extensively used in communication networks. Many routers direct their packets along the edges of an MST. Bridges in extended LANs are configured using an MST protocol. Computing the MST of a graph is the milestone of network optimization problems. The first algorithm to find the MST was due to Bor˙uvka [2]. The two classical algorithms of Kruskal [17] and Prim [20] (first discovered by Jarnik [12]) are the most popular. Despite the long history of the problem, the algorithmic complexity is not resolved yet. The best known bound for a deterministic algorithm for MST, due to Chazelle [3], is O(mα(n, m)) (where α(−) is the inverse Ackermann function). The MST of a planar graph can be found in linear time [4]. Given a spanning tree for a graph, verifying whether this tree is minimum or not can be done in linear time: Koml´os showed how to verify spanning trees with linear number of comparisons [15];

Sometimes, we are able to control the network parameters and hence affecting the cost of the resulting MST. Several possibilities like controlling the topology, the link costs, removing some links or removing some vertices. Increasing the cost of the MST would be the goal of some network designers in some situations. Alternatively, there are some applications for which it is important to determine the maximum degradation in the performance of the broadcasting network protocol that can be expected as a result of traffic fluctuations and link failures [21]. Also, there are several combinatorial optimization problems whose solutions require finding an MST of a graph, and thus for which a change in the cost of the MST means a change in the value of the solution [10]. Some of the cases mentioned above are hard, while the others are not. For example, finding the k most vital edges in G whose removal will cause greatest cost increase in the MST of the remaining graph was shown to be N P -hard [18]. When k = 1, finding the most vital edge can be done

in O(mα(n, m)) [11]. On the other hand, the continuous version of the above discrete problem is comparably easier. Namely, the maximum possible increase in the cost of the MST that can be produced by edge cost increases of total cost B, assuming that the increase in the cost of an edge has an associated value proportional to the magnitude of the change. Computing all the possible MSTs for all cost increases can be done in O(n3 m2 log (n2 /m)) [9]. The dual problem, minimizing the MST, was also considered. Several problems arising in areas such as communication networks and VLSI design can be expressed in the following general form: Enhance the performance of a given network by upgrading a suitable subset of nodes. In communication networks, upgrading a node corresponds to installing faster communication equipment at that node. Such an upgrade reduces the communication delay along each edge emanating from the node. In signal flow networks used in VLSI design, upgrading a node corresponds to replacing a circuit module at the node by a functionally equivalent module containing suitable drivers. Such an upgrade reduces the signal transmission delay along the wires connected to the module. Usually, there is a cost associated with upgrading a node, and this motivates the study of problems of the following type: find an upgrading set of bounded total cost so that the resulting network satisfies certain performance requirements, like the minimum possible MST. This general problem was shown to be N P -hard, and approximation algorithms were given [16]. In this paper, we consider the problem of optimizing the MST by controlling different parameters of the underlying network.

is a tree. Finally, we connect an arbitrary vertex of G2 with some arbitrary vertices of G1 using the remaining edges. Lemma 1 The graph G is a realizable connected graph. Proof. By construction, both G1 and G2 are realizable and connected. Consider the number of edges connecting G1 to G2 , and call it r. The choice of j ensures that the number of edges of G1 plus those of G2 is more than m − j and less than m, since m − j < j(j − 1)/2 + n − j − 1 < m. Hence, r ≥ 1 and the graph G is connected. Also, r < j permits connecting one vertex of G2 with a subset of the vertices of G1 , making G realizable. ¤

Lemma 2 The MST of G consists of the following edges, identified by their ranks in the sorted sequence of W : The edges whose ranks are i(i−1)/2+1, for all i = 1, 2, . . . , j− 1. If j < n, the edge whose rank is j(j − 1)/2 + 1 and the n − j − 1 edges with the largest costs. Proof. Assume that the vertices of G1 are numbered from 1 to j according to the order by which G1 is constructed in the above algorithm. Consider the cuts induced by the first i vertices and the rest of G, for all i = 1, 2, . . . , j − 1. By the cut property of MSTs, the edges whose ranks are i(i − 1)/2 + 1 must be in the MST. Similarly, if j < n, the edge whose rank is j(j − 1)/2 + 1 must be in the MST. The n − j − 1 edges of G2 must be in the MST, as well. ¤

Lemma 3 The MST of G(n,m) is the maximum possible among all graphs G’(n,m) with the same edge costs.

2.1 Controlling the network topology Given the cost vector W of the m links that constitute the network together with the value of the number of nodes n, the question is how to reconnect the links of G such that its corresponding MST is maximized. We start with sorting W by increasing cost values in O(m log n) time. Let j ≤ n be the largest integer satisfying: µ

j 2

¶ ≤ m−n+j

=⇒ j = b

3+

p 9 + 8(m − n) c. 2

For all i = 2, 3, . . . , j, we connect the i-th vertex with all the preceding i − 1 vertices using the i − 1 edges ranked from (i − 1)(i − 2)/2 + 1 up to i(i − 1)/2 within the sorted sequence of W , in any arbitrary order. We call this subgraph, G1 . Note that G1 is a Kj (complete graph that has j vertices and j(j − 1)/2 edges). If j < n, we connect the remaining n − j vertices with the n − j − 1 edges having the largest costs. We call this sub-graph, G2 . Note that G2

Proof. Consider the following algorithm to construct an MST, which is a modification to that of Bor˙uvka [2]. We start with the n vertices forming n connected components. For the iterative step, the minimum possible edge that connects two components is added to the MST edges, and the two components are combined into one component. Consider the moment for the i-th edge to be added to the MST by this algorithm. At this moment, the number of candidate edges connecting two components is at least m−i(i−1)/2. As a result, the i-th edge added to the MST will have a cost at most that of the edge with rank i(i − 1)/2 + 1 among the sorted sequence of W , for i = 1, 2, . . . , j − 1. If j < n, the same argument holds for the j-th edge added to the MST, which will have a cost at most that of the edge with rank j(j − 1)/2 + 1 among the sorted sequence of W . The cost of the remaining n − j − 1 edges of the MST corresponding to G are those with the largest possible costs. It follows that the MST corresponding to G is the maximum possible among all such graphs. ¤

2.2 Controlling the weights Consider a graph G with a fixed topology, where the costs on the links assume d different vectors W1 , W2 , . . . , Wd over the day. Let T be the MST corresponding to G with the weighted-average cost vector W = P P p W , where 0 ≤ p ≤ 1 and p i i i i i i = 1. The question is how to select the weights p1 , p2 , . . . , pd , such that ∗ cost(T ) is maximized. Let T be this MST with the maximum cost possible. Let Ti be the MST corresponding to G with the cost vector Wi . Let Timax be a tree with the maximum cost, and Timin be a tree with the minimum cost, among T1 , T2 , . . . , Td . The following are properties relating these costs: ∗

1. cost(T ) ≥ cost(Timax ). Proof. By setting pimax = cost(Timax ).

1, we get cost(T )

= ¤



2. If T1 , T2 , . . . , Td have the same topology, cost(T ) = cost(Timax ). Proof. It follows thatP T will have the same topology as well. Hence, cost(T ) = i pi · cost(Ti ), which is maximized when pimax = 1. ¤ 3. For any weights p1 , p2 , . . . , pd , cost(T ) ≥ cost(Timin ). Proof. Let T |W be the tree T defined on G with P the cost vector W .P Using property 2, cost(T ) = i pi · cost(T |Wi ) ≥ i pi · cost(Ti ) ≥ cost(Timin ). ¤ 4. There exist graphs where some edges are in T1 , T2 , . . . , Td but not in T . Proof. We prove the property by giving an example for such graph. Consider G(3, 3), where W1 = (3, 1, 4), W2 = (3, 4, 1), p1 = p2 = 1/2. It follows that T1 = (1, 1, 0), T2 = (1, 0, 1), while T = (0, 1, 1). ¤ ∗

5. There exist graphs where cost(T ) > cost(Timax ). Proof. For the same graph of property 4, cost(T1 ) = cost(T2 ) = 4, while cost(T ) = 5. Note that for this ∗ graph cost(T ) = 5 when p1 = p2 = 1/2, or p1 = 1/3, p2 = 2/3, or p1 = 2/3, p2 = 1/3. ¤ The case d = 2 is interesting. The problem in such case is known in the literature as the parametric minimum spanning tree problem, where each edge of the graph has a linear cost function we (λ) = ae + λbe and Z(λ) denotes the cost of the MST in terms of λ. The problem is then to construct Z(λ); or alternatively, to apply the closelyrelated parametric search problem to find λ∗ that maximizes Z(λ∗ ). It can be shown that Z(λ) is a piecewise-linear concave function of λ; the points at which the slope of Z

changes are called breakpoints. Let b(n, m) be the number of such breakpoints. It is known that b(n, m) = O(mn1/3 ) [5] and b(n, m) = Ω(nα(n)) [7]. Fern´andez-Baca et al. gave an O(mn log n) algorithm to construct Z(λ) and an √ 2 O( mn log2 nm ) algorithm to find Z(λ∗ ). They use sparsification and Megiddo’s [19] parametric search techniques to achieve these bounds. An O(b(n, m)n2/3 log4/3 n) algorithm to construct Z(λ) was later introduced by Agarwal et al [1]. We give a simpler algorithm that is generalized to any dimension d. We first illustrate our algorithm for the case d = 2. Assume without loss of generality that no two pairs of edge-costs become equal simultaneously (this can be achieved by perturbing the cost functions infinitesimally). The key observation is that for every breakpoint of Z(λ), two edges must have equal costs. We maintain the current spanning tree edges at any moment of the algorithm and start updating it at the breakpoints as λ increases. At any moment, we maintain a sorted sequence of the edges of the graph with respect to the cost values corresponding to the current value of λ. The next breakpoint must happen when two consecutive edges in this sequence have equal costs. We check all possible candidate breakpoints and select the one with the smallest λ; this λ is identified in O(log n) by maintaining a heap structure with such candidate breakpoints. The potential breakpoint corresponds to an actual breakpoint only when the edge that is heavier before the breakpoint is not already in the MST and induces a cycle containing the other edge in the current MST; this can be tested in O(log n) per potential breakpoint using the link-cut trees in [22]. Whether this breakpoint produces a change in the MST or not, we update the sorted sequence of the edges by swapping the two edges corresponding to the breakpoint. The algorithm proceeds producing Z(λ) until its value starts decreasing, indicating the value of λ∗ and the maximum MST we are looking for. Since the number of candidate breakpoints is O(m2 ), the above algorithm has an O(m2 log n) time complexity, and O(m) storage complexity. For general values of d (assuming d is a constant in the O() notation), Z(p1 , p2 , . . . , pd ) is a concave chain that results from the intersections of d-dimensional hyper-planes. The vertices of this chain are the breakpoints, where d of the edges must have equal costs. One way to find Z(p∗1 , p∗2 , . . . , p∗d ) is to calculate all possible O(md ) weight vectors where d of the edges have equal costs, and find the MST for each of these weight vectors. The solution with the maximum-cost MST among these trees is what we are looking for. This algorithm has an O(md · tM ST (n, m)), where tM ST is the time required to find the MST. To generalize the algorithm of the 2-dimensional case, we walk on the vertices of the concave chain, each time with the smallest feasible step, as long as the value of the corresponding

MST is increasing. The details follow. We start with an arbitrary basis and keep track of the current MST corresponding to the current weight vector. The next potential breakpoint results when one of the d edges with equal costs leave the basis to another edge. There are at most O(m) such neighbors to be checked. We select the candidate breakpoint where the difference between the current weights and the new weights has the smallest norm. The potential breakpoint is checked if it produces an actual change in the edges of the MST, or otherwise another candidate breakpoint is selected. This step requires O(log n) per potential breakpoint using the link-cut trees in [22]. Note that once a candidate breakpoint is rejected it will never be checked again. The algorithm proceeds until the value of Z cannot be increased any more, indicating the maximum MST we are looking for. Since the number of candidate breakpoints is O(md ), the above algorithm has an O(md+1 ) time complexity and O(m) storage complexity. Still, in analogy with the Simplex method, we expect the practical average running time to be much far from this worst-case bound.

References [1] P. Agarwal, D. Eppstein, L. Guibas and M. Henzinger. Parameteric kinetic minimum spanning trees. 39th IEEE Found. of Computer Science (1998), 596605. [2] O. Bor˙uvka. O jist´em probl´emu minim´alnim. Pr´aca Moravsk´e Pˇrirodovˇedeck´e Spoleˇcnosti 3 (1926), 37-58. [3] B. Chazelle. A minimum spanning tree algorithm with inverse Ackermann type complexity. J. ACM 47(6) (2000), 1028-1047. [4] D. Cheriton and R. Tarjan. Finding minimum spanning trees. SIAM J. Comput. 5 (1976), 309-312. [5] T. Dey. Improved bounds on planar k-sets and k-levels. Discrete and Computational Geometry 19 (1998), 373382. [6] B. Dixon, M. Rauch and R. Tarjan. Verification and sensitivity analysis of minimum spanning trees in linear time. SIAM J. of Comput. 21(6) (1992), 1184-1192. [7] D. Eppstein. Geometric lower bounds for parametric matroid optimization. 27th ACM Symp. on Theory of Computing (1995). [8] M. Fredman and E. Willard. Trans-dichotomous algorithms for minimum spanning trees and shortest paths. J. Comput. Syst. Sci. 48 (1993), 424-436. [9] G. Frederickson and R. Solis-Oba. Increasing the weight of minimum spanning trees. 7th ACM-SIAM Symp. On Disc. Algorith. SODA (1996), 539-546.

[10] R. Graham and P. Hell. On the history of the minimum spanning tree problem. Annals of the History of Comput. 7 (1985), 43-57. [11] K. Iwano and N. Katoh. Efficient algorithms for finding the most vital edge of a minimum spanning tree. Inf. Process. Lett. 48 (1993), 211-213. [12] V. Jarnik. O jist´em probl´emu minim´alnim. Pr´aca Moravsk´e Pˇrirodovˇedeck´e Spoleˇcnosti 6 (1930), 57-63. (In Czech.) [13] D. Karger, N. Klein and R. Tarjan. A randomized linear-time algorithm to find minimum spanning trees. J. ACM 42 (1995), 321-328. [14] V. King. A simpler minimum spanning tree verification algorithm. Algorithmica 18 (1997), 263-270. [15] J. Koml´os. Linear verification for spanning trees. Combinatorica 5(1) (1985), 57-65. [16] S. Krumke, M. Marathe, H. Noltemeire, R. Ravi, S. Ravi, R. Sundaram and H. Wirth. Improving minimum cost spanning trees by upgrading nodes. J. Alg. 33(1) (1999), 92-111 [17] J. Kruskal. On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Amer. Math. Soc. 7 (1956), 48-50. [18] K. Lin and M. Chern. The most vital edges in the minimum spanning tree problem. Inf. Process. Lett. 45 (1993), 25-31. [19] N. Megiddo. Applying parallel computation algorithms in the design of serial algorithms. J. ACM 30(4) (1983), 852-865. [20] R. Prim. Shortest connection networks and some generalizations. Bell System Tech. J. 36 (1957), 13891401. [21] M. Schwatz and T. Stern. Routing techniques used in computer communication networks. IEEE Trans. On Commun. 28 (1980), 539-552. [22] D. Sleator and R. Tarjan. A data structure for dynamic trees. J. Comput. Syst. Sci. 26 (1983), 362-391. [23] R. Tarjan. Sensitivity analysis of minimum spanning trees and shortest path trees. Inf. Process. Lett. 14(1982), 30-33. [24] R. Tarjan. Applications of path compression on balanced trees. J. ACM 26 (1979), 690-715.