Coloring Heuristics for Register Allocation - CiteSeerX

5 downloads 0 Views 2MB Size Report
INTRODUCTION. From the earliest compilers, register allocation was recognized as an important optimization. Indeed, the original Fortran compiler spent two of ...
RETROSPECTIVE:

Coloring Heuristics for Register Allocation Preston Briggs

Cray Research, Inc. 411 First Avenue South Suite 600 Seattle, WA 98104

[email protected]

1.

Keith D. Cooper

Department of Computer Science Rice University Houston, TX 77251-1982

[email protected]

Ken Kennedy

INTRODUCTION

[email protected]

produced a simple four-node counterexample, the diamond graph. a

b

d

c

We quickly realized that Matula’s algorithm would two-color this graph. This shifted our attention away from live-range splitting and back onto the fundamental’s of Chaitin’s method. At a high-level, Chaitin’s algorithm builds an interference graph, uses the graph to order the nodes for color assignment, and then assigns colors in the specified order. The critical step, for our purposes, is when the algorithm picks the next node to add to the ordering. The algorithm selects any node n with fewer than k neighbors, where k is the number of registers available to the allocator. If n has fewer than k neighbors, any assignment of colors to n’s neighbors leaves at least one color for n. Thus, n must receive a color. If no such node remains in the graph, the algorithm picks a node, using some heuristic, and spills the corresponding value. This approach fails to find a two-coloring for the diamond graph because every node has two neighbors. The first time it tries to pick a node, it must spill one, say a. This lowers the degree of its two neighbors, b and d, to one. It can now construct a coloring order for b, c, and d. (Any order beginning with b or d works.) Clearly, this is pessimistic because a can always use the same color as c. Smallestlast coloring constructs an ordering—it picks any node first since they all have the same degree. Since any order will produce a two coloring, it succeeds where Chaitin’s algorithm did not. Smallest-last coloring is not, in itself, the answer. For example, it provides no help in spilling when the graph cannot, in fact, be k-colored. The paper shows the insights and algorithms that we eventually derived. The resulting algorithm fits a stronger coloring heuristic into the basic structure of the original algorithm. We refer to this heuristic as optimistic coloring.

BACKGROUND

We implemented a graph-coloring register allocator in our compiler for the I BM RT- PC, an early R ISC workstation with 16 general purpose registers and 8 floating-point registers. We were, in general, pleased with the results. In detailed examination of the code for some inner loops, however, we noticed that the allocator overspilled—leaving some registers unused while spilling critical values. This shortcoming led us to re-examine the register allocator and its algorithms. In particular, we began to investigate live range splitting [11, 16, 13]. During this time, the authors attended a colloquium talk that Jorge Mor´e gave at Rice. Mor´e’s talk included Matula’s smallestlast coloring algorithm [18]. Our study of Chaitin’s algorithm had

3. INFLUENCE

This paper appeared at P LDI 89, in a session with two other papers on improvements to graph-coloring allocators [3, 15]. The papers were notable because they all improved on Chaitin’s algorithm and each showed apples-to-apples comparisons—that is, the same allocator running in the same compiler with a single algorithmic change. (Most prior work on register allocation came in the form of experience papers, rather than experimental comparisons.) That session marked a resurgence of research interest in Chaitin-style register allocators. Nickerson noted that optimistic coloring improved the behavior of the allocator for values that required multiple registers [19].

20 Years of the ACM/SIGPLAN Conference on Programming Language Design and Implementation (1979-1999): A Selection, 2003. Copyright 2003 ACM 1-58113-623-4 ...$5.00.

ACM SIGPLAN

Department of Computer Science Rice University Houston, TX 77251-1982

[email protected]

From the earliest compilers, register allocation was recognized as an important optimization. Indeed, the original Fortran compiler spent two of its six passes on the problem [1]. (That compiler used an approach similar to the linear-scan algorithms being proposed today for just-in-time compilers.) In the early 1960’s, the Soviet mathematician Lavrov made the intellectual connection between allocation problems and graph coloring [17]. Unfortunately, Lavrov gave no algorithm; instead, he suggested that it was possible to enumerate all the colorings of the graph and take the best one. Chaitin, et al. took Lavrov’s fundamental ideas, developed them, and built the first graph-coloring register allocator [9, 8]. In essence, Chaitin’s allocator builds an interference graph where each node represents a value, computes an ordering on the nodes of that graph, and assigns colors to the nodes in that order. Our paper introduced an improved coloring strategy that produced better allocations for many graphs on which Chaitin’s method fails. The key difference between our algorithm and Chaitin’s algorithm lies in the timing of spill decisions. While computing the order for coloring, the allocator can reach a point where it cannot find a node in the interference graph that it can provably color. At this point, Chaitin’s allocator spills the value associated with that node and excludes the node from the coloring order. Our method also chooses a spill candidate at this point. Instead of spilling it, the allocator inserts it into the coloring order. The spill candidate either receives a color or it does not—in which case the allocator spills it. Experimental evidence in the paper confirms the effectiveness of this deferred spilling approach. This technique has been adopted in many commercial and research compilers.

2.

Linda Torczon

Department of Computer Science Rice University Houston, TX 77251-1982

283

Best of PLDI 1979-1999

Koblenz and Callahan built a hierarchical register allocator that relied, at its heart, on optimistic coloring [7]. Norris and Pollock took another approach to hierarchical allocation with an allocator that operated on program-dependence graphs [20]. We introduced a number of improvements, including rematerialization, conservative coalescing, and biased coloring [5]. Appel and George invented iterated coalescing [14]. Park and Moon struck a middle ground between Chaitin’s “aggressive” coalescing and the less aggressive conservative and iterated schemes [21]. Bergner, et al. invented interference-region spilling [2]. Of equal importance, a second thread in the literature has dealt with issues that arise in the implementation of graph-coloring register allocators. Gupta, et al. showed taht the compiler can use clique separators to reduce the memory requirements for allocation [15]. Choi, et al. introduced sparse evaluation graphs, reducing the time required to compute L IVE information [10]. Briggs, et al. explained how to encode multiple-register requirements into the interference graph [4] and how to implement some of the data structures [6]. Cooper et al. described fast techniques for constructing the interference graph [12]. These papers have made it easier to implement a fast and effective allocator. The algorithm, along with various improvements, has been implemented in many compilers, both research systems and commercial products. Indeed, Hopkins reported (in conversation) that IBM’s own implementation of our algorithm in the Tobey compiler yielded almost exactly the improvements reported in our paper.

4.

[8]

[9] [10]

[11]

[12] [13]

[14]

ACKNOWLEDGMENTS

IBM Corporation, through Dr. Horace Flatt of the Palo Alto Scientific Center, provided the support that let us explore these ideas.

[15]

REFERENCES [1] J. W. Backus, R. J. Beeber, S. Best, R. Goldberg, L. M. Haibt, H. L. Herrick, R. A. Nelson, D. Sayre, P. B. Sheridan, H. Stern, I. Ziller, R. A. Hughes, and R. Nutt. The FORTRAN automatic coding system. In Proceedings of the Western Joint Computer Conference, pages 188–198. Institute of Radio Engineers, NY, NY, USA, Feb. 1957. [2] P. Bergner, P. Dahl, D. Engebretsen, and M. O’Keefe. Spill code minimization via interference region spilling. SIGPLAN Notices, 32(6):287–295, June 1997. Proceedings of the ACM SIGPLAN ’97 Conference on Programming Language Design and Implementation. [3] D. Bernstein, D. Q. Goldin, M. C. Golumbic, H. Krawczyk, Y. Mansour, I. Nahshon, and R. Y. Pinter. Spill code minimization techniques for optimizing compilers. SIGPLAN Notices, 24(7):258–263, July 1989. Proceedings of the ACM SIGPLAN ’89 Conference on Programming Language Design and Implementation. [4] P. Briggs, K. D. Cooper, and L. Torczon. Coloring register pairs. ACM Letters on Programming Languages and Systems, 1(1):3–13, Mar. 1992. [5] P. Briggs, K. D. Cooper, and L. Torczon. Rematerialization. SIGPLAN Notices, 27(7):311–321, July 1992. Proceedings of the ACM SIGPLAN ’92 Conference on Programming Language Design and Implementation. [6] P. Briggs and L. Torczon. An efficient representation for sparse sets. ACM Letters on Programming Languages and Systems, 2(1–4):45–58, March–December 1993. [7] D. Callahan and B. Koblenz. Register allocation via hierarchical graph coloring. SIGPLAN Notices, 26(6):192–203, June 1991. Proceedings of the ACM

ACM SIGPLAN

[16]

[17]

[18]

[19]

[20]

[21]

284

SIGPLAN ’91 Conference on Programming Language Design and Implementation. G. J. Chaitin. Register allocation and spilling via graph coloring. SIGPLAN Notices, 17(6):98–105, June 1982. Proceedings of the ACM SIGPLAN ’82 Symposium on Compiler Construction. G. J. Chaitin, M. A. Auslander, A. K. Chandra, J. Cocke, M. E. Hopkins, and P. W. Markstein. Register allocation via graph coloring. Computer Languages, 6(1):47–57, Jan. 1981. J.-D. Choi, R. Cytron, and J. Ferrante. Automatic construction of sparse data flow evaluation graphs. In Conference Record of the Eighteenth Annual ACM Symposium on Principles of Programming Languages, pages 55–66, Orlando, FL, USA, Jan. 1991. F. C. Chow and J. L. Hennessy. Register allocation by priority-based coloring. SIGPLAN Notices, 19(6):222–232, June 1984. Proceedings of the ACM SIGPLAN ’84 Symposium on Compiler Construction. K. D. Cooper, T. J. Harvey, and L. Torczon. How to build an interference graph. Software—Practice and Experience, 28(4):425–444, Apr. 1998. K. D. Cooper and L. T. Simpson. Live range splitting in a graph coloring register allocator. In Proceedings of the Seventh International Compiler Construction Conference, CC ’98, Lecture Notes in Computer Science 1383, pages 174–187, 1998. L. George and A. W. Appel. Iterated register coalescing. ACM Transactions on Programming Languages and Systems, 18(3):300–324, May 1996. R. Gupta, M. L. Soffa, and T. Steele. Register allocation via clique separators. SIGPLAN Notices, 24(7):264–274, July 1989. Proceedings of the ACM SIGPLAN ’89 Conference on Programming Language Design and Implementation. J. R. Larus and P. N. Hilfinger. Register allocation in the SPUR Lisp compiler. SIGPLAN Notices, 21(7):255–263, July 1986. Proceedings of the ACM SIGPLAN ’86 Symposium on Compiler Construction. S. S. Lavrov. Store economy in closed operator schemes. Journal of Computational Mathematics and Mathematical Physics, 1(4):687–701, 1961. English translation in U.S.S.R. Computational Mathematics and Mathematical Physics 3:810-828, 1962. D. Matula and L. Beck. Smallest-last ordering and clustering and graph coloring algorithms. Technical Report CSE-8104, Department of Computer Science and Engineering, Southern Methodist University, July 1981. B. R. Nickerson. Graph coloring register allocation for processors with multi-register operands. SIGPLAN Notices, 25(6):40–52, June 1990. Proceedings of the ACM SIGPLAN ’90 Conference on Programming Language Design and Implementation. C. Norris and L. L. Pollock. Register allocation over the program dependence graph. SIGPLAN Notices, 29(6):266–277, June 1994. Proceedings of the ACM SIGPLAN ’94 Conference on Programming Language Design and Implementation. J. Park and S.-M. Moon. Optimistic register coalescing. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 196–204. IEEE, 1998.

Best of PLDI 1979-1999

ACM SIGPLAN

285

Best of PLDI 1979-1999

ACM SIGPLAN

286

Best of PLDI 1979-1999

ACM SIGPLAN

287

Best of PLDI 1979-1999

ACM SIGPLAN

288

Best of PLDI 1979-1999

ACM SIGPLAN

289

Best of PLDI 1979-1999

ACM SIGPLAN

290

Best of PLDI 1979-1999

ACM SIGPLAN

291

Best of PLDI 1979-1999

ACM SIGPLAN

292

Best of PLDI 1979-1999

ACM SIGPLAN

293

Best of PLDI 1979-1999

ACM SIGPLAN

294

Best of PLDI 1979-1999