Efficiently Computing Static Single Assignment Form and the Control Dependence Graph RON CYTRON, MARK JEANNE FERRANTE, BARRY K. ROSEN, and N. WEGMAN IBM Research Division and F. KENNETH Brown ZADECK University In optimizing compilers, data structure choices directly influence the power and efficiency of practical program optimization. A poor choice of data structure can inhibit optimization or slow compilation to the point that advanced optimization features become undesirable. Recently, static single assignment form and the control dependence graph have been proposed to represent data flow and control flow propertiee of programs. Each of these previously unrelated techniques lends efficiency and power to a useful class of program optimization. Although both of these structures are attractive, the difficulty of their construction and their potential size have discouraged their use. We present new algorithms that efficiently compute these data structures for arbitrary control flow graphs. The algorithms use dominance frontiers, a new concept that may have other applications. We also give analytical and experimental evidence that all of these data structures are usually linear in the size of the original program. This paper thus presents strong evidence that these structures can be of practical use in optimization. Categories and Subject Descriptors: D .3.3 [Programming Languages]: Language Constructs—control structures; data types and structures; procedures, functions and subroutines; Languages]: Processors—compilers; optimization; I. 1.2 [Algebraic D.3.4 [Programming Intelligence]: Automatic Manipulation]: Algorithms—analysis of algorithms; 1.2.2 [Artificial Programming-program transformation A preliminary version of this paper, “An Efficient Method of Computing Static Single Assignment Form, ” appeared in the conference Record of the 16th ACM Symposium on principles of Programming Languages (Jan. 1989). F. K. Zadeck’s work on this paper was partially supported by the IBM Corporation, the office of Naval Research, and the Defense Advanced Research Projects Agency under contract NOCIO14-83K-0146 and ARPA order 6320, Amendment 1. Authors’ ac!drwses: R. i3yiru:I, J. Ferrante, and M. N. Wegman, Computer Sciences Department, IBM Research Division, P. O. Box 704, Yorktown Heights, NY 10598; B. K. Rosen, Mathematical Sciences Department, IBM Research Division, P. O. Box 218, Yorktown Heights, NY 10598; F. K. Zadeck, Computer Science Department, P.O. Box 1910, Brown University, Providence, RI 02912. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. @ 1991 ACM 0164-0925/91/1000-0451 $01.50 ACM Transact~ons on Programmmg Languages and Systems, VO1 13, NO 4, October, le91, Pages 451.490 452 Ron Cytron et al. s General Terms: Algorithms, Additional dominator, Languages Key Words and Phrases: optimizing compilers Control dependence, control flow graph, clef-use chain, 1. INTRODUCTION In optimizing compilers, and efficiency structure can of practical program optimization. A inhibit optimization or slow compilation advanced optimization data structure features choices become directly undesirable. influence the power poor choice of data to the point where Recently, static single assignment (SSA) form [5, 431 and the control dependence graph [241 have been proposed to represent data flow and control flow properties of programs. Each of these previously unrelated techniques lends efficiency and power to a useful class of program optimizations. Although both of these structures are attractive, the difficulty of their construction and their potential size have discouraged their use [4]. We present new algorithms that efficiently compute these data structures for arbitrary control flow graphs. The algorithms use dominance the dominance This a new frontiers, also give analytical paper dependence Figure concept and experimental frontiers thus is usually presents 1 illustrates the role may evidence linear strong can be of practical that have that other applications. in the size of the original evidence that We the sum of the sizes of all SSA form program. and control use in optimization. of SSA form in a compiler. The intermediate code is put into SSA form, optimized in various ways, and then translated back out of SSA form. Optimizations that can benefit from using SSA form include code motion [221 and elimination of partial redundancies [431, as well as the constant propagation discussed later in this section. Variants of SSA form have been used for detecting program equivalence [5, 521 and for increasing parallelism in imperative programs [211. The representation of simple data flow information (clef-use chains) may be made more compact through SSA form. If a variable has D definitions and U uses, then there can be D x U clef-use chains. When similar information is encoded in SSA form, there can be at most E def-use chains, where E is the number of edges in the control flow graph [40]. Moreover, the clef-use information encoded in SSA form can be updated easily when optimizations are applied. This is important for a constant propagation algorithm that deletes branches to code proven at compile time to be unexecutable [50]. Specifically, the def-use program information text that is just use that a list, for each variable, of the places in the variable. Every use of V is indeed a use of the value provided by the (unique) assignment to V. To see the intuition behind SSA form, it is helpful to begin with straight-line code. Each assignment to a variable is given a unique name (shown as a subscript in Figure 2), and all of the uses reached by that assignment are renamed to match the assignment’s new name. Most programs, however, have branch and join nodes. At the join nodes, we In Figure 3, the add a special form of assignment called a @-function. ACM TransactIons on Programming Languages and Systems, Vol. 13, No, 4, October 1991 Static Single Assignment Form and the Control Dependence Graph Orlgmal Intermediate Opt Imlzed Code Intermediate J Code T P()+P~+P~-+ Fig, 1. Vertical arrows represent arrows represent optimizations, ... translation to/from Pn static single assignment V+4 VI-4 +-V+5 V+-6 V2*6 + Fig. 2. Straight-line code and its single assignment P if V - 4 then VI + 4 else V +- 6 else V2 + 6 IJse V several operands point. times. to the by new /* */ if-then-else @-function Subsequent replaced anditssingle indicate variables one assignment to Vi. assignment which uses of V become there and Constant Figure tines. */ version. assignments is only V2) V~ several uses of V3. The Vl, Vz, V3, ..., Indeed, Use each to V reach old variable use of Vi thle join V is thus is reached one assignment entire program. This simplifies the record keeping An especially clear example is constant propagation the next subsection sketches this application. 1.1 +5 P then Fig.3. just vi version. v~ e- q5(vf, /* form, Horizontal +-- V2 +7 --V+7 if 453 . by to Vi in the for several optirnizations. based on SSA form [501; Propagation 4 is a more elaborate test of Q, and propagating version of Figure the constant true 3. The else branch to this includes use of Q tells a a compiler that the else branch never falls through to the join point. If Q is the constant true, then all uses of V after the join point are really uses of the constant 4. Such possibilities processing is either can be taken costly into [48] or difficult the algorithm is fast and simple. Initially, the algorithm assumes that account without to understand SSA form, [49]. With each edge is unexecutable but the SSA form, (i. e., never followed at run time) and that each variable is constant with an as-yet unknown value (denoted T). Worklists are initialized appropriately, and the assumptions are corrected until they stabilize. Suppose the algorithm finds that variable P is not constant (denoted _L) and, hence, that either branch of ACM Transactions on Programming Languages and Systems, Vol. 13. No. 4, October 1991. 454 Ron Cytron et al . if if P then do P then 4 V +- do else do V + if else 6 Q then do V2 if return +- 4 +- 6 Q then return end end V3 + 4(V1, V2) . . . +- V3+S . . . +V+5 Fig the outer conditional executable, and processed, Vi end end the 4, Example ofconstant may betaken. the statements assumption propagation The outedges they about VI reach are is changed of the test ofP processed. from are marked When T to 4, and VI +4 all is uses of VI are notified that they are uses of4. (Each use ofVl is indeed a use of this value, thanks to the single assignment property.) In particular, the @function combines 4with Vz. The second operand of the @function, however, isassociated through with the inedge the else branch. algorithm uses T for So long the of thejoin as this second point that corresponds edge is considered operand, no matter what to falling unexecutable, the is currently as- sumed about Vz. Combining 4 with T yields 4, so uses of V~ are still tentatively assumed to be uses of 4. Eventually, the assumption about Q may change from T to a known would lead to the followed, and then constant or to ~ . A change discovery that the second the 6 at Vz combines with to either false or L inedge of the join point can be the 4 at VI to yield L at V~. A change to true would have no effect on the assumption at V3. Traditional constant propagation algorithms, on the other hand, would see that assignments of two different constants to V seem to “reach” all uses after the join point. They would thus decide The algorithm just sketched that V is not constant at these uses. is linear in the size of the SSA program it sees [501, but this size might be nonlinear in the size of the original program. In particular, it would be safe but inefficient to place a +-function for every variable at every join point. This paper shows how to obtain SSA form efficiently. When ~-functions are placed possible but is unlikely in practice. The linear in essentially 1.2 Where At first the size of the original linear in the SSA size. to Place glance, carefully, nonlinear behavior is still size of the SSA program is typically program; the time to do the work is @-Functions careful placement might seem to require the enumeration of pairs of assignment statements for each variable. Checking whether there to V that reach a common point might seem to be are two assignments intrinsically nonlinear. In fact, however, it is enough to look at the domiof each node in the control flow graph. Leaving the technicalinance fi-ontier ties to later sections, we sketch the method here. ACM Transactions on Programming Languages and Systems, Vol 13, No 4, October 1991 Static Single Assignment Form and the Control Dependence Suppose so that that any a variable V has just one assignment be either a use of the use of V will program or a use of the value assignment to V. Let X be the will determine basic block the value Y. When . in the original value V. program, at entry along control flows along any edge X + Y, the code in Y will to reach a node node strictly dominated X it may be. Eventually, Z not strictly X + 1{ to a dominated by to V may appear in the and be X (in a [ways by X will always see Vl, no however, control may be able X. Suppose node on a path, so that Z sees VI along one inedge another inedge. Then Z is said to be in the dominance clearly in need of a @-function for V. In general, assignments to the see VI unaffected by V.. If Y # X, but all paths to Y must still go through dominate Y), then the code in Y will which case X is said to strictly see VI. Indeed, any matter how far from 455 VI from the most recent execution of the basic block of code that assigns to V, so X of V when entered Graph original Z is the but first such see V. along frontier of X and is no matter how many program may and no matter how complex the control flow may be, we can place ~-functions for V by finding the dominance frontier of every node that assigns to V, then the dominance frontier of every node where a +-function has already been placed, and so on. The same concept also be used conditions dependent frontiers used for computing dependence [20, Outline and program analysis the to deal with and formalizes SSA form (Section Section 7 explains experiments representation a statement is control definitely causes that cause the statement to be of parallelism [2], program of control flow by a directed to construct it. This here. Our algorithm graph, section can be 5) and the control dependence graph (Section 6) efficiently. how to translate out of SSA form. Section 8 shows that our are linear in the size of programs restricted to certain control We also give evidence of general linear behavior by reporting on Section 9 summarizes the algowith FORTRAN programs. 2. CONTROL basic can those these variants. Section 4 reviews the dominator tree dominance frontiers. Then we show how to compute rithms and time bounds, compares presents some conclusions. The SSA form identify of the Rest of the Paper 2 reviews algorithms structures. which [28]. Section 3 explains SSA form and sketches how also considers variants of SSA form as defined adjusted concept 24], Informally, the branch to execute while another edge can Such information is vital for detection optimization, Section control affecting statement execution. on a branch if one edge from statement skipped. 1.3 of dominance to compute statements blocks, FLOW with other techniques, and GRAPHS of a program where our technique program are flow organized enters into a basic (not block necessarily maximal) at its first statement and leaves the basic block at its last statement [1, 36]. Basic blocks are indicated by the column of numbers in parentheses in Figure 5. A control is a directed graph whose nodes are the basic blocks of a program jZow graph and two additional nodes, Entry and Exit. There is an edge from Entry to ACM Transactions on Programming Languages and Systems, Vol. 13, No, 4, October 1991 456 . Ron Cytron et al. (1) 1+1 J+ I (1) (1) K+l L+-1 (1) repeat (2) (2) if (P) then (3) (3) (3) (4) do J- I if (Q) then L t 2 else L - 3 K-K+ (5) (6) (6) (7) I end else print KtK+2 (I, J,K, (8) L) 10 (9) repeat if (9) (10) (R) then until L +- L + 4 (S) 11 (11) @IEl--& (12) (12) 1-1+6 until 4 9 (T) Fig 5 Simple pro~amand itscontrol flow graph any basic block at which the program can be entered, Exit from any basic block that can exit the program. the representation of control also an edge from transfers of control node dependence and explained Entry to Exit. The other (jumps) between the basic is on a path from of X is any successor Entry node and there is an edge to For reasons related to in Section and on a path to Exit. For Y with an edge X + Y in SZLCC(X) is the set of all successors of X; with more than one successor is a branch 6, there edges of the graph blocks. We assume is represent that each each node the graph, X, a and similarly for predecessors. A node node; a node with more than one node. Finally, each variable is considered to have an predecessor is a join assignment in Entry to represent whatever value the variable may have when the program is entered. This assignment is treated just like the ones the that appear explicitly in the code. Throughout this paper, CFG denotes control flow graph of the program under discussion. of length J in CFG consists of a For any nonnegative integer J, a path Xo, ., XJ) and a sequence of J edges sequence of J + 1 nodes (denoted (denoted e,, . . . . eJ) 1< j < J. (We write such that e, runs from XJ_, to X, for all j with sequences, one item eJ: X~_l ~ X~. ) As is usual with (node occur or edge) ACM Transactions may on Programmmg several times. The Languages and Systems, Vol null path, with 13, No 4, October J = O, is 1991 Static Single Assignment Form and the Control Dependence allowed. We write p is known X.: p: XJ for an unrestricted path Graph p, but . 457 X. “ XJ p: if to be nonnull. Nonnull paths p: X. ~ XJ and q: YO ~ YK are said to converge at a node Z if X. # Yo; (1) XJ=Z=YK; (XJ=Y,) Intuitively, the node-disjoint, paths but p they SINGLE This section intermediate and Jork=K). q start at the possibility ASSIGNMENT (3) different at the One of the paths need to consider 3. STATIC -(j= come together and in (3) is deliberate. we still (2) end. nodes The may happen and are almost use of or rather to be a cycle than Z ~ Z, but of convergence. FORM initially assumes that programs have form in which each statement evaluates been expanded to an some expressions and uses the results either to determine a branch or to assign to some variables. (Other program constructs are considered in Section 3.1.) An assignment statement A has the form LHS( A) - RHS( A), where the left-hand side LHS( side tuple. A) is a tuple RHS( A) Each of distinct is a tuple target target variable in from RHS( there is no need to distinguish LHS( A). In the examples use of longer tuples variables of expressions, A) already the same is assigned discussed, between in expanding (U, V, . ..) with V and other the as the LHS corresponding all tuples (V). program and the right-hand length value are l-tuples, and 3.1 sketches the Section constructs (such as proce- dure calls) so as to make explicit the variables used and/or changed by the statements in the source program. Such explicitness is a prerequisite for most optimization anyway. The only novelty in our use of tuples is the fact that they into the intermediate provide a simple text. and uniform Practical way concerns ate text are addressed in Section 3.1. Translating a program into SSA form step, join some nodes trivial o-functions in program’s the to fold about is a two-step V ~ @(V, V, . . . ) are control flow the graph. results of analysis the size of the inte rmedi process. inserted In the In the at some second step, first of the new variables V, (for i = O, 1,2, . . . ) are generated. Each mention of a variable V in the program is replaced by a mention of one of the new variables V,, where a mention may be in a branch statement, Throughout an ordinary assignment expression or on either this paper, an assignment or a @function. Figure side of an assignment statement 6 displays may the be either result of translating a simple program to SSA form. A +-function at entrance to a node X has the form V ~ +( R, S, . . . ), where V, R, S, . . . are variables. The number of operands R, S, . is the number of control flow predecessors of X. The predecessors of X are listed in some arbitrary fixed order, and the jth operand of + is associated with the jth ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991. 458 Ron Cytron et al. 0 1-1 J+ Kti L+-1 11-1 Jl +-1 Kl +-l L1tl I repeat repeat if (P) then 12 - #(13,11) J2 K2 + @(J4, JI) l#J(K~,K~) L2 +- 4(L9, L1) if (P) do J+ then I if (Q) then L3 t 2 L +--- 3 else L4 +- 3 L5 K3 I (I, J,K, else L) - K4 +- K2+2 #( J3, J2) K5 - @(K3, K4) L6 + ~(L2,L~) print(12, J4, K5, L6) repeat (R) then L7 +- if (R) L+-L+4 until (S) 1+1+6 until Fig.6 is just uses only + L8 -L7+4 @(L8, L7) (S) 13-12+6 (T) predecessor. Ifcontrol supportl remembers @(L~,L~) then L9 @(R, S,...) qifunction @(L3, L4) +K2+1 end K+--K+2 repeat until 12 L +-- 2 J4 until (Q) else end else if - if then K-K+ print do J3 Simple program reaches Xfromits J“ while executing the value one ofthe (T) anditsSSAform. jth predecessor, then the run-time the o-functions in X. The value of of the jth operand. Each execution of a operands, but which one depends on the flow of control just before entering X. Any d-functions in X are executed before the ordinary statements in X. Some variants of d-functions as defined here are useful for special purposes. For example, each @function can be tagged lWhen SSA form is used only as an intermediate form (Figure 1), there is no need actually to provide this support. For us, the semantics of @ are only important when assessing the correctness of intermediate steps in a sequence of program transformations beginning and ending with code that has no +-functions, Others [6, 11, 521 have found it useful to give @ another parameter that incidentally encodes j, under various restrictions on the control flow. ACM Transactions on Programming Languages and Systems, Vol. 13, No 4, October 1991 Static Single Assignment Form and the Control Dependence with the node X where it appears [5]. When the control Graph flow . of a language suitably restricted, each @function can be tagged with information conditionals or loops [5, 6, 11, 52]. The algorithms in this paper apply variants is about to such as well. SSA relation form 459 form may be considered between two programs. if each variable as a property of a single program OY as a A single program is defined to be in SSA is a target of exactly one assignment program text. Translation to SSA form replaces new program with the same control flow graph. V, the following conditions are required statement in the the original program by a For every original variable of the new program: (1) If two nonnull paths X ~ Z and Y~ Z converge at a node Z, and nodes X and Y contain assignments to V (in the original program), then a trivial +-function V + q5(V, . . . . V) has been inserted at Z (in the new program). (2) Each mention of V in the original program or in an inserted +fulnction has been replaced by a mention of a new variable V,, leaving the new program in SSA form. (3) Along any control original program) Then V and Translation to flow path, consider and the corresponding Vi have any use of a variable V (in the use of V, (in the new program). the same value. SSA minimal form is translation to SSA form with the proviso that the number of @-functions inserted is as small as possible, subject to Condition (1) above. The optimizations that depend on SSA form are still valid if there are some extraneous +-functions beyond those that would cause appear in minimal information the optimization where they forego placing SSA to be lost, process itself. required. are extraneous always variant +-functions add unnecessary it is important of SSA at a convergence more uses for V in or after any risk of losing Condition and it is sometimes However, as Figure However, they Thus, One a @-function form. and to place form point [52] can overhead +-functions would Z, so long to only sometimes as there nre no Z. The @function could then be omitted without (3). This has been called pruned SSA form [16], preferable to our placement 16 in Section 7 illustrates, at all our convergence points. form is sometimes preferable to pruned form. When desired, pruned SSA form can be obtained by a simple adjustment of our algorithm [16, Section 5.1]. For any variable V, the nodes at which we should insert @-functions in the original program of SSA form. two paths assignments observe that can be defined recursively A node Z needs a +-function that to originate at V or needing a node Z needs by Condition two different @-functions a @-function (1) in the definition for V if Z is a convergence nodes, both nodes for V. Nonrecursively, for V because point for two nonnull paths X ~ Z and Y ~ Z that already containing assignments to V. If Z did point for containing we may Z is a convergence start at nodes X and Y not already contain an assignment to V, then the @function inserted at Z adds Z to the set of nodes that contain assignments to V. With more nodes to consider as origins of paths, we may observe more nodes appearing as convergence points of ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991. 460 Ron Cytron et al. . nonnull paths needing insertion in much 3.1 If originating @functions cycle. 2 The less time. Other the Program source at nodes could thus algorithm with be assignments found presented by here to V. The iterating obtains an the set of nodes observation/ same end results Constructs program computes expressions using constants and scalar variables to assign values to scalar variables, then it is straightforward derive an intermediate text that is ready to be put into SSA form otherwise analyzed and transformed by an optimizing compiler). to (or However, source programs commonly use other constructs as well: Some variables are not scalars, and some computations do not explicitly indicate which variables they use constructs or change. This to an explicit section sketches intermediate text how suitable to map some important for use in a compiler. 3.1.1 Arrays. It will suffice to consider one-dimensional arrays here. Arrays of higher dimension involve more notation but no more concerns. If A and B are array variables, then array assignment statements like A + B or A +- O, if allowed assignments program, taken however, to awkward, value rather to scalar be both an by the variables. source Many language, of the will be mentions of A(i) integer variable. Treating because an assignment for to A(i) can be treated mentions some index A(i) may as just of A in the i, which a variable or may not like source may would change be be the of A(j) and because the value of A(i) could be changed by assigning to i than to A(i). An easier approach is illustrated in Figure 7. The entire array is treated like a single scalar variable, which may be one of the operands of Access or Update. 3 The expression Access(A, i) evaluates to the ith component of A; the expression Update(A, j, V) evaluates to an array value that is of the same size as A and has the same component values, except for V as the value of the jth component. Assigning a scalar value V to A(j) is equivalent to assigning an array value to the entire array A, where the new array value depends on the old one, as well as on the index j and on the new scalar value V. The translation to SSA form is unconcerned with whether the values of variables As with scalars, translation are large objects or what the operators mean. of array references to SSA form removes some anti- and output-dependences [32]. In the program in Figure 7a, dependence analysis may prohibit reordering the use of A(i) by the first statement and the definition of A(j) by the second statement. After translation to SSA form, the two references to A have been renamed, and reordering is then possible. For example, the two statements can execute concurrently. Where optimization does not reorder the two references, Section 7.2 describes how transla - 2In typical cases, paths originating at @functions add nothing beyond the contributions from paths originating at ordinary assignments. The straightforward iteration is still too slow, even in typical cases 3This notation is similar to Select and Update [24], which, in turn, is similar to notation for referencing aggregate structures in a data flow language [23]. ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991 Static Single Assignment Form and the Control Dependence Graph + A(i) A(j) + V - A(k) + 2 A t Llpdate(A, T - Access Access (A, i) j,V) (A, k) . 461 +- ACCeSS(A8, 17) A9 + Update(A8, j~, V~) T1 + Access(A9,k~~) +T1+2 -T+2 (b) (a) (c) Fig. 7. Source code with array component references (a) is equivalent to code with explicit Access and Update operators that treat the array A just like a scalar (b). Transformationto SSA form proceeds as usual (c). tion out of SSA form reclaims maintain distinct variables. Consider array A. indirectly) terms the loop shown storage in Figure would 8a, which of its effect assigns on A, the whole loop be necessary to every component to of is like accesses of A require assignment each values produced execution of the statement can us, or for anyone problem for scalars, is to communicate optimizations of the form form, the Update to the loop. operator in response to the crudeness of the Update operator is to calculations and other genuine scalar calculations can still be optimized extensively. Another response sis [3, 10, 32, 51], which can sometimes case for an assignment ). Any assignment to a component of A an initialization loop is dead code that be eliminated. With or without SSA 8b makes A appear to be live at entry One reasonable accept it. Address the otherwise The value assigned to each component A(i) does not depend (even on any of the values previously assigned to components of A. In A + (...), where A is not used in (... that is not used before entering such should Figure that (like dead then by any other assignment be viewed who assignment to A(i) as an uses Update of the results elimination) that some code is to perform dependence analydetermine that no subsequent in to A. Such is Figure initialization to make arrays of dependence are usually 8a. The of A. The loolk like analysis formulated to in terms of “scalar” variables. A simple solution to this formal problem is shown in Figure 8c, where the HiddenUpdate operator does not mention the assigned array operand. The actual code generated for an assignment from a HiddenUpdate expression is exactly the same as for an assignment from the corresponding the target Update expression, where the hidden operand 3.1.2 Structures. A structure can be generally regarded where references to structure fields are treated as references the array. Thus, an assignment Update of the structure, Access of the structure. references, this treatment constants. is supplied by of the assignment. Dependence to a structure field as an array, to elements of is translated iIItO an and a use of a structure field is translated into an In the prevalent case of simple structure field results in arrays whose elements are indexed by analysis can often determine independence i31111011g such accesses, so that optimization may move an assignment to one field far from an assignment to another field. If analysis or language semantics reveals a structure whose n fields are always accessed disjointly, then the ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, Octob,w 1991, 462 . integer Ron Cytron et al. A(1:1OO) integer AO(I:lOO) integer AO(I:lOO) integer A1(l:lOO) integer A1(I:lOO) integer A2(I:IOO) integer A2(1:100) i+l il repeat repeat A(i) t-- t ~(ii,i~) i2 +-- #(i~, Al + f#(fio, A2) Al +- ~(A0,A2) i3 i repeat i2 A2 i i+l+l until il+--l +--l > 100 until + update (Al,lz, iZ) -iz+i i3 > 100 (a) A2 +- 13 +-12+1 until i3) HiddenUpdate(12, i3 > (b) i2) 100 (c) Fig. 8, Source loop with array assignment (a) is equivalent to code with an Update operator that treats the array A just hke a scalar (b), As Section 72 explains, the eventual translation out of SSA form will leave just one array here, Using HiddenUpdate (c) is a purely formal way tosummarize some results of dependency analysis, if available. structure structures can be decomposed are united in the and the the program’s expression of the actual use into n distinct source program structure’s of the variables. The elements only for organizational decomposition structure in more SSA apparent of such reasons, form to makes subsequent optimization. 3.1.3 Implicit References requires knowing addition to those modify those to variables The Variables. variables modified explicitly and construction used referenced, by of SSA form a statement. a statement may In use or implicit variables not mentioned by the statement itself. Examples of such references are global variables modified or used by a procedure call, aliased variables, and dereferenced pointer variables. To obtain SSA form, we must account for implicit as well as explicit references, either by conservative assumptions or by analysis. Heap storage can be conservatively modeled by representing the entire heap as a single variable that is both modified and used by any statement that may change the heap. More refined modeling is also possible [15], but the conservative approach is already strong enough to support optimization of code that does not interspersed with heap-dependent code. For any statement S, three types of references involve affect the heap translation but into is SSA form: (1) MustMod(S) is the set of variables that must be modified by execution of that may be modified by execution of s. (2) MayMod(S) is the set of variables s. (3) May Use(S) is the set of variables may be used by S. We represent an assignment ACM Transactions whose values implicit references for any statement statement A, where all the variables on Programmmg Languages and Systems, Vol, prior to execution of S S by transformation to in MayMod(S) appear 13, No, 4, October 1991 Static Single Assignment Form and the Control Dependence Graph in LHS( A) and MustMod(S)) An optimizing procedures all appear called the in variables RHS( in May Use(S) will but not change. compiler may (directly or indirectly) or may it is reasonable not have limit that are available to extract To determine parameter aliasing effects and a sophisticated side Small tuples often unavailable. analysis effects technique and the can be represented Many compilers structure plus of the tuples that a direct includes a flag the usual LHS and Sophisticated representation be compact. (set to indicate of the few local because of parameter transmission. The cal analyses of optimization techniques 3.2 Overview Translation is that are analysis, that small. however, analysis all globals variables intuitive (with conveniently formulated in terms of explicit can still be used in the implementations. result RHS) been The vari- see [14, 15, is at all. analyzed. compilers The representation that can information: on global do no interprocedural can still If no to the caller aliasing, Consider a call to an external procedure that has not tuples for the call must contain all global variables. representation local detailed pointer (both of all effects. transformations more is applied, tuples directly. that of procedures ables, see [7-9], [19], [37], and [42]; to determine 27, 29, 33, 34], and [44]. bodies of their variables Techniques are few – to make conservative asor any parameter passed the extent be performed. When ( ikfcz~illod(s) access to the or to summaries to assume Such assumptions there 463 A). specific information is available, it is customary sumptions. A call might change any global variable by reference, u . Both own caln be a are in the tuple) are in the tuple explanations and theoretior without SSA form) are tuples; compact representations of the SSA Algorithm to minimal (1) The dominance graph (Section SSA form frontier 4.2). is done in three mapping (2) Using the dominance frontiers, variable in the original program steps: is constructed from the contrcd the locations of the @-functions are determined (Section 5.1). flow for each (3) The variables are renamed (Section 5.2) by replacing each mention of an original variable V by an appropriate mention of a new variable V,. 4. DOMINANCE Section 4.1 reviews the dominance flow graph and how to summarize 4.2 introduces the dominance relation [47] between nodes in the control this relation in a dominator tree. Section frontier mapping and gives an algorithm for its computation. 4.1 Dominator Trees Let X and Y be nodes in the control flow graph CFG of a program. If -X appears on every path from Entry to Y, then X dominates Y. Domination is both reflexive and transitive. ACM Transactions If X dominates on Programming Languages Y and and Systems, X # Y, then Vol. X strictly 13, No. 4, October 1991 464 Ron Cytron et al . Entry [- I -] ‘(-) (-i 1 [- I Exit] Exit [- I -] I (Exit 2[-1 (-l~xit2 3[-18] P 5[;]6] (-) 4[616] ) Exit,2] \ 7[8181 8[-1 Exi’t ,2] 6&\] (Exit,2) 9[-1 Exit,2,9] ,_,~2 ,, 10[111 11] li[91Exit:2,9] I (’ E=’k (Exit ,2) 12[Exit,21Exit,2] w Fig. 9 Control flow graph anddominator tree of the simple program Thesets ofnodes listed in ()and[] brackets summarize thedominance frontier calculation from Section 4.2, Each node X is annotated with two sets [DFIOCaj (X) I DF’(X)] and a third set (DFUP(X)). dominates for Y. In formulas, domination. immediate If X dominator does we write X>> strictly not of Y (denoted Y for strict dominate idom( Y)) domination and Y, we write X ~ is the closest strict X ~ Y Y. The dominator of Y on any path from Entry to Y. In a dominator tree, the children of a node X are all immediately dominated by X. The root of a dominator tree is Entry, and any node Y other than Entry has idom( as its parent Y) in the tree. The dominator tree for CFG from Figure 5 is shown in Figure 9. Let N and E be the numbers of nodes and edges in CFG. The dominator tree can be constructed in 0( Ea( E, N)) time [35] or (by a more difficult algorithm) in 0(E) time [26]. For all practical purposes, a( E, N) is a small constant,4 so this paper will consider the dominator tree to have been found in linear time. The dominator tree of CFG has exactly the same set of nodes as CFG but a very different always always set of edges. Here, the words refer to CFG. The words refer to the dominator tree. parent, predecessor, child, successor, ancestor, and and path descendant 4Under the definition of a ueed in analyzing the dominator tree algorithm [35, p. 123], N s E implies that a(17, N) = 1 when logz N < 16 and a(ll, N) = 2 when 16 < logz N < 216. ACM TransactIons on Programming Languages and Systems, Vol. 13, No 4, October 1991 Static Single Assignment Form and the Control Dependence 4.2 Dominance The dominance Y such that Graph . 465 Frontiers frontier DF( X) X dominates of a CFG a predecessor node X is the set of all of Y but CFG does not strictly nodes dominate Y: DF(X) = {Y[(~Pe Pred(Y))(X> Computing D1’( X) directly much of the dominator-tree. X would the be quadratic, dominance the mapping, such that the sets themselves mapping we define in time two intermediate the following equation DF(X) X? Y)}. from the definition would require searching The total time to compute DF( X) for all nodes even when frontier Pand linear are small. in the size sets DFIOC~l and Ex To compute I DF( X) I of DFUP for eaclh node holds: = DFIOC~l(X) U (J (4) DFUP(Z). ZeChildren(X) Given This any local node X, some contribution of the successors DFIOC.l( X) of X may is defined contribute to DIF( X). by Given any node Z that is not the root Entry of the dominator tree, some of the nodes in DF(Z) may contribute to DF( idom(Z)). The contribution DFUP(Z) that Z passes up to idom(Z) is defined by DFUP(Z)~f{Ye LEMMA The dominance 1. PROOF. dominance still show Because and strictly dominate dominance let equation ~ Y}. (4) is correct. is reflexive, U ~ Y be an edge Y. U # X, on the other The frontier I idom(Z) DFIOC.l( X) G DF( X). Because is transitive, each child Z of X has DFUP(Z) G DF( X). We must that everything in DF( X) has been accounted for. Suppose Y e DF( X), cannot implies DF(Z) If U = X, hand, then strictly dominate that Ye DFUP(Z). Y intermediate can sets such then there because that X dominates Y e DFIOC.Z( X), is a child X U but and we Z of X that does not strictly are dominates dominate dcles not done. If U but Y. This ❑ be computed = {Y= SUCC(X) with simple equality tests as follows: LEMMA 2. For any node X, DFIOC~t(X) PRooF. We assume that (X> Y ~ Succ( X) Y) on Programming # X}. and show that # (idom(Y) The = part is true because the immediate dominator. For the = part, suppose that ACM Transactions I idom(Y) =X). dominator X strictly is defined dominates to be a strict Y and, hence, Languages and Systems, Vol. 13, No. 4, October 1991. 466 . Ron Cytron et al, for each ~ in DF(X) each for /*local./ a bottom-up + if traversal of the dominator tree do 0 Y E Succ(X) id.m(Y) # do X then llF(X) +- IIF’(X) U {Y} end for each for Z E each /*up*/ if Children(x) Y E do LIF’(Z) idorn(Y) # do X then DF’(X) - llF’(X) U {Y} end end end Fig. 10. that some Entry V to child dominates idom( Y) V of Y that X X or X and V = Y. But = x. V For any node X and We assume The - part is true immediate dominance. Y. Then then V appears follows the cannot on any edge dominate path X, so V = Y any child = {Ye DF(Z) that Y G DF(Z) (X> Y)+ Z of X in the dominator V = Y. If I idom(Y) and show (idom(Y) # X}. that =X). because strict dominance For the + part, suppose V = Y, then is the transitive closure that X strictly dominates idom( Y) = idom( V ) = X, and we are done. of Y U o!i to Y U or Suppose V + Y (and, hence, that V dominates U) and derive a contradiction. one child of X can dominate U, so V = Z and Z dominates Y. This contradicts These dominance the hypothesis results imply frontiers that the given putes DFIOC.Z( X) on the storage to it. The /*up*/ dominator tree bottom-up, children. To illustrate the dominator tree in Figure THEOREM PROOF. correctness in Figure Direct from of the 10. The algorithm /*local*/ for line computing effectively TransactIons and uses it in (4) without needing to devote line is similar for DFUP( Z). We traverse the visiting each node X only after visiting each of its working of this algorithm, we have annotated the 9 with the information [1 and () brackets. in Figure the preceding on Programming the com- fly The algorithm 1. ❑ Ye DF(Z). 10 is correct. lemmas. ❑ Let CFG have N nodes and E edges. The loop over Succ( X) in Figure examines each edge just once, so all executions of the /*local*/ line ACM and tree, and, hence, that some child V of X dominates Y. Choose a predecessor Y such that Z dominates U. Then V appears on any path from Entry V dominates that goes to U and then follows the edge U + Y, so either that Only from X + Y, so either n DFUP(Z) PROOF. of DF( X) for each CFG node X. dominates goes to = idom(v) LEMMA 3. Calculation Languages and Systems, Vol. 13, No. 4, October 1991 10 are Static Single Assignment Form and the Control Dependence Graph complete in time 0(E). complete in time O(size(DF)). which amounts shows that, have 4.3 Relating We start by stating The complexity algorithm of the time O(E and have observed that are Section is usually linear. it is faster compiler line + size(DF)), of 0( E + N2 ). However, DF 467 /*up*/ is thus mapping in the PTRAN Frontiers more should executions overall size of the computations Dominance the @-functions of join nonnull the this data-flow all The to a worst-case in practice, implemented standard Similarly, . 8 We than the [21. to Joins formally the nonrecursive be located. Given characterization of where a set Y of CFG nodes, the set J(Y) nodes is defined to be the set of all nodes Z such that there are two CFG paths that start at two distinct nodes in Y and converge at Z. iterated join J+ ( 5“ ) is the limit of the increasing sequence of sets of nodes J1= J(Y); J,+l In particular, if Y happens = J(YU to be the set of assignment V, then J+(Y) is the set of @function The join and iterated join operations extend natural the dominance way: frontier As with join, increasing the sequence iterated nodes for a variable nodes for V. map sets of nodes to sets of nodes. mapping DF( J,). S“ ) = from ~QJDF( dominance nodes to sets of nodes We in the X) . frontier DF+ ( Y ) is the limit of the of sets of nodes DF1=DF(Y); = DF(W DF,+l The actual algorithm iterated computation of DF,) . DF+ ( Y ) is performed in Section 5.1; the formulation dominance frontiers to iterated assignment nodes for a variable V, then J+(Y) by the efficient here is convenient joins. If the set Y we will worklist for relating is the set of show that =DF+(Y) (this equation depends on the fact that Entry is in Y) and, hence, that the location of the @functions for V can be computed by the worklist algorithm for computing DF+ ( Y ) that is given in Section 5.1. The following lemmas do most of the work by relating dominance frontiers to joins: LEMMA 4. For any nonnull path p: X ~ Z X’ e {X} U DF+ ({ X}) on p that dominates every node on p, the node X’ can be chosen PRooF. If X dominates get all the claimed properties ACM Transactions every node of X’. on Programming on p, then We may Languages in CFG, Z. Moreover, in DFf ({ X}). we just assume and Systems, there unless that Vol. is a node X dominates choose X’ = X to some of the nodes 13, No. 4, October 1991. 468 on . p Ron are not Cytron et al dominated by X. Let the sequence of nodes on p be X = i such that X does not dominate X,, the Xo, . . ., XJ = Z. For the smallest predecessor X,_ ~ is dominated by X and so puts X, into DF’( X). Thus, there are choices of ~“ with XJ e DF+ ({ X}). Consider X’ = XJ for the largest j with XJ ~ DF+ ({ X}). We will show that X’ >> Z. Suppose not. Then j < J, and there is a first k with j < k s J such that X’ does not dominate Xh. The predecessor Xh _ ~ is dominated by X’ and so puts X~ into DF( X’). Thus, Xh e DF( DF+ ({ X} )) = DF+ ({ X} ), contradicting 5. LEMMA p: X$ DF+({ Z the choice Let X + Y be two nodes in CFG, and q: Y$ Z in CFG and suppose converge at Z. of j. that Then D nonnull Z ~DF+({ paths X}) U Y}). PROOF. We consider three cases that are obviously exhaustive. In the first two cases, we prove that Z e DF+ ({ X} ) U DF+ ({ Y} ). Then we show that the first two cases are (unobviously) exhaustive because the third case leads to a contradiction. Let nodes X= XO, ..., X’ be from Lemma 4 for the path p, with the sequence of XJ = Z. Let Y’ be from Lemma 4 for the path q, with the sequence Y = YO, . . ., YK = Z. of nodes X’ is on q and show that Z e DF+({ X}). By the Case 1. We suppose that definition of convergence (specifically, (3) in Section 2), X’ = Z. We may now assume already of Z but that Z = X and X dominates every node on p. (Otherwise, Lemma 4 asserts Z e DF+ ({ X}).) Because X dominates the predecessor XJ_ ~ does not strictly dominate Case 2. We suppose that reasoning just as in Case 1. Z, we have Y is We derive a contradiction Case 3. q and Y is not on p. Because X’ on p, from ~ Z c DF( X) and show that the suppositions Z but X’ G DF+ ({ X}). is not Z e DF’ that ({ Y}), X’ is not on on q, X’ >> YK = Z and, therefore, dominates all predecessors of YK. In particular, X’ >> YK_ ~. But X’ # YK_~, SO X’ >> Y&~, and we continue inductively to show that X’ >> Y~ for all k. In particular, X’ >> Y’. On the other hand, by similar reasoning from the supposition that Y >> Z but Y is not on p, we can show that Y >> X’. Two nodes cannot strictly dominate each other, so Case 3 is ❑ impossible. LEMMA PROOF. LEMMA For any set Y’ of CFG 6. We apply 7. For Lemma 5. any set Y nodes, J(7) G DF’(J7 ). ❑ of CFG nodes such that Entry e Y, DF( Y ) G J(Y). PROOF. Consider any Xc .Y and any Ye DF( X). There is a path from X to Y where all nodes before Y are dominated by X. There is also a path from Entry to Y where none of the nodes are dominated by X. The paths ❑ therefore converge at Y. THEOREM 2. The set of nodes that need ~-functions for any variable V is the iterated dominance frontier DF+ ( Y ), where Y is the set of nodes with assignments to V. ACM Transactions on Programmmg Languages and Systems, Vol. 13, No. 4, October 1991 Static Single Assignment Form and the Control Dependence Graph By Lemma PROOF. 6 and induction on i in the definition . of J+, 469 we can show that J+(Y) The induction (5) GDF+(Y). step is as follows: Ji+l = J(YUJ,) GJ(SWDF+(Y)) GDF+(!YUDF+(.Y)) The node Entry is in Y, so Lemma =DF+(Y). 7 and another DF+(Y)G The induction yield J+(Y). (6) step is as follows: DF,+l = DF(YUDF,) GJ(YU SDF(J’W J+(Y)) The set of nodes that need @-functions prove ❑ the theorem. 5. CONSTRUCTION 5.1 induction OF MINIMAL J+(Y)) =J+(Y). for V is precisely so (5) and (6) J+(Y), SSA FORM Using Dominance Frontiers to Find Where @-Functions Are Needed The algorithm algorithm structures in Figure is performed are used: — W is the worklist algorithm, ments to DF( X) @-functions. of CFG nodes being a @function. Each The outer V in the program. processed. loop of this Several In each iteration data of this of nodes that contain assignensures that each node Y in iteration terminates when the w orklist empty. — Work(*) is indicates an array whether of the outer indicates of flags, one flag for each node, X has ever been added to W during where Work(X) the current iteration loop. –HasAlready(*) The flags trivial W is initialized to the set &(V) V. Each node X in the worklist receives becomes 11 inserts once for each variable is an array whether Work(X) because the needing a d-function of flags, a @-function and property for one for each node, where V has already Ha.sAlready( of assigning for V. The flags X) to are independent. V the values true and false, but this would to reset any true flags between iterations, have at X) X. We need two flags is independent could HasAlready( been inserted of the been implemented property with of just require additional record keeping without the expense of looping over all the nodes. It is simpler to devote an integer to each flag and to test flags by comparing them with the current iteration count. assignments to variables,, where Let each node X have AO,,~ (X) original each ordinary assignment statement LHS + RHS contributes the length of the tuple LHS to A o,,~( x). counting assignments to variables is one of ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, Octc,ber 1991. 470 Ron Cytron et al. . IterCount for each + O node X do HasAlready(X) Work(X) + - O 0 end W’+o for each variable IterCount for V +- each X --I 1 d(V) ~ Work(X) do IterCount +- do IterCount W+wu {x} end while W # take for @ do X from each Y if W & DF’(.T) HasAlready(Y) then do < IterCount do place (F’ - @(V,..., HasAlready(Y) if Work(Y) then V)) + < at Y IterCount IterCount do Work(Y) Jv +- +w u IterCount {Y} end end end end end Fig, 11. Placement of ~-functions. several measures of program size. 5 By this measure, the program from size AO,,~ = XX AO,,~(X) to size A,O, = XX AtO,( X), where expands each @- = A o,,g(x) function placed at X contributes 1 to A,.,(X) + A+(X). a similar expansion in the number of mentions of variables, from There is Morlg = at X Xx M~,,g( contributes X) to 1 plus Mto, = Xx where each @-function placed X), of X to MtOt( X) = MOrLg( X) + M+(X). Mto,( the indegree Placing a ~-function at Y in Figure 11 has cost linear in the indegree of Y, so there is an O(EX M*( X)) contribution to the running time from the work done when HasAlready( Y) < IterCount. Replacing mentions of variables will contribute at least 0( M~Ot) to the running time of any SSA translation algorithm, so the cost of placement can be ignored in analyzing the contribution of Figure 11 to the 0(. . ) bound for the whole process. The 0(N) cost of 5The various measures relevant summarized in Section 9 1. ACM TransactIons on Programming here are reviewed Languages when the whole SSA translation and Systems, Vol. 13, No, 4, October 1991 process is Static Single Assignment Form and the Control Dependence initialization can be similarly the dominance managing the performed A,O,( X) because running Sizing frontier worklist ignored because calculation. What W. The statement times, and each the output in the natural time way as 0( A ~Otx aurgDF), aurgDF~f (~ it is subsumed incurs as where (A,O,(X) x 471 by the cost of cannot be ignored is the cost of take X from W in Figure 11 is a cost linear all Ye DF( X) are polled. The contribution time of the whole process is therefore O(ZX( contribution . Graph we can A tot, the weighted I DF(X) in of Figure A ~Ot(X) x also I DF( X) I 11 to the I DF( X) I )). describe the average 1))/(~ (7) Aw(X)) emphasizes the dominance frontiers of nodes with many assignments. Section 8 shows, the dominance frontiers are small in practice, and Figure is effectively 0( A tOt). This, in turn, is effectively 0( AO,i~). As 11 5.2 Renaming The algorithm denoted in Figure V,, where 12 begins a top-down the root traversal node Entry. with the have been node 12 renames all mentions i is an integer, in The visit sequential inserted. The of variables. are generated of the dominator tree by calling to a node processes order, starting processing New for each variable with the SEAFtCH statements any of a statement at associated @-functions requires variables V. Figure that work for may only those variables actually mentioned in the statement. In contrast with Figure 11, we need a loop over all variables only when we initialize two arrays among the following data structures: –S(*) hold is an array integers. variable — C(*) Vi that of stacks, The should is an array C’(V) tells – WhichPred( X. The jth of Y from —Each integer one stack replace of integers, how many for i at the top each variable of S(V) V. The is used stacks to construct can the a use of V. for One assignments each variable to V have V. The countelc value been processed. X, Y) is an integer telling which predecessor of Y in CFG is operand of a +-function in Y corresponds to the jth predecessor the listing assignment of the inedges statement of Y. A has the form LHS(A) - RHS(A) where the right-hand side RHS( A) is a tuple left-hand side LHS( A) is a tuple of distinct target change as mentions is still remembered of variables are renamed, as oldLILS( A). of expressions and the variables. These tuples but the original To minimize notation, target tuple a conditional branch is treated as an ordinary assignment to a special variable whose value indicates which edge the branch should follow. A real implementation would recognize that a conditional branch involves a little less work The LHS part of the processing can be than a genuine assignment: omitted. ACM Transactions on Programmmg Languages and Systems, Vol. 13, No. 4, October 1991 472 Ron Cytron et al. . for each variable V c(v) + o S(V) +- Empty do Stack end call SEARCH (Entry) SEARCH(X) : for each statement if A is .4 in ordinary an X do assignment then for each variable replace V use of used 1’ by In use RHS(A) of do V, where i = Top(S(V)) end for each V in LHS(A) do C(v) itreplace push P’ by i onto C’(V) - new tj in LHS(A) S(V) i 1 + end end /* for each ~ for of first Y + c loop +/ SUCC(X) do tt7hichPred(Y)X) each ~-function replace the F’ in j-th Y operand do t’ in RHS(F) by V, where i = Top(S(V)) end end for each Y c Children(X) do SEARCH(Y) call end for each assignment for each pop V in A in X do oldLHS(A) do s(v) end end end SEARCH Fig. 12 Renaming mentions ofvariables. V by V, ina RHS Thelistofusesof V, grows with each replacement of The processing ofan assignment statement A considers the mentions of variables in A. The simplest case is that of a target variable V inthetuple LHS(A). Weneed anew variable V,. By keeping acount C(V) ofthe number have already been processed, we can of assignments to V that appropriate new variable by using i = C(V) and then incrementing To facilitate renaming uses of V in the future, identifies V,) onto a stack S(V) of (integers that V. replacing ACM Transactions on Pro~amming Languages and Systems, Vol we also identify) 13, N0 push new 4, October 1991 find an C(V). i (which variables Static Single Assignment Form and the Control Dependence A subtler any computation variable is needed V used expressions in RHS( in the tuple. for the right-hand A); that We want of the assignment that There are two subcases is, side RHS( V appears to replace Graph V by A). Consider in at least Vi, where one of the V, is the target produces the value for V actually used in RHS( A). because A maybe either an ordinary assignment or a ~-function. Both subcases get V, from the top of the stack S(V), inspect S(V) at different times in Figure 12. Lemma 10, introduced this section, shows that The correctness proof V, is correctly for renaming chosen depends LEMMA Each 8. new variable but they later in in both subcases. on results from Section on three more lemmas. We start by showing that “the” assignment to a variable in the transformed exactly 473 . it makes program. 4 and sense to speak Vi in the transformed program C(V) after of is a target of one assignment. PRooF. Because the counter is incremented processing each assignment to V, there can be at most one assignment to V,. To show that there is at least one assignment to V,, we consider the two ways V, can be mentioned. If V, is mentioned on the LHS on an assignment, then there is nothing more to show. If the i = Z’op(S( V )) was true value at the time of V to a use of Vi. At the earlier pushed to vi. because an assignment of Vi when time is used, on the the algorithm other hancl, renamed then the old use when i was pushed onto to V had just been changed to an assignment S(V), it was ❑ A node X may original program contain assignments or were to the variable introduced to receive V that the value appeared in the of a o-function V. Let TopA/7er( V, X) denote the new variable in effect for V after statements in node X have been considered by the first loop in Figure Specifically, and define we consider TopAfter If there that the ( V, X) are no assignments TopAf3er( assigns V, X) is top of each ~f to V in X, then from at the end of this 9. For any variable have a rp-function for V, TopAfter(V, PRooF. o-function idom( loop i = Top(S(V)). the top-down the closest traversal dominator ensures of V and We may for X) assume = TopAfter( that V, if a node any CFG Z has edge X ~ Y such V, idom( X that X # idom( Y). Ye DF(Z), then variable X), derived from idom( idom( X)), Transactions Let . . . that TopAfter ACM V. ( V, X) on Programming U be the assigns Because Y does not Z does not assign Languages first Y does (8) node in to such a variable. = TopAfter that Y)). variable derived from V. We use this fact twice below. By Lemma 2, Ye DFIOC=l( X) G DF( X), and, thus, X does not new all 12. to V. LEMMA not S(V) where V, inherited stack for have assign the to a sequence Then (9) ( V, U). and Systems, a to a new Vol. 13, No. 4, October 1991. 474 Ron Cytron et al . Because DF(U). any U assigns But to a new U dominates Z with variable U >> Z >> idom( 7’opAfter( from The preceding form, but first V, U) = TopAfler TopAfter ( V, idom( lemma helps establish extend Condition TopAfter and just after this i = Top( S ( V ) ) before ( V, A) ~f V, where i = TOP( S( V ) ) after B, then TopA/ler( PRooF. V, A) = TopBefore( We use induction Just along the There is one top of each processing processing the path. A. V, B). path in the transformed Consider any variable before A; program V and any the original program, program is the same executing A in the transformed We consider the kth statement along the path and assume that, for all j < k, the jth statement T not from the original programG or has each variable V agreeing with TopBefore( V, T) just before program and the original before A. Case V, A) A. statement in block X, then A is followed by other hand, occurrence of a statement A along the path. If A is from then the value of Vjust before executing A in the original of TopBefore( of SSA new variable to define where 10. Consider any control flow the same path in the original program. executed is either which V, as the value program. is Z does not (3) in the definition each statement A. We consider iteration LEMMA and that ~f statement Y $ Y. For Y)), so as to specify to be the last In particular, if A happens TopAfter( V, A) = TopAfler( V, X). If, on the another that Z >> Y and there implies V, A) TopBefore( follows dominates ❑ (9). we must before V, it Z >> X because of U, U>> Z >> X from V. Therefore, corresponds to V just before and just after iteration of the first loop in Figure 12 for stack just from of Y, so U strictly Y), we get an edge X * Y. By the choice assign to a new variable derived and (8) follows derived a predecessor 1. Suppose that T. We assume show that is not A that the V agrees the first kth statement with A is from TopBefore( original statement V, A) in its just basic block Y. hypothesis Let T be the statement just before A in Y. By the induction and the semantics of assignment, V agrees with TopAfter(V, T) just T. after Case 2. Let X + Thus, Suppose Y be the V agrees that edge with A is the followed TopBefore( first by the original control V, A) just statement path. We before in its claim A. basic that block Y, V agrees with TopAfler( V, X) when control flows along this edge. If X = Entry, then TopAfter( V, X) is VO and does hold the entry value of V. If X # Entry, let T be the last statement in X. By the induction hypothesis and the semantics of assignment, V agrees with TopAfter( V, T) ‘It could be a nominal ACM Transactions Entry assignment on Programming or a @function, Languages and Systems, Vol 13, No, 4, October 1991 Static Single Assignment Form and the Control Dependence Graph ... ... + ? ~ + T . ... ... No ~ for V 475 t @(..., TopAfter(V,X), . ..) /* */ Y Y V4K V 2 Z’opAfter(V, idom(Y)) ‘1 Fig. 13. just after x+ Y. We In the proof of Lemma T and, must TopAfter( still bridge V, X) along TopBefore( may or program. Case hence, V, A) just may not 2.1. 10, the node Y mayor with TopAfter( the gap X + before and doing that V for has when knowing control that knowing A in be a @-function Suppose V, X) between Y may not have a +-function that Y. As Figure V ahead of a ~-function for V. flows V along agrees V with agrees with 13 illustrates, A in Vi+- 4(. the ..) there transformed in Y. The @ operand corresponding to X + Y is TopAfter ( V, X), thanks to the loop over successors of X in Figure 12. This is the operand whose value is assigned to Vi, which before is TopBefore( doing A in Case 2.2. V agrees just before V agrees V does not have A in ❑ the dominance and Figure that V, X) Any 3. Thus, TopAfter( doing THEOREM ing Suppose with V, A). with TopBefore( Y. program frontiers can be put and a ~-function = TopAfter( then 12. Let A ~Ot be the total V, idom( into minimal applying number in Y)) the whole average process O PROOF. dominance original Y. By Lemma SSA form the algorithms of assignments (7) of dominance E + ~ x Figure frontier I DF(X) 11 places the I + (A,o, @functions DF+ ( Y ), where By 7Each ordinary assignment each @-function contributes ACM frontier by computin Figure to variables sizes. Then 9, V, A) 11 in the of variables in Let avr.gDF be the running time of is ( program. just = TopBefore( of mentions resulting program. ~ Let MtOt be the total number the resulting program. Let E be the number of edges in CFG. the weighted V, A) Y. Transactions Theorem 2, Y for + M,O, ) . V at the nodes in the iterated is the set of assignments DF+ ( Y ) is the LHS + RHS contributes 1 to A ~Ot. on Programming avrgDF) x Languages the length set of the tuple and Systems, to V in the of nodes Vol. LHS that to A,.,, 13, No. 4, October need and 1991. 476 Ron Cytron et al. . @-functions for translation show renaming definition Let N follows be the and subsumed frontiers, Section so we to SSA form that edges, V, have with obtained the is done Condition fewest possible correctly by Figure 12 runs in OF CONTROL DEPENDENCE show we frontiers Y be nodes in that in the control reverse dependence graph postdominator is [241 are flow essentially graph. Let the X and X is If X postdominates Y but X # Y, then X strictly immediate postdominator of Y is the closest strict of Y on any path from Y to Exit. In a postdominator the tree, postdominated by X. on a CFG node X if both of the hold: (2) The node other and We associate control flow there there with edge is and here only postdominates postdominate some is also edge some from path can easily Let X and 11. if every Suppose PROOF. p: X ~ Y such that Y postdominates every the node X. X definitely from that X that Y be CFG (iff ) node that path q from U to Exit. q that reaches the first be shown there after nodes. is a nonnull p: Y. label on the definition of to the original Y postdominates path Y executing to be equivalent Then node causes avoids this control dependence from X to Y the from X that causes Y to execute. Our control dependence definition [241. LEMMA path Y does not strictly words, to execute, if term the dominance at the end of CFG. If X appears on every path from Y to Exit, then Like the dominator relation, the postdominator relation (1) There is a nonnull after X on p. of X N + E ❑ of the control children of a node X are all immediately dependent A CFG node Y is control In the Y.a reflexive and transitive. postdominates Y. The following still (2) in Lemma 10. tree has O(N) The x czurgDF). 0( AfOt In section time. of We must Condition cost of computing 0( M,O,). As explained 6. CONSTRUCTION postdominates 12. X ) I) 11 contributes definition @functions. Figure 0( N + E + Mtot) by the 0( E + x ~ I DF( so Figure 12 contributes dominance the from Lemma 8. Condition (3) follows from number of nodes in CFG. The dominator 5.1, Figure this (1) in X ~ a successor Y such that Y X on p. Y postdominates a successor U of X. Choose any Then Y appears on q. Let r be the initial segment of appearance of Y on q. For any node V on r, we can get from U to Exit by following to Exit. Because Y postdominates r to V and then by taking any path from V U but does not appear before the end of r, Y must Let postdominate X + U and every then node after V as well. proceeds along r. p be the path Then p: that starts X ~ Y and with the edge Y postdominates X on p. 8The postdominance relation in [24] is irreflexive, whereas the definition we use here is reflexive. The two relations are identical on pairs of distinct elements, We choose the reflexive defimtlon here to make postdominance the dual of the dominance relation, ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991, Static Single Assignment Form and the Control Dependence Graph build RCFG build dominator apply the tree dominance in each node X do for each node Y do each X CD(X) E - 477 RCFG Figure frontier for for for algorithm . 10 to mapping CD(X) t-- RDF(Y) find RDF the for RCFG 0 end do CD(X) U { Y } end end Fig. 14. Algorithm Node for computing the set CD(X) of nodes that are contro 1dependent on X CD(Node) Entry 1,2,8,9,11,12 1 2 3,6,7 3 4,5 4 5 Fig. 15. 6 Control dependences of theprogram in Figure5. 7 8 10 9 10 11 9,11 12 [ 2,8,9,11,12 Conversely, after given X on p. Then The reverse control a path p with these U is a successor flow graph properties, of X and RCFG let U be the first Y postdominates has the same nodes U. node [] as the control flow graph CFG, but has an edge Y + X for each edge X + Y in CFG. roles of Entry and Exit are also reversed. The postdominator relation CFG is the dominator COROLLARY on X in CFG l?ROOF. control 1. relation Let X and iff X~ DF( Y) The on on RCFG. Y be nodes in CFG. Then Y is control dependent in RCFG. Using Lemma 11 to simplify the first condition in the definition of dependence, we find that Y is control dependent on X iff Y postdomi - nates a successor of X but does not strictly postdominate X. In RCFG this means that Y dominates a predecessor of X but does not strictly dominate X; that Figure ❑ is, X~DF(Y). 14 applies this result to the computation of control After building the dominator tree for RCFG by the in time 0( Ea( E, N)), we spend O(size( RDF)) finding and then inverting them. The total time is thus depenclences. standard method [351 dominance frontiers 0( E + size( RDF)) ACM TransactIons on Programming Languages and Systems,Vol. 13, No for 4, October all 1991 478 practical graph the Ron Cytron et al. ● purposes. in Figure edge from Entry ence relation, was as a graph, FROM powerful the algorithm the control to Exit viewed 7. TRANSLATING Many By applying 5, we obtain in Figure dependence added be rooted analysis in existing and transformation target machines. out of SSA form by replacing ments. Naive translation could can be generated elimination if two This each yield already and storage control that depend- at Entry. techniques section @-function inefficient useful allocation can be applied to a program must be executed. they are generally not repredescribes with object optimization how to translate some ordinary assigncode, but efficient code are applied: dead code by coloring. Naively, a k-input @-function at entrance to a node ordinary assignments, one at the end of each control This is always correct, but these ordinary assignments good deal elimination the flow 15. Note SSA FORM programs in SSA form. Eventually, however, The @-functions have precise semantics, but sented so that to CFG would 14 to the control in Figure X can be replaced by k flow predecessor of X. sometimes perform a of useless work. If the naive replacement is preceded by dead code and then followed by coloring, however, the resulting code is efficient. 7.1 Dead Code The original Elimination source on any program as procedure live may program output). integration) become dead may have dead code Some of the intermediate may in the also introduce course (i.e., code that dead code. Code that of optimization. With sources for dead code, it is natural to perform (or repeat) tion late in the optimization process, rather than burden steps with concerns about has no effect steps in compilation so many (such once was possible dead code eliminamany intermediate dead code. Translation to SSA form is one of the compilation steps that may introduce dead code. Suppose that V is assigned and then used along each branch of an if . ..then... but that V is never used after the join point. The else.... original assignments to V are live, but the added assignment by the @ function is dead. Often, such dead +-functions are useful, as in the equivalencing and redundancy elimination algorithms that are based on SSA form [5, 431. One such use is shown in Figure 16. Although others have avoided placement of dead @functions in translating to SSA form [16, 52], we prefer to include the dead O-functions to increase optimization opportunities. There are many different definitions of dead code in the literature. Dead code is sometimes defined to be unreachable code and sometimes defined (as it is here) to be ineffectual code. In both cases, it is desirable to use the broadest code really possible definition, can be safely subject removed. ‘The definition used here is broader variables [25, p, 489]. ACM TransactIons on Programming to the correctness g A procedural condition version of the than the usual one [1, p 595] and similar Languages and Systems, Vol. 13, No. 4, October that “dead” definition to that of “faint” 1991 is Static Single Assignment Form and the Control Dependence if Pi if then Graph then do Yltl use of Y1 use end of Y1 end else do Y2 else + use of do xl Y2 Y2 use of end + xl Y2 end 4’(Y1, Y2) Y3+ . . . if 479 Pi do Yi+-1 Y3 - . 4( YI, Y2) . . . Pi then Z1 t 1 else Z2 ~ Xl Z3 + 4( ZI, Z2) use of Z3 use of Y3 Fig. 16. Ontheleft isanunoptimized program containing a dead @function that assigns toY3. The value numbering technique in [5] can determine that Y3 and Z3 have the same value, Thus, Z3 and many of the computations that produce it can be eliminated. The dead ~-function is brought tolifebyusing Y3 inplaceof Z3. more intuitive Initially, however, Marking natural than all need A statement such are to be marked version, so we prefer tentatively live marked because the dead. of the procedural Some conditions style. statements, listed below. these statements live may cause others to be marked live. When the worklist eventually empties, any statements that are still marked dead are truly (1) The a recursive statements dead and can be safely is marked statement live is one that as an 1/0 statement, call to a routine that (2) The statement already marked may removed iff at least should be assumed an assignment have the code. to affect algorithm, given conditional branches. in algorithms conditional Figure output, parameter, or a there are statements and there are statements already on this conditional branch, eliminate dead code in the narrower sense branch to be marked live [30, 31, 381.1° Our 17, goes (An unpublished one step algorithm ‘“The more readily accessible [31] contains a typographical should be consulted for the correct version. TransactIons program to a reference is an assignment statement, and live that use some of its outputs. Several published that requires every holds: side effects. (3) The statement is a conditional branch, marked live that are control dependent ACM from one of the following on Programmmg Languages further in eliminating by R. Paige error; the earlier and Systems, Vol. dead does this technical al so.) report [301 13, No. 4, October 1991. 480 Ron Cytron et al . for each if statement S do S G Pre Live then Live(S) - true else Live(S) + false end iVorkList while t- Pre Live (WorkList take # S from for each D c then do Definers(S) Live(D) if 0) 14To~kList do = false do Live(l)) t- PVorkList true - \VorkList U { ~ } end end for each If block B in CD-l Liz~e(Last(B)) then (B/ock(S)) do = false do Live(Last(B)) WorkList t true WorkList U { Last(B) } end end end for each If statement Live(S) then S do = false delete S from Block(S) end Fig. 17 The following data —~ive(S) indicates —PreLiue is the structures that is a list is the —Definers(S) ment —Last( statement ACM S is live. whose execution is initially Statements that result in 1/0 scope are typically included. of statements set whose of statements Iiveness that has provide assumed or side effects been values recently used B) is the statement is the basic l?) TransactIons is the basic that block containing block on Programming terminates that Languages basic block statement immediately by disstate- and Systems, B. S. postdominates Vol 13, No. 4, October block 1991 to outside S. –Block(S) — ipdom( are used: set of statements affect program output. the current procedure — WorkList covered. Dead code ehmination 1?. Static Single Assignment Form and the Control Dependence Graph – CD-1(B) is the set of nodes that node corresponding After the live to block algorithm are deleted. empty blocks, remove them. discovers 11 Other are control B. This the live marked live useful it might (such all those as code still 14. not marked motion) may leave optimization to can be removed dead until marked live is crucial (transitively) on themselves are by some other live statement. CD- l(Block(S)). A basic block statements is itself live. Condition whose termina- simply ~-functions. to map However, all occurrences of Vi back to V the new variables introduced by to SSA form cannot always be eliminated, because optimization capitalized on the storage independence of the new variables. The persistence of the new variables 18a) assigns can optimized to V twice by a program 18c). The leave a region introduced by the code motion (Figure yielding required seem possible all of the can be illustrated be of the B) in Figure by Coloring and to delete translation may have unless by the loop over a block with live Allocation At first, predecessors so there is already ample reason for a late The empty blocks left by dead code elimination (3) is handled tion controls 7.2 481 same as RDF( statements, optimizations along with any other empty blocks. The fact that statements are considered for condition (2). Statements that depend never dependence is the . dead moving with in the the invariant variables to V3 will program by translation in Figure and uses it twice. separate assignment example where The SSA form assignment for out separate be eliminated. VI and to SSA form 18. The source Vz are 18b) of loop, purposes These code (Figure the (Figure optimization simultaneously live. Thus, both variables are required: The original variable V cannot substitute for both renamed variables. Any graph coloring algorithm [12, 13, 17, 18, 211 can be used to reduce the number of variables needed and thereby can remove most of the associated assignment statements. The choice of coloring technique should be guided by the eventual then just use of the output. it is desirable to consider the SSA variables the SSA variables coloring changes derived should most If the goal is to produce each original from readable at once. In both assignments that were source V separately, V. If the goal is machine be considered of the variable code, coloring code, then all of cases, the process inserted to model of the @functions into identity assignments, that is, assignments of the form V - V. These identity assignments can all be deleted. Storage savings are especially noticeable for arrays. If optimization does not perturb the order of the first two statements in Figure 7, then ~arrays A ~ and A ~ can be assigned the same color and, hence, can share the same storage. The array A ~ is then assigned an Update from an identically llA colored conditional of its prior array. branch Such operations can be implemented can be deleted by transforming it to an unconditional inexpensively branch by to any one targets. ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991. 482 Ron Cytron et al. . V2G6 while (. ..) do while read V Wtv+w (. ..) W3 - V3 - read while ~(w~, W-J lj(v~, v~) end V1 WI assigning +V1+W3 (b) to just (c) really uses two instances for a variable SSA form; (c) result of code motion, one component operation V2+W1 end (a) Program that (b)unoptimized do W2 4- end Fig 18. program; (. ..) W3 + ~(w~, w~) V3 + $+(VO, V2) read VI WI +vi+w3 V2t6 W2 +V2+W1 V+6 W+v+w actual do performed if the array s share by HiddenUpdate after code motion. storage. is always (a) Source Inparticular, of this the form. 8. ANALYSIS AND MEASUREMENTS The number of nodes that contain d-functions for a variable V is a function of the program control flow structure and the assignments to V. Program structure alone determines dominance frontiers and the number of control dependence. It is possible that dominance frontiers may be larger than necessary for computing @function locations for some programs, since the actual assignments are not taken the size of the dominance frontiers control loops. flow branching (We assume branching.) ure Such linear 4. Figure For programs and Consider 19. a control results constructs predicates by the that and perform grammar suggest while-do no internal given that in Fig- the behavior is programs. while-do constructs, most two nodes. PROOF. and can be described experimental and in to if-then-else expressions programs for actual THEOREM is restricted that 19. We also give into account. In this section we prove that is linear in the size of the program when dominance a top-down Initially, flow comprised the we graph have CFG parse straight-line two nodes The initial dominance frontiers are DF(Entry) changes production, we consider the associated frontiers of nodes. When a production expands CFG using (program) if-then-else, code, of any of a program a single with of frontier node the node contains grammar in and one edge: the shown parse Entry at - tree Exit. = @ = DF(Exit). For each CFG and to the dominance a nonterminal parse tree node to S, a new subgraph is inserted into CFG in place of S. In this new subgraph, each node corresponds to a symbol on the right-hand side of the production. We will show that applying any of the productions preserves the following invariants: CFG node S corresponding to an unexpanded –Each has at most one node in its dominance frontier. ACM TransactIons on Programmmg Languages and Systems, Vol. (statement) 13, No. 4, October 1991 symbol Static Single Assignment Form and the Control Dependence Graph :: = (1) ::= (2) ::= if (3) then ::= while ::= CFG node We consider Grammar the productions this production SI and Sz. Edges leave Sz. A single flow graph tortree: additionally, (3) When s to a terminal node is applied, Sand aCF’G consider Sland symbol has edges Entry node at most +S+Exit, Sisreplaced is applied, Edges production all nodes that Sz. Thus, Te.~i~. T,~ and leave how this Szdominate production Sel.e, and enter (5) structures two yield- by twon odes previously entering and leaving S now enter edge is inserted from SI to Sz. Although the Sldominates this then? for control inturn. has changed, Nodes frontier. This production adds aCFG ing DF(S) = {Exit}. (2) When - T corresponding nodes in its dominance (4) Fig. 19. —Each do 483 else (l) . wehave a CFG DF(Sl) node previously T,~~z~. Edges affects were are inserted S is replaced from thedornina - dominateclby = DF(S) entering SI and control and S; = DIF(SJ. by nodes leaving Ti~ to both Ti~, S now S~~,.m and s~l~~; edges are also inserted from St~~~ and S~l~~ to Te~~i~. In the dominator tree, Ti ~ and T,~~i ~ both dominate all nodes that were dominated made by S. Additionally, Ti ~ dominates S~~~~ and S.l,.. IBy the argument for production (2), we have DF( T,~) = DF( S) = DF(T,~~i ~]). NOW consider nodes fr(mtier, (4) When we this Sthen and production From Sel~e. DF(S,~e.) obtain the = DF(S.l.,) is applied, definition of a a CFG node T whzle and SdO. All edges previously associated associated with node TUkil.. Edges are inserted S is replaced with from node TW~,l, from S& to TU~,l,. Node TWh,l, dominates all nodes that were by node S. Additionally, TU~,l, dominates SdO. Thus, we have U { TWhi~=) and DF(SdO) = { T~h,~,}. = DF(S) (5) After phic application of this production, ❑ to the old graph. COROLLARY and while-do For programs 2. constructs, the new control comprised every node dominance = { T.n~,fl}. flow of straight-line is control by nodes S are now to Sd. and dominated DF( TWhil.) graph is isomor- code, if-then-else, dependent on at most two nodes. PROOF. associated Consider control ACM a program flow Transactions graph P composed CFG. on Programmmg of the allowed The reverse Languages control and Systems, constructs flow Vol. graph and its R(CFG 13, No. 4~October is 1991. 484 . Ron Cytron et al. Table 1. Summary of Our Experiment Statements per procedure Statements in all procedures Package name Statistics Min Median Description Max EISPACK FL052 SPICE 7,034 2,054 14,093 22 9 8 89 54 43 327 351 753 Dense matrix eigenvectors Flow past an airfoil Circuit simulation Totals 23,181 8 55 753 221 FORTRAN itself a structured RCFG, is then Unfortunately, tures. In Figure these particular, 5. For includes leads control flow graph linearity results consider each loop, the the variable mapping needs at most is not actually computation mapping 0(n) used of dominance We implemented placing ~-functions data flow might For Y in 1, Y P’. for all program loops of the total all 4. By Corollary in to that loop entrance For n nested loops, this yet each size is 0( nz), Most of the dominance @functions, so it seems take excessive time struc- illustrated with frontier that the respect to @functions. We therefore wish to measure the nodes as a function of program size over a our algorithms in the PTRAN from hold loops. whose d-functions. in placing frontiers and control procedures frontier procedures program of repeat-until to surrounding frontier the resulting number of actual number of dominance frontier diverse set of programs. local some do not nest dominance each of the entrances to a dominance library for DF( Y) contains at most two nodes by Theorem ❑ control dependent on at most two nodes. and values flow for constructing dominance frontiers and system, which already offered the required analysis EISPACK [2]. We ran these [46] and 160 procedures algorithms from on 61 two “Perfect” [391 benchmarks. Some summary statistics of these procedures are shown in Table I. These FORTRAN programs were chosen because they contain irreducible intervals and other unstructured constructs. As the plot in Figure 20 shows, the size of the dominance frontier mapping appears to vary linearly with program size. The ratio of these sizes ranged from 0.6 (the Entry node has an empty dominance frontier) to 2.1. For the programs ~-functions is also we tested, linear the plot in the in Figure size of the 21 shows that original the number program. The ratio of of these sizes ranged from 0.5 to 5.2. The largest ratio occurred for a procedure of only 12 statements, and 95 percent of the procedures had a ratio under 2.3. All but one of the remaining procedures contained fewer than 60 statements. Finally, the plot in Figure 22 shows that graph is linear in the size of the original ranged from O.6 to 2.4, which is very dominance frontiers. The placing ACM the size of the control dependence program. The ratio of these sizes close to the range of ratios for ratio avrgDF (defined by (7) in Section 5.1) measures the cost of ~-functions relative to the number of assignments in the resulting TransactIons on Programmmg Languages and Systems, Vol. 13, No. 4, October 1991 Static Single Assignment Form and the Control Dependence Graph . 485 * 16001500140k 13001200- * 11oo- * 1000-900 — 800 — * * 700 — * * 600 — * *** 500 — ** 400 — ***** * * 300 — *:*@*?:*:: * 200 — ‘** * loo— 0 I 100 o Fig. 20. I I I I 200 300 400 500 Size of dominance frontier mapping 6~o versus number ~~oo of program statements. 1300 * 1200 110 100 900 * * 800 * 700 * * 600! * * * 500 — ** 300 — * ‘e ** ****:* *** 400 — * * *** * * * * * * loo— 0 I 100 0 Fig, 21. SSA form program. no correlation with I I I 200 I I 300 400 500 600 Number of rj-functions This ratio varied program size. versus number from of program 1 to 2, with --&-Too statements median 1.3. There was We also measured the expansion A,., / A.... in the number of assignments when translating to SSA form. This ratio varied from 1.3 to 3.8. Fina”lly, we measured the expansion MtOt / MO,,~ in the number of mentions (assignments ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991. 486 Ron Cytron et al. . 160 * 150 1400 130 120 110 100 * 900 * 800 ** 700 ! 600 ** * 400 ?@%’” ‘**** 300 1 200 — ** * * ******* 500 * * ** **2** * $* loo— 0 100 0 Fig. 22. or uses) to 9. 6.2. Size of control of variables For both 1 I 200 400 translating ratios, there I 500 dependence graph versus number when of these I 300 to SSA was form. no 600 700 of program This correlation 800 statements. ratio varied with from program 1.6 size. DISCUSSION 9.1 Summary of Algorithms The conversion (1) The graph to SSA form is done in three steps: dominance fi-ontier mapping is constructed from CFG (Section 4.2). Let CFG have N nodes and be the mapping from compute the dominator O(E and Time Bounds + Xx nodes to their tree and then the control flow E edges. Let DF dominance frontiers. The time the dominance frontiers in CFG to is I DF(X)I). (2) Using the locations of the @functions for each are determined (Section 5.1). Let A ~Ot be to variables in the resulting program, statement LHS * RHS contributes the len~h of the tuple LffS to A ~ot, and each ~-function contributes I to Placing @functions contributes 0( A,O, x aurgDF) to the overall where aurgDF is the weighted average (7) of the sizes I DF( X) 1. the dominance frontiers, variable in the original program the total number of assignments where each ordinary assignment A ,.,. time, (3) The variables mentions are renamed (Section 5.2). Let M,O, be the total number of of variables in the resulting program. Renaming contributes 0( M, O,) to the overall R To state the time bounds of the original program ACM TransactIons on Programming time. in terms be the of fewer maximum Languages and Systems, parameters, let the overall size of the relevant numbers: N Vol 13, No 4, October 1991 Static Single Assignment Form and the Control Dependence nodes, E edges, mentions ordinary AOT,~ original assignments In the worst case, of variables. assignments A tOt = !J(R2) can at worst. require In the 0( I&) worst . 487 to variables, and MO,,~ original avrgDF = ~ ( IV) = 0(R), and k insertions case, Graph of a c$function qLfunctions. has Thus, !J( 1?) operands. Thus, M,Ot = 0( R3) at worst. The one-parameter worst-case time bounds are thus 0(122) for finding dominance frontiers and 0( R3) for translation to SSA form. However, the data in Section 8 suggest that the entire translation to SSA form will is small, avrgDF be linear as is the is constant, lation process Control reverse RDF in practice. The dominance frontier of each node in CFG number of ~-functions added for each variable. In effect, A ~Ot= 0( Ao,,~), O(R). and M,O, = O(MO,,~). The entire dependence graph RCFG is the size are read (Section of the off from 6) in time output of the the dominance frontiers 0( E + size( RDF)). control Since dependence 9.2 that the control dependence in calculation is effectively of Shapiro and the the size of calculation, algorithm is linear in the size of the output. The only quadratic caused by the output being fl( R2) in the worst case. The data suggest trans- is effectively this behavior in Section is 8 0(R). Related Work Minimal SSA form is a refinement Saint’s [45] notion of a pseudoassignment. The pseudoassignment nodes for V are exactly the nodes that need o-functions for V. A closer precursor [221 of SSA form associated new names from one new name for V with pseudoassignment to another. Without nodes explicit and inserted @functions, assignments however, it was difficult to manage the new names or reason about the flow of values. Suppose the control flow graph CFG has N nodes and. E edges for a program with Q variables. One algorithm [411 requires 0( Ea( E, N)) bit vector operations (where pseudoassignments. each A simpler vector algorithm is of [43] length Q) for reducible to find all of the programs com- putes SSA form in time 0( E x Q). With lengths of bit vectors taken into account, both of these algorithms are essentially 0( R2 ) on programs of size R, and the simpler algorithm sometimes inserts extraneous o-functions. The method presented here is 0( R3) at worst, but Section 8 gives evidence that it is 0(R) in practice. The earlier 0( R2 ) algorithms have no provision for running faster in typical cases; they appear to be intrinsically quadratic. For CFG with N nodes and E edges, previous general control dependence algorithms [24] can take quadratic the worst-case 0(N) depth of the time in (N + E). This analysis is based on (post) dominator tree [24, p. 326]. Section 6 shows that control dependence can be determined by computing dominance frontiers in the reverse graph RCFG. In general, our approach can also take quadratic time, but the only quadratic behavior is caused by the output being fl( N2) in the worst case. In particular, suppose a program is comprised only of straight-line code, if-then-else, and while-do constructs. By Corollary 2, our algorithm computes control dependence in linear time. We obtain a better time bound for such programs because our algorithm is based on dominance frontiers, whose sizes are not necessarily related to the depth of ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991. 488 . Ron Cytron et al. the dominator tree. For languages that offer only these constructs, control dependence can also be computed from the parse tree [28] in linear time, but our algorithm is more robust. typical cases in linear time. It handles all cases in quadratic time and 9.3 Conclusions Previous work powerful code optimization has shown that SSA form algorithms and control that dependence are highly can support efficient in terms of time and space bounds based on the size of the program after translation to the forms. We have shown that this translation can be performed efficiently, that it leads to only a moderate increase in program size, and that applying the early steps in the SSA translation to the reverse graph is an efficient way to compute control control dependence dependence. This is strong evidence form a practical basis for optimization. that SSA form and ACKNOWLEDGMENTS We would Trina like Avery, Jin-Fan to thank Julian Shaw referees, Fran Padget, Allen for early Bob Paige, encouragement Tom for various helpful comments. thorough reports led to many whose and Fran Reps, Randy Allen, Scarborough, We are especially improvements grateful in this and to the paper. REFERENCES A. V., SETHI, R., AND ULLMAN, J. D. Compzlers: Principles, Techniques, and Tools. Addison-Wesley, Reading, Mass , 1986. ALLEN, F, E., BURKE, M,, CHARLES, P,, CYTRON, R., AND FERRAiYTE, J An overview of the Distrib. Comput. 5, (Oct. 1988), PTRAN analysis system for multiprocessing. J. Parallel 617-640. ALLEN, J. R Dependence analysis for subscripted variables and its application to program transformations. Ph.D. thesis, Department of Computer Science, Rice Univ., Houston, Tex,, Apr. 1983. ALLEN, J R , AND JOHNSON, S Compiling C for vectorization, parallelization and inline of the SIGPLAN ’88 Symposium on Compder Construction. expansion. In Proceedings SIGPLAN Not. (ACM) 23, 7 (June 1988), 241-249 ALPERN, B , WEGMAN, M, N., AND ZADECK, F. K. Detecting equality of values m programs on Principles of Programming Languages In Conference Record of the 15th ACM Symposium (Jan. 1988), ACM, New York, pp. 1-11, BALLANCE, R. A., MACCABE, A. B., AND OTTENSTEIN, K. J. The program dependence web: A representation supporting control-, data-, and demand driven interpretation of languages, In 1. AHO, 2, 3. 4 5. 6. Proceedings of (ACM) (June 1990), 257-271. 25, 6 the SIGPLAN 90 Symposkum on Compiler Construction. SIGPLAN Not. 7. BANNING, J. B, An efficient way to find the side effects of procedure calls and the aliases of Record of the 6th ACM Syrnpoa~um on Principles of Programming variables. In Conference Languages (Jan. 1979) ACM, New York, pp. 29-41 8. BARTH, J. M. An interprocedural data flow analysis algorithm. In Conference Record of the 4th ACM York: pp. Symposium 9. BURKE, M. flow on Principles An interval-based analysis. ACM Trans. approach Program 10. BURKE, M,, AND CYTRON, R. Proceedings (ACM) ACM of Programmmg Languages (Jan. 1977) ACM, New 119-131. of the 21, 7 (June Transactions SIGPLAN to exhaustive Lang. Interprocedural 86 Symposium Syst. and incremental 12, 3 (July dependence on Compiler interprocedural 1990), 341-395. analysis and parallelization. Construction SIGPLAN 1986), 162-175. on Programming Languages and Systems, Vol. 13, No. 4, October 1991 data In Not Static Single Assignment Form and the Control Dependence 11. CARTWRIGHT, R., AND FELLEISEN, M. of the SIGPLAN 89 Symposium 1989), 13-27. 12. CHAITIN, G. J. SIGPLAN 82 Register allocation Symposium on The semantics on Compiler and spilling Compiler of program Construction. . dependence. SIGPLAN via graph Construction. Graph Not. coloring. SIGPLAN 489 In Proceedings 24, 7 (July (ACM) In Proceedings Not. (ACM) 17, of the 6 (June 1982), 98-105. 13. CHAITIN, G. J., AUSLANDER, M. A., CHANDRA, A. K., COCICE, J., HOPKINS, M. IE., AND MARKSTEIN, P. W. Register allocation via coloring. Comput. Lang. 6 (1981), 47-57. 14. CHASE, D. R. Safety considerations for storage allocation optimizations. In Proceedings of the SIGPLAN 88 Symposium on Compiler Construction. SIGPLAN Not. (ACM) 23, 7 (June 1988), 1-10. 15. CHASE, D. R., WEGMAN, M. AND ZADECK, F. K. Analysis of pointers and structures. In Proceedings of the SIGPLAN 90 Symposium on Compiler Construction. SIGPLAN Not. 20. 1990), 296-310. CHOI, J., CYTRON, R., AND FERRANTE, J. Automatic construction of sparse data flow evaluaRecord of the 18th ACM Symposium on Principles of Programtion graphs. In Conference ming Languages (Jan, 1991). ACM, New Yorkj pp. 55-66. CHOW, F. C. A portable machine-independent global optimizer—Design and measurements. Ph. D. thesis and Tech. Rep. 83-254, Computer Systems Laboratory, Stanford Univ., Stanford, Calif., Dec. 1983. CHOW, F. C., AND HENNESSY, J. L. The priority-based coloring approach to register allocaLang. Syst. 12, 4 (Oct. 1990), 501-536. tion. ACM Trans. Program. COOPER, K. D. Interprocedural data flow analysis in a programming environment. Ph.D. thesis, Dept. of Mathematical Sciences, Rice Univ., Houston, Tex., 1983. CYTRON, R., AND FERRANTE, J. An improved control dependence algorithm. Tech. Rep. RC 21. CYTRON, R., AND FERRANTE, J. (ACM) 16, 17. 18. 19. 25, (June 13291, IBM Corp., Armonk, N. Y., 1987. What’s in a name? In Proceedings of the 1987 International (Aug. 1987), pp. 19-27. CYTRON, R., LOWRY, A., AND ZADECK, F. K. Code motion of control structures in high-level on Principles of Programming languages. In Conference Record of the 13th ACM Symposium Languages (Jan. 1986). ACM, New York, pp. 70-85. DENNIS, J. B. First version of a data flow procedure language. Tech. Rep. Comput. Strut. Group Memo 93 (MAC Tech. Memo 61), MIT, Cambridge, Mass., May 1975. FERRANTE, J., OTTENSTEIN, K. J., AND WARREN, J. D. The program dependence gmph and ACM Trans. Program. Lang. Syst. 9, 3 (July 1987), 319– 349. its use in optimization. ACM GIEGERICH, R. A formal framework for the derivation of machine-specific optimizers. Trans. Program. Lang. Syst. 5, 3 (July 1983), 478-498. HAREL, D. A linear time algorithm for finding dominators in flow graphs and related of the 17th ACM Symposium on Theory of Computing (May 1985). problems. In Proceedings Conference 22, 23. 24. 25. 26. on Parallel Processing ACM, New York, pp. 185-194. 27. HORWITZ, S., PFEIFFER, P., AND REPS, T. Proceedings of the SIGPLAN 89 Dependence Symposium on analysis Compiler for pointer Construction. variables. SIGPLf~N In Not. (ACM) 24, 7 (June 1989). 28. HORWITZ, S., PRINS, J., AND REPS, T. Integrating non-interfering versions of programs. 1989), 345-387. JONES, N. D., AND MUCHNICK, S. S. Flow analysis and optimization of lLISP-like structures. Flow Analysis, S. S. Muchnick and N, D. Jones, Eds. Prentice-Hall, Englewood In Program Cliffs, N. J., 1981, chap. 4, pp. 102-131. KENNEDY, K. W. Global dead computation elimination. Tech. Rep. SETL Newsl. 111, Courant Institute of Mathematical Sciences, New York Univ., New York, N. Y., Aug. 1973. In Program Flow Analysis, KENNEDY, K. W. A survey of data flow analysis techniques. S. S. Muchnick and N. D. Jones, Eds. Prentice-Hall, Englewood Cliffs, N. J., 1981. of Computers and Computations. Wiley, New York, 1978. KUCK, D. J. The Structure LARUS, J. R. Restructuring symbolic programs for concurrent execution on multiprocessors. at Berkeley, Tech. Rep. UCB/CSD 89/502, Computer Science Dept., Univ. of California Berkeley, Calif., May 1989. ACM 29. 30. 31. 32. 33. Trans. Program. Lang. ACM Transactions Syst. 11, 3 (July on Proq-amming Languages and Systems, Vol. 13, No. 4, October 1991. 490 34 . LARUS, J Ron Cytron et al R., AND HILFINGER, P. N Proceedings of the ACM SIGPLAN Detecting 88 Sympostum conflicts between on Compder structure Construction. accesses SIGPLAN In Not. 23, 7 (July 1988), 21-34, 35. LENGAUER, T., AND TARJAN, R. E. A fast algorithm for finding dominators m a flowgraph. 1 (July 1979), 121-141 Prentice-Hall, EngleProgram Flow Analysis. MUCHNICK, S. S., AND JONES, N. D , EDS wood Cliffs, NJ., 1981 Record of the MYEKS, E. W. A precise interprocedural data flow algorithm, In Conference 8th ACM Symposium on Principles of Programming Languages (Jan. 1981). ACM, New York, pp. 219-230. OTTENSTEIN, K. J. Data-flow graphs as an intermediate form. Ph.D. thesis, Dept. of Computer Science, Purdue Univ., W. Lafayette, Ind., Aug. 1978. POINTER, L. Perfect report: 1. Tech. Rep CSRD 896, Center for Supercomputing Research and Development, Univ. of Illinois at Urbana-Champaign, Urbana, 111,, July 1989 REIF, J. H., AND LEWIS, H. R. Efficient symbolic analysis of programs. J. Comput. Syst, SCZ. 32, 3 (June 1986), 280-313. REIF, J. H., AND TARJAN, R. E Symbolic program analysis in almost linear time. SIAM J. Comput. 11, 1 (Feb. 1982), 81-93 ROSEN, B. K. Data flow analysis for procedural languages J. ACM 26, 2 (Apr. 1979), 322-344. ROSEN, B. K , WEGMAN, M. N., AND ZADECK, F. K. Global value numbers and redundant Record of the 15th ACM Symposui m on Przn ciples of Programcomputations. In Conference ming Languages, (Jan. 1988). ACM, New York, pp. 12-27, RUGGIERI, C., AND MURTAGH, T. P. Lifetime analysis of dynamically allocated objects. In ACM 36. 37 38. 39. 40 41. 42. 43. 44. Trans. Program. Conference Record Lang. Syst. of the 15th 1, ACM Symposzum on Principles of Programming Languages (Jan. 1988). ACM, New York, pp. 285-293. The representation of algorithms. Tech. Rep, CA-7002-1432, 45. SHAPIRO, R. M , AND SAINT, H Massachusetts Computer Associates, Feb. 1970. 46. SMITH, B. T,, BOYLE, J. M., DONGARRA, J. J,, GARBOW, B, S., IIIEBE, Y., KLEMA, V. C,, AND MOLER, C B. Matrzx Eigensystem Routines – Eispack Guide. Springer-Verlag, New York, 1976. 3, 1 (1974), 62-89, 47 TARJAN, R. E. Finding dominators in directed graphs. SIAM J. Comput Property extraction in well-founded property sets. IEEE Trans. Softw. Eng. 48. WEGBREIT, B SE-1, 3 (Sept. 1975), 270-285. 49. WEGMAN, M. N , AND ZADECK, F, K. Constant propagation with conditional branches. In Conference Record of the 12th ACM Symposium on Prlnczples of Programm mg Languages (Jan.). ACM, New York, pp. 291-299. 50. WEGMAN, M. N., AND ZADECK, F. K. Constant propagation with conditional branches. ACM Trans. Program. Lang. Syst. To be published. 51 WOLFE, M. J. Optimizing supercompilers for supercomputers. Ph.D, thesis, Dept of Computer Science, Univ. of Illinois at Urbana-Champaign, Urbana 111., 1982 52. YANG, W., HORWITZ, S., AND REPS, T. Detecting program components with equivalent behaviors. Tech Rep. 840, Dept. of Computer Science, Univ. of Wisconsin at Madison, Madison, Apr. 1989 Received July ACM 1989; revised TransactIons March on Programming 1991; accepted March Languages and Systems, 1991 Vol 13, No 4, October 1991