Reading Material + Announcements

Size: px

Start display at page:

Download "Reading Material + Announcements"

Walter Melton
5 years ago
Views:

1 Reading Material + Announcements Reminder HW 1» Before asking questions: 1) Read all threads on piazza, 2) Think a bit Ÿ Then, post question Ÿ talk to Animesh if you are stuck Today s class» Wrap up Control Flow Analysis» Introduction to Dataflow Analysis» Compilers: Principles, Techniques, and Tools, (2 nd edition) A. Aho, R. Sethi, and J. Ullman, Addison-Wesley. (Sections: 9.2) - 1 -

2 Class Problems Find the traces. Assume a threshold probability of 60% BB1 40 BB2 BB BB4 25 BB BB6 25 BB7 75 BB

3 Class Problems 20 BB Create the superblocks, trace threshold is 60% BB2 BB BB BB7 BB5 41 BB8 BB BB9-3 -

4 Class Problem Solution Superblock Formation 100 Each color represents a trace. 20 BB1 80 BB2 BB BB BB7 BB5 41 BB8 BB BB9 Create the superblocks, trace threshold is 60% - 4 -

5 Class Problem Solution Superblock Formation Each color represents a trace BB7 10 BB BB5 BB1 BB BB BB8 41 BB9 BB6 49 To convert trace into a superblock, BB4 is duplicated and the edge weights are adjusted BB BB Create the superblocks, trace threshold is 60% 10 BB BB2 BB3 BB BB5 BB6 BB BB9

6 Class Problem From Last Time - Answer if (a > 0) { r = t + s if (b > 0 c > 0) u = v + 1 else if (d > 0) x = y + 1 else z = z + 1 } a. Draw the CFG b. Compute CD c. If-convert the code BB1 a <= 0 a > 0 d <= 0 b <= 0 c <= 0 BB3 BB4 BB2 c > 0 d > 0 BB6 BB7 BB8 b > 0 BB5 BB CD , p3 = 0 p1 = CMPP.UN (a > 0) if T r = t + s if p1 p2,p3 = CMPP.UC.ON (b > 0) if p1 p4,p3 = CMPP.UC.ON (c > 0) if p2 u = v + 1 if p3 p5,p6 = CMPP.UC.UN (d > 0) if p4 x = y + 1 if p6 z = z + 1 if p5-6 -

7 If-conversion Positives» Remove branch Ÿ No disruption to sequential fetch Ÿ No prediction or mispredict» Increase potential for operation overlap: bigger BBs» Enable more aggressive compiler xforms: Software pipelining Negatives» Instruction execution is additive for all BBs that are if-converted, thus require more processor resources» Executing or waiting for useless operations BB BB2 BB BB4 10 BB BB

8 Predicated Execution Processors:» Intel Itanium» Intel X86» ARM Reading: Effective Compiler Support for Predicated Execution Using the Hyperblock - 8 -

9 Control Flow Analysis Summary» Basic blocks» Control Flow Graphs» Dominator/immediate dominator/post dominator/dom tree» Identify natural loops: header, back edges» Regions (beyond BBs): Ÿ Trace Ÿ Super blocks Ÿ Profiling over these regions» Alternatives to branches: predicated execution - 9 -

10 Next Topic: Dataflow Analysis + Optimization

11 Looking Inside the Basic Blocks: Dataflow Analysis + Optimization r1 = r2 + r3 r6 = r4 r5 r6 = r2 + r3 r7 = r4 r5 r4 = 4 r6 = 8 Control flow analysis» Treat BB as black box» Just care about branches Now» Start looking at ops in BBs» What s computed and where Classical optimizations» Want to make the computation more efficient Ex: Common Subexpression Elimination (CSE)» Is r2 + r3 redundant?» Is r4 r5 redundant?» What if there were 1000 BB s» Dataflow analysis!!

12 Dataflow Analysis Introduction r1 = r2 + r3 r6 = r4 r5 Dataflow analysis Collection of information that summarizes the creation/destruction of values in a program. Used to identify legal optimization opportunities. r4 = 4 r6 = 8 Pick an arbitrary point in the program Which VRs contain useful data values? (liveness) Which definitions may reach this point? (reaching defns) r6 = r2 + r3 r7 = r4 r5 Which definitions are guaranteed to reach this point? (available defns)

13 Live Variable (Liveness) Analysis Defn: For each point p in a program and each variable y, determine whether y can be used before being redefined starting at p» In other words, there is a use of the variable y along some path from the point p to the exit.» Example: x = =x x=» Useful for dead code elimination» Example: a = b + c b = c d = a

14 Algorithm sketch» Backward dataflow analysis as propagation occurs from uses upwards to defs» For each BB, y is live if it is used before defined in the BB or it is live leaving the block 4 sets» IN = set of variables that are live at the entry point of a BB» OUT = set of variables that are live at the exit point of a BB» GEN = set of external variables consumed in the BB» KILL = set of external variable uses killed by the BB Ÿ equivalent to set of variables defined by the BB Transfer function and Meet function

15 Computing GEN/KILL Sets For Each BB for each basic block in the procedure, X, do GEN(X) = 0 KILL(X) = 0 for each operation in reverse sequential order in X, op, do for each destination operand of op, dest, do GEN(X) -= dest KILL(X) += dest endfor for each source operand of op, src, do GEN(X) += src KILL(X) -= src endfor endfor endfor

16 Compute IN/OUT Sets for all BBs initialize IN(X) to 0 for all basic blocks X change = 1 while (change) do change = 0 for each basic block in procedure, X, do old_in = IN(X) OUT(X) = Union(IN(Y)) for all successors Y of X IN(X) = GEN(X) + (OUT(X) KILL(X)) if (old_in!= IN(X)) then change = 1 endif endfor endfor

17 Example Liveness Computation BB1 1. r1 = MEM[r2+0] 2. r2 = MEM[r1 + 1] 3. r8 = r1 * r2 GEN -= dest GEN += src KILL += dest KILL -= src OUT = Union(IN(succs)) IN = GEN + (OUT KILL) BB2 4. r1 = r r3 = r5 r1 6. r7 = r3 * 2 BB3 7. r2 = 0 8. r7 = r1 + r2 9: r3 = 4 BB4 10: r3 = r3 + r7 11: r1 = r2 r8 12: r3 = r1 *

18 Class Problem 1. r1 = 3 2. r2 = r3 3. r3 = r4 Compute liveness Calculate GEN/KILL for each BB Calculate IN/OUT for each BB 4. r1 = r r7 = r1 * r2 6. r4 = r r4 = r3 + r2 8. r8 = 8 9. r9 = r7 + r8-18 -

19 Reaching Definition Analysis (rdefs) A definition of a variable x is an operation that assigns, or may assign, a value to x A definition d reaches a point p if there is a path from the point immediately following d to p such that d is not killed along that path A definition of a variable is killed between 2 points when there is another definition of that variable along the path» r1 = r2 + r3 kills previous definitions of r1 Liveness vs Reaching defs» Liveness à variables (e.g., virtual registers), don t care about specific users» Reaching defs à operations, each def is different» Forward dataflow analysis as propagation occurs from defs downwards (liveness was backward analysis)

20 Compute Rdef GEN/KILL Sets for each BB GEN = set of definitions created by an operation KILL = set of definitions destroyed by an operation - Assume each operation only has 1 destination for simplicity so just keep track of ops.. for each basic block in the procedure, X, do GEN(X) = 0 KILL(X) = 0 for each operation in sequential order in X, op, do for each destination operand of op, dest, do G = op K = {all ops which define dest op} GEN(X) = G + (GEN(X) K) KILL(X) = K + (KILL(X) G) endfor endfor endfor

21 Compute Rdef IN/OUT Sets for all BBs IN = set of definitions reaching the entry of BB OUT = set of definitions leaving BB initialize IN(X) = 0 for all basic blocks X initialize OUT(X) = GEN(X) for all basic blocks X change = 1 while (change) do change = 0 for each basic block in procedure, X, do old_out = OUT(X) IN(X) = Union(OUT(Y)) for all predecessors Y of X OUT(X) = GEN(X) + (IN(X) KILL(X)) if (old_out!= OUT(X)) then change = 1 endif endfor endfor

22 Example Rdef Calculation BB1 1. r1 = MEM[r2+0] 2. r2 = MEM[r1 + 1] 3. r8 = r1 * r2 G = op K = {all ops which define dest op} GEN(X) = G + (GEN(X) K) KILL(X) = K + (KILL(X) G) IN = Union(OUT(preds)) OUT = GEN + (IN KILL) BB2 4. r1 = r r3 = r5 r1 6. r7 = r3 * 2 BB3 7. r2 = 0 8. r7 = r1 + r2 9. r3 = 4 BB4 10. r3 = r3 + r7 11. r1 = r2 r8 12. r3 = r1 *

23 Class Problem 1. r1 = 3 2. r2 = r3 3. r3 = r4 Compute reaching defs Calculate GEN/KILL for each BB Calculate IN/OUT for each BB 4. r1 = r r7 = r1 * r2 6. r4 = r r4 = r3 + r2 8. r8 = 8 9. r9 = r7 + r8-23 -

24 DU/UD Chains Convenient way to access/use reaching defs info Def-Use chains» Given a def, what are all the possible consumers of the operand produced» Maybe consumer Use-Def chains» Given a use, what are all the possible producers of the operand consumed» Maybe producer

25 Example DU/UD Chains 1. r1 = 3 2. r2 = r3 3. r3 = r4 4. r1 = r r7 = r1 * r2 6. r4 = r r4 = r3 8. r8 = 8 9. r9 = r7 + r8-25 -

26 To Be Continued

EECS 583 Class 7 Classic Code Optimization cont d

EECS 583 Class 7 Classic Code Optimization cont d University of Michigan October 2, 2016 Global Constant Propagation Consider 2 ops, X and Y in different BBs» 1. X is a move» 2. src1(x) is a literal» 3.