SU-CS143 MAY202025

Its optimization time!!

Program Optimization

“maximum benefit for minimal cost”

When should optimization happen?

AST?

pro: machine independent
con: no notion registers, not language dependent

Assembly?

pro: many optimization opportunities
con: machine dependent

And so, we really should be doing this with some IR.

Three-address intermediate code

Each instruction is of the form:

x := y op z
x := op z

where x, z are registers, constants, etc.

basic block

A basic block is a sequence with no labels (except in the first point) an no jumps (except in the last point). So when we are inside a basic block, we can optimize the code inside sans worry about control flow.

control-flow graph

a control-flow graph is a directed graph with….

basic blocks as nodes
as edge from \(A\) to \(B\) if an execution can pass from the last instruction in \(A\) to first instruction in \(B\)

Three Levels of Optimizations

local optimizations: apply to a basic block in isolation
global optimizations: apply to a control-flow graph
inter-procedural optimizations: optimizations between graphs, such as inclining

some stuff you should run

basic block
dataflow
loop
instruction optimization / peephole
register

some special oop stuff

function inclining
method call targets
class unboxing

some functional stuff

tail recursion
deforestation

basic block optimizations

algebraic simplification

x := x + 0  => x := x
x := x * 1 => x := x
y := y ** 2 => y := y*y
x := x * 8 => x := x << 3

on --ffast-math

-- only works on ints (since nan * 0 = nan)
x := x * 0 => x := 0

constant folding

x := y op z

with constant y and z, we can just fold it. so

x := 2+2 => x := 4

static single-assignment form

rewrite intermediate code in SSA form <= never reassign variables. Meaning, if two assignments end up with the same rhs, they compute the same value. Namely:

So in SSA,

x := y + z
...
w := y + z

we know that x and w can’t be redefined in SSA; this means that we cloud write:

x := y + z
...
w := x

copy propagation

so we can keep writing using static single-assignment form replacements. In particular note that once we are just assigning a variable to another, the first variable can just be replaced with the second everywhere.

b := z + y
a := b
x := 2 * a

we can stick b into a

b := z + y
a := b
x := 2 * b

and then we see that `a := b` is deadcode.

b := z + y
x := 2 * b

dead code elimination

within a basic block, only one type of dead code

Suppose “w := rhs” appears in a basic block, but “w” doesn’t appear anywhere else in the program. We can say “w := rhs” is dead and can be eliminated.

peephole optimization

a “peephole” is a sequence of contiguous instructions; the optimizer replaces the sequence with another equivalent one, but faster.

global optimizations

How do we apply local optimizations to a global scope?

Generally, to prove postulate P at any point requires knowing the entire function; so we either know X is definitely true, or we say “we don’t know.”

example

Consider the branching order:

(X := 3, B > 0) => {(Y := Z + W), (Y := 0)} => (A := 2*x)

in this control flow graph, we never touch x in the middle, therefore we can just propagate x forward by replace it.

program point

every statement is associated with 2 program points:

right before a statement
right after a statement

global constant propagation

Let’s consider three states x can be:

value	interpretation
\(\top\)	not constant
\(c\)	constant, on every path
\(\bot\)	unreachable code

so by the time you hit a program point, if \(x=c\), then its constant. To perform this labeling, we must…

transfer function
for “in”(to a statement) and “out” (of a statement), we have, slides 21 and more https://web.stanford.edu/class/cs143/lectures/lecture15.pdf.

an algorithm
For every entry \(s_{0}\) to the program, we set:
\begin{equation} C\qty(s_{0},x,\text{in}) = \top \end{equation}
and set:
\begin{equation} C\qty(s \neq s_{0},x,\text{in}) = C\qty(s \neq s_{0},x,\text{out}) = \bot \end{equation}
everywhere else. An then we just apply the rules above.
- convergence
  This system converges because the rules have it such that \(\bot < c < \top\) (once something is top, it can’t go back.) So, \(\top\) is the “greatest” value, and \(\bot\) the “least”.
  We can then lub the rules 1-4 from above into on rule:
  \begin{equation} C\qty(s,x,\text{in}) = \text{lub} \qty {C\qty(p,x,\text{out}) | p\text{ predecessor } s} \end{equation}
  So you can just see the lub of predecessor hierarchy of each statement.

dead code elimination

a variable \(x\) is “live” at statement \(s\) if…

there’s a statement \(s’\) that uses \(x\)
there’s a path from \(s\) to \(s’\)
that path has no intervening assignment to \(x\)