Houjun Liu

compiler

A compiler complies code!

parts of compilers

lexical analysis

make stuff tokens — “identifying words”

an example!

if x == y then z = 1; else z = 2;
  • if: keyword
  • " “: white space
  • x: identifier
  • ==: relation

… and so on

parsing

abstract syntax treeifying the tokens — “identifying sentences”.

See parser.

semantic analysis

semantic analysis

optimization

make the IR faster — “editing”.

goals

  • run faster
  • use less memory
  • generally, conserve resources

tricky tricky!

Can this be optimized to x=0?

x = y * 0

Its tricky. y may not be numeric; y maybe float, then in which can nan * 0 = nan.

code generation

generate machine code — “translation.” Consider: register layout, etc.

other stuff

intermediate representation

compliers typically translate between multiple intermediate languages.

  • all but the first and last representations are called intermediate representations
  • IRs are generally ordered in descending levels of abstraction
digraph {
rankdir=LR;
graph [bgcolor=transparent];
node [fontcolor=white, color=white, shape=none];
edge [fontcolor=white, color=white];

ir1 [label="ir"]

source -> ir -> "..." -> ir1 -> {assembly, ir1}
}

issues

many pitfalls:

  • your copmiler maybe slow
  • may not be able to error nous inputs
  • language design is important – determines what is ambiguous (hence what is easy / hard to compile)
  • tradeoffs! in language design