compiler

A compiler complies code!

parts of compilers

lexical analysis

make stuff tokens — “identifying words”

an example!

if x == y then z = 1; else z = 2;

if: keyword
" “: white space
x: identifier
==: relation

… and so on

parsing

abstract syntax treeifying the tokens — “identifying sentences”.

See parser.

semantic analysis

optimization

make the IR faster — “editing”.

goals

run faster
use less memory
generally, conserve resources

tricky tricky!

Can this be optimized to x=0?

x = y * 0

Its tricky. y may not be numeric; y maybe float, then in which can nan * 0 = nan.

code generation

generate machine code — “translation.” Consider: register layout, etc.

other stuff

intermediate representation

compliers typically translate between multiple intermediate languages.

all but the first and last representations are called intermediate representations
IRs are generally ordered in descending levels of abstraction

digraph {
rankdir=LR;
graph [bgcolor=transparent];
node [fontcolor=white, color=white, shape=none];
edge [fontcolor=white, color=white];

ir1 [label="ir"]

source -> ir -> "..." -> ir1 -> {assembly, ir1}
}

issues

many pitfalls:

your copmiler maybe slow
may not be able to error nous inputs
language design is important – determines what is ambiguous (hence what is easy / hard to compile)
tradeoffs! in language design