Houjun Liu

SU-CS143 APR152025

Another review of the compiler

Compiler Front End

Lexer => Parser => Semantic Analysis; usually language specific.

Compiler Mid End

Optimizer; usually shared across many languages.

Compiler Back End

Codegen

parser

parser does parsing

parser: takes sequence of tokens from the lexer and produce AST of the program. General goal is to distinguish between valid and invalid strings of tokens.

Importantly, we don’t really emit errors from parsing; instead we just do it at the layer below and above.

NOTE: FORM of the CFGs is very important.

We use Context-Free Grammars

example

Consider:

if x = y then 1 else 2 fi

parser gets

IF ID = ID THEN INT ELSE INT FI

parser makes

Chomsky Hierarchy

Levels of languages, in increasing amounts of difficulty.

  1. regular languages
  2. CFGs (pushdown automatas)—they have a stack
  3. CSGs (context sensitive grammars)
  4. Recursively Enumerable languages (undecidable languages)