SU-CS143 APR102025
Last edited: August 8, 2025Its DFA time! Deterministic Finite Automata + regular expression (complexity). See programmatically compiling RegEx to DFA.
SU-CS143 APR152025
Last edited: August 8, 2025Another review of the compiler
Compiler Front End
Lexer => Parser => Semantic Analysis; usually language specific.
Compiler Mid End
Optimizer; usually shared across many languages.
Compiler Back End
Codegen
parser
parser: takes sequence of tokens from the lexer and produce AST of the program. General goal is to distinguish between valid and invalid strings of tokens.
Importantly, we don’t really emit errors from parsing; instead we just do it at the layer below and above.
SU-CS143 APR172025
Last edited: August 8, 2025and also abstract syntax tree and top-down parsing
error handling
Recall! Our compiler should detect non-valid programs and to translate valid ones.
Error Kind | Example | Detected by… |
---|---|---|
Lexical | $ <- doesn’t exist | lexer |
Syntax | x * % <- doesnt’ make sense | parser |
Semantic | int x; x(3) <- doesn’t check | type checker |
Correctness | you! |
Good error handler should report errors accurately and clearly; recovering from an error quickly, and not slow down compilation of valid code.
SU-CS143 APR242025
Last edited: August 8, 2025Bottom-Up Parsing
Bottom-Up Parsing is more general than top-down parsing (i.e., it doesn’t require left factorization, etc.)
This is generally the preferred method. Key insight: we reduce strings into increasingly higher level symbols
Sketch
Because of the Properties of Bottom-Up Parsing, we can….
initialization
- split the string into two pieces
- right substring has terminals, not yet examined
- left substring has terminals and non-terminals
We begin by splitting to the left since we begin with no non-terminals.