System Programming - Compilers/VMs/Language Designs
rust zig system compiler language kernelJournal and references for exploring language design, interpreters, and compilers
===========================================
Resources
- The Kernel Book A Journey in Creating an Operating System Kernel
- Crafting Interpreters Ever wanted to make your own programming language or wondered how they are designed and built?
- Tales of the M1 GPU
- How I wrote my own “proper” programming language
- Rich programmer food
- Machines and Managed Runtimes
- Post-Modern Compiler Design
- Tail call optimization - From Erlang
============================================
Rusty Log - 05-01-2023
A grammar is a specification of the valid sentences in a language. Note that grammars need not be executable — indeed, they may be specified formally or informally, explicitly or implicitly. A parser is an implementation of a grammar, which checks whether an input is valid with respect to that grammar or not. There are various ways of implementing parsers, but there are generally only two which we really need to consider: LR parsing and recursive descent parsing.
Nearly, all hand-written parsers are recursive descent parsers since it’s more easier and more flexible. There is an additional, easily over-looked, factor that comes from relying on recursive descent parsers: learners cannot, in any sensible time, understand from the recursive descent parser the language’s syntax. Many languages ignore this problem at first, and hope that learners can guess the language’s syntax from examples. This is inevitably unsatisfactory, and it is then common over time for languages to try deriving a CFG from the recursive descent parser, to help learners and tool authors. However, these CFGs rarely fully match the recursive descent parser, leaving learners confused, and tool authors frustrated. if you believe static type checking allows you to write programs that are more reliable than if you used dynamic type checking then, logically, you must also believe that LR parsing is more reliable than recursive descent parsing. Similarly, if you believe that explicit static types help readers better understand code, then you must also believe that explicit LR grammars better convey to readers the grammar of a language than the equivalent recursive descent parser.
Grammer for rusty (work in progress)
Reference: https://craftinginterpreters.com/representing-code.html#top
BNF notation simplified
pipe | : options one or the other. e.g. bread -> "toast" | "biscuits" | "muffin"
parenthesis ( ) : grouping. e.g. protein → ( "scrambled" | "poached" | "fried" ) "eggs"
recursion * : repeat zero or more times. e.g. crispiness → "really" "really"*
postfix + : appear at least once. e.g. crispiness → "really"+
postfix ? : appear zero or once
expression → equality ;
equality → comparison ( ( "!=" | "==" ) comparison )* ;
comparison → term ( ( ">" | ">=" | "<" | "<=" ) term )* ;
term → factor ( ( "-" | "+" ) factor )* ;
factor → unary ( ( "/" | "*" ) unary )* ;
unary → ( "!" | "-" ) unary
| primary ;
primary → NUMBER | STRING | "true" | "false" | "nil"
| "(" expression ")" ;
Precedence and Associativity
Name Operators Associates Precedence Grammer
Equality == != Left Lower Top
Comparison > >= < <= Left
Term - + Left
Factor / * Left
Unary ! - Right Highest Bottom
Recursive Descent Parser
A recursive descent parser is a literal translation of the grammar’s rules straight into imperative code. Each rule becomes a function. The body of the rule translates to code roughly like:
Grammar notation Code representation
Terminal Code to match and consume a token
Nonterminal Call to that rule’s function
| if or switch statement
* or + while or for loop
? if statement
=================================================================