db-12 — References

Primary sources

  • Crafting Interpreters, Robert Nystrom — chapters 4 ("Scanning") and 6 ("Parsing Expressions") map almost 1:1 onto what we built. The hand- rolled recursive-descent style and the "one function per non-terminal" discipline are taken straight from this book. https://craftinginterpreters.com/
  • Modern Compiler Implementation in ML (or in C / Java), Andrew Appel — chapter 3 ("Parsing"). The clearest exposition of why recursive descent works for LL(1) grammars and what changes when you need lookahead or precedence climbing.
  • The Dragon Book (Aho, Lam, Sethi, Ullman) — chapters 3 and 4. The textbook source for lexical analysis (DFA construction, regular expressions to scanners) and predictive parsing. Read for theory; the practice is in Crafting Interpreters.

How real databases parse SQL

Recursive descent specifically

  • Rob Pike, Lexical Scanning in Go, GopherCon 2011 — the talk that popularized the "scanner emits tokens to a channel; parser reads from the channel" style. We don't use channels (we just return a Vec), but the state-machine framing of the scanner loop is the same. https://www.youtube.com/watch?v=HxaD_trXwRE
  • Doug Crockford, Top Down Operator Precedence, 2007 — the cleanest explanation of Pratt parsing, which is what you reach for next once recursive descent runs into expression-precedence pain. We don't need it here (no expression grammar) but it's the natural follow-up. https://crockford.com/javascript/tdop/tdop.html

Determinism / wire formats

Cross-lab dependencies

None. db-12 is intentionally self-contained: there is nothing in earlier labs (storage, WAL, LSM, B-tree) that the parser needs, and the AST serializer uses no upstream wire format. The C++ build does not add_subdirectory(../db-NN). That isolation is a feature, not an oversight — it keeps the lab small enough to be reasoned about as a self-contained compiler-front exercise.

db-12 feeds db-13 (execution + transactions), where the AST will finally be walked by a VM.