Execution
Tasks Performed
- Schema + Value + Row in all three languages.
Valueis a tagged union ofInt(i64)andText(Vec<u8>)with a frozen total order (Int < Textcross-kind; natural order within kind). - Secondary index as
BTreeMap-equivalent:std::collections::BTreeMapin Rust,std::mapin C++, a parallel sorted slice in Go (since Go's maps are randomised). - Planner with the cost model from
analysis.md. Single pass over predicates, lowest estimate wins. - Executor that scans, filters, projects in order.
- Wire format (
dump_plan,dump_result) using only little-endian primitives so the SHA-256 lines up across all three implementations. - Workload driver (
qplan workload ...) that printssha256_hex(concat(dump_plan(plan) ++ dump_result(rows)))over N ops. - Tests: 10 Rust + 11 Go + 11 C++ unit tests covering the eight planner behaviours, plus dump determinism and SHA-256 known-answer vectors.
- Scripts:
verify.sh(build + unit tests),cross_test.sh(build all three binaries, run scenarios A and B, assert sha256 identity and match the frozen golden hashes).
Order of Implementation
Rust first (the lab's reference language). Go next, debugged against the
Rust hashes. C++ last, debugged against both. Each language is
self-contained — no shared library, no FFI — so a divergence shows up
immediately as a different cross_test.sh hash.
Pitfalls Encountered
- Go map iteration. The first Go prototype iterated
map[Value][]uint32directly and produced a different hash on every run. Replaced with a sorted[]indexEntryslice and afindKeybinary search. - C++
std::map<Value, ...>. Works only ifValue::operator<is a strict weak order across kinds; the cross-kindInt < Textrule had to match Rust'sPartialOrdderivation. - Result magic length. The lab spec freezes the magic at 7 bytes
(
DSEQR01, no terminator). An early C++ port wrote 8 bytes (NUL included) and the cross-test hash diverged at byte 8 of every op. Discovered by diffing the first 16 bytes of each binary's output for op 0. u8 opbyte forPred. Rust'senum Op { Eq=1, ... }is#[repr(u8)]; Go and C++ mirror the constants explicitly. A missing#[repr(u8)]was the second source of byte divergence in the first iteration.
What's NOT Implemented
- Joins of any kind.
- Cost-based search over plan equivalence classes (Cascades).
- Histograms / cardinality estimation beyond uniform-distribution.
- Index updates on
DELETE(rows are append-only). - Index merge (combining two IndexScans on different columns).
- Persistence — see db-15 for the persistent counterpart.