Execution

Tasks Performed

  1. Schema + Value + Row in all three languages. Value is a tagged union of Int(i64) and Text(Vec<u8>) with a frozen total order (Int < Text cross-kind; natural order within kind).
  2. Secondary index as BTreeMap-equivalent: std::collections::BTreeMap in Rust, std::map in C++, a parallel sorted slice in Go (since Go's maps are randomised).
  3. Planner with the cost model from analysis.md. Single pass over predicates, lowest estimate wins.
  4. Executor that scans, filters, projects in order.
  5. Wire format (dump_plan, dump_result) using only little-endian primitives so the SHA-256 lines up across all three implementations.
  6. Workload driver (qplan workload ...) that prints sha256_hex(concat(dump_plan(plan) ++ dump_result(rows))) over N ops.
  7. Tests: 10 Rust + 11 Go + 11 C++ unit tests covering the eight planner behaviours, plus dump determinism and SHA-256 known-answer vectors.
  8. Scripts: verify.sh (build + unit tests), cross_test.sh (build all three binaries, run scenarios A and B, assert sha256 identity and match the frozen golden hashes).

Order of Implementation

Rust first (the lab's reference language). Go next, debugged against the Rust hashes. C++ last, debugged against both. Each language is self-contained — no shared library, no FFI — so a divergence shows up immediately as a different cross_test.sh hash.

Pitfalls Encountered

  • Go map iteration. The first Go prototype iterated map[Value][]uint32 directly and produced a different hash on every run. Replaced with a sorted []indexEntry slice and a findKey binary search.
  • C++ std::map<Value, ...>. Works only if Value::operator< is a strict weak order across kinds; the cross-kind Int < Text rule had to match Rust's PartialOrd derivation.
  • Result magic length. The lab spec freezes the magic at 7 bytes (DSEQR01, no terminator). An early C++ port wrote 8 bytes (NUL included) and the cross-test hash diverged at byte 8 of every op. Discovered by diffing the first 16 bytes of each binary's output for op 0.
  • u8 op byte for Pred. Rust's enum Op { Eq=1, ... } is #[repr(u8)]; Go and C++ mirror the constants explicitly. A missing #[repr(u8)] was the second source of byte divergence in the first iteration.

What's NOT Implemented

  • Joins of any kind.
  • Cost-based search over plan equivalence classes (Cascades).
  • Histograms / cardinality estimation beyond uniform-distribution.
  • Index updates on DELETE (rows are append-only).
  • Index merge (combining two IndexScans on different columns).
  • Persistence — see db-15 for the persistent counterpart.