step 03 — cross-language snapshot
Goal
Produce the canonical snapshot byte stream defined in ../CONCEPTS.md, run the deterministic workload in each language, and assert byte-identical SHA-256 across Rust, Go, and C++.
By the end of this step:
dump_snapshotexists in every language and produces bytes that match the spec section-for-section.- A
run_workload(seed, ops, keys, scenario)function exists in every language and is bit-exact. - The CLI prints the hex SHA-256 with no trailing newline.
scripts/verify.shends with=== OK ===.scripts/cross_test.shends with=== ALL OK ===and reports both golden hashes for scenarios A and B.
Tasks
- Implement
dump_snapshot. Build it incrementally: write the magic + header first, get a single-row dump matching by hand, then add the secondary section. - Implement
splitmix64and a statefulSplitMix64::next(). Pin a test forsplitmix64(0) == 0x8b57dafca0cee644to guard against constant typos. - Implement
run_workloadper the rules in CONCEPTS.md. Pay special attention to: drawing all three rng words even for read ops; the kind decoding(r1 >> 60) & 0x7; the modulo casts toi64. - Implement
sha256_hex. In Rust use thesha2crate. In Go usecrypto/sha256+encoding/hex. In C++ inline the reference implementation (FIPS 180-4) — keep it in the same translation unit as the engine to avoid a third-party dependency. PinSHA256("")andSHA256("abc")in tests. - Wire up the CLI:
sqlitectl workload --seed N --ops N --keys N --scenario S. Print the hex withprint!/fmt.Print/std::cout— no newline. - Run
scripts/verify.shthenscripts/cross_test.sh. Iterate until both end with their success markers.
Debugging a divergence
If cross_test.sh shows different hashes between languages, follow
the ladder in ../docs/verification.md:
shrink the op count, dump the raw snapshot bytes with xxd, diff,
and look for the first differing byte. It almost always points at a
section boundary that exposes either map-iteration order or a
wrong-width cast.
Acceptance
- All three unit suites pass under release optimisation.
- Both
=== OK ===and=== ALL OK ===markers appear. - Scenario A hash:
e8ccacd39d8535c1ed101f0bc8b7a0799f56468a384da9284d4768cd8b3a3aab. - Scenario B hash:
dd1d6bb7fec1ffc9f71f01e75a58166b04517a669495af2aa2da432d4722db69.