verification

The verification ladder

  1. Unit tests inside each language. 13 tests per implementation, covering insert/update/delete semantics, the no-op rule on missing keys, secondary-index maintenance across tag changes, the tombstone-then-reinsert path, the wire format byte layout, and the two frozen scenarios.
  2. scripts/verify.sh runs all three suites end to end.
  3. scripts/cross_test.sh builds all three CLI binaries and asserts byte-identical SHA-256 across Rust/Go/C++ for both scenarios and equality with the frozen goldens.

What each layer protects against

LayerCatches
Unit testsWrong semantics within one language: e.g. UPDATE bumping txid when the row was missing.
Frozen golden in unit testsDrift in one language only: e.g. someone "fixes" the splitmix64 constants.
Cross-language scriptCross-language drift: e.g. Go iterating a map without sorting.
Both goldensDrift that happens to leave one scenario unchanged. Hitting two seeds at very different op counts is a cheap insurance policy.

Test vectors we pinned

In every language:

  • SHA256("") = e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
  • SHA256("abc") = ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
  • splitmix64(0) = 0x8b57dafca0cee644

If any of those fail, every higher-level test is meaningless, so they run first.

How to debug a cross-language mismatch

If cross_test.sh reports a mismatch:

  1. Re-run with smaller --ops (say 10) until the divergence appears. SHA-256 is binary — either equal or not — so you need to dump the actual bytes.
  2. Add a temporary print of the snapshot's hex before the SHA, in both the suspect language and the reference. xxd or od -An -tx1 on the two outputs and diff them.
  3. The first byte that differs almost always points at a section boundary. Decode the next_txid and primary row count by hand.
  4. The two most common causes by a wide margin are (a) unsorted map iteration and (b) a missing ULL on a C++ constant. Check those first.

Coverage gaps we accept

We do not run a property-based test (no proptest in Rust, no testing/quick in Go). The two seeded scenarios are dense enough that we have not seen a real bug that proptest would have caught and they would not, and adding proptest would make the test loop slower and more flaky.