verification

The verification ladder

Unit tests inside each language. 13 tests per implementation, covering insert/update/delete semantics, the no-op rule on missing keys, secondary-index maintenance across tag changes, the tombstone-then-reinsert path, the wire format byte layout, and the two frozen scenarios.
scripts/verify.sh runs all three suites end to end.
scripts/cross_test.sh builds all three CLI binaries and asserts byte-identical SHA-256 across Rust/Go/C++ for both scenarios and equality with the frozen goldens.

What each layer protects against

Layer	Catches
Unit tests	Wrong semantics within one language: e.g. UPDATE bumping txid when the row was missing.
Frozen golden in unit tests	Drift in one language only: e.g. someone "fixes" the splitmix64 constants.
Cross-language script	Cross-language drift: e.g. Go iterating a map without sorting.
Both goldens	Drift that happens to leave one scenario unchanged. Hitting two seeds at very different op counts is a cheap insurance policy.

Test vectors we pinned

In every language:

SHA256("") = e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
SHA256("abc") = ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
splitmix64(0) = 0x8b57dafca0cee644

If any of those fail, every higher-level test is meaningless, so they run first.

How to debug a cross-language mismatch

If cross_test.sh reports a mismatch:

Re-run with smaller --ops (say 10) until the divergence appears. SHA-256 is binary — either equal or not — so you need to dump the actual bytes.
Add a temporary print of the snapshot's hex before the SHA, in both the suspect language and the reference. xxd or od -An -tx1 on the two outputs and diff them.
The first byte that differs almost always points at a section boundary. Decode the next_txid and primary row count by hand.
The two most common causes by a wide margin are (a) unsorted map iteration and (b) a missing ULL on a C++ constant. Check those first.

We do not run a property-based test (no proptest in Rust, no testing/quick in Go). The two seeded scenarios are dense enough that we have not seen a real bug that proptest would have caught and they would not, and adding proptest would make the test loop slower and more flaky.

Distributed Systems Engineer — Build Databases & Consensus From Scratch

verification

The verification ladder

What each layer protects against

Test vectors we pinned

How to debug a cross-language mismatch

Coverage gaps we accept