Verification — db-22
What "verified" means here
For a perf-and-bench lab, "verified" means three things at once:
- All three implementations pass their own unit tests (Rust 9, Go 9, C++ 9).
- All three implementations produce byte-identical snapshot hashes for both frozen scenarios.
- The frozen hashes match the golden values committed in source.
Anything less and the bench numbers are meaningless. You can't claim "Rust does X ops/sec on this workload" if it is not doing the same work as the Go and C++ versions.
How to verify
bash scripts/verify.sh
bash scripts/cross_test.sh
Each script exits non-zero on failure and prints either === OK === or
=== ALL OK === on success.
Expected last lines:
$ bash scripts/verify.sh
…
=== OK ===
$ bash scripts/cross_test.sh
…
=== ALL OK ===
What each unit test pins
| Test | Pins |
|---|---|
sha256_vectors | SHA-256 against known empty and "abc" vectors |
splitmix64_known | splitmix64(0) == 0x8b57dafca0cee644 |
incr_accumulates | incr adds to existing entries, creates new ones, bumps total_ops |
decr_saturates_and_removes | decrement past zero removes the entry |
decr_on_missing_is_visible_op | decr on a missing key bumps total_ops but does not create the entry |
snapshot_layout_two_keys | exact wire bytes of a 2-key snapshot |
workload_determinism | same seed twice → same snapshot bytes |
scenario_a_frozen / scenario_b_frozen | frozen golden hashes per scenario |
The frozen-scenario tests are the highest-value tests in the lab. Any silent change to the wire format, the workload, or SplitMix64 breaks both of them with a clear "got X, want Y" message in the failing language's test output.
Manual sanity checks
# bytes of the smallest meaningful snapshot
./target/release/benchctl hash workload --seed 0 --ops 0 --keys 1 --scenario default
# expected: sha256 of MAGIC || 0_u64 || 0_u32 = the empty-store hash
# determinism
./target/release/benchctl hash workload --seed 42 --ops 500 --keys 32 --scenario default
./target/release/benchctl hash workload --seed 42 --ops 500 --keys 32 --scenario default
# should print the same hex twice
What is not verified by these tests
- That
benchreports the correct throughput. It is impossible to verify a wall-clock number from a test. The bench harness has adistinct=field as a structural sanity check, but the numeric throughput is left to the operator to inspect. - That the implementations are equally fast — we only check they are equally correct. The whole point of the lab is to make speed comparisons honest by first making correctness identical.
- That the implementations would still match on a 32-bit or big-endian
platform. The wire format pins little-endian; on a hypothetical
big-endian build we'd need a byte-swap in
put_u64_leetc.