Verification — db-21

1. What "Verified" Means Here

Two distinct claims:

Per-language correctness: unit tests in each language pass.
Cross-language byte equivalence: three independent implementations produce identical canonical wire dumps for three fixed workloads, proven by sha256.

Both must hold. (1) without (2) lets each port drift independently into a "self-consistent but wrong" state.

2. Per-Language Unit Tests

Ten tests, mirrored across all three ports:

#	Name	Asserts
1	`bloom_hit_miss`	Bloom positive case + a definite negative
2	`bounds_short_circuit`	`Get` skips SST when key outside `[smallest, largest]`
3	`range_tomb_hides_older_put`	Newer range tomb shadows older Put
4	`range_tomb_respects_newer_put`	Older range tomb does not shadow newer Put
5	`tiered_picks_prefix`	`compact_size_tiered` picks ≥2 prefix
6	`universal_picks_run`	`compact_universal` picks ≥3 contiguous run
7	`noop_compaction`	Returns `false` when no eligible group
8	`dump_determinism`	Two dumps of the same state are equal; magic is `DSEADV21`
9	`workload_all_scenarios`	All four scenarios produce non-empty dumps with correct magic
10	`dedup_keeps_last`	`build_sst` keeps the last Put per key

./scripts/verify.sh
# == Rust ==
# 10 passed; 0 failed
# == Go ==
# ok      github.com/10xdev/dse/db21
# == C++ ==
# 1/1 Test #1: test_adv .........................   Passed
# === OK ===

3. Cross-Language Byte Equivalence

./scripts/cross_test.sh
# == build Rust ==
# == build Go ==
# == build C++ ==
# ok   fixture=A impl=rust fc2fe88978eb2d419a73a7a16fa9ec0695ad9a56cb3a31b0bf85c0a28d7c97d6
# ok   fixture=A impl=go   fc2fe88978eb2d419a73a7a16fa9ec0695ad9a56cb3a31b0bf85c0a28d7c97d6
# ok   fixture=A impl=cpp  fc2fe88978eb2d419a73a7a16fa9ec0695ad9a56cb3a31b0bf85c0a28d7c97d6
# ok   fixture=B impl=rust 05b07426e0da8ec2f1f8c81573dc275cd61cab9c19c93dc17c854456e441e7bb
# ok   fixture=B impl=go   05b07426e0da8ec2f1f8c81573dc275cd61cab9c19c93dc17c854456e441e7bb
# ok   fixture=B impl=cpp  05b07426e0da8ec2f1f8c81573dc275cd61cab9c19c93dc17c854456e441e7bb
# ok   fixture=C impl=rust 4ad255755dbfbaa40a842766656d0c0dbd6713b6a527ffea5a24fa35964d73e4
# ok   fixture=C impl=go   4ad255755dbfbaa40a842766656d0c0dbd6713b6a527ffea5a24fa35964d73e4
# ok   fixture=C impl=cpp  4ad255755dbfbaa40a842766656d0c0dbd6713b6a527ffea5a24fa35964d73e4
# === ALL OK ===

4. What Would Falsify The Claim

A non-exhaustive list of bugs the cross test would catch but a per-language test wouldn't:

Forgetting to encode the bloom bitmap as little-endian on a big-endian port.
Using host integer width for length prefixes instead of u32.
Iterating a hash map at any point in merge_run (non-deterministic order across languages and across runs).
Encoding the ratio as "0.5" instead of the IEEE bit pattern.
Compacting via "longest run found so far that satisfies threshold at the time of finding", instead of evaluating all runs and picking the global longest.
Off-by-one in b = a + 1 + (r3 mod (keys-a)) for the range tombstone end key.

5. Reproducibility Bar

macOS arm64, AppleClang 16, Go 1.22, Rust stable (rustc 1.7x).
No external dependencies (no sha2 crate, no golang.org/x/..., no OpenSSL): every implementation is self-contained, so the verification step is reproducible offline.
All three hashes are pinned in scripts/cross_test.sh and reproduced in this document for paper-trail purposes.