db-07 Verification

Ten properties, three implementations each.

V1 — Empty inputs

compact([], drop=false) → exactly 36 bytes, identical to SstWriter::new().finish() from db-06. Same for drop=true.

V2 — Single input passthrough

compact([A], drop=false) reproduces A's logical contents (same entries in same order). The bytes are not necessarily identical to A (block boundaries may differ if A had unusual block-target settings), but the output's entries() matches A's entries() exactly.

V3 — Newest wins on overlap

Inputs A = [("k", V, "new")], B = [("k", V, "old")]. Output contains ("k", V, "new") only. Output entry count = 1.

V4 — Tombstones win over older values

Inputs A = [("k", T)], B = [("k", V, "v")]. With drop=false, output contains ("k", T). With drop=true, output is empty.

V5 — Disjoint keys interleave correctly

Inputs A = [("b", V, "x"), ("d", V, "x")], B = [("a", V, "y"), ("c", V, "y")]. Output: ("a", V, "y"), ("b", V, "x"), ("c", V, "y"), ("d", V, "x") — sorted, no duplicates, every entry from its sole source.

V6 — Three-way merge handles transitive dedupe

Inputs (newest → oldest):

A: [("k", V, "v1")]
B: [("k", V, "v2"), ("z", V, "Z")]
C: [("k", V, "v3"), ("a", V, "A")]

Output: [("a", V, "A"), ("k", V, "v1"), ("z", V, "Z")]. K resolves to A's. Both B and C must advance past their "k" entries even though neither wins.

V7 — Canonical scenario byte-identity

Build newer.sst and older.sst as described in observation.md. Compact in each language with drop=false. Assert sha256 equality across all three languages.

V8 — SstWriter rejects an internally broken merge

If the merger forgets to drain duplicate cursors and tries to call SstWriter::add with the same key twice, the writer returns Error::Unsorted. The test for this constructs two inputs with overlapping keys and verifies that a correct compaction succeeds (i.e., we never see that error during a valid compaction).

V9 — Output is a valid db-06 SSTable

The bytes returned by compact open cleanly via SstReader::open and get(key) returns the merged version. This is the round-trip test.

V10 — Idempotent re-compaction

compact([compact([A, B])]) is byte-identical to compact([A, B]). Compaction of an already-compacted file is a no-op modulo metadata.

Cross-test (scripts/cross_test.sh)

Goes beyond V7 to also run a 3×3 reader/writer matrix on the merged file (byte-identity already implies this, but it confirms the output is portable):

  1. Build newer.sst and older.sst via db-05 → db-06 (Rust binaries; db-06 already proved byte-identity).
  2. Each language runs compact OUT.sst newer.sst older.sst.
  3. Assert sha256 match across the three OUT files.
  4. Each language reads each OUT file with sstable iter (from db-06) and asserts the iter output equals a reference (the Rust read of its own OUT).
  5. Spot-check sstable get OUT.sst key10value: 4e45572d3130 ("NEW-10") in all three.
  6. Spot-check sstable get OUT.sst key5tombstone.
  7. Spot-check sstable get OUT.sst key50value: 4f4c442d3530 ("OLD-50").