db-07 Verification
Ten properties, three implementations each.
V1 — Empty inputs
compact([], drop=false) → exactly 36 bytes, identical to
SstWriter::new().finish() from db-06. Same for drop=true.
V2 — Single input passthrough
compact([A], drop=false) reproduces A's logical contents (same entries in
same order). The bytes are not necessarily identical to A (block boundaries
may differ if A had unusual block-target settings), but the output's
entries() matches A's entries() exactly.
V3 — Newest wins on overlap
Inputs A = [("k", V, "new")], B = [("k", V, "old")]. Output contains
("k", V, "new") only. Output entry count = 1.
V4 — Tombstones win over older values
Inputs A = [("k", T)], B = [("k", V, "v")]. With drop=false, output
contains ("k", T). With drop=true, output is empty.
V5 — Disjoint keys interleave correctly
Inputs A = [("b", V, "x"), ("d", V, "x")], B = [("a", V, "y"), ("c", V, "y")].
Output: ("a", V, "y"), ("b", V, "x"), ("c", V, "y"), ("d", V, "x") — sorted,
no duplicates, every entry from its sole source.
V6 — Three-way merge handles transitive dedupe
Inputs (newest → oldest):
A: [("k", V, "v1")]
B: [("k", V, "v2"), ("z", V, "Z")]
C: [("k", V, "v3"), ("a", V, "A")]
Output: [("a", V, "A"), ("k", V, "v1"), ("z", V, "Z")]. K resolves to A's.
Both B and C must advance past their "k" entries even though neither wins.
V7 — Canonical scenario byte-identity
Build newer.sst and older.sst as described in observation.md. Compact in
each language with drop=false. Assert sha256 equality across all three
languages.
V8 — SstWriter rejects an internally broken merge
If the merger forgets to drain duplicate cursors and tries to call
SstWriter::add with the same key twice, the writer returns
Error::Unsorted. The test for this constructs two inputs with overlapping
keys and verifies that a correct compaction succeeds (i.e., we never see
that error during a valid compaction).
V9 — Output is a valid db-06 SSTable
The bytes returned by compact open cleanly via SstReader::open and
get(key) returns the merged version. This is the round-trip test.
V10 — Idempotent re-compaction
compact([compact([A, B])]) is byte-identical to compact([A, B]).
Compaction of an already-compacted file is a no-op modulo metadata.
Cross-test (scripts/cross_test.sh)
Goes beyond V7 to also run a 3×3 reader/writer matrix on the merged file (byte-identity already implies this, but it confirms the output is portable):
- Build
newer.sstandolder.sstvia db-05 → db-06 (Rust binaries; db-06 already proved byte-identity). - Each language runs
compact OUT.sst newer.sst older.sst. - Assert sha256 match across the three OUT files.
- Each language reads each OUT file with
sstable iter(from db-06) and asserts the iter output equals a reference (the Rust read of its own OUT). - Spot-check
sstable get OUT.sst key10→value: 4e45572d3130("NEW-10") in all three. - Spot-check
sstable get OUT.sst key5→tombstone. - Spot-check
sstable get OUT.sst key50→value: 4f4c442d3530("OLD-50").