Verification — db-05 LSM MemTable
Unit tests (per language)
| ID | Test name | What it asserts |
|---|---|---|
| V1 | empty_encode_decode | MemTable::new().encode() → 8 bytes MMT1\x00\x00\x00\x00; decode round-trips to an empty table. |
| V2 | put_then_get | After put("k","v"), get("k") returns Value("v"). |
| V3 | overwrite_replaces | Two puts on the same key keep only the latest value; len() stays at 1. |
| V4 | delete_writes_tombstone | After put("k","v") then del("k"), get("k") returns Tombstone (not None). |
| V5 | iter_byte_lex_order | Insert keys in random order; iteration yields them sorted byte-lex ("" first, \x00 next, etc.). |
| V6 | encode_decode_round_trip | Build a 50-entry table with a mix of values and tombstones; encode → decode → every entry matches and len() is preserved. |
| V7 | size_bytes_matches_encode | For any table, size_bytes() == encode().len(). |
| V8 | decoder_rejects_bad_magic | decode(b"XXX1...") returns Err. |
| V9 | decoder_rejects_truncation | Truncate a valid dump at every byte boundary; decode must fail cleanly (no panic). |
| V10 | decoder_rejects_unsorted_keys | Hand-craft a dump where keys go ["b","a"]; decoder rejects. |
Cross-language interop (scripts/cross_test.sh)
The same scripted scenario runs in each language:
new → bulk 100 → put "key50" "REPLACED"
→ del "key10"
→ put "" "empty-key-value"
→ del "key99"
→ save
This produces dumps rust.bin, go.bin, cpp.bin. The script then:
- SHA-256s all three dumps. All must match — this is the byte-identical gate.
- 3×3 reader matrix. Every reader (
rust/go/cpp) runsiteron every writer's dump. The lines must be identical across all 9 combinations. getspot-check. Each reader querieskey50,key10,key99,"", and an absent keynonexistent; results must bevalue: 5245504c41434544(REPLACED),tombstone,tombstone,value: 656d7074792d6b65792d76616c7565,absentrespectively across all readers.
End-to-end verification (scripts/verify.sh)
bash scripts/verify.sh
Builds and tests all three languages, then runs the cross-test. Final line must be
ALL GREEN.
Manual sanity checks
memtable new /tmp/m && wc -c /tmp/m→ exactly 8 bytes.memtable bulk /tmp/m 1000 && memtable size /tmp/m→ matches the formula8 + 1000 * (9 + len("keyN") + len("valN"))summed over N=0..999.- Hexdump the first 16 bytes of any dump and confirm magic + count.
What broken looks like
| Symptom | Diagnostic |
|---|---|
decode accepts b"\x00\x00\x00\x00" (no magic check) | Add magic test V8. |
Two readers print different iter output for the same dump | Either type-byte misplaced, or one language is comparing by string instead of bytes (UTF-8 vs raw). |
len() differs across langs after the same script | Go's map+sort path lost a duplicate; check overwrite path. |
Dump grows monotonically after del | Tombstone path is creating a new entry under a different key; check key equality. |
Random crash in C++ on decode of truncated input | Missing length check before memcpy; bounds-check every read. |