db-09 step 01 — The write path
Goal
Implement Db::open(dir), put(k,v), delete(k), and Write(WriteBatch)
such that every successful return has been durably persisted to the WAL.
Tasks
- Pick the on-disk layout (
MANIFEST,wal.log,sst-NNNNNN.sst). - Define the WriteBatch wire format. Use a single encoder/decoder so the in-memory batch representation and the WAL record payload are identical bytes.
- On
open(dir):mkdir -pthe directory.- Read
MANIFEST(if it exists) one line at a time; collect SST ids newest-first. - Open each SSTable in order; track
max_id. - Replay
wal.logwithWalIter: decode each record as a batch and apply to a fresh memtable. - Open the WAL for writes; set
next_id = max_id + 1.
- Implement
Write(batch):- Reject the empty batch as a no-op (don't write an empty WAL record).
bytes = batch.encode(); wal.append(bytes); wal.sync();- Apply the batch to the memtable.
putanddeleteare thin wrappers that build a one-op batch.
Acceptance
Inline unit tests:
batch_roundtrip— encode → decode round-trip preserves three representative ops (Put, Delete, Put-with-empty-key).batch_rejects_trailing— decoding rejects a one-byte-suffix-corrupted payload.put_get_memtable—put("a","1")followed byget("a")returnsSome("1");get("missing")returnsNone.delete_shadows—putthendeletemakesgetreturnNone.
All four green in Rust, Go, and C++.
Discussion prompts
- Why must
wal.sync()happen before applying to the memtable, not after? - What invariant would break if we let
Writeproceed for an empty batch? - How would a group-commit optimization preserve the same durability
guarantee while batching multiple
Writecalls behind a singlefsync?