Broader Ideas — db-21

A short scrapbook of "what would I add next if this were a real engine?"

1. Tombstone Garbage Collection

Right now a range tombstone lives forever — it survives every compaction and is copied verbatim into the merged output. A production engine drops a tomb when it's certain no shadowed Puts remain below it. The standard test: the tomb's end_key is < smallest_key of every SST below it. Implementing this would require tracking the "sequence number" or generation of each record, which we deliberately omitted.

2. Coalescing Overlapping Tombstones

Two tombs [k0, k5) and [k3, k7) are equivalent to [k0, k7). Merging them at compaction time shrinks the per-Get cost (the active vector stays smaller). We didn't do it because the canonical-bytes test would then need to specify a normalisation policy (sort by start? coalesce overlaps? coalesce adjacencies?). Each choice is fine, but the choice itself becomes part of the wire format.

3. Multi-Level Layout

The lab keeps SSTs as a flat list. RocksDB has L0 (overlapping ranges allowed, newest writes land here) plus L1..Ln (each level non-overlapping, ratio'd in size). Universal compaction roughly corresponds to a degenerate "L0 only" mode, while leveled compaction is its own beast (each compaction picks one L_i SST and the L_{i+1} SSTs that overlap it). A natural follow-up would implement leveled compaction and re-run the same three fixtures with new hashes.

4. Bloom Quality

A 64-bit single-hash bloom is intentionally bad — it exists to make the test for "bloom misses still must walk older SSTs for range tombstones" trigger reliably on tiny inputs. Real engines use per-SST blooms sized to ~10 bits per key with k≈7 hash functions, giving a false-positive rate ~1%. The change is purely numeric; the wire format would absorb a longer bitmap as a length-prefixed byte string.

5. Snapshot Reads / MVCC

If each entry carried a seq: u64, Get(key, at_seq) would walk SSTs the same way but only consider entries with entry.seq ≤ at_seq. Range tombstones would gain a seq too. The merge step would need to keep older versions until they're below the oldest live snapshot.

6. Why Not Implement These Now?

Each one would multiply the size of the wire format and the surface area of the cross-language tests. The lab's claim is that two ideas (range tombstones, two compaction policies) are enough to stress-test cross- language byte equivalence to the point of being convincing. Adding a third without first writing it down somewhere else would dilute the lesson.

Distributed Systems Engineer — Build Databases & Consensus From Scratch