Step 02 — Tiered and Universal Compaction

Goal

Implement two compaction policies as deterministic functions of the current SST sequence and the configured ratio, so that the same input list of SSTs always produces the same output list.

Size-Tiered

Pick the longest prefix L ∈ [2, n-1] of ssts such that Σ size(ssts[0..L]) ≤ ratio · size(ssts[L]).

chosen     = 0
prefix_sum = 0
for L in 1..=n-1:
    prefix_sum += size(ssts[L-1])
    if L >= 2 and prefix_sum <= ratio * size(ssts[L]):
        chosen = L
if chosen < 2: return false
merged = merge_run(ssts[0..chosen])
ssts   = [merged] ++ ssts[chosen..]
return true

The merged SST goes at the newest position, because that's where the newly-merged data conceptually lives.

Universal

Pick the longest contiguous run [i, i+L) with L ≥ 3 such that Σ size(run) ≤ ratio · size(ssts[i+L]) (i.e. the run must be followed by something at least 1/ratio times its total size). Ties broken by smaller i.

best_i, best_L = -1, 0
for i in 0..n:
    if i + 3 >= n: break
    run_sum = 0
    for L in 1..=n-1-i:
        run_sum += size(ssts[i+L-1])
        follow = i + L
        if follow >= n: break
        if L >= 3 and run_sum <= ratio * size(ssts[follow]):
            if L > best_L: best_i, best_L = i, L
if best_L == 0: return false
merged = merge_run(ssts[best_i..best_i+best_L])
ssts   = ssts[..best_i] ++ [merged] ++ ssts[best_i+best_L..]
return true

Merge Semantics (shared by both)

Walk the run newest → oldest:

  1. Append the SST's range tombs to out_tombs.
  2. For each entry:
    • skip if seen[key] (newer-wins),
    • skip if covered by any tomb in active,
    • otherwise emit; mark seen.
  3. Append the SST's range tombs to active (so they apply to older SSTs in the run).

After the walk, sort out_entries by key (for determinism across hash-map iteration orders) and recompute smallest, largest, bloom.

Why the Minimum Lengths

  • Tiered's L ≥ 2 keeps it from being "merge one SST with nothing", which would just rewrite the SST.
  • Universal's L ≥ 3 is RocksDB's actual choice; smaller runs are too frequent to amortise the I/O.

Done When

  • tiered_picks_prefix passes (size-tiered selects the 3-small-SST prefix in front of a big SST and produces a 2-SST result).
  • universal_picks_run passes (universal selects the 3-small run between two big SSTs).
  • noop_compaction passes (both policies return false on a 2-SST tree).
  • All three canonical fixtures still match after this step.