Step 02 — Tiered and Universal Compaction
Goal
Implement two compaction policies as deterministic functions of the
current SST sequence and the configured ratio, so that the same input
list of SSTs always produces the same output list.
Size-Tiered
Pick the longest prefix L ∈ [2, n-1] of ssts such that
Σ size(ssts[0..L]) ≤ ratio · size(ssts[L]).
chosen = 0
prefix_sum = 0
for L in 1..=n-1:
prefix_sum += size(ssts[L-1])
if L >= 2 and prefix_sum <= ratio * size(ssts[L]):
chosen = L
if chosen < 2: return false
merged = merge_run(ssts[0..chosen])
ssts = [merged] ++ ssts[chosen..]
return true
The merged SST goes at the newest position, because that's where the newly-merged data conceptually lives.
Universal
Pick the longest contiguous run [i, i+L) with L ≥ 3 such that
Σ size(run) ≤ ratio · size(ssts[i+L]) (i.e. the run must be followed by
something at least 1/ratio times its total size). Ties broken by smaller
i.
best_i, best_L = -1, 0
for i in 0..n:
if i + 3 >= n: break
run_sum = 0
for L in 1..=n-1-i:
run_sum += size(ssts[i+L-1])
follow = i + L
if follow >= n: break
if L >= 3 and run_sum <= ratio * size(ssts[follow]):
if L > best_L: best_i, best_L = i, L
if best_L == 0: return false
merged = merge_run(ssts[best_i..best_i+best_L])
ssts = ssts[..best_i] ++ [merged] ++ ssts[best_i+best_L..]
return true
Merge Semantics (shared by both)
Walk the run newest → oldest:
- Append the SST's range tombs to
out_tombs. - For each entry:
- skip if
seen[key](newer-wins), - skip if covered by any tomb in
active, - otherwise emit; mark
seen.
- skip if
- Append the SST's range tombs to
active(so they apply to older SSTs in the run).
After the walk, sort out_entries by key (for determinism across hash-map
iteration orders) and recompute smallest, largest, bloom.
Why the Minimum Lengths
- Tiered's
L ≥ 2keeps it from being "merge one SST with nothing", which would just rewrite the SST. - Universal's
L ≥ 3is RocksDB's actual choice; smaller runs are too frequent to amortise the I/O.
Done When
tiered_picks_prefixpasses (size-tiered selects the 3-small-SST prefix in front of a big SST and produces a 2-SST result).universal_picks_runpasses (universal selects the 3-small run between two big SSTs).noop_compactionpasses (both policies returnfalseon a 2-SST tree).- All three canonical fixtures still match after this step.