db-10 — Analysis
What had to be decided before any code was written, and why each decision shapes the next 5 labs.
Required invariants
- Search-tree order. For every internal node with keys
k0 < k1 < … < kn-1and childrenc0, c1, …, cn, every key inc_iis< k_iand every key inc_{i+1}is> k_i. - Bounded fanout. Non-root nodes hold between
T - 1and2T - 1keys (1..3withT = 2). The root may hold fewer keys, only when the tree is otherwise empty or being collapsed. - Uniform depth. All leaves sit at the same depth from the
root. This is what makes the worst-case lookup guaranteed to
be
O(log_T n), not merely expected. - Proactive split / rebalance. The descent on insert never needs to back up to fix an overflow; the descent on delete never needs to back up to fix an underflow. Each mutating operation touches each level on the path exactly once.
- Canonical serialization. Two B-trees with the same shape must serialize to the same bytes regardless of insertion order; two B-trees with different shapes must serialize to different bytes even if they hold the same key-value set.
Design decisions
Why T = 2 (smallest non-trivial degree)
Larger T means flatter trees and more keys per page — what real
B-trees use to amortize disk I/O. But the algorithms are
identical at every T ≥ 2, and T = 2 makes splits and merges
frequent, which makes them easy to spot, easy to unit-test, and
easy to render in a hex dump. db-11 will bump T to something
realistic (e.g. matching a 4 KiB page) once nodes are pages.
Why B-tree, not B+-tree
A B+-tree puts values only in the leaves and threads the leaves
into a doubly-linked list for range scans. That's the right call
once nodes are disk pages — internal nodes shrink because they
don't carry values, so fanout (and therefore depth) wins.
In-memory, with T = 2, the values-in-every-node B-tree is
simpler and the savings would be invisible. db-11 swaps to a
B+-tree when the disk-page trade-off applies.
Why the wire format encodes structure, not just contents
Two trees with the same {(k, v)} set can have different shapes if
they were built by different insertion orders. A serializer that
only emits the in-order key list (essentially scan()) would let a
serious bug — say, swapping the left and right halves of a split —
hide forever, because the bug would manifest only as different
tree shapes, never different scan results.
By emitting the full preorder shape, byte-equality across languages is byte-equality of the trees' physical state. db-11 will reuse this property: the page-byte serialization of a B+-tree should be exactly reproducible across implementations.
Why the workload generator reads two PRNG outputs unconditionally
Each run_workload iteration consumes exactly r1 and r2,
regardless of whether the chosen scenario insertions, deletes, or
no-ops. If a scenario consumed a variable number of PRNG draws, the
sequence of keys would diverge across scenarios for the same
seed, making the cross-scenario hashes incomparable and the bug
hunt much harder.
The cost: a small amount of wasted entropy on no-op iterations. The
gain: scenarios inserts, deletes, and mixed all visit the
same key-space in the same order for the same seed, so any
divergence is the operations' fault, not the keys'.
Why scenarios live in the library, not in the CLI
run_workload(...) is a library function that returns a BTree.
The btreectl binary is a one-liner around it. This means the
inline unit tests can call run_workload("mixed", 42, 500)
directly and assert determinism with no shell-out, no file I/O,
and no path-dependent flakiness. The same property lets
cross_test.sh trivially compare three independent CLI binaries.
Why three languages
- Forces the API to be small and explicit. The Rust
Box<Node>recursion translates to Go's struct pointer recursion and C++'sstd::unique_ptr<Node>recursion; if the algorithm needs language-specific cleverness, you've over-fit to one runtime. - Pins integer arithmetic.
SplitMix64uses wrapping unsigned multiplication; expressing it identically in three languages is a forcing function for the cross-language hash to match. - Provides a deterministic conformance suite for the whole B-tree track. When db-11's pager produces a tree whose in-memory shape disagrees with the pure in-memory baseline, db-10's serializer is the comparison witness.
Tradeoffs worth flagging
- The serializer recurses on the call stack. For pathologically
deep trees this could overflow. With
T = 2and 64-bit keys drawn from a 200-key space, the worst-case height is roughlylog_2 200 ≈ 8and the stack is never the bottleneck. db-11's paged variant will be even shallower and is fine to keep recursive. - Keys and values are stored as owned
Vec<u8>/[]byte/std::vector<uint8_t>. This is the simplest correct choice and it dominates allocation cost. db-22 (perf) will revisit whether to intern, slice, or arena-allocate. deletereturnsbool(was-present) rather than the removed value. Sufficient for testing; some real engines need the payload (e.g., to free its backing buffer). Easy to extend.