db-23 — Capstone: distributed replicated KV database
This is the final lab. It synthesizes everything from db-01 through db-22 into a single tiny but real distributed key/value database whose state is byte-identical across Rust, Go, and C++ for two frozen scenarios.
What this lab builds
A 3-node replicated KV cluster:
| Node | Role |
|---|---|
| 0 | Leader. The only node that originates writes. |
| 1 | Follower. Can be taken down mid-run. |
| 2 | Follower. Always up. |
Each write Op (Put or Del) is:
- Drawn deterministically from a SplitMix64 stream (see db-04, db-22).
- Appended to the leader's log at index
log.len() + 1. - Replicated synchronously to every live follower.
- Counted as ack'd by every live node whose log already contains that index (plus the leader itself).
- Committed on every reachable node when the ack count reaches a majority of 3 (= 2).
- Applied: each newly-committed entry mutates the local
BTreeMap<i64, i64>state machine in commit-index order.
A catch_up operation lets a recovering follower copy any missing log
entries from the leader and advance its commit/apply watermark.
Two scenarios — frozen hashes
The cluster snapshot is the canonical encoding of all three nodes concatenated. We hash it with SHA-256.
| Scenario | seed | ops | keys | fault? | SHA-256 |
|---|---|---|---|---|---|
| normal | 42 | 200 | 16 | no | 5976b45b9f40f440e8249da27fe4fe752e005f606efc3596bdb25ca4e4f99296 |
| fault | 7 | 2000 | 128 | follower 1 down on [ops/2, 3·ops/4) | d67c36725af65242e985a308db5152af2a3e2525fab33d11ed6e826a252ff792 |
Both hashes are frozen as constants in src/rust/src/lib.rs, src/go/db23_test.go, and src/cpp/src/db23.h, and cross-checked by scripts/cross_test.sh.
Deterministic workload
For op i the RNG draws three u64s regardless of branch outcome, so
the stream is identical no matter which kind of op gets generated:
r1, r2, r3 = rng.next(), rng.next(), rng.next()
kind = (r1 >> 62) & 0x3 // 0,1,2 -> Put, 3 -> Del
k = i64(r2 % keys)
v = i64(r3 % 1000)
The fault schedule is purely a function of ops:
down_start = ops / 2
down_end = (ops * 3) / 4
At i == down_start follower 1 is marked down; at i == down_end it
comes back up and we immediately catch_up. If the loop happens to end
while follower 1 is still down, we catch it up once more at the end so
all three nodes always converge.
Per-node canonical encoding
magic : 8 bytes = "DSEDIST2"
node_id : u8
term : u64 LE
commit_index : u64 LE
log_len : u32 LE
log[log_len] of:
term : u64 LE
index : u64 LE
op_kind : u8 (0 = Put, 1 = Del)
key : i64 LE
value : i64 LE (0 for Del)
kv_len : u32 LE
kv[kv_len] of (ascending by key):
key : i64 LE
value : i64 LE
The cluster snapshot is just node0.encode() || node1.encode() || node2.encode(). last_applied is not serialized — after a write
loop completes (with terminal catch-up) it always equals
commit_index, so it carries no extra information.
Sources of cross-language divergence — avoided
| Risk | How we eliminate it |
|---|---|
| Map iteration order | Sort i64 keys ascending in Go (sort.Slice); BTreeMap/std::map already ordered in Rust/C++. |
| Endianness | All multi-byte ints written little-endian by hand. |
| RNG branch-skew | Always draw 3 words per op regardless of kind. |
32/64-bit int | All wire types are u8/u32/u64/i64; sizes are explicit. |
| Apply order under fault | Apply is gated by a single monotonic commit-index counter, and catch_up is called at well-defined points. |
0 value for Del | C++/Go fill v=0; Rust matches with explicit value() returning 0 for Del. |
What this synthesizes from prior labs
| Earlier lab(s) | Used here as |
|---|---|
| db-01 storage primitives | Manual byte-level LE encoding. |
| db-02 data structures | Sorted map state machine. |
| db-03 write-ahead log | The per-node log is the WAL. |
| db-04 hashing | SHA-256 + SplitMix64 PRNG. |
| db-05/06/07/08 LSM stages | Replaced here by a simpler in-memory state machine, but the apply-log-then-mutate pattern is the same. |
| db-13 transactions | Atomic apply per committed entry (no partial state). |
| db-16 distributed fundamentals | Replication, majority quorum, follower catch-up. |
| db-17 raft | Leader-only writes, log indexing, commit watermark. |
| db-22 perf & bench | Deterministic workload + canonical snapshot pattern. |
How to verify
bash scripts/verify.sh # runs all 9 tests in all 3 languages
bash scripts/cross_test.sh # confirms cross-lang + golden equality
Both must end with === OK === and === ALL OK === respectively.