db-20 — Analysis

What is the question?

Given Raft-shaped consensus semantics, can we build a replicated state machine that produces a byte-identical snapshot across three language ecosystems? "Byte-identical" is the strongest possible test of cross-language conformance — strings, integers, map iteration order, and op semantics all have to line up.

Why is this an interesting study?

Raft on its own (db-17) tells you nothing about how a real key/value store is layered on top of it. Production systems (etcd, TiKV, CockroachDB) all answer the same questions:

What does the leader send to followers? (log entries)
When does a follower apply an entry? (when its commit_index advances)
How does a partitioned follower catch up? (next-index probe / install snapshot)
What invariants does the state machine maintain across replicas?

db-20 strips out the network and timer noise so we can focus on questions 2–4 alone. The simplification turns out to be the whole point: once you stop worrying about elections, the integration story fits in ~400 lines of Rust.

Design choices and trade-offs

Snapshot push instead of next-index walk

Raft's real conflict resolution is "decrement next_index, retry". For our purposes that produces the same final state as a one-shot snapshot push, but it forces us to model RPC round-trips. We pick the snapshot push because:

it converges in a single step (deterministic), and
it makes heal() trivial to write — just truncate and replay.

The cost: we cannot study log-divergence scenarios where two leaders both append. That's fine: this lab is single-leader by construction.

State machine is `BTreeMap<String, Vec<u8>>`

A sorted map gives free deterministic iteration in Rust and C++. Go's map has randomised iteration, so the Go implementation explicitly sorts before serialising. This is the single most common source of non-determinism in cross-language ports — every wire-format-aware function in the Go code does sort.Slice or sort.Strings.

Failure	db-17 covers?	db-20 covers?
Single follower crash + catchup	yes	yes (heal)
Network partition isolating minority	yes	yes
Leader crash + new election	yes	no (fixed leader)
Split-brain after partition heal	yes	no (no elections)
Log compaction / snapshot install	scratched the surface	no
Disk-loss / log truncation	no	no
Byzantine behaviour	no	no

Where to take this next

broader-ideas.md lists the explicit extensions: linearizable reads, log compaction, multi-region replicas, learner replicas, snapshot install over the wire, gossip-style cluster membership.
The exam in cross_test.sh doubles as a regression net for any of those extensions — break the snapshot bytes, you break the build.

Distributed Systems Engineer — Build Databases & Consensus From Scratch