db-17 — References

Primary sources

  • Diego Ongaro and John Ousterhout, In Search of an Understandable Consensus Algorithm (Extended Version), USENIX ATC 2014. The Raft paper. Figure 2 is the spec this lab implements; Figure 8 is the motivation for the "commit only entries of the current term" rule. https://raft.github.io/raft.pdf
  • Diego Ongaro, Consensus: Bridging Theory and Practice, Stanford PhD dissertation, 2014. The book-length treatment. Chapters 3–4 cover what's in this lab; chapters 5–6 cover snapshots, log compaction, and membership changes (deferred to db-21 / db-23). https://github.com/ongardie/dissertation
  • raft-tla — the TLA+ specification of the algorithm, also by Ongaro. Useful when you want a second, machine-checked statement of the same rules implemented here. https://github.com/ongardie/raft.tla

Implementations to read alongside

  • etcd/raft (Go) — the most-studied production Raft. Same Figure 2 structure; adds pre-vote, leader leases, learner replicas, ReadIndex, joint consensus. https://github.com/etcd-io/raft
  • hashicorp/raft (Go) — Consul's engine. Easier to read than etcd's because it carries less production scar tissue. https://github.com/hashicorp/raft
  • tikv/raft-rs (Rust) — TiKV's port of etcd's algorithm. Useful as a counterpoint to this lab's stdlib-only Rust version. https://github.com/tikv/raft-rs

Determinism and simulation

  • db-16's references on FoundationDB simulation testing and TigerBeetle apply verbatim here.
  • Hermitian (CockroachDB) and Antithesis are commercial deterministic simulators for distributed databases; the spirit is the same as cross_test.sh.

Background reading worth doing

  • Heidi Howard et al., Flexible Paxos: Quorum intersection revisited, OPODIS 2016. Helps see Raft as a specialization of Paxos with a fixed quorum intersection rule.
  • Lamport's Paxos Made Simple — for the db-18 transition.
  • Junqueira et al., ZooKeeper's Atomic Broadcast Protocol: Theory and Practice — for the db-19 transition.

Cross-lab dependencies

  • Upstream: db-16 distributed-fundamentals (Lamport/VC and the deterministic simulator harness whose discipline this lab inherits wholesale).
  • Downstream:
    • db-18 Paxos — reuses the heap-and-tick simulator; different RPC structure; weaker leader assumption.
    • db-19 ZAB — leader-based atomic broadcast; same election + log-replication skeleton.
    • db-20 Distributed KV — wraps a chosen consensus engine (probably this one) around a key-value state machine.
    • db-23 Capstone — joint membership changes and snapshots get added on top of this code.