db-18 step 01 — Single-decree Paxos
Goal
Build the two-phase Paxos protocol for one slot. A proposer must be able to drive a value to a decision in the presence of competing proposers, and an acceptor's recorded state must be exactly enough for the next proposer to recover any value that might have already been chosen. The byte layout of acceptor state must be identical across Rust, Go, and C++.
Tasks
-
Ballot. Define
Ballot { round: u32, proposer_id: u32 }with lexicographic ordering (round first, then proposer_id as tie-break). Provide aBallot::ZEROconstant equal to(0, 0). Every comparison in the rest of the protocol uses this order. -
PaxosNode skeleton. Each node carries:
id: u32,n: u32(cluster size),quorum = n/2 + 1.role: Role(Follower / Candidate / Leader).promised_ballot: Ballot— highest ballot ever promised (one value, shared across all slots in this Multi-Paxos style).my_ballot: Ballot— this proposer's current attempt.accepts: BTreeMap<Slot, (Ballot, Vec<u8>)>— for each slot, the highest-ballot accept observed.learned: BTreeMap<Slot, Vec<u8>>— decided values, in slot order.
-
on_prepare(ballot). Ifballot >= promised_ballot, setpromised_ballot = ballotand replyPromise { accept_ok = true, accepted = [(slot, ab, value) for every entry in accepts] }. Otherwise replyPromise { accept_ok = false, accepted = [] }. The full walk overacceptsis what makes Phase 1 the recovery step. -
on_promise. A proposer collects promises until it has a quorum. For each slot mentioned in any promise, it adopts the value of the highest-ballot accept (Paxos safety property P2c). For slots with no prior accept, the proposer is free to use its own pending value. The proposer then transitions to Leader and broadcastsAcceptfor every slot in its working set. -
on_accept(ballot, slot, value). Ifballot >= promised_ballot, updatepromised_ballot = ballot, overwriteaccepts[slot] = (ballot, value), replyAccepted { accept_ok = true }. Otherwise replyAccepted { accept_ok = false }. Note that an accepted value is never unaccepted — only superseded by a higher-ballot accept on the same slot. -
on_accepted. A proposer that collects a quorum ofaccept_ok = truefor the same(slot, ballot)learns the value and broadcastsDecided { slot, value }. Learners (every node) recordlearned[slot] = valueonDecided.
Acceptance
Inline unit tests in each language. Names below are the Rust form;
Go uses TestSha256KnownVectors style, C++ uses
test_sha256_known_vectors:
sha256_known_vectors— empty,"abc", and the lazy-dog vector all hash to the well-known constants. Locks the SHA-256 implementation to RFC 6234.dueling_proposers_higher_ballot_wins— acceptor promises(1,1), then(1,2)arrives and is promised; a staleAcceptat(1,1)is nacked. Verifies promised_ballot monotonicity.promise_carries_prior_accept_for_recovery— acceptor with a prior accept at ballotb1on slot 0 receives a Prepare at higher ballotb2; the Promise must include the(0, b1, value)tuple so the new leader can re-propose the value. This is P2c.majority_required_to_decide— proposer in a 5-node cluster with only 2 of 5 accepts must not call the slot decided; the third accept tips it over the threshold.ballot_ordering_is_lexicographic—(1,9) < (2,0),(1,0) < (1,1),ZERO < (0,1). Locks the comparator.
All five green in Rust, Go, and C++.
Discussion prompts
- Quorum intersection. Why must any two quorums share at least one acceptor? Walk through what breaks if a 4-node cluster used quorum size 2 instead of 3.
- Why P2c. Suppose Phase 1 returned just
accept_okwithout the list of prior accepts. Construct a 3-node run where a valuevis chosen, then a higher-ballot proposer chooses a different valuew. Why does carrying prior accepts forward in the Promise prevent this? - Ballots vs terms. Raft's
termis a singleu64. Paxos's ballot is(round, proposer_id). What does the proposer_id tie-break buy you that a single counter would not, and why does Raft not need it?