db-18 step 01 — Single-decree Paxos

Goal

Build the two-phase Paxos protocol for one slot. A proposer must be able to drive a value to a decision in the presence of competing proposers, and an acceptor's recorded state must be exactly enough for the next proposer to recover any value that might have already been chosen. The byte layout of acceptor state must be identical across Rust, Go, and C++.

Tasks

  1. Ballot. Define Ballot { round: u32, proposer_id: u32 } with lexicographic ordering (round first, then proposer_id as tie-break). Provide a Ballot::ZERO constant equal to (0, 0). Every comparison in the rest of the protocol uses this order.

  2. PaxosNode skeleton. Each node carries:

    • id: u32, n: u32 (cluster size), quorum = n/2 + 1.
    • role: Role (Follower / Candidate / Leader).
    • promised_ballot: Ballot — highest ballot ever promised (one value, shared across all slots in this Multi-Paxos style).
    • my_ballot: Ballot — this proposer's current attempt.
    • accepts: BTreeMap<Slot, (Ballot, Vec<u8>)> — for each slot, the highest-ballot accept observed.
    • learned: BTreeMap<Slot, Vec<u8>> — decided values, in slot order.
  3. on_prepare(ballot). If ballot >= promised_ballot, set promised_ballot = ballot and reply Promise { accept_ok = true, accepted = [(slot, ab, value) for every entry in accepts] }. Otherwise reply Promise { accept_ok = false, accepted = [] }. The full walk over accepts is what makes Phase 1 the recovery step.

  4. on_promise. A proposer collects promises until it has a quorum. For each slot mentioned in any promise, it adopts the value of the highest-ballot accept (Paxos safety property P2c). For slots with no prior accept, the proposer is free to use its own pending value. The proposer then transitions to Leader and broadcasts Accept for every slot in its working set.

  5. on_accept(ballot, slot, value). If ballot >= promised_ballot, update promised_ballot = ballot, overwrite accepts[slot] = (ballot, value), reply Accepted { accept_ok = true }. Otherwise reply Accepted { accept_ok = false }. Note that an accepted value is never unaccepted — only superseded by a higher-ballot accept on the same slot.

  6. on_accepted. A proposer that collects a quorum of accept_ok = true for the same (slot, ballot) learns the value and broadcasts Decided { slot, value }. Learners (every node) record learned[slot] = value on Decided.

Acceptance

Inline unit tests in each language. Names below are the Rust form; Go uses TestSha256KnownVectors style, C++ uses test_sha256_known_vectors:

  • sha256_known_vectors — empty, "abc", and the lazy-dog vector all hash to the well-known constants. Locks the SHA-256 implementation to RFC 6234.
  • dueling_proposers_higher_ballot_wins — acceptor promises (1,1), then (1,2) arrives and is promised; a stale Accept at (1,1) is nacked. Verifies promised_ballot monotonicity.
  • promise_carries_prior_accept_for_recovery — acceptor with a prior accept at ballot b1 on slot 0 receives a Prepare at higher ballot b2; the Promise must include the (0, b1, value) tuple so the new leader can re-propose the value. This is P2c.
  • majority_required_to_decide — proposer in a 5-node cluster with only 2 of 5 accepts must not call the slot decided; the third accept tips it over the threshold.
  • ballot_ordering_is_lexicographic(1,9) < (2,0), (1,0) < (1,1), ZERO < (0,1). Locks the comparator.

All five green in Rust, Go, and C++.

Discussion prompts

  • Quorum intersection. Why must any two quorums share at least one acceptor? Walk through what breaks if a 4-node cluster used quorum size 2 instead of 3.
  • Why P2c. Suppose Phase 1 returned just accept_ok without the list of prior accepts. Construct a 3-node run where a value v is chosen, then a higher-ballot proposer chooses a different value w. Why does carrying prior accepts forward in the Promise prevent this?
  • Ballots vs terms. Raft's term is a single u64. Paxos's ballot is (round, proposer_id). What does the proposer_id tie-break buy you that a single counter would not, and why does Raft not need it?