db-17 step 03 — Cross-test and partition

Goal

A Cluster that drives an n-node simulation forward by integer ticks, a --partition CLI flag that drops messages in named directions, and a cross-language scripts/cross_test.sh proving the canonical dump's sha256 is byte-identical across Rust, Go, and C++ for six seeded scenarios including partitions.

Tasks

  1. Cluster::new(seed, nodes). Holds:

    • nodes: Vec<RaftNode> (ids 0..nodes);
    • drop: BTreeSet<(u32, u32)> (directional message-drop set);
    • heap: BinaryHeap<InFlight> ordered by (delivery_time, sender, seq)InFlight implements Ord such that BinaryHeap behaves as a min-heap;
    • seq: u64 (global monotonic);
    • pending_proposals: VecDeque<Vec<u8>>.
  2. Cluster::run(rounds, n_proposals). For each tick t in 0..rounds:

    1. Enqueue scheduled proposals. schedule[i] = (i+1) * rounds / (n_proposals + 1); if t == schedule[i], push payload "p<i:02>" onto pending_proposals.
    2. Inject pending into current leader. Find leader as the (max current_term, min id) node with role == Leader; while pending_proposals is non-empty and a leader exists, drain one payload and call leader.propose(payload). The propose pushes RPCs onto the heap with delivery times computed from splitmix64(seed ^ src ^ dst ^ t) % 3 + 1.
    3. Deliver. Pop every InFlight whose delivery_time == t. For each, if (sender, dest) is in drop, discard. Otherwise call nodes[dest].handle(rpc, t) and enqueue any reply RPCs the handler produces.
    4. Tick. Iterate nodes in ascending id; call node.on_tick(t) on each; enqueue any RPCs produced.
  3. canonical_dump(&cluster) -> Vec<u8>. As specified in CONCEPTS.md: magic "DSERAFT1" (8 bytes), u32_le(node_count), then for each node in id order: id, current_term, voted_for (i64 LE, -1 for None), role (u8), commit_index, log_len, and each entry's (term, cmd_len, cmd_bytes).

  4. raftctl CLI. Parses --seed, --nodes, --rounds, --proposals, --partition s,d,s,d,.... Calls Cluster::new, inserts every (s, d) pair into cluster.drop, runs, dumps, sha256s, prints lowercase hex with no trailing newline.

  5. scripts/cross_test.sh. For each of the six scenarios (A–F in docs/observation.md), invoke all three binaries with the same args, compare raw dumps with cmp -s, then compare hex hashes. Print the scenario label and OK on success, or the diverging offset and the three hashes on failure. End with === ALL OK ===.

Acceptance

  • cargo test --release ⇒ ~10 tests pass.
  • go test ./... ⇒ ~12 tests pass.
  • ctest --test-dir build100% tests passed.
  • ./scripts/verify.sh=== OK ===.
  • ./scripts/cross_test.sh ⇒ all six scenarios OK, final === ALL OK ===.
  • The exact sha256s match docs/observation.md's table. Specifically scenario A is a2299ff06a2ed5ced5842d100bb7867b3ae50f6e7d7da93f835385565f1ed9e9.

Discussion prompts

  • The proposal-injection step picks the leader by (max term, min id). Why not "first leader found in iteration order"? (Hint: Go's map iteration is randomized; (max term, min id) is content-defined.)
  • Scenario E (--partition 0,1,0,2,1,0,2,0) drops every message into or out of node 0. What is the only way the resulting log can contain committed entries? Trace which two-node sub-cluster achieves quorum.
  • Scenario F is an asymmetric partition (0 → 1 only). Why doesn't this cause permanent leadership churn? (Hint: node 1 can still reach node 0 via AppendEntriesReply.)
  • If you swap BTreeSet for HashSet in Cluster::drop (Rust), the hashes still match — why? But if you swap BTreeMap for HashMap in RaftNode::next_index, they don't. Articulate the rule.