db-17 — Observation

What the cross-language test produces and how to read it by hand.

Expected sha256s

scripts/cross_test.sh runs six scenarios and asserts the three binaries (Rust, Go, C++) all print the same hex digest. The current canonical hashes are:

labelargssha256
A--seed 42 --nodes 3 --rounds 1000 --proposals 5a2299ff06a2ed5ced5842d100bb7867b3ae50f6e7d7da93f835385565f1ed9e9
B--seed 7 --nodes 5 --rounds 2000 --proposals 20b6dc06aee72e595f51bd5045ea7c92ffcbe7f6fda3198985f9ded1eca2671c4b
C--seed 99 --nodes 3 --rounds 500 --proposals 0f9db9ea7e6c1ca2b3a911b42b2431e964a4ee7c5e40e27efd29b41e747958838
D--seed 1 --nodes 1 --rounds 200 --proposals 5ce8b8e05d6ad0b4a243753a934b2f052c2363e97beca0c175586677d1a489408
E--seed 42 --nodes 3 --rounds 1000 --proposals 3 --partition 0,1,0,2,1,0,2,0b1689eb48b209187b7cd82a24b1a6a2d19b0be4b481ac1a5b4f1ac9e23a6ae05
F--seed 3 --nodes 5 --rounds 1500 --proposals 10 --partition 0,1fcc70ecabe37509133bb27155f5bd7d74981c3f98e79719e2b47077acca6a31f

If any of these change, cross_test.sh will fail; either you have a bug, or you have intentionally changed the spec (timer constants, schedule formula, dump layout) and you must update this table in the same commit.

What the canonical dump looks like (scenario D — single node)

--seed 1 --nodes 1 --rounds 200 --proposals 5. Five proposals into a single-node cluster — leader is itself the majority, so every proposal commits immediately.

offset 0x00 :  44 53 45 52 41 46 54 31    "DSERAFT1"        magic
offset 0x08 :  01 00 00 00                 1                 node_count
offset 0x0c :  00 00 00 00                 0                 node id
offset 0x10 :  ?? ?? ?? ?? ?? ?? ?? ??     current_term      (~1, the first self-election)
offset 0x18 :  00 00 00 00 00 00 00 00     voted_for = 0     (voted for self in term 1)
offset 0x20 :  02                          role = Leader (2)
offset 0x21 :  05 00 00 00 00 00 00 00     commit_index = 5
offset 0x29 :  05 00 00 00                 log_len = 5
offset 0x2d :  XX XX XX XX XX XX XX XX     log[0].term       (== current_term)
offset 0x35 :  03 00 00 00                 log[0].cmd_len    (3 bytes: "p00")
offset 0x39 :  70 30 30                    "p00"             payload
...

Each subsequent entry is 8 + 4 + 3 = 15 bytes (term + cmd_len + "pNN"). Total dump for D is therefore approximately 0x2d + 5 * 15 = 0xa0 bytes = 160 bytes. The actual numbers vary slightly depending on how many election cycles --seed 1 produces before the first self-vote.

A multi-node dump (scenario C — quiet cluster)

--seed 99 --nodes 3 --rounds 500 --proposals 0. No proposals; the cluster elects a leader, sends heartbeats, and that is it. Every node's log is empty:

44 53 45 52 41 46 54 31         magic
03 00 00 00                     node_count = 3

00 00 00 00                     node id 0
XX XX XX XX XX XX XX XX         current_term       (1 if 0 elected itself, otherwise higher)
XX XX XX XX XX XX XX XX         voted_for           (0 for the leader, otherwise the leader id)
XX                              role                (Leader or Follower; never Candidate at quiescence)
00 00 00 00 00 00 00 00         commit_index = 0
00 00 00 00                     log_len = 0

01 00 00 00                     node id 1
... same shape ...

02 00 00 00                     node id 2
... same shape ...

Total dump: 8 + 4 + 3 * (4 + 8 + 8 + 1 + 8 + 4) = 111 bytes.

How to debug a divergence

If cross_test.sh fails, the script captures the raw dump from each language into /tmp/raft_<label>_<lang>.bin and prints which two languages diverged. Then:

cmp -l /tmp/raft_A_rust.bin /tmp/raft_A_go.bin | head
xxd /tmp/raft_A_rust.bin | sed -n '<line>,+2p'
xxd /tmp/raft_A_go.bin   | sed -n '<line>,+2p'

The first divergence offset tells you what to look at:

offset rangelikely culprit
0x00–0x07magic (typo: DSERAFT1 not DESRAFT1)
0x08–0x0bnode_count (impossible if all three accept --nodes correctly)
inside a node block, on current_termelection timer or heap-order bug
inside a node block, on voted_forNone encoding (must be i64 LE -1)
inside a node block, on roleenum mapping (Follower=0, Candidate=1, Leader=2)
inside a node block, on commit_indexpropose() not calling advance_commit(), or quorum count wrong
inside a log entryAppendEntries truncate-on-conflict bug, or peer iteration order

In all six existing scenarios these checks pass; the table above is the runbook for the day someone changes the algorithm and forgets to update one of the three implementations.

Tick-level scope (Rust REPL trick)

To watch a scenario from the inside, add this temporary print in Cluster::run before the simulator loop:

#![allow(unused)]
fn main() {
if std::env::var("RAFT_TRACE").is_ok() {
    eprintln!("t={} leader={:?} terms={:?}", t,
        self.nodes.iter().find(|n| n.role == Role::Leader).map(|n| n.id),
        self.nodes.iter().map(|n| n.current_term).collect::<Vec<_>>());
}
}

then run RAFT_TRACE=1 raftctl --seed 42 --nodes 3 --rounds 1000 ... | head -50. The output is not part of the canonical dump and does not affect the sha256. Remove before commit.