db-16 step 03 — CLI and cross-language byte-identity
Goal
Build a simctl CLI in all three languages, then prove via sha256 that
all three produce byte-identical event logs for the same
(seed, nodes, rounds) triple — for at least two distinct scenarios.
CLI contract
simctl --seed N --nodes K --rounds R
Writes the canonical wire-format bytes (no trailing newline) to stdout.
Tasks
- Build
simctlin Rust (src/rust/src/bin/simctl.rs), Go (src/go/cmd/simctl/main.go), and C++ (src/cpp/src/simctl.cc). - Write
scripts/verify.shthat runs unit tests in all three langs. - Write
scripts/cross_test.shthat:- Builds all three binaries.
- Scenario A:
simctl --seed 42 --nodes 3 --rounds 20→ sha256 all three outputs → assert all three match. - Scenario B:
simctl --seed 7 --nodes 5 --rounds 50→ sha256 all three → assert all three match. - Spot-check the first 8 bytes of scenario A's output equal the
magic
"DSE6"plus theu32 LEcount120. - Print
=== ALL OK ===.
Acceptance
$ scripts/verify.sh
=== rust === ... ok
=== go === ... ok
=== cpp === ... ok
=== OK ===
$ scripts/cross_test.sh
...
match(A): 0d7e753cdc891e3a481977da372a4d97a6a0e0ab00b74f5a074dbc25791dc797
match(B): 321221187709684afd59c55202f8d373dad33c8026e933b36740aeed23c8c2d4
=== ALL OK ===
A byte-identical hash across three independent implementations is a near-proof that the PRNG, scheduler, clock-update rules, and wire format are all spec-compliant. Any divergence — even on a single byte — will surface here.
Discussion prompts
- Why two scenarios instead of one? What property would slip through with a single scenario that two catch?
- If the scenario-A hash matches but scenario B does not, where in the codebase would you start looking?
- The sha256 hashes are baked into the script as constants. What's the benefit, and what's the maintenance cost when the wire format legitimately evolves (e.g., adding a new event kind)?