db-16 step 03 — CLI and cross-language byte-identity

Goal

Build a simctl CLI in all three languages, then prove via sha256 that all three produce byte-identical event logs for the same (seed, nodes, rounds) triple — for at least two distinct scenarios.

CLI contract

simctl --seed N --nodes K --rounds R

Writes the canonical wire-format bytes (no trailing newline) to stdout.

Tasks

  1. Build simctl in Rust (src/rust/src/bin/simctl.rs), Go (src/go/cmd/simctl/main.go), and C++ (src/cpp/src/simctl.cc).
  2. Write scripts/verify.sh that runs unit tests in all three langs.
  3. Write scripts/cross_test.sh that:
    1. Builds all three binaries.
    2. Scenario A: simctl --seed 42 --nodes 3 --rounds 20 → sha256 all three outputs → assert all three match.
    3. Scenario B: simctl --seed 7 --nodes 5 --rounds 50 → sha256 all three → assert all three match.
    4. Spot-check the first 8 bytes of scenario A's output equal the magic "DSE6" plus the u32 LE count 120.
    5. Print === ALL OK ===.

Acceptance

$ scripts/verify.sh
=== rust === ... ok
=== go   === ... ok
=== cpp  === ... ok
=== OK ===

$ scripts/cross_test.sh
...
  match(A): 0d7e753cdc891e3a481977da372a4d97a6a0e0ab00b74f5a074dbc25791dc797
  match(B): 321221187709684afd59c55202f8d373dad33c8026e933b36740aeed23c8c2d4
=== ALL OK ===

A byte-identical hash across three independent implementations is a near-proof that the PRNG, scheduler, clock-update rules, and wire format are all spec-compliant. Any divergence — even on a single byte — will surface here.

Discussion prompts

  • Why two scenarios instead of one? What property would slip through with a single scenario that two catch?
  • If the scenario-A hash matches but scenario B does not, where in the codebase would you start looking?
  • The sha256 hashes are baked into the script as constants. What's the benefit, and what's the maintenance cost when the wire format legitimately evolves (e.g., adding a new event kind)?