db-19 — Verification

Prerequisites

  • Rust ≥ 1.74 with cargo on PATH.
  • Go ≥ 1.22 (module declares go 1.22).
  • CMake ≥ 3.20 and a C++17 compiler (Apple clang ≥ 14, gcc ≥ 11).
  • A POSIX sha256sum is not required — each binary computes its own sha256 in-process.

One command

cd db-19-zab
bash scripts/verify.sh && bash scripts/cross_test.sh

Green is === db-19 :: ALL UNIT TESTS GREEN === followed by === ALL OK ===. Anything else is a regression.

What verify.sh does

  1. Rustcargo test --release --quiet over src/rust/. Builds db19 lib + zabctl binary; runs the inline tests in src/rust/src/lib.rs. Expected: test result: ok.
  2. Gogo test ./... over src/go/. Builds the db19 package + cmd/zabctl; runs src/go/zab_test.go. Expected: PASS and ok github.com/10xdev/dse/db19.
  3. C++cmake -DCMAKE_BUILD_TYPE=Release -B build, cmake --build build --target test_db19 zabctl, then ctest --test-dir build --output-on-failure. The test binary ends with Test #1: test_db19 ........ Passed.

If any of these three blocks fails, the script exits non-zero and the rest does not run.

What cross_test.sh does

For each of the six canonical scenarios (A–F) it invokes the three release zabctl binaries with identical flags, captures the lowercase-hex sha256 of the canonical cluster dump, and asserts rust == go == cpp byte-for-byte. The scenarios are:

labelargswhat it exercises
A--seed 42 --nodes 3 --rounds 1000 --proposals 5basic 3-node, 5 proposals, clean network
B--seed 7 --nodes 5 --rounds 2000 --proposals 20bigger cluster, longer horizon
C--seed 99 --nodes 3 --rounds 500 --proposals 0election convergence only
D--seed 1 --nodes 1 --rounds 200 --proposals 5degenerate single node (instant leader)
E--seed 42 --nodes 3 --rounds 1000 --proposals 3 --partition 0,1,0,2,1,0,2,03-node with churn
F--seed 3 --nodes 5 --rounds 1500 --proposals 10 --partition 0,15-node, asymmetric one-way drop

Canonical hashes are listed in docs/observation.md. The script asserts consistency among the three ports; it is docs/observation.md that pins them to the historical fingerprint.

What green guarantees

  1. Determinism. Same flags ⇒ same canonical dump bytes across languages and runs (modulo endianness — all targets are little-endian). The simulator advances in integer ticks; all map/set iteration is over BTreeMap / sorted Go slices / std::map so the dump order is fixed.
  2. Safety in the modeled environment. No two nodes commit different histories. For every scenario in the suite, after the final tick:
    • last_committed.epoch is monotonic per node.
    • Where two nodes' history overlap by zxid, the bytes match.
    • No follower has committed past last_committed reported by the leader of its current epoch.
  3. Liveness in the modeled environment. Scenarios A, B, D, F include proposals and run long enough to elect a leader and commit them. Scenarios C and E confirm we don't commit what we shouldn't (C has zero proposals; E partitions away the would-be leader so the alternative path must take over).

What green does not guarantee

  • Behavior outside the canonical scenarios. ZAB's state space is large; six fingerprints are an acceptance test, not a model checker. Real validation needs TLA+ (see references.md).
  • Performance. No latency or throughput is measured. Tick count is simulation cost, not wall-clock SLA.
  • Snapshotting / log compaction. Histories grow unboundedly; ZooKeeper truncates via snapshots, which is out of scope here.
  • Production safety primitives — fsync barriers, on-disk checksums, recovery from torn writes, byzantine actors. All deliberately deferred.
  • Real network. Partitions are modeled as a BTreeSet of one-way drops applied at delivery; reordering happens through the simulator's priority queue, not a Lossy/OOO network. There is no actual socket.

Invariant assertions in code

The implementations carry inline assertions where they are nearly free. The load-bearing ones:

WhereAssertionWhat it catches
Leader on_ackrefuse acks for zxids not in our outstanding setduplicate / replayed acks inflating quorum
update_vote (election)only adopt votes with greater (last_zxid, id)non-monotone vote drift
handle_new_epochfollowers must reply only if new_epoch > accepted_epochaccepting a stale epoch from a deposed leader
handle_new_leaderfollowers replace history only if new_epoch > current_epochlosing already-committed entries
canonical_dump writer (all 3 langs)nodes in ascending id, per-node history in ascending zxiddump-writer drift between languages

The unit tests assert each of these on at least one path.