db-16 — Analysis

Required invariants

  1. Lamport monotonicity. For any node n, the sequence of Lamport values produced by its successive tick/send/recv calls is strictly monotonic.
  2. Lamport consistency with happens-before. If a → b (happens- before), then ts(a) < ts(b). The converse does not hold; that is the cost of compressing a partial order into a single integer.
  3. Vector-clock characterization. With vector clocks the biconditional holds: a → b iff vc(a) < vc(b) componentwise (and vc(a) ≠ vc(b)).
  4. Send-precedes-receive. Every Recv event in the simulator is paired with exactly one Send event from (peer → node) whose sim_time is strictly less than the Recv's sim_time and whose vector clock is strictly less than the Recv's.
  5. Byte determinism. For every (seed, nodes, rounds), the three binaries produce identical bytes on stdout. This is the single property scripts/cross_test.sh checks; if it ever drifts, all downstream labs lose reproducibility.

Design decisions

  • Two-phase tick (deliver-then-send). Each integer tick first drains all in-flight messages whose delivery_time has arrived, then runs every node's send. Doing deliver first means a single tick can witness a message being received and a response being sent — capturing causal flow without needing finer time resolution.

  • Heap ordered by (delivery_time, sender, seq). The third field (seq, a global monotonic counter) gives an unconditional tie-break even when two nodes send to the same destination in the same tick with the same chosen delay.

  • splitmix64 seeded per (seed, t, s). A single splitmix64 call produces all three random fields (dest, delay, payload) for one (t, s) decision. This avoids the question "whose RNG state advances first" — there is no shared RNG state at all.

  • Vector-clock entries sorted on the wire. BTreeMap in Rust, sorted-key iteration in Go, std::map in C++ all produce ascending order naturally. If you ever switch the Rust side to HashMap you will get byte diffs.

  • Lib + thin CLI. All three implementations expose the same trio of primitives (Lamport, VectorClock, simulate/Simulate) as a library. The CLI is ten lines that calls serialize(simulate(...)) and writes to stdout. Downstream labs will link the library, not shell out to the CLI.

Why three languages

  • Forces the spec to be unambiguous. A Rust BTreeMap and a C++ std::map both happen to iterate in key order; the moment you reach for Go's map you discover the language does not and you must sort explicitly. That kind of discovery only happens with multiple implementations.
  • Pins endianness, integer overflow semantics (wrapping), and signed-vs- unsigned modulo. Splitmix64 in particular depends on unsigned wrapping multiplication; expressing it identically in three languages is a forcing function.
  • Future-proofs the track. db-17 onwards will pick one host language per experiment; having a reference implementation in three independent languages means a port bug in db-17's Raft simulator can be cross- checked against the db-16 baseline.

Tradeoffs worth flagging

  • Sim time is integer ticks, not floating-point seconds. This trades realism for determinism. Real networks have continuous-time jitter; capturing that would require an event-priority structure keyed by a rational/decimal time, which is not worth the complexity for a study lab.
  • All sends are unicast and always succeed. We do not model drops, reorderings beyond delay-based interleaving, or partitions. db-17 will add a partition primitive on top of this simulator; doing it here would mean adding --drop-rate to the CLI and changing the wire format, which would lock in a poor abstraction.
  • Each node sends exactly one message per tick during the active window. That is a fixed-load workload. Variable-load (silent nodes, bursty senders) would be a strict extension; it is intentionally omitted to keep the spec small enough to verify by hand.