Step 03 — Bench Harness
Goal
Add a bench subcommand to benchctl in each language that runs the
same workload as the hash subcommand and reports a throughput number.
The harness should be small enough to read end-to-end but disciplined
enough not to lie.
What to build
A bench workload --seed N --ops N --keys N --scenario S subcommand
that:
- Runs a warm-up pass of
ops/10 + 1operations and discards the result. - Captures a high-resolution start timestamp.
- Runs the full
opsworkload and keeps the resultingCounterStoreso we can readdistinctfrom it. - Captures a high-resolution end timestamp.
- Writes one line to stderr in this format:
ops=<N> keys=<N> elapsed_us=<N> ops_per_sec=<N> distinct=<N>
- Writes nothing to stdout.
The CLI's hash subcommand must remain unchanged: stdout-only, no
trailing newline, no diagnostic noise.
Timing primitives by language
- Rust:
std::time::Instant. - Go:
time.Now()/time.Since(). - C++:
std::chrono::steady_clock.
steady_clock / Instant are the right choice — they are monotonic and
not subject to wall-clock adjustments mid-run.
Tests this step should pass
There are no automated tests for bench (timing values can't be
asserted), but the structural sanity check is:
./target/release/benchctl bench workload --seed 1 --ops 100000 --keys 1024 --scenario default
# expect on stderr:
# ops=100000 keys=1024 elapsed_us=<some number> ops_per_sec=<some number> distinct=1024
# expect on stdout: nothing
Things to watch for
- Don't put
printfinside the timed region. Allocating a string is ~hundreds of nanoseconds and will dominate small workloads. - Don't take a timestamp per op. The cost of
Now()is comparable to the cost of one workload op. - Don't forget the warm-up. The first pass is dominated by cold-cache effects and first-touch allocator behavior.
- Don't claim numbers across machines without describing the machine.
Acceptance
Running bench against a 100k-op, 1024-key workload produces a
throughput line on stderr and an empty stdout. verify.sh and
cross_test.sh continue to pass.