References — db-22

Primary sources on benchmarking

  • Brendan Gregg. Systems Performance: Enterprise and the Cloud, 2nd ed., Addison-Wesley, 2020. The canonical modern reference. Chapter 12 ("Benchmarking") is required reading; the "active benchmarking" methodology and the catalog of common mistakes (cold-cache effects, the wrong saturation point, the wrong unit) frame the entire lab.

  • Brendan Gregg. BPF Performance Tools, Addison-Wesley, 2019. Less directly relevant here but the right book if you want to observe what your benchmark is actually doing on a Linux box.

  • Gil Tene. "How NOT to Measure Latency." Strange Loop 2015. The "coordinated omission" talk. Even on an in-memory benchmark like this one, the principle generalizes: the metric you report has to match the question the user is asking. We intentionally report ops_per_sec, not p99 latency, because a single-threaded synchronous loop does not have an interesting tail.

  • Bryant & O'Hallaron. Computer Systems: A Programmer's Perspective, 3rd ed., Pearson, 2015. Chapter 5 ("Optimizing Program Performance") and Chapter 9 ("Virtual Memory") supply the "always measure one level deeper" instinct used throughout the docs.

Determinism, RNGs, and reproducible benchmarks

Microbenchmarking pitfalls (per-language)

  • Andrey Akinshin. Pro .NET Benchmarking, Apress, 2019. Despite the .NET framing, chapters 1–4 are language-agnostic gold: warm-up, steady state, the dead-code-elimination trap, JIT vs AOT timing.

  • Aleksey Shipilëv. "JMH samples" and his "Nanotrusting the Nanotime" blog post. Java-specific but the lessons are universal — particularly the discussion of System.nanoTime resolution traps, which apply equally to std::chrono::steady_clock and Go's time.Now().

  • Rust: criterion documentation, especially the section on outlier detection.

  • Go: the testing package's Benchmark docs and Dave Cheney's "Five things that make Go fast".

  • C++: Google Benchmark and Chandler Carruth's CppCon talk "Tuning C++".

Cross-language byte-equality engineering

  • The Cap'n Proto encoding spec. A worked example of a wire format designed for cross-language stability. We do not use Cap'n Proto here, but its constraints (fixed-width little-endian, no sentinel ordering ambiguity, no implicit string normalization) are the same constraints we impose on dump_snapshot.

  • Go issue #7986map iteration is intentionally randomized. Read the issue and the surrounding discussion; this is the canonical worked example of why a portable wire format may never iterate a hash map without an explicit sort.

Background reading on what "fast" means