Verification — What to Test and How

Per-language property tests

#TestPass if
V1fnv1a64_known_vectors""0xcbf29ce484222325; "a"0xaf63dc4c8601ec8c; "foobar"0x85944171f73967e8
V2splitmix64_known_vectorssplitmix64(0) = 0xe220a8397b1dcdaf; splitmix64(0xdeadbeef) = 0x4adfb90f68c9eb9b
V3no_false_negativesInsert N=10 000 random keys (seeded); contains returns true for every one
V4fpr_within_2xBuild for n=10 000 at fpr=0.01; query 100 000 random absent keys; observed FPR ≤ 2× theoretical
V5optimal_k_formulawith_fpr(1000, 0.01) returns k=7 and 9 580 ≤ m ≤ 9 620 (allow ±0.5%)
V6encode_decode_roundtripencode → decode → query the same keys: identical results
V7header_layoutFirst 4 bytes = k LE; next 8 = m LE; payload length = ⌈m/8⌉
V8empty_filter_rejects_allNew filter with m=64, k=3; contains returns false for 1000 random keys

Cross-language test

scripts/cross_test.sh performs the writer × reader matrix for {go, rust, cpp}²:

  1. Each writer builds a filter for the same fixed-seed key set (1 000 keys).
  2. Filters must be byte-identical (md5sum over filter file).
  3. Each reader opens each writer's filter and runs:
    • 1 000 known-present queries → must all return present
    • 1 000 known-absent queries (different seed) → results must match across readers

This catches:

  • Endian or bit-order bugs in the header / bit array.
  • Hash mismatch (fnv1a64 or splitmix64 differs).
  • mod m reduction differs (Lemire's u128 trick vs % should yield identical indices).

What "passing" means

  • All 8 property tests green in all three languages.
  • cross_test.sh exits 0 with 9 byte-identical filter writers and 9 passing reader runs.
  • Manual smoke: hexdump of a 4-key filter matches the structure described in docs/observation.md.