Observation — Looking Inside the Structures

What you should see

This lab is at its best when you stop trusting the numbers and start looking at the memory layout.

1. Histogram of skip-list node heights

The skip list with p=0.5 should produce a geometric distribution: ~half the nodes at level 0 only, ~quarter reaching level 1, etc.

./target/release/dsbench skiplist insert 100000

Sample output:

level   count    %
    0  50032   50.0   ████████████████████████████████████████████████
    1  25021   25.0   █████████████████████████
    2  12508   12.5   █████████████
    3   6234    6.2   ██████
    4   3098    3.1   ███
    5   1581    1.6   ██
    6    778    0.8   █
    ...
   max level used = 16

If your distribution is skewed (e.g., level 0 is 25% instead of 50%) your RNG or sampling code is wrong. The most common bug is rng() & 1 evaluated once and reused.

2. Hash-table probe-distance histogram

./target/release/dsbench hashtable insert 1000000

Sample output (Robin Hood, load 0.477):

probe distance   count       %
            0   633412     63.3
            1   235108     23.5
            2    87412      8.7
            3    32104      3.2
            4    10001      1.0
            5     1652      0.2
            6      298      0.0
            7       12      0.0
            8        1      0.0
   mean = 0.55   max = 8   capacity = 2097152   load = 0.477

With pure linear probing (no Robin Hood) the tail extends much further.

3. Memory accounting

./target/release/dsbench bench mem 1000000

Sample:

skip list  : 1,000,000 entries, ~80 B/entry
hash table : 1,000,000 entries, ~25 B/entry  (cap=2097152, load=0.477)

What "working" looks like

Skip list: max level grows like log₂ n + 5 (slight overshoot from variance).
Hash table: mean probe < 1, max probe < 10 at load ≤ 0.85.
Bench: hash table 5–20× faster than skip list on point ops.
Memory: hash table is 2–4× smaller per entry.

What "broken" looks like

Mean probe distance climbs above 2 → poor hash function or table not actually power-of-two-sized.
Max skip-list level stuck at 1–2 with 1M entries → RNG broken; bit-test always falls through.
Same level distribution from one run to the next → seed not random.
Hash table size doesn't grow after 85% load → resize trigger not firing.
max_probe 3–4× above the theoretical bound → almost always the hash function. We hit this with raw FNV-1a 64-bit (max_probe ≈ 200 at N=100k vs expected ≤ 66). Adding a SplitMix64 finalizer fixed it. The pure-FNV variant typoed its prime constant too — see step 02 for the canonical value 0x00000100000001b3 and the three pinned vectors ("", "a", "foobar").

What `scripts/cross_test.sh` proves

Runs dsbench skiplist iter N seed in Go, Rust, and C++ and diffs all three outputs. If they aren't byte-identical, one of the three has drifted on hash function, RNG, or ordering — usually the easiest single signal for catching a port regression.

Distributed Systems Engineer — Build Databases & Consensus From Scratch