Analysis — Design Decisions

Every choice here is reversible in code but irreversible in performance: change one, all the others bend with it.

D1. Skip list over balanced BST (red-black, AVL)

	Skip list	Red-black tree
Insert / lookup	`O(log n)` expected	`O(log n)` worst case
Implementation LOC	~120 (Rust)	~400+
Concurrent reads	Lock-free with seqlock or single-CAS publish	Requires hand-over-hand locking
Cache locality	Poor (pointer chasing)	Poor (pointer chasing)
Worst-case bound	w.h.p., not absolute	Absolute

Choice: skip list. The simplicity (no rotations, no rebalancing, no parent pointers) is the value proposition. Worst-case absolute bound is irrelevant when we control the hash that feeds keys in.

When you'd flip: real-time systems where a probabilistic O(n) blowup is unacceptable. Database MemTables don't qualify.

D2. Hash table: open addressing + Robin Hood, not chaining

	Open addressing (linear)	Chaining
Memory per entry	(1/loadfactor) × sizeof(slot)	sizeof(entry) + sizeof(ptr) + malloc overhead
Cache misses per lookup	0–1 typically	1–3 typically
Pointer stability across resize	No	Yes
Deletion	Backward-shift or tombstone	Free a list node
Tail latency at high load	Spikes near 1.0	Degrades gracefully

Choice: open addressing, linear probing, Robin Hood. The cache-miss saving is decisive at small/medium values, which is the database use case.

When you'd flip: large values (≥1 KiB) where you want pointer stability so external references survive resize.

D3. `p = 0.5` for skip list, not `p = 0.25`

The expected number of comparisons per search is (1/p) · log_{1/p}(n). Minimizing this gives p = 1/e ≈ 0.37. In practice:

`p`	Expected comparisons (n=1M)	Levels	Memory per node
0.25	~16	~10	1.33 forward ptrs avg
0.5	~20	~20	2 forward ptrs avg
1/e	~14	~13	1.58 forward ptrs avg

We pick p = 0.5 because (a) bit-shift sampling (count trailing zeros of a random u64) is one instruction, and (b) the code stays trivial. The 30% theoretical improvement from p=1/e is not worth the table-lookup or log math.

D4. Max level = 32

A skip list with p=0.5 and n entries has expected max level log₂ n. At max level 32 we support n = 2^32 ≈ 4 G entries before quality degrades. Going higher costs a forward-pointer slot per node forever (8 bytes per extra level). Going lower caps the structure size.

D5. Hash function: FNV-1a 64-bit

We need a hash that is:

High quality enough for non-adversarial keys (passes basic distribution tests)
Fast for small keys (~10 cycles per 8 bytes)
Identical in all three languages so cross-language tests can compare counts

FNV-1a is 6 lines of code, deterministic, and produces nearly the same probe-distance distribution as xxHash3 for keys ≤ 32 bytes. We use it because we control the input keys in this lab; in production you'd switch to xxHash3 or SipHash.

When you'd flip: keys controlled by adversaries → SipHash with a per-instance random key.

D6. Load factor 0.85, grow ×2

Robin Hood handles load factor up to ~0.9 well; beyond that the variance of probe distance blows up. We resize at 0.85 to leave headroom and double the capacity (and rehash). Cost of resize is amortized O(1) per insert.

D7. Backward-shift deletion, no tombstones

Tombstones simplify code but bloat the table over time — a "delete-then-insert" workload fills the table with markers and forces a resize. Backward-shift deletion costs O(PSL) per delete (typically <5 slots) and keeps the table compact. The implementation walks forward from the deleted slot, moves each entry one slot left until reaching an empty slot or an entry whose PSL is 0.

D8. Seed: deterministic per-instance, random across instances

The skip list RNG must not be deterministically the same across runs. But within one process run we want reproducible behavior for debugging. The CLI accepts an optional seed; tests pass a fixed value.

Cost-model cheat sheet

Operation	Latency
L1 cache hit	~1 ns
L2 cache hit	~4 ns
L3 cache hit	~15 ns
Main memory	~100 ns
4 KiB page from SSD (cold)	~100 µs
4 KiB page from HDD	~10 ms

Every random pointer dereference is a potential L3 miss → 100 ns. A skip list with 20 levels traverses 20 such pointers in the worst case = 2 µs. A linear-probing hash table touches 1–2 cache lines = ~10 ns. Hash table is ~100× faster for point lookups in RAM, and the gap grows as the working set leaves L3.

What breaks at scale

Symptom	Cause	Mitigation
Skip-list lookup tail latency 10× worse than median	Bad RNG sequence; node height variance	Use a higher-quality PRNG; bound max height
Hash table tail latency spikes near 90% load	Robin Hood variance explodes	Resize earlier (load factor 0.75)
Skip-list memory 2× of equivalent BST	Forward-pointer array overhead	Use per-level arena allocators; pack pointer arrays
Hash table grows but never shrinks	Resize is unidirectional in our impl	Shrink when load < 0.25 (we don't — extension point)
Iterator skips entries	Mutation during iteration	Snapshot at iterator-creation time (db-05 covers this)

Distributed Systems Engineer — Build Databases & Consensus From Scratch