Broader Ideas — Where to Go Next
Extensions you can build on top of this lab. Each is a 0.5–2 day exercise.
1. Concurrent skip list with lock-free reads
LevelDB's MemTable allows concurrent readers while a single writer inserts. The trick: nodes are made visible by a single atomic store of the bottom-level forward pointer — once that store lands, the node exists; before, it doesn't. Higher-level pointers can race because they're only used to speed up a search; if they point to a not-yet-visible node, the next compare won't match and the search retries.
Implement: an AtomicPtr per forward pointer, a single writer (enforced by external mutex or compare_exchange), no per-node lock. Test: spawn 8 readers + 1 writer, run for 10s, assert no reader observes a partial node.
2. Concurrent hash table: lock striping + extendible hashing
Lock striping: 64 stripe mutexes; key's stripe = hash & 63. A write locks its stripe; reads either lock-read or use seqlock counters.
Extendible hashing: instead of full-table resize, split one bucket at a time when it grows past a threshold.
3. ART (Adaptive Radix Tree)
A radix tree variant that uses 4 different internal node layouts (4, 16, 48, 256 children) and adapts based on density. Wins for variable-length keys with shared prefixes (URLs, paths). [Leis et al., ICDE 2013].
4. Cuckoo hashing
Two hash functions, two candidate slots per key. Lookups are guaranteed 2 reads. Used in Memcached extensions.
5. Hopscotch hashing
Each entry must live within H slots of its home (typically H=32). Bounded probe distance like Robin Hood with stronger guarantee.
6. B+-Tree (preview db-10)
Write an in-memory B+-Tree with fan-out 64, leaves chained, and compare to skip list on range scans. This sets up the "why on-disk B-Trees beat skip lists" intuition before you've touched disk.
7. Skip list with rank queries (Redis ZRANK)
Add a span field per forward pointer = "number of bottom-level nodes this pointer skips over." Now rank(key) is O(log n) instead of O(n). ~50 LOC of additions.
8. Bloom filter (preview db-04)
In front of the hash table: a 1-bit-per-position array sized for ~1% false-positive rate at expected N. Measure: at 50%-miss workloads the Bloom filter saves you cache misses; at 95%-hit workloads it's pure overhead.
9. xor / cuckoo / ribbon filters
Modern variants (xor [Graf & Lemire 2020], ribbon [Dillinger 2021]) get the same false-positive rate as Bloom with 25–35% less memory.
10. Cache-conscious skip list
Replace the per-node forward-pointer array with a contiguous tail allocation (RocksDB's InlineSkipList). Compare cache miss rates: same algorithm, half the misses.
11. Persistent / immutable variants
Build an immutable skip list where insert returns a new root, sharing 99% of nodes with the old one. Useful for snapshots, MVCC.
When you've explored a couple, you're ready for db-03 Write-Ahead Log, where the durability story begins.