Step 1 — Implement the Skip List

Goal

Build a sorted map with O(log n) expected insert/lookup/remove and O(n) ordered iteration.

API

SkipList::new(seed: u64) -> SkipList
SkipList::insert(key, value) -> bool   // false if replaced existing
SkipList::get(key) -> Option<value>
SkipList::remove(key) -> bool
SkipList::len() -> usize
SkipList::iter() -> iterator<(key, value)>
SkipList::max_level_used() -> usize
SkipList::level_histogram() -> [usize; MAX_LEVEL]

Constants

  • MAX_LEVEL = 32
  • P = 0.5 (sample via count_trailing_zeros(rng()) % MAX_LEVEL)

Data layout

SkipList {
    head:    Node*       // dummy sentinel at MAX_LEVEL
    level:   usize       // current max level used (1..=MAX_LEVEL)
    len:     usize
    rng:     u64         // xorshift64 state
}

Node {
    key:      Vec<u8>
    value:    Vec<u8>
    forward:  Vec<Option<Box<Node>>>   // length = height of this node
}

In Rust we use Box<Node> for ownership and raw pointers for siblings (or Option<NonNull<Node>> for safer raw pointers). In Go, *Node is the natural choice. In C++, std::unique_ptr<Node> for the sole owner of level-0 next, raw pointers for higher levels.

For simplicity we use a single ownership style: level-0 owns nodes; higher levels hold raw pointers. Drop walks the level-0 chain once.

Insert pseudocode

update[0..MAX_LEVEL] = HEAD
x = head
for i in (level-1)..=0:
    while x.forward[i] != null && x.forward[i].key < key:
        x = x.forward[i]
    update[i] = x

if x.forward[0] != null && x.forward[0].key == key:
    x.forward[0].value = value
    return false                       // replaced

new_level = random_level()
if new_level > level:
    for i in level..new_level: update[i] = HEAD
    level = new_level

node = new Node(key, value, new_level)
for i in 0..new_level:
    node.forward[i] = update[i].forward[i]
    update[i].forward[i] = node
len += 1
return true

Random level

fn random_level(rng: &mut u64) -> usize {
    *rng ^= *rng << 13;
    *rng ^= *rng >> 7;
    *rng ^= *rng << 17;
    let lvl = (rng.trailing_zeros() as usize) + 1;
    min(lvl, MAX_LEVEL)
}

Tests

#TestPass if
T1insert 1000 random keys, all get succeedevery value matches
T2insert sorted keys 0..999, iter yields 0..999strictly increasing
T3insert + remove all keys, len = 0empty
T4insert with same key twice, len unchangedreplacement worked
T5level distribution at N=100k is geometricsum of L≥k slots ≈ N · 2^-k