db-10 step 01 — Tree shape and get / scan
Goal
Define the in-memory B-tree's node representation and implement the
two read-only operations: point lookup get(k) and ordered
scan() -> Vec<(k,v)>. No mutation yet; this step is about pinning the
data structure.
Tasks
- Declare constants
T = 2,MIN_KEYS = T - 1,MAX_KEYS = 2T - 1. - Define a
Nodecontaining:is_leaf: boolkeys: Vec<(Vec<u8>, Vec<u8>)>— sorted by keychildren: Vec<Box<Node>>— empty for leaves; for internal nodes,children.len() == keys.len() + 1
- Define
BTree { root: Box<Node> }.new()produces an empty leaf root. - Implement
get(&self, key: &[u8]) -> Option<Vec<u8>>by descending from the root: at each node, find the first key>= target; if equal, return its value; if leaf, returnNone; else recurse intochildren[i]. - Implement
scan(&self) -> Vec<(Vec<u8>, Vec<u8>)>as the standard in-order traversal: for eachi in 0..keys.len(), recurse intochildren[i], pushkeys[i]; finally recurse into the last child. - Implement
len()andis_empty()as helpers.
Acceptance
Inline unit tests:
get_on_empty_returns_none—BTree::new().get(b"k") == None.manual_build_get_returns_value— manually construct a 3-key leaf,getreturns the right value for each key andNonefor misses.scan_is_sorted— manually construct an internal node with two leaf children;scan()returns the merged sorted sequence.
All three green in Rust, Go, and C++.
Discussion prompts
- Why does
getuse linear scan overkeysrather than binary search? ForT = 2the answer is obvious; forT = 256is it still? - Why is
is_leafstored on each node rather than inferred fromchildren.is_empty()? - What goes wrong if
scanrecurses into the last child before pushing the last key?