Step 02 — Rule-Based Planner + Executor
Goal
Turn a Query { projections, predicates } into a Plan { Pipeline[scan, *Filter, Project?] }, then execute it.
Why
Once a table has indexes, every predicate has at least two physical implementations: scan the whole heap and filter, or jump into the index. Picking the right one is what a planner does. Even a 10-line rule-based planner outperforms "always SeqScan" by orders of magnitude on selective queries, and it teaches the same vocabulary (selectivity, access path, predicate pushdown) you need for cost-based work later.
What to do
-
Estimate per predicate (return
Noneif not indexable):Op::Ne→ not indexable.- No index on the column → not indexable.
Op::Eq→rows / distinct_keys.Op::Lt / Le / Gt / Ge→(rows + 2) / 3.
-
Pick the lowest estimate (strict
<, so earlier predicate wins ties — determinism matters). -
Build the Plan:
- First child:
IndexScan{col, op, val}if a predicate was chosen, elseSeqScan{table_id: 0}. - Remaining predicates become
Filternodes in input order. - If
projectionsis non-empty, appendProject{cols}— sort and dedupe the column list so the wire format is canonical.
- First child:
-
Execute the Plan (Volcano-style, but materialise at each operator for simplicity):
- Scan emits all matching rows in (key, row-id) order.
- Filter retains rows where
eval_pred(row, predicate)is true. - Project rewrites each row to the requested column subset.
-
IndexScan for each op:
Eq→ fetch the single bucket (or empty).Lt→ iteraterange(..val).Le→ iteraterange(..=val).Gt→ iteraterange(val+ε..).Ge→ iteraterange(val..).- Within each bucket, emit row-ids in ascending order.
Acceptance
- Given an indexed column and an
Op::Eqpredicate, the planner emitsIndexScan, and the executor returns the matching rows in row-id order. - Given a non-indexable predicate (
Op::Ne, or no index), the planner emitsSeqScanand aFilter. - Given two predicates, the planner picks the one with the lower
estimate; the other becomes a
Filter. Projectionswith duplicates (e.g.[2, 0, 2]) end up as[0, 2].
Common pitfalls
- Forgetting to clone the predicate value when moving it into
IndexScan— both the chosen and discarded predicates need a copy. - Using
<=instead of<for tie-breaking — only<keeps the choice deterministic when two predicates tie. - Returning rows from
Le(<=) by stopping at the first key greater thanvalinstead of strictly greater — off-by-one on bounds. - Mutating the input
Query.projectionsinstead of cloning beforesort/dedup.