Analysis

Goal

Build the smallest end-to-end query engine that nonetheless exercises the three concepts a real planner must get right:

Access-path selection — choose between SeqScan and IndexScan.
Predicate ordering — apply the most selective predicate first.
Projection placement — only carry the columns the query asked for.

All three must be deterministic across Rust, Go, and C++, because the artifact under test is the SHA-256 of the serialised plan + result bytes.

Pure in-memory. No SQL parser (queries are constructed structurally). No persistence. No transactions. One table, fixed three-column schema (id INT, name TEXT, age INT). Three scenarios — scanonly, mixed (index on age), indexheavy (indexes on age and name).

Design Decisions

Index physical form

A BTreeMap<Value, Vec<row_id>> per indexed column. Three reasons:

Sorted iteration is required for range scans.
The total order on keys is also a stable iteration order for the cross- language test; map randomisation (Go's default) would break that.
Each bucket's Vec<row_id> is naturally ascending because rows are appended in row_id order, so no per-bucket sort is needed.

Planner cost model

Deliberately simple, frozen across languages:

Predicate	Estimate
`Eq` indexable	`rows / distinct_keys`
`Lt/Le/Gt/Ge` ix	`(rows + 2) / 3`
`Ne`	not indexable → SeqScan
No matching index	not indexable → SeqScan

The (rows+2)/3 is the standard "one-third selectivity" heuristic for inequalities used by SQLite when no histogram is available. rows/distinct for equality is the uniform-distribution maximum-likelihood estimator.

Tie-breaking

If two predicates produce the same estimate, the earlier one wins. This makes the choice deterministic without dragging in input-order-sensitive hashing.

Plan shape

A Plan is always a single Pipeline. Children are, in order: one scan, zero or more Filters (the non-chosen predicates), and an optional Project. No nested pipelines, no joins. Keeps the wire format flat.

Executor model

Volcano-style pull, but materialised at each operator. With at most a few thousand rows per query, the simplicity of materialisation is worth more than the cost of allocation, and it makes the row-emission order trivial to reason about. The "true" pull pipeline is the same code in a streaming shape — the lab doesn't need that subtlety.

Failure Modes Considered

Map randomisation breaks byte identity. Go's default map has a randomised iteration order. We use a parallel sorted slice; explicit sort.Slice is used everywhere a map could leak.
i64 / u32 endianness. Always little-endian, encoded with explicit byte slicing — never unsafe casts.
String collation. Text values are compared as raw bytes (std::memcmp / bytes.Compare / slice ==), never via locale-aware comparison.
Wrong magic length. The result-row magic is exactly 7 bytes DSEQR01 (no NUL terminator). C++ uses std::memcmp(..., 7), never strcmp.

Distributed Systems Engineer — Build Databases & Consensus From Scratch