Distributed Systems Engineer — Build Databases & Consensus From Scratch

"What I cannot create, I do not understand." — Richard Feynman

A lab-based curriculum for becoming a senior distributed systems engineer by building the systems you'll one day operate, debug, and replace: LevelDB (LSM-tree storage), SQLite (B-tree storage + SQL), and the three canonical consensus algorithms — Raft, Paxos, and ZAB — all implemented from scratch in Rust, Go, and C++.

Why This Repo Exists

Most engineers treat databases and consensus as black boxes. This curriculum makes them transparent. You will:

Write storage engines that flush, compact, recover, and serve concurrent reads.
Implement consensus protocols that survive node crashes, network partitions, and message reordering.
Reason about hardware trade-offs: SSD vs HDD seek latency, write amplification, fsync cost, io_uring vs blocking I/O, cache-line locality, NUMA effects.
Compare algorithm families: LSM vs B-tree, level-based vs size-tiered compaction, Raft vs Multi-Paxos vs ZAB.
Build the same thing three times — once in each language — to internalize the design (not the syntax).

Curriculum at a Glance

Phase	Theme	Labs
1	Storage Primitives & Foundations	`db-01` … `db-04`
2	LevelDB / LSM-Tree	`db-05` … `db-09`
3	SQLite / B-Tree	`db-10` … `db-15`
4	Consensus Algorithms	`db-16` … `db-20`
5	Advanced Storage & Capstone	`db-21` … `db-23`

See PHASES.md for the full breakdown with learning objectives per lab.

How To Use This Repo

Read TOOLS.md and install the required toolchains (Rust, Go, C++/CMake).

Start with db-01-storage-primitives/. Each lab is self-contained and has the same shape:

db-NN-<name>/
├── CONCEPTS.md       # The "why" — read this first
├── references.md     # Papers and source-code links to study
├── docs/
│   ├── analysis.md       # Design trade-offs (hardware, algorithmic)
│   ├── broader-ideas.md  # Extensions, alternatives, future work
│   ├── execution.md      # Toolchain versions, quick-start commands
│   ├── observation.md    # Debugging, profiling, monitoring
│   └── verification.md   # Pass/fail checks for your implementation
├── steps/            # Numbered, sequential implementation guides
│   ├── 01-*.md
│   └── 02-*.md
└── src/
    ├── rust/         # Cargo workspace
    ├── go/           # Go module
    └── cpp/          # CMake project

Work through steps/ in order. The reference code in src/ is a target — try to write your own first, then compare.
Run the checks in docs/verification.md before moving on.

What You Will Build

By the end of the curriculum you will have implemented (×3 languages):

A crash-safe write-ahead log with CRC32 checksums and group commit.
A skip-list MemTable, an SSTable file format with block compression, and level-based compaction.
A page-oriented B+-tree with a pager, rollback journal, and WAL mode.
A hand-written SQL tokenizer, parser, AST, and bytecode virtual machine.
A transaction manager with MVCC snapshot reads and serializable writes.
A complete Raft implementation with snapshotting and membership changes.
Single-decree Paxos and Multi-Paxos with a stable leader.
A simplified ZAB broadcast layer with epoch transitions.
A 3-node distributed KV store combining Raft with your LevelDB clone.
A capstone mini distributed SQL database (the storage engine, the SQL frontend, and Raft replication — all your own code).

Prerequisites

Comfortable with C-family syntax in at least one systems language (you'll pick up the other two as you go).
Familiarity with binary trees, hash tables, and Big-O analysis.
Basic Linux command-line and git.
Not required: prior distributed systems knowledge, SQL internals knowledge, or database engine experience. We build it all from the ground up.

Pedagogical Style

Modeled after cstack/db_tutorial (concept-first, incremental, runnable code at every step) and the ai-engineering/ lab repo (consistent 8-part CONCEPTS.md, docs/, steps/, src/ structure).

Every CONCEPTS.md follows the same 8-part framework:

What Is It — one-paragraph executive summary
Why It Matters — concrete benefits
How It Works — ASCII architecture diagram
Core Terminology — table of precise definitions
Mental Models — analogies for intuition
Common Misconceptions — myths corrected
Interview Talking Points — what to say in a senior systems interview
Connections to Other Labs — how this fits the bigger picture

Status

Phase	Status
Phase 1 — Storage Primitives	Lab 01 complete, 02–04 scaffolded
Phase 2 — LevelDB	Scaffolded
Phase 3 — SQLite	Scaffolded
Phase 4 — Consensus	Scaffolded
Phase 5 — Advanced & Capstone	Scaffolded

See PHASES.md for per-lab status.

License

MIT — see source headers in each implementation.