Distributed Systems Engineer — Build Databases & Consensus From Scratch
"What I cannot create, I do not understand." — Richard Feynman
A lab-based curriculum for becoming a senior distributed systems engineer by building the systems you'll one day operate, debug, and replace: LevelDB (LSM-tree storage), SQLite (B-tree storage + SQL), and the three canonical consensus algorithms — Raft, Paxos, and ZAB — all implemented from scratch in Rust, Go, and C++.
Why This Repo Exists
Most engineers treat databases and consensus as black boxes. This curriculum makes them transparent. You will:
- Write storage engines that flush, compact, recover, and serve concurrent reads.
- Implement consensus protocols that survive node crashes, network partitions, and message reordering.
- Reason about hardware trade-offs: SSD vs HDD seek latency, write amplification,
fsynccost,io_uringvs blocking I/O, cache-line locality, NUMA effects. - Compare algorithm families: LSM vs B-tree, level-based vs size-tiered compaction, Raft vs Multi-Paxos vs ZAB.
- Build the same thing three times — once in each language — to internalize the design (not the syntax).
Curriculum at a Glance
| Phase | Theme | Labs |
|---|---|---|
| 1 | Storage Primitives & Foundations | db-01 … db-04 |
| 2 | LevelDB / LSM-Tree | db-05 … db-09 |
| 3 | SQLite / B-Tree | db-10 … db-15 |
| 4 | Consensus Algorithms | db-16 … db-20 |
| 5 | Advanced Storage & Capstone | db-21 … db-23 |
See PHASES.md for the full breakdown with learning objectives per lab.
How To Use This Repo
- Read TOOLS.md and install the required toolchains (Rust, Go, C++/CMake).
- Start with
db-01-storage-primitives/. Each lab is self-contained and has the same shape:db-NN-<name>/ ├── CONCEPTS.md # The "why" — read this first ├── references.md # Papers and source-code links to study ├── docs/ │ ├── analysis.md # Design trade-offs (hardware, algorithmic) │ ├── broader-ideas.md # Extensions, alternatives, future work │ ├── execution.md # Toolchain versions, quick-start commands │ ├── observation.md # Debugging, profiling, monitoring │ └── verification.md # Pass/fail checks for your implementation ├── steps/ # Numbered, sequential implementation guides │ ├── 01-*.md │ └── 02-*.md └── src/ ├── rust/ # Cargo workspace ├── go/ # Go module └── cpp/ # CMake project - Work through
steps/in order. The reference code insrc/is a target — try to write your own first, then compare. - Run the checks in
docs/verification.mdbefore moving on.
What You Will Build
By the end of the curriculum you will have implemented (×3 languages):
- A crash-safe write-ahead log with CRC32 checksums and group commit.
- A skip-list MemTable, an SSTable file format with block compression, and level-based compaction.
- A page-oriented B+-tree with a pager, rollback journal, and WAL mode.
- A hand-written SQL tokenizer, parser, AST, and bytecode virtual machine.
- A transaction manager with MVCC snapshot reads and serializable writes.
- A complete Raft implementation with snapshotting and membership changes.
- Single-decree Paxos and Multi-Paxos with a stable leader.
- A simplified ZAB broadcast layer with epoch transitions.
- A 3-node distributed KV store combining Raft with your LevelDB clone.
- A capstone mini distributed SQL database (the storage engine, the SQL frontend, and Raft replication — all your own code).
Prerequisites
- Comfortable with C-family syntax in at least one systems language (you'll pick up the other two as you go).
- Familiarity with binary trees, hash tables, and Big-O analysis.
- Basic Linux command-line and
git. - Not required: prior distributed systems knowledge, SQL internals knowledge, or database engine experience. We build it all from the ground up.
Pedagogical Style
Modeled after cstack/db_tutorial (concept-first, incremental, runnable code at every step) and the ai-engineering/ lab repo (consistent 8-part CONCEPTS.md, docs/, steps/, src/ structure).
Every CONCEPTS.md follows the same 8-part framework:
- What Is It — one-paragraph executive summary
- Why It Matters — concrete benefits
- How It Works — ASCII architecture diagram
- Core Terminology — table of precise definitions
- Mental Models — analogies for intuition
- Common Misconceptions — myths corrected
- Interview Talking Points — what to say in a senior systems interview
- Connections to Other Labs — how this fits the bigger picture
Status
| Phase | Status |
|---|---|
| Phase 1 — Storage Primitives | Lab 01 complete, 02–04 scaffolded |
| Phase 2 — LevelDB | Scaffolded |
| Phase 3 — SQLite | Scaffolded |
| Phase 4 — Consensus | Scaffolded |
| Phase 5 — Advanced & Capstone | Scaffolded |
See PHASES.md for per-lab status.
License
MIT — see source headers in each implementation.