A unified glossary of terms used across all labs. Terms are grouped by domain.
| Term | Definition |
| Page | The unit of I/O between disk and memory. Usually 4 KiB (matches OS page size) but databases often use 4–32 KiB. |
| Block | An SSTable's I/O unit (LevelDB default 4 KiB). Distinct from a B-tree "page" — both are I/O units but for different engines. |
| mmap | Map a file into process address space. Reads happen via page faults; writes via dirty pages flushed by the kernel. |
| pread/pwrite | Positional read/write syscalls. Explicit offset, no shared file pointer. Predictable cost, no page-fault stalls. |
O_DIRECT | Open flag (Linux) that bypasses the page cache. Requires aligned buffers, aligned offsets, aligned sizes. |
fsync | Force file data + metadata to stable storage. Blocks until disk acknowledges. Often the slowest syscall in a database. |
fdatasync | Like fsync but skips non-essential metadata. Faster on most filesystems. |
| Write amplification (WA) | Bytes physically written / bytes logically written. SSDs have hardware WA; LSM-trees have algorithmic WA from compaction. |
| Read amplification (RA) | Bytes physically read / bytes logically read. LSM-trees suffer from RA due to checking multiple levels. |
| Space amplification | Bytes on disk / bytes of live data. LSMs have space amp from stale data awaiting compaction. |
| Endianness | Byte order. Little-endian (x86, ARM default): least-significant byte first. Big-endian: network byte order. |
| Alignment | Memory address being a multiple of N. Required for O_DIRECT (usually 512 B or 4 KiB) and SIMD ops. |
io_uring | Linux async I/O API (≥ 5.1). Two ring buffers (SQ/CQ) shared between kernel and user space. |
| DMA | Direct Memory Access — disk controller writes directly to RAM without CPU involvement. |
| Term | Definition |
| HDD seek time | ~5–10 ms for random reads (head movement + rotational latency). ~150 MB/s sequential. |
| SATA SSD | ~100 μs random read latency, ~500 MB/s sequential, ~80K IOPS. |
| NVMe SSD | ~50–100 μs random read latency, ~3–7 GB/s sequential, ~500K–1M IOPS. Multiple hardware queues. |
| Cache line | CPU cache unit, almost always 64 bytes. Data-structure layout for cache locality matters. |
| NUMA | Non-Uniform Memory Access — CPU sockets have local RAM; cross-socket access is slower. |
| Wear leveling | SSD firmware spreads writes across blocks to even out flash wear. Causes hardware write amplification. |
| Term | Definition |
| Skip list | Probabilistic balanced structure with O(log n) ops and lock-free-friendly properties. Used in LevelDB MemTable. |
| B-Tree | Self-balancing m-ary tree. Internal nodes store keys + values + child pointers. Used for indexes. |
| B+-Tree | B-Tree variant where all values live in leaf nodes; internal nodes are pure routing. Used for tables in SQLite. |
| LSM-Tree | Log-Structured Merge-Tree. In-memory MemTable + on-disk sorted runs (SSTables), merged via compaction. |
| Bloom filter | Probabilistic set membership; no false negatives, tunable false positive rate. Used to skip SSTable lookups. |
| ART | Adaptive Radix Tree — modern in-memory index alternative to B-Trees, used by HyPer, DuckDB. |
| Term | Definition |
| Quorum | Subset of nodes whose agreement is required. Typically ⌊N/2⌋ + 1 for majority quorum. |
| Term / Epoch | Monotonically increasing identifier for a leadership period (Raft term, ZAB epoch, Paxos ballot). |
| Log index | Position of an entry in the replicated log. Indices are monotonic and dense. |
| Commit index | The largest log index known to be safely replicated to a quorum. |
| Linearizability | Strongest consistency: operations appear to take effect atomically at some point between their invocation and response. |
| Sequential consistency | All processes agree on a single global order, but the order need not match real-time. |
| Eventual consistency | If updates stop, all replicas eventually agree. No real-time guarantees. |
| CAP theorem | Under a network partition, you must choose Consistency or Availability. Partition tolerance is non-negotiable. |
| FLP impossibility | No deterministic asynchronous consensus protocol can guarantee progress with even one crash failure. |
| Lamport timestamp | Scalar logical clock: L(a) < L(b) if a happened-before b. Cannot detect concurrency. |
| Vector clock | Per-node vector. VC(a) < VC(b) iff every component is ≤. Detects concurrent events. |
| HLC | Hybrid Logical Clock: combines physical time with a logical counter; bounded skew from real time. |
| Term | Definition |
| ACID | Atomicity, Consistency, Isolation, Durability — properties a transaction must satisfy. |
| Isolation level | READ UNCOMMITTED → READ COMMITTED → REPEATABLE READ → SERIALIZABLE. Each rules out more anomalies. |
| Dirty read | Reading data written by an uncommitted transaction. |
| Non-repeatable read | Reading the same row twice in one tx and getting different values. |
| Phantom read | A range query returns different rows when re-run within one tx. |
| MVCC | Multi-Version Concurrency Control — writes create new versions; readers see a snapshot. |
| 2PL | Two-Phase Locking — acquire locks in a growing phase, release in a shrinking phase. Guarantees serializability. |
| 2PC | Two-Phase Commit — distributed transaction protocol: prepare phase, then commit/abort. Blocking on coordinator failure. |
| Term | Definition |
| VDBE | Virtual Database Engine — SQLite's bytecode VM that executes compiled SQL. |
| Prepared statement | A parsed and compiled SQL statement, reusable with different parameters. |
| Cardinality estimation | Predicting how many rows a query operator will produce. Core to the query planner. |
| Selectivity | Fraction of rows that satisfy a predicate. Low selectivity ⇒ index scan preferred. |
| Covering index | An index that contains all columns needed by a query, so the table doesn't need to be touched. |
| Term | Definition |
| Snapshot | A consistent point-in-time view of data. Used for backups, MVCC reads, Raft log compaction. |
| Checkpoint | Operation that flushes in-memory state to disk so recovery has less log to replay. |
| Compaction | Background process that merges sorted files (LSM) or reclaims fragmented space (B-tree). |
| YCSB | Yahoo Cloud Serving Benchmark — standard KV workload suite (A–F). Used in db-22. |
| Jepsen | Test framework for distributed systems correctness; injects partitions/clock skew. Inspires our consensus tests. |