LMDB / mdb design — Howard Chu, MDB: A Memory-Mapped Database
and Backend for OpenLDAP. Describes a B+-tree pager whose write
path is copy-on-write rather than write-back. Useful counterpoint
to the LRU + dirty-bit approach we took.
https://www.symas.com/symas-embedded-database-lmdb
Goetz Graefe, Modern B-Tree Techniques, Foundations and
Trends in Databases 3(4), 2010. Chapter 2 covers buffer-pool
management and the page-eviction policies real engines use.
Andrew S. Tanenbaum, Modern Operating Systems, 4th ed.,
chapter on file systems and page caches. The OS's own page cache
is conceptually our cache; understanding pread/pwrite/fsync
at the kernel level explains why "writing" without fsync is
not durable.
fsync(2) man page — the canonical answer to "what does fsync
actually guarantee?" Read this before assuming a write reached
disk.
Eduardo Pinheiro et al., Failure Trends in a Large Disk Drive
Population, FAST 2007. Sobering reminder that the device under
the pager does fail; durability is a probabilistic claim.
Elizabeth O'Neil, Patrick O'Neil, Gerhard Weikum, The LRU-K
Page Replacement Algorithm For Database Disk Buffering, SIGMOD
1993. Why naive LRU thrashes on scan-heavy workloads, and the
fix everyone borrowed.
Theodore Johnson, Dennis Shasha, 2Q: A Low Overhead High
Performance Buffer Management Replacement Algorithm, VLDB 1994.
The 2Q policy used by Postgres and several others.
The db-11 implementation deliberately uses textbook LRU. db-22
(performance) will measure when this hurts and what 2Q / CLOCK /
ARC buy.
Upstream: none. The pager is a from-scratch component.
Downstream: db-12 (SQL frontend), db-13 (MVCC), db-14
(indexes), db-15 (SQLite-complete), db-20 (distributed KV) all
store state on top of a pager file in this format.