References — Storage Primitives

Canonical Papers & Specifications

  • POSIX pread/pwrite/fsync — https://pubs.opengroup.org/onlinepubs/9699919799/functions/pread.html
  • Linux open(2) (for O_DIRECT, O_DSYNC) — https://man7.org/linux/man-pages/man2/open.2.html
  • Linux fsync(2) — https://man7.org/linux/man-pages/man2/fsync.2.html
  • Linux io_uring design — https://kernel.dk/io_uring.pdf (Jens Axboe, 2019). Read for db-21.
  • macOS F_FULLFSYNCman fcntl on macOS; see also Apple Tech Note TN1150.

Hardware Numbers

  • "Latency Numbers Every Programmer Should Know" — Jeff Dean, 2012. https://gist.github.com/jboner/2841832
  • "What Every Programmer Should Know About Memory" — Ulrich Drepper, 2007. https://people.freebsd.org/~lstewart/articles/cpumemory.pdf (long but seminal)
  • NVMe specification — https://nvmexpress.org/specifications/ (skim §3 on queues, §4 on commands)

Battle Stories

  • "PostgreSQL's fsync surprise" — https://lwn.net/Articles/752063/. Why fsync semantics on Linux were subtler than database authors assumed. Read this.
  • "Files are Hard" — Dan Luu. https://danluu.com/file-consistency/. Survey of how filesystems can lose your data.
  • "mmap-based databases vs. read/write-based databases" — Andy Pavlo et al., "Are You Sure You Want to Use MMAP in Your Database Management System?", CIDR 2022. https://db.cs.cmu.edu/mmap-cidr2022/. Required reading if you ever consider mmap.

Implementation References

  • SQLite OS interface — https://www.sqlite.org/src/file/src/os_unix.c (search for unixSync to see real-world fsync handling, including the macOS F_FULLFSYNC workaround)
  • LevelDB env_posix.cc — https://github.com/google/leveldb/blob/main/util/env_posix.cc (look at PosixWritableFile::Sync)
  • LMDB — http://www.lmdb.tech/doc/ (the canonical mmap database; read for contrast)

Books

  • "Operating Systems: Three Easy Pieces" — Arpaci-Dusseau. Free at https://pages.cs.wisc.edu/~remzi/OSTEP/. Chapters 39–44 (persistence) are exactly this lab.
  • "Designing Data-Intensive Applications" — Martin Kleppmann, O'Reilly. Chapter 3 ("Storage and Retrieval") frames the LSM vs B-tree debate that drives Phases 2 and 3.