db-20 — References
Distributed-system foundations and the specific consensus / replication ideas that informed this lab.
Consensus
- Ongaro, D. & Ousterhout, J. In Search of an Understandable Consensus Algorithm (Extended Version). USENIX ATC 2014. https://raft.github.io/raft.pdf
- Lamport, L. Paxos Made Simple. ACM SIGACT News, 2001. https://lamport.azurewebsites.net/pubs/paxos-simple.pdf
- Howard, H. Distributed consensus revised. PhD thesis, Cambridge 2018. https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-935.pdf
CAP / consistency models
- Brewer, E. Towards Robust Distributed Systems (PODC 2000 keynote).
- Gilbert, S. & Lynch, N. Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services. SIGACT 2002.
- Vogels, W. Eventually Consistent. CACM 2009.
Transactional storage
- Gray, J. & Reuter, A. Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1993. Chapter 7 on replicated data.
- Mohan, C. et al. ARIES. ACM TODS 17(1), 1992 — background on why our log is append-only.
State-machine replication
- Schneider, F. B. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial. ACM Comp. Surveys 22(4), 1990.
Production systems for comparison
- etcd — https://etcd.io/docs/v3.5/learning/design-learner/
- TiKV — https://tikv.org/docs/dev/reference/architecture/raftstore/
- CockroachDB — https://www.cockroachlabs.com/docs/stable/architecture/replication-layer.html
Self-references in this repo
db-16-distributed-fundamentals/— failure models, CAP/FLP intuitions.db-17-raft/— the underlying consensus algorithm.db-09-leveldb-complete/— the storage-engine quality bar this lab matches.