db-11 step 01 — Page I/O and file layout
Goal
Build the bottom half of the pager: the file format and the
uncached read / write / allocate path. No cache, no LRU, no
eviction. Every read is a pread; every write is a pwrite;
flush is just fsync.
Tasks
- Define
MAGIC = b"DSE-PAGER-v1\0\0\0\0"(16 bytes) andHEADER_LEN = 24. - Implement
Pager::open(path, page_size, capacity):- If file does not exist or is empty, create it; write a fresh
header page (magic + page_size + num_pages=1, zero-padded to
page_size); fsync. - If file exists, read bytes 0..24, validate magic, parse
page_sizeandnum_pages. The caller-suppliedpage_sizeargument must match the on-disk value (or be supplied as the authoritative size on creation).
- If file does not exist or is empty, create it; write a fresh
header page (magic + page_size + num_pages=1, zero-padded to
- Implement
Pager::allocate() -> u32:- return
num_pages, thennum_pages += 1. The on-disk file is not yet extended — the nextflush()will rewrite page 0 and the new page will materialise then.
- return
- Implement
Pager::read(pid) -> Vec<u8>(no caching yet):- validate
1 <= pid < num_pages. pread(page_size bytes at offset pid * page_size).
- validate
- Implement
Pager::write(pid, bytes)(no caching yet):- validate
bytes.len() == page_size. - validate
1 <= pid < num_pages. pwrite(bytes at offset pid * page_size).
- validate
- Implement
Pager::flush():- rewrite page 0 with current
num_pages(handles allocate-only transactions). fsync.
- rewrite page 0 with current
- Implement
Pager::close():flush()then drop the file handle.
Acceptance
Inline unit tests:
header_round_trip— open new file, close, reopen, assertnum_pages == 1and the magic is intact.allocate_monotonic— threeallocate()calls in a row return1, 2, 3.write_then_read_same_pager— allocate, write a known byte pattern, read it back, assert equal.write_then_reopen_then_read— allocate, write,flush(), drop, reopen, read; bytes survived.flush_extends_file— after allocate + write + flush, file size equals(num_pages) * page_size.
All three green in Rust, Go, and C++.
Discussion prompts
- Why is
num_pagesstored on page 0 rather than inferred from the file size? (Hint: what happens betweenallocate()andflush()if the OS crashes?) - What goes wrong if
open()is called concurrently from two processes on the same file? - Why does
flush()rewrite page 0 even if no data page changed?