Step 1 — Open a File and Write Bytes
Goal
Build the smallest possible thing that touches the disk: open a file, write some bytes at a known offset, close the file. You'll do this three times — once in Rust, Go, and C++ — so you can feel how each language exposes the same pread/pwrite/fsync primitives.
Prerequisites
- Toolchain installed per
../../TOOLS.md. - An empty editor and a terminal in this lab's directory.
What You're Building
A function with this signature (conceptually):
write_page(path: string, page_no: u64, bytes: [u8]) -> Result
- Opens (or creates)
pathfor read+write. - Computes
offset = page_no * PAGE_SIZE(withPAGE_SIZE = 4096). - Zero-pads
bytesto exactlyPAGE_SIZE. pwrites the padded buffer atoffset.- Calls
fdatasync(orfsynciffdatasyncis unavailable). - Closes the file.
Why pwrite, not write
The classic POSIX write syscall uses the file's seek pointer (lseek). That makes it stateful — two threads writeing to the same fd will race. pwrite takes an explicit offset and is thread-safe. Every database in this curriculum uses pwrite. No lseek in our code, ever.
Why PAGE_SIZE = 4096
It matches the OS page size on x86_64 and ARM64, which means the kernel page cache, the device LBA, and your write are all the same unit. Mismatched sizes cause read-modify-write at the kernel layer: writing 100 bytes requires the kernel to first read the 4 KiB page containing those bytes, modify, and write back. By always writing a full page, you avoid that hidden cost.
Why fdatasync Over fsync
fsync flushes data and metadata (file size, modification time). For a write that doesn't change the file size — the common case in a steady-state database — fdatasync skips the metadata flush, saving a few hundred microseconds per call on average. Use fdatasync when you can.
Rust Implementation
In ../src/rust/src/lib.rs we use the std::os::unix::fs::FileExt::write_at extension, which compiles to pwrite64 on Linux and macOS. Look at the function write_page.
Key idiom:
#![allow(unused)] fn main() { use std::os::unix::fs::FileExt; file.write_all_at(&buf, offset)?; file.sync_data()?; // == fdatasync }
sync_data is Rust's portable name for fdatasync on Linux and fcntl(F_FULLFSYNC) on macOS (Rust 1.78+ uses F_BARRIERFSYNC on macOS, which is a faster middle ground).
Go Implementation
In ../src/go/pagealloc.go, the WriteAt method is pwrite, and f.Sync() is fsync. There is no first-class fdatasync in os, so we call unix.Fdatasync(fd) from golang.org/x/sys/unix.
if _, err := f.WriteAt(buf, offset); err != nil { return err }
return unix.Fdatasync(int(f.Fd()))
On macOS, unix.Fdatasync is not exported (the kernel doesn't have it). We fall back to unix.FcntlInt(fd, unix.F_FULLFSYNC, 0). The wrapper in fsync_full.go handles the platform branch.
C++ Implementation
In ../src/cpp/src/pagealloc.cc:
ssize_t n = ::pwrite(fd, buf.data(), buf.size(), offset);
if (n != static_cast<ssize_t>(buf.size())) return std::errc::io_error;
::fdatasync(fd);
On macOS we use ::fcntl(fd, F_FULLFSYNC). The dispatch is in fsync_full.cc.
Try It
cd src/rust && cargo build --release
./target/release/pagealloc write /tmp/step1.bin 0 "first page"
xxd -l 32 /tmp/step1.bin
Expected output:
00000000: 4547 4150 3145 5344 0000 0000 0000 0000 EGAP1ESD........
00000010: 6669 7273 7420 7061 6765 0000 0000 0000 first page......
The first 8 bytes are our little-endian page magic 0x44534531_50414745 (read as bytes left-to-right: 45 47 41 50 31 45 53 44).
Bytes 16+ contain your ASCII payload "first page" followed by zero-padding to 4 KiB.
What Just Happened
- You opened a file (
open(2)withO_RDWR | O_CREAT). - You wrote exactly one page at exactly one offset (
pwrite(2)). - You forced the data to stable storage (
fdatasync(2)orF_FULLFSYNCon macOS). - You closed the fd, which does not flush —
close(2)returns immediately.
On a power loss between step 3 and step 4, your write survives. Without step 3, it might not.
Next
In Step 2 you'll add the read path and a hexdump utility, and verify that all three implementations produce byte-identical files.