Step 1 — Open a File and Write Bytes

Goal

Build the smallest possible thing that touches the disk: open a file, write some bytes at a known offset, close the file. You'll do this three times — once in Rust, Go, and C++ — so you can feel how each language exposes the same pread/pwrite/fsync primitives.

Prerequisites

  • Toolchain installed per ../../TOOLS.md.
  • An empty editor and a terminal in this lab's directory.

What You're Building

A function with this signature (conceptually):

write_page(path: string, page_no: u64, bytes: [u8]) -> Result
  • Opens (or creates) path for read+write.
  • Computes offset = page_no * PAGE_SIZE (with PAGE_SIZE = 4096).
  • Zero-pads bytes to exactly PAGE_SIZE.
  • pwrites the padded buffer at offset.
  • Calls fdatasync (or fsync if fdatasync is unavailable).
  • Closes the file.

Why pwrite, not write

The classic POSIX write syscall uses the file's seek pointer (lseek). That makes it stateful — two threads writeing to the same fd will race. pwrite takes an explicit offset and is thread-safe. Every database in this curriculum uses pwrite. No lseek in our code, ever.

Why PAGE_SIZE = 4096

It matches the OS page size on x86_64 and ARM64, which means the kernel page cache, the device LBA, and your write are all the same unit. Mismatched sizes cause read-modify-write at the kernel layer: writing 100 bytes requires the kernel to first read the 4 KiB page containing those bytes, modify, and write back. By always writing a full page, you avoid that hidden cost.

Why fdatasync Over fsync

fsync flushes data and metadata (file size, modification time). For a write that doesn't change the file size — the common case in a steady-state database — fdatasync skips the metadata flush, saving a few hundred microseconds per call on average. Use fdatasync when you can.

Rust Implementation

In ../src/rust/src/lib.rs we use the std::os::unix::fs::FileExt::write_at extension, which compiles to pwrite64 on Linux and macOS. Look at the function write_page.

Key idiom:

#![allow(unused)]
fn main() {
use std::os::unix::fs::FileExt;
file.write_all_at(&buf, offset)?;
file.sync_data()?;   // == fdatasync
}

sync_data is Rust's portable name for fdatasync on Linux and fcntl(F_FULLFSYNC) on macOS (Rust 1.78+ uses F_BARRIERFSYNC on macOS, which is a faster middle ground).

Go Implementation

In ../src/go/pagealloc.go, the WriteAt method is pwrite, and f.Sync() is fsync. There is no first-class fdatasync in os, so we call unix.Fdatasync(fd) from golang.org/x/sys/unix.

if _, err := f.WriteAt(buf, offset); err != nil { return err }
return unix.Fdatasync(int(f.Fd()))

On macOS, unix.Fdatasync is not exported (the kernel doesn't have it). We fall back to unix.FcntlInt(fd, unix.F_FULLFSYNC, 0). The wrapper in fsync_full.go handles the platform branch.

C++ Implementation

In ../src/cpp/src/pagealloc.cc:

ssize_t n = ::pwrite(fd, buf.data(), buf.size(), offset);
if (n != static_cast<ssize_t>(buf.size())) return std::errc::io_error;
::fdatasync(fd);

On macOS we use ::fcntl(fd, F_FULLFSYNC). The dispatch is in fsync_full.cc.

Try It

cd src/rust && cargo build --release
./target/release/pagealloc write /tmp/step1.bin 0 "first page"
xxd -l 32 /tmp/step1.bin

Expected output:

00000000: 4547 4150 3145 5344 0000 0000 0000 0000  EGAP1ESD........
00000010: 6669 7273 7420 7061 6765 0000 0000 0000  first page......

The first 8 bytes are our little-endian page magic 0x44534531_50414745 (read as bytes left-to-right: 45 47 41 50 31 45 53 44). Bytes 16+ contain your ASCII payload "first page" followed by zero-padding to 4 KiB.

What Just Happened

  1. You opened a file (open(2) with O_RDWR | O_CREAT).
  2. You wrote exactly one page at exactly one offset (pwrite(2)).
  3. You forced the data to stable storage (fdatasync(2) or F_FULLFSYNC on macOS).
  4. You closed the fd, which does not flush — close(2) returns immediately.

On a power loss between step 3 and step 4, your write survives. Without step 3, it might not.

Next

In Step 2 you'll add the read path and a hexdump utility, and verify that all three implementations produce byte-identical files.