Observation — Storage Primitives
How to look inside the page cache, watch syscalls, measure latency, and prove to yourself that your code is doing what you think.
Looking at the Page Cache
Linux
# What's in the page cache for our file? (Requires `pcstat` or vmtouch.)
go install github.com/tobert/pcstat/pcstat@latest
pcstat /tmp/lab01.bin
# Drop the page cache (requires root) — to test "cold" reads.
sync && sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
macOS
# `purge` drops the unified buffer cache (requires admin password).
sudo purge
# `fs_usage` is the macOS strace-for-files.
sudo fs_usage -w -f filesys ./src/rust/target/release/pagealloc
Watching Syscalls
# Linux
strace -e trace=openat,pread64,pwrite64,fsync,fdatasync \
./src/go/pagealloc-go write /tmp/lab01.bin 0 "hello"
# macOS (sudo required for dtrace)
sudo dtruss -f -t pread,pwrite,fsync ./src/cpp/build/pagealloc write /tmp/lab01.bin 0 "hello" 2>&1
You should see — in order:
openat(AT_FDCWD, "/tmp/lab01.bin", O_RDWR|O_CREAT, 0644) = 3
pwrite64(3, "hello\0\0...", 4096, 0) = 4096
fdatasync(3) = 0
close(3) = 0
If you see read(3, ...) without an offset, you're using buffered I/O — that's wrong for this lab.
If you see no fsync/fdatasync, your durability is fake.
Measuring Latency
The bench subcommand measures cold-cache and warm-cache pread latency:
# Preallocate a 100 MB file, then do 10000 random 4 KiB reads.
./src/rust/target/release/pagealloc bench /tmp/lab01.bin 25600 10000
Expected output:
preallocated: 25600 pages = 102400 KiB
warm-cache reads: p50=3.1 µs p99=8.4 µs throughput=315 MB/s
dropped page cache
cold-cache reads: p50=78 µs p99=210 µs throughput=51 MB/s
The exact numbers depend on your hardware. The shape matters:
- Warm p50 ≈ 1–5 µs: that's a memcpy from the page cache. No actual disk I/O.
- Cold p50 ≈ 50–200 µs on NVMe, 5–15 ms on a spinning disk.
- p99 > 10× p50: latency tails are real; this motivates
io_uringand dedicated I/O threads.
Profiling Tools
Rust
cargo install cargo-flamegraph
cd src/rust
cargo flamegraph --release --bin pagealloc -- bench /tmp/lab01.bin 25600 100000
# open flamegraph.svg in your browser
Go
cd src/go
go test -bench=BenchmarkPread -cpuprofile=cpu.prof ./...
go tool pprof -http=:8080 cpu.prof
C++
# Linux
perf record -F 999 -g ./src/cpp/build/pagealloc bench /tmp/lab01.bin 25600 100000
perf report
# macOS (use Instruments.app or sample)
sample pagealloc 5 -file /tmp/sample.txt
Watching Disk Throughput
# Linux (iostat from sysstat package)
iostat -dx 1 nvme0n1
# macOS
sudo fs_usage -w -f diskio
While running pagealloc bench, watch r/s (reads per second), rkB/s, and await (avg I/O latency in ms). For NVMe, expect r/s to plateau in the thousands for QD=1; you'd need io_uring (Lab 21) to push it into the hundreds of thousands.
Verifying Endianness
# Write the integer 0x01020304 into a fresh file (we'll write it as bytes via hexdump).
./src/rust/target/release/pagealloc write /tmp/endian.bin 0 ""
# In a separate REPL session, use whichever language you prefer to write a binary u32 to the file.
# Then xxd the file:
xxd /tmp/endian.bin | head -1
A little-endian system writes 04 03 02 01 for the value 0x01020304. If you see 01 02 03 04, either your machine is big-endian (unlikely on x86/ARM) or your code is using to_be_bytes somewhere.