Execution — db-05 LSM MemTable

Library API (Rust shape; mirrored in Go and C++)

#![allow(unused)]
fn main() {
pub enum Entry { Value(Vec<u8>), Tombstone }

pub struct MemTable { /* sorted map */ }

impl MemTable {
    pub fn new() -> Self;
    pub fn len(&self) -> usize;
    pub fn size_bytes(&self) -> usize;
    pub fn put(&mut self, key: &[u8], value: &[u8]);
    pub fn delete(&mut self, key: &[u8]);
    pub fn get(&self, key: &[u8]) -> Option<&Entry>;
    pub fn iter(&self) -> impl Iterator<Item = (&[u8], &Entry)>;
    pub fn encode(&self) -> Vec<u8>;
    pub fn decode(bytes: &[u8]) -> Result<Self, Error>;
}
}

Go: func New() *MemTable, func (*MemTable) Put / Delete / Get / Iter / Encode, func Decode([]byte) (*MemTable, error). Iter yields a slice of (key, entry) pairs in sorted order.

C++: class MemTable with the same names; Iter() returns a const reference to the underlying std::map.

CLI: memtable

memtable <subcommand> [args]

Subcommands:
  new PATH                       create an empty MemTable at PATH
  put PATH KEY VALUE             open PATH, put, save
  del PATH KEY                   open PATH, delete (writes tombstone), save
  get PATH KEY                   print 'value: <hex>' | 'tombstone' | 'absent'
  iter PATH                      print one line per entry: TYPE KEY VALUE  (hex)
  bulk PATH N                    open or create PATH, insert key0..key{N-1}
                                 with values val0..val{N-1}, save
  size PATH                      print 'entries=N size_bytes=B'

Iter output format (deterministic, used by cross-test):

V <hex-key> <hex-value>
T <hex-key>

Hex is lowercase, no 0x prefix, no separators.

Build & test

Per language:

# Rust
cd src/rust && cargo test --release && cargo build --release

# Go
cd src/go && go test ./... && go build -o bin/memtable ./cmd/memtable

# C++
cd src/cpp && cmake -S . -B build -DCMAKE_BUILD_TYPE=Release \
            && cmake --build build && ( cd build && ctest --output-on-failure )

Or run all at once:

bash scripts/verify.sh

Cross-language interop test

scripts/cross_test.sh:

  1. Build all three binaries.
  2. Drive each one through the same sequence of bulk 100, a handful of puts with overwrites, and a handful of dels.
  3. SHA-256 each dump; assert all three match.
  4. For each writer/reader pair, run iter and check the output is byte-identical.

If any pair differs, the script prints the failing combination and exits non-zero.

Manual playground

$ memtable new /tmp/m.bin
$ memtable put /tmp/m.bin alpha "first"
$ memtable put /tmp/m.bin beta  "second"
$ memtable put /tmp/m.bin alpha "first-updated"   # overwrite
$ memtable del /tmp/m.bin beta                    # tombstone
$ memtable iter /tmp/m.bin
V 616c706861 66697273742d75706461746564
T 62657461
$ memtable get /tmp/m.bin alpha
value: 66697273742d75706461746564
$ memtable get /tmp/m.bin beta
tombstone
$ memtable get /tmp/m.bin gamma
absent
$ memtable size /tmp/m.bin
entries=2 size_bytes=37

37 = 8 (header) + (9+5+13) + (9+4+0) = 8 + 27 + 13 — two entries with key "alpha"/value "first-updated" and tombstone for key "beta".

What broken looks like

SymptomLikely cause
Cross-test SHA mismatch on first byte setMagic disagreement (must be ASCII MMT1).
Cross-test SHA mismatch mid-fileEndianness or type byte placement differs.
iter order differs across langsGo's map iteration order; missed the sort.Strings.
get returns absent after delTombstone not stored, only erased.
Decoder accepts trailing garbageForgot the "consumed all bytes" check.