Execution — SSTable Format
Library API (uniform across Rust / Go / C++)
struct Entry { type: 0|1; value: bytes } // tombstone => value == empty
struct IndexEntry { key: bytes; offset: u64; size: u64 }
struct Footer { index_offset: u64; index_size: u64; num_blocks: u64; magic: "SST1\0\0\0\0" }
const BLOCK_TARGET: usize = 4096
const FOOTER_LEN: usize = 32
const MAGIC: &[u8; 8] = b"SST1\0\0\0\0"
// ---- writer ----
SstWriter::new(target_block_size = BLOCK_TARGET)
SstWriter::add(&mut self, key: &[u8], entry: Entry) // keys MUST be strictly ascending
SstWriter::finish(&mut self) -> Vec<u8> // returns full SSTable bytes
// ---- reader ----
SstReader::open(bytes: &[u8]) -> Result<Self, Error>
SstReader::len(&self) -> usize // num entries
SstReader::num_blocks(&self) -> usize
SstReader::get(&self, key: &[u8]) -> Option<Entry> // None if absent OR tombstone is not skipped
SstReader::iter(&self) -> impl Iterator<Item=(&[u8], Entry)> // full file scan
Error variants: Short, BadMagic, BadBlock, Unsorted,
BadTombstone, BadType, IndexOutOfRange.
CLI
The binary is named sstable in every language and dispatches on the
first arg:
sstable build IN.mt OUT.sst # read MemTable dump, write SSTable
sstable footer FILE.sst # print: index_offset=... index_size=... num_blocks=... magic_ok=...
sstable get FILE.sst KEY # prints: value: <hex> | tombstone | absent
sstable iter FILE.sst # prints lines: V <hex-key> <hex-value> | T <hex-key>
sstable size FILE.sst # prints: file_bytes=B entries=N num_blocks=K
Output formats match db-05 deliberately so the same cross-test helpers
(hex iter, value:/tombstone/absent get) apply.
Worked example
Given memtable bulk M.mt 100 && memtable put M.mt key50 REPLACED && memtable del M.mt key10, calling sstable build M.mt OUT.sst does:
- Decode
M.mt(MemTable format from db-05). - Iterate in sorted order; for each entry, call
writer.add. - The writer accumulates entries into a 4096-byte data-block buffer.
When the next entry would overflow, it flushes the buffer:
- records
IndexEntry { key = first_key_of_block, offset, size }, - appends the encoded block to the output stream,
- resets the buffer with the just-added entry.
- records
- After the last
add,finishflushes the final block, then writes the index block, then a 32-byte footer ending inSST1\0\0\0\0.
The output file is then self-validating: sstable footer OUT.sst
prints the footer values, sstable iter OUT.sst reproduces every
entry in sorted order, and sstable get OUT.sst key50 returns
value: 5245504c41434544.