Observation — SSTable Format
Smallest possible SSTable
Build from an empty MemTable: zero entries, zero data blocks, an empty index, and just the footer.
file size: 0 + 4 (index count=0) + 32 (footer) = 36 bytes
Hex (annotated):
offset
0000: 00 00 00 00 # index block: count=0
0004: 00 00 00 00 00 00 00 00 index_offset = 0
000c: 04 00 00 00 00 00 00 00 index_size = 4
0014: 00 00 00 00 00 00 00 00 num_blocks = 0
001c: 53 53 54 31 00 00 00 00 magic = "SST1\0\0\0\0"
File-size formula
For a build with N entries spread across K data blocks where the
sum of key sizes is Σk and the sum of value sizes (only for
non-tombstone entries) is Σv:
data_bytes = Σ_blocks ( 4 + Σ_entries_in_block (9 + k + v) )
= 4·K + N·9 + Σk + Σv
index_bytes = 4 + Σ_blocks ( 4 + 8 + 8 + first_key_len )
= 4 + K·20 + Σ_block_first_key_lens
file_bytes = data_bytes + index_bytes + 32
(The 4-byte per-block header is the entry count. The 20-byte
per-index-entry overhead is klen u32 + offset u64 + size u64.)
Hex walkthrough of a 3-entry SSTable
Three small entries forced into one block by the small block target —
e.g. put a 1, put bb 22, del ccc:
00000000 03 00 00 00 count=3
00000004 01 00 00 00 01 00 00 00 00 'a' '1' # entry 1: klen=1 vlen=1 type=0 "a" "1"
00000011 02 00 00 00 02 00 00 00 00 'b' 'b' '2' '2' # entry 2: klen=2 vlen=2 type=0 "bb" "22"
0000001e 03 00 00 00 00 00 00 00 01 'c' 'c' 'c' # entry 3: klen=3 vlen=0 type=1 "ccc"
00000028 01 00 00 00 # index count=1
0000002c 01 00 00 00 00 00 00 00 00 00 00 00 28 00 00 00 00 00 00 00 'a' # klen=1 offset=0 size=0x28 "a"
00000048 00 00 00 00 00 00 00 00 # footer.index_offset = 0x28
00000050 19 00 00 00 00 00 00 00 # footer.index_size = 0x19
00000058 01 00 00 00 00 00 00 00 # footer.num_blocks = 1
00000060 53 53 54 31 00 00 00 00 # magic "SST1\0\0\0\0"
# (file size = 0x68 = 104 bytes)
Note that the first key of the single block is "a", so the index
entry copies that key.
What broken looks like
| Symptom | Likely cause |
|---|---|
BadMagic at open | file truncated, or footer overwritten by an interrupted writer. |
BadBlock reading a block | block size in the index disagrees with the in-file count header — e.g. wrong endianness. |
| Two languages produce different file sizes for identical input | block-flush rule mismatch (> vs >=). |
Unsorted from the writer | caller didn't iterate the MemTable in sorted order before add. |
IndexOutOfRange at read | corrupted offset/size in the index — checked against file_len - 32 to fail loudly. |