It is a format to store what is conceptually a Sorted String Table. There is no reference defining what a Sorted String Table is and the implementations varies depending on the context. It is often said to have been introduced in a paper from Google. It is a Key/Value map sorted by Key.
Format
The custom format is a header:
- Format version
- Number of entries in the index
followed by an index which is a sorted list of fixed size entries:
- SHA256,offset,size
after the index the content of the objects is found.
Writing
It is assumed writing is done in batch, sequentially
Reading
- Binary search for the SHA256 in the index
- Seek to the object content to stream it to the caller in chunks of a given size