Page MenuHomeSoftware Heritage

Using a custom Hash Table format
Started, Work in Progress, NormalPublic

Description

A possible optimization could be to generate a https://en.wikipedia.org/wiki/Perfect_hash_function

Format

The custom format is a header:

  • Format version

followed by an index which is a hash table

  • HASH(SHA256),offset,size

after the index the content of the objects is found.

Writing

It is assumed writing is done in batch, sequentially

Reading

  • HASH(SHA256) in the index
  • Seek to the object content to stream it to the caller in chunks of a given size

Event Timeline

dachary changed the task status from Open to Work in Progress.Mar 8 2021, 10:08 PM
dachary triaged this task as Normal priority.
dachary created this task.
dachary created this object in space S1 Public.