Page MenuHomeSoftware Heritage

compress Merkle DAG and origin nodes together
Closed, MigratedEdits Locked

Description

We've created a compressed representation of the archive Merkle DAG, from snapshot nodes down to contents.
Strictly speaking that is exhaustive, but for some use cases (e.g., list all the different origins where a given node has been found) it would be useful to also have nodes representing origins.
Now that we know compressing the full Merkle DAG is doable, we should give a try to the extended version that also includes origin nodes. It shouldn't be much harder, given the added nodes/edges are not that numerous.

Event Timeline

zack triaged this task as Normal priority.Jun 30 2019, 1:56 PM
zack created this task.
zack renamed this task from compress the archive graph, including origin nodes to compress Merkle DAG and origin nodes together.Jul 7 2019, 1:51 PM
zack raised the priority of this task from Normal to High.
zack changed the task status from Open to Work in Progress.Jul 9 2019, 2:52 PM

this has been started on sexus yesterday, ETA: next monday-ish

Due to multiple server maintenance, the process was re-started a few times, but it is now finished and results are uploaded in the annex: https://annex.softwareheritage.org/public/dataset/graph/latest/compressed/all+ori/

haltode claimed this task.