Page MenuHomeSoftware Heritage

Read compression input from ORC instead of the edges file
Closed, MigratedEdits Locked

Description

The edges dataset can't contain arbitrary node properties. Instead of compressing the graph dataset from the edges files, we could compress it from the ORC files, which would allow us to read node properties along the way and write them to the appropriate on-disk representation.

Event Timeline

seirl triaged this task as High priority.Dec 6 2021, 11:05 AM
seirl created this task.
zack changed the task status from Open to Work in Progress.Jan 4 2022, 1:35 PM
zack moved this task from Backlog to In progress on the Compressed graph service board.