Page MenuHomeSoftware Heritage

Improve archiver behavior on big objects
Closed, MigratedEdits Locked

Description

The archiver does a bunch of unneeded reads of data, which makes it strain objstorage backends a lot.

  • it calls objstorage.check(obj_id) before starting the copy
  • it then reads the whole object at once again, using objstorage.get(obj_id)
  • it then pushes the object for each new copy it's creating

It should probably be changed to

  • do the integrity checking client-side
  • use the streaming methods to shuffle objects around

Event Timeline

olasd lowered the priority of this task from Normal to Low.Sep 29 2017, 3:17 PM
zack claimed this task.
zack added a subscriber: zack.

the archiver is gone, closing