Page MenuHomeSoftware Heritage

(Re-)Compute data checksums before insertion
Open, NormalPublic

Description

Up until now, in git repositories, we have trusted the external "source" for object identifiers. Other kinds of repositories would compute the checksum "locally" before sending the data.

Before importing git repositories, we ran git-fsck on the repositories to make sure the checksums were valid.

As we're moving towards a more distributed development model, and to reduce the need for T75 in the future, we should make swh.storage recompute identifiers, and reject the insertion of data where they don't match.

Event Timeline

olasd created this task.May 11 2016, 7:16 PM
olasd changed the visibility from "All Users" to "Public (No Login Required)".May 13 2016, 5:09 PM
zack renamed this task from Compute checksums of data before insertion to (Re-)Compute data checksums before insertion.Feb 10 2017, 8:43 AM