Page MenuHomeSoftware Heritage

(Re-)Compute data checksums before insertion
Closed, MigratedEdits Locked

Description

Up until now, in git repositories, we have trusted the external "source" for object identifiers. Other kinds of repositories would compute the checksum "locally" before sending the data.

Before importing git repositories, we ran git-fsck on the repositories to make sure the checksums were valid.

As we're moving towards a more distributed development model, and to reduce the need for T75 in the future, we should make swh.storage recompute identifiers, and reject the insertion of data where they don't match.