Page MenuHomeSoftware Heritage

Prepare support of new hashing algorithms for browsing objects
Open, NormalPublic

Description

Currently, the following object types can be browsed from their sha1_git checksums only:

  • directory
  • revision
  • release

In the future, object identifiers will be computed using other hashing algorithms (sha256 for instance)
so we must prepare the webapp to handle those for browsing archived objects.

We could use the same solution as for browsing contents objects. Indeed each content has four
checksums computed for it: sha1, sha1_git, sha256 and blake2s256.
The webapp then enables to browse a content from each computed checksum using the following
URL: /browse/content/(algo):(hash)/

Event Timeline

anlambert triaged this task as Normal priority.Jun 5 2020, 2:33 PM
anlambert created this task.
zack added a subscriber: zack.Jun 5 2020, 3:00 PM

The webapp then enables to browse a content from each computed checksum using the following URL: /browse/content/(algo):(hash)/

instead of doing that, I think we should enforce that all objects are addressable via (and only via, other than for backward compatibility—if applicable) SWHID

the actual hashing algorithm is an implementation detail that ideally shouldn't be leaked in browsing URLs

zack added a comment.Jun 5 2020, 3:01 PM

FWIW we have discussed already a related aspect in T1805 ("Use SWH PIDs whenever possible"). There it was only for the Web API, but it seems wise to do the same for /browse/ URLs too.

In T2435#45105, @zack wrote:

FWIW we have discussed already a related aspect in T1805 ("Use SWH PIDs whenever possible"). There it was only for the Web API, but it seems wise to do the same for /browse/ URLs too.

Ack, seems a great solution indeed. Fyi, you can find the context of that task creation here.