Page MenuHomeSoftware Heritage

Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects
Open, HighPublic

Description

It turns out that using only swhid+authority_id+discovery_date+fetcher_id is not a good unicity key for RawExtrinsicMetadata objects, as we already need to include some other fields in it (context keys, see T2668); and we might need to include others in the future.

Additionally, it is currently intentionally unspecified in the spec and in the RPC interface what happens if we write two different objects with the same key.
This is fine, but less than ideal.

Hashing the entire object solves both these issues.

The only drawback is that the unicity key isn't human-readable anymore, and requires an API request to know what SWHID it's about. But we are already doing that for most objects, and I don't think it matters much anyway.

Event Timeline

vlorentz renamed this task from Use intrinsic identifiers for RawExtrinsicMetadata objects to Use intrinsic identifiers/hashes for RawExtrinsicMetadata objects.Wed, Oct 14, 2:01 PM
vlorentz triaged this task as High priority.
vlorentz created this task.
vlorentz edited projects, added Data Model; removed Package Loader.