We don't keep the binary indexes from Debian repositories, for instance.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 13 2020
So they're metadata specific to files that we don't archive at all because they're not source? That doesn't sound very useful to keep at all. We don't keep the binary indexes from Debian repositories, for instance.
They are metadata on the file itself (file name, checksums, has signature, upload time, file-specific comment (often empty), yank status), so they have nothing in common
In practice, is there many meaningful differences between the wheel metadata and the sdist metadata? If not then I think option 3 would be the most sensible.
Oct 12 2020
FTR, olasd, douardda and I discussed an inconsistency in keys used in kafka, and decided to use hashes for all origin/visits/visit statuses; and doing the same for ext metadata in both kafka and the DB solves the issue about defining unicity.
@rdicosmo a full example of what?
The suggestion was to have extrinsic metadata on directories that come from a deposit of a bundle (e.g. .tar.gz or .zip file coming from HAL), instead of on a synthetic revision as is currently the case, so they can be accessed knowing the hash of the directory (which is an intrinsic id).
Oct 8 2020
Alternatively, we could keep writing the metadata on revision/releases, and use the provenance service (when it's ready) to find them from a directory SWHID. What do you think?