Currently, loading Nix and Guix as single origins with a huge snapshot, with each branch name being a URL is wrong.
We need to replace the Nixguix loader with a lister, which creates as many origins referenced by Nix and Guix public manifests.
This would be closer to what we do with Debian/Ubuntu.
Define the following (see the hedgedoc [1] which details a proposition):
- [x] target structure sketch of the data in the archive
- [x] What are the origin urls?
- [x] what kind of extrinsic metadata and/or extids are we storing?
- [x] what kinds of snapshots we're generating
Plan:
- [ ] D8341: Implement lister
~~- [ ] D8406, ...: Adapt archive loader to accept tarball from nixguix manifests~~ (cannot work)
- [ ] D8581: Implement ContentLoader (~~possibly as a package loader~~ [2]) to deal with content file with intrinsic metadata (out of nixguix manifests)
- [ ] Implement DirectoryLoader (~~possibly as a package loader~~ [2]) to deal with tarball with intrinsic metadata (out of nixguix manifests)
- [ ] Run through docker
- [ ] Deploy in staging
- [ ] Call for public review
- [ ] Deploy in production when ok ^
[1] Draft pad: https://hedgedoc.softwareheritage.org/2AQFbVB0S-OrOtkJV2yNJw
[2] It cannot. We may not have any versions received and package loader are currently relying on that particular data for its main ingestion algorithm.