As part of the maven lister, it's been put into attention that some urls can be listed
without being the main canonical urls. This can result in origins duplication for no
good reason.
So let's reuse some existing url canonicalization code (for github origins) in listers
and reuse when possible. That code should exist in swh-web and be refactored out into
swh.core then be reused both in swh-web and listers (starting with the maven one,
possibly nixguix, and packagist listers can be done later as well).
Plan:
- Compute canonical github urls in an exposed library function in swh.core
- Refactor GitHubSession request management out of swh.lister in swh.core
- Use GitHubSession to make the canonical computation avoid being rate limited (and if it,
do it like the remaining parts of the base code with ^)
- Refactor swh.lister to reuse the code moved in swh.core
- Refactor swh.web to execute the github origin canonicalization server side (reusing swh.core code)
- Refactor maven lister to reuse swh.core code for canonical gh urls
Optional plan:
- Adapt nixguix lister
- Adapt packagist lister
- Adapt remaining listers if any