2 sides of that coin:
#### Lister
algo:
- drop the R cran script
- parse the listing page instead (as in simple_lister, check lister cgit's way of doing it) [1]
- for each package found there, send the origin url [2] to the loader (as `recurring` task)
schema adaptations:
- make the tasks outputed by the lister as `recurring` (currently `oneshot`)
- Adapt uid field to be the origin_url's value
migration plan:
- truncate cran_repo table
- trigger back a full listing
#### Loader
algo:
- Improve the loader so it scrapes that origin url [2] page.
- It then determines itself what the artifact urls it needs to ingest
- In the [2] page, there is an archive link `Old source` which lists the previous artifact version.
The good news is this that be done independently and in any order (this task can then be split in 2 subtasks).
[1] https://cran.r-project.org/web/packages/available_packages_by_date.html
This can be subject to discussion with the cran community to ask for a better api endpoint (if it's not too much hassle for them to adapt and provide ;)
[2] https://cran.r-project.org/package=<package-name>