Page MenuHomeSoftware Heritage

semi-automated addition of new "forges"
Open, NormalPublic


Use case: extend archive coverage to a specific GitLab instance (specified by URL) as seamlessly as possible.

(The obvious generalization is replacing GitLab with any kind of supported listable source code origin out there, e.g., another Debian-like distro, another PyPI instance, etc.)

We currently can, with a single command (1) add an entry to the list of "forges" being listed.
What we lack to implement the "as seamlessly as possible" part above is:

(2) immediately do the full listing, with high scheduling priority
(3) once (2) is done, immediately load all listed origins, with high scheduling priority
(4) bonus point: notify the user once (3) is done

As another bonus point, having the above doable with a single CLI command would be great.

Once we have this, it will be the obvious building block of a "save forge now" user-visible functionality in the Web UI (which will be tracked in a separate task).

Related Objects

Event Timeline

zack triaged this task as Normal priority.Feb 21 2019, 8:16 PM
zack created this task.
zack added a project: Scheduling utilities.

As a test case, we've been asked to archive this small (for now) GitLab instance:
We can easily find tons of other small public instances for further testing.

ardumont added a subscriber: ardumont.EditedJul 2 2019, 7:59 AM

D1504 is related as it proposes a way to remove lister setup steps (db model adaptation, adding new lister task type in scheduler db). This works towards the "seamlessly as possible" part.