the internship topic on this is now available here: https://wiki.softwareheritage.org/wiki/Ingest_all_Debian_derivatives_(internship)
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 29 2018
Oct 22 2018
In T1262#23695, @zack wrote:That's a very good idea, which I'll be happy to draft as a proper internship proposal. Before doing so, however, can you confirm that, scheduling wise, tracking something like ~100 additional derivatives wouldn't be a problem for us in terms of load?
Oct 18 2018
Ok, so reworked the group_by_exception snippet to have a more sensible output:
Oct 17 2018
In any case, for now, like i said in [2], we will first schedule back
those 1409 origins in error.
Oct 16 2018
In T1262#23681, @olasd wrote:Automating the addition of distributions from the Debian derivatives census to Software Heritage would probably be a good topic for an internship, e.g. a Google Summer of Code/Outreachy project.
Here is the pypi report about the loading errors.
Debian derivatives (that is, distributions that are forks of Debian, not Debian itself) are not being archived.
Oct 12 2018
My point was just that you didn't list here the entries that you think have to be updated, so it wasn't actionable.
It would be great if you can update the task description with all the entries that you think deserve an update (even if you've doubts about them).
I suggested this task instead of editing because I wasn't sure about item no° 3 (Debian).
And I didn't know if entries should be dropped or do we want to keep all items in the list and have a checkbox when we get to them.
Oct 11 2018
Can you clarify the scope of this task?
Oct 5 2018
kibana dashboard will help in that matters (P311 because it's noisy).
Sep 21 2018
Sep 20 2018
Now, it's scheduled. Just need to wait for the swh-scheduler-runner.service to finish its loop on task_types.
swhscheduler@saatchi:~$ python3 -m swh.scheduler.cli task list-pending -t swh-lister-pypi Found 1 tasks
Schedule the lister-pypi:
Sep 19 2018
Sep 6 2018
Sep 4 2018
Aug 24 2018
A priori, at current speed, there remains ~7.5 days till the end of the gitlab origins ingestion.
Aug 3 2018
First pass have been done complete a while back.
Aug 1 2018
Jul 26 2018
Jul 25 2018
Jul 20 2018
Jul 19 2018
Jul 18 2018
Jul 17 2018
Jul 5 2018
Some repositories @olasd mentioned to me that qualifies as gitlab repositories (in parenthesis, their current size in term of repositories):
- https://0xacab.org/api/v4/projects/ (600)
- https://framagit.org/api/v4/projects/ (8619)
- https://salsa.debian.org/api/v4/projects/ (25155)
- https://gitlab.com/api/v4/projects/ (567086)
- https://gitlab.freedesktop.org/api/v4/projects/ (254)
- https://gitlab.gnome.org/api/v4/projects/ (3247)
- https://gitlab.inria.fr/api/v4/projects/ (837)
- ...