Page MenuHomeSoftware Heritage

Archive coverageFolder
ActivePublic

Members

  • This project does not have any members.

Watchers

  • This project does not have any watchers.

Details

Description

stuff related to extend the coverage of the Software Heritage archive

Recent Activity

Sat, Dec 8

ardumont added a comment to T1351: (periodically) ingest GNU package releases.

This should probably be split in 2 tasks:

Sat, Dec 8, 3:11 PM · Archive coverage

Mon, Dec 3

anlambert closed T1398: npm incremental lister, a subtask of T1378: Ingest npm into the Software Heritage archive (meta task), as Resolved.
Mon, Dec 3, 6:02 PM · Origin-npm, Archive coverage

Tue, Nov 27

anlambert triaged T1389: Implement a base loader for package managers as Wishlist priority.
Tue, Nov 27, 12:23 PM · Origin-npm, Origin-Pypi, Archive coverage

Mon, Nov 26

anlambert closed T1380: npm lister, a subtask of T1378: Ingest npm into the Software Heritage archive (meta task), as Resolved.
Mon, Nov 26, 11:05 AM · Origin-npm, Archive coverage

Thu, Nov 22

ardumont updated the task description for T1139: ingest major gitlab instances.
Thu, Nov 22, 4:37 PM · Archive coverage, Origin-GitLab
anlambert triaged T1379: npm loader as Normal priority.
Thu, Nov 22, 3:51 PM · Origin-npm
anlambert triaged T1378: Ingest npm into the Software Heritage archive (meta task) as Normal priority.
Thu, Nov 22, 3:43 PM · Origin-npm, Archive coverage

Fri, Nov 16

zack triaged T1352: ingest Guix (SD) packages as Normal priority.
Fri, Nov 16, 12:09 PM · Archive coverage
zack renamed T1351: (periodically) ingest GNU package releases from periodically ingest GNU package releases to (periodically) ingest GNU package releases.
Fri, Nov 16, 12:08 PM · Archive coverage
zack added a project to T1351: (periodically) ingest GNU package releases: Archive coverage.
Fri, Nov 16, 12:08 PM · Archive coverage

Oct 29 2018

olasd added a comment to T1139: ingest major gitlab instances.

I had added framagit and 0xacab on Friday, forgot to update the task.

Oct 29 2018, 10:49 AM · Archive coverage, Origin-GitLab
olasd updated the task description for T1139: ingest major gitlab instances.
Oct 29 2018, 10:48 AM · Archive coverage, Origin-GitLab
douardda updated the task description for T1139: ingest major gitlab instances.
Oct 29 2018, 10:01 AM · Archive coverage, Origin-GitLab
douardda updated the task description for T1139: ingest major gitlab instances.
Oct 29 2018, 10:00 AM · Archive coverage, Origin-GitLab

Oct 22 2018

zack added a comment to T1262: wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested.

the internship topic on this is now available here: https://wiki.softwareheritage.org/wiki/Ingest_all_Debian_derivatives_(internship)

Oct 22 2018, 7:59 PM · Archive coverage
olasd added a comment to T1262: wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested.
In T1262#23695, @zack wrote:

That's a very good idea, which I'll be happy to draft as a proper internship proposal. Before doing so, however, can you confirm that, scheduling wise, tracking something like ~100 additional derivatives wouldn't be a problem for us in terms of load?

Oct 22 2018, 6:45 PM · Archive coverage
ardumont updated the task description for T1246: pypi loader: Analyze existing errors.
Oct 22 2018, 10:24 AM · Archive coverage, Origin-Pypi

Oct 18 2018

ardumont added a comment to T1246: pypi loader: Analyze existing errors.

Ok, so reworked the group_by_exception snippet to have a more sensible output:

Oct 18 2018, 11:27 AM · Archive coverage, Origin-Pypi

Oct 17 2018

ardumont added a comment to T1246: pypi loader: Analyze existing errors.

In any case, for now, like i said in [2], we will first schedule back
those 1409 origins in error.

Oct 17 2018, 4:22 PM · Archive coverage, Origin-Pypi

Oct 16 2018

zack added a comment to T1262: wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested.
In T1262#23681, @olasd wrote:

Automating the addition of distributions from the Debian derivatives census to Software Heritage would probably be a good topic for an internship, e.g. a Google Summer of Code/Outreachy project.

Oct 16 2018, 5:41 PM · Archive coverage
ardumont added a comment to T1246: pypi loader: Analyze existing errors.

Here is the pypi report about the loading errors.

Oct 16 2018, 2:03 PM · Archive coverage, Origin-Pypi
olasd added a comment to T1262: wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested.

Debian derivatives (that is, distributions that are forks of Debian, not Debian itself) are not being archived.

Oct 16 2018, 12:19 PM · Archive coverage
moranegg updated the task description for T1262: wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested.
Oct 16 2018, 11:55 AM · Archive coverage
moranegg renamed T1262: wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested from wiki: Update suggestion box to wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested.
Oct 16 2018, 11:54 AM · Archive coverage

Oct 12 2018

zack added a comment to T1262: wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested.

My point was just that you didn't list here the entries that you think have to be updated, so it wasn't actionable.
It would be great if you can update the task description with all the entries that you think deserve an update (even if you've doubts about them).

Oct 12 2018, 11:13 AM · Archive coverage
moranegg added a comment to T1262: wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested.

I suggested this task instead of editing because I wasn't sure about item no° 3 (Debian).
And I didn't know if entries should be dropped or do we want to keep all items in the list and have a checkbox when we get to them.

Oct 12 2018, 11:09 AM · Archive coverage

Oct 11 2018

zack added a comment to T1262: wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested.

Can you clarify the scope of this task?

Oct 11 2018, 8:22 PM · Archive coverage
moranegg triaged T1262: wiki: Update suggestion box if `all Debian derivatives` can be noted as ingested as Low priority.
Oct 11 2018, 2:41 PM · Archive coverage

Oct 5 2018

ardumont renamed T1246: pypi loader: Analyze existing errors from Analyze pypi errors to pypi loader: Analyze existing errors.
Oct 5 2018, 6:31 PM · Archive coverage, Origin-Pypi
ardumont added a comment to T1246: pypi loader: Analyze existing errors.

kibana dashboard will help in that matters (P311 because it's noisy).

Oct 5 2018, 6:30 PM · Archive coverage, Origin-Pypi
ardumont triaged T1246: pypi loader: Analyze existing errors as Normal priority.
Oct 5 2018, 6:28 PM · Archive coverage, Origin-Pypi
zack added a project to T1139: ingest major gitlab instances: Archive coverage.
Oct 5 2018, 4:34 PM · Archive coverage, Origin-GitLab

Sep 21 2018

ardumont closed T421: PyPI loader, a subtask of T419: ingest PyPI into the Software Heritage archive (meta task), as Resolved.
Sep 21 2018, 6:35 PM · Archive coverage, Origin-Pypi

Sep 20 2018

ardumont closed T1181: pypi: Schedule ingestion, a subtask of T419: ingest PyPI into the Software Heritage archive (meta task), as Resolved.
Sep 20 2018, 11:17 AM · Archive coverage, Origin-Pypi
ardumont closed T1181: pypi: Schedule ingestion as Resolved.
Sep 20 2018, 11:17 AM · Archive coverage, Origin-Pypi
ardumont added a comment to T1181: pypi: Schedule ingestion.

Now, it's scheduled. Just need to wait for the swh-scheduler-runner.service to finish its loop on task_types.

Sep 20 2018, 9:52 AM · Archive coverage, Origin-Pypi
ardumont added a comment to T1181: pypi: Schedule ingestion.
swhscheduler@saatchi:~$ python3 -m swh.scheduler.cli task list-pending -t swh-lister-pypi
Found 1 tasks
Sep 20 2018, 9:48 AM · Archive coverage, Origin-Pypi
ardumont updated the task description for T1181: pypi: Schedule ingestion.
Sep 20 2018, 9:47 AM · Archive coverage, Origin-Pypi
ardumont added a comment to T1181: pypi: Schedule ingestion.

Schedule the lister-pypi:

Sep 20 2018, 9:47 AM · Archive coverage, Origin-Pypi

Sep 19 2018

ardumont changed the status of T1181: pypi: Schedule ingestion, a subtask of T419: ingest PyPI into the Software Heritage archive (meta task), from Open to Work in Progress.
Sep 19 2018, 7:52 PM · Archive coverage, Origin-Pypi
ardumont changed the status of T1181: pypi: Schedule ingestion from Open to Work in Progress.
Sep 19 2018, 7:52 PM · Archive coverage, Origin-Pypi
ardumont closed T879: Reschedule googlecode svn origins from scratch, a subtask of T617: ingest Google Code Subversion repositories, as Resolved.
Sep 19 2018, 1:56 PM · Archive coverage, Origin-GoogleCode, SVN Loader

Sep 6 2018

ardumont updated the task description for T1181: pypi: Schedule ingestion.
Sep 6 2018, 5:38 PM · Archive coverage, Origin-Pypi
ardumont renamed T1181: pypi: Schedule ingestion from pypi: Trigger listing task to pypi: Schedule ingestion.
Sep 6 2018, 5:37 PM · Archive coverage, Origin-Pypi
ardumont triaged T1181: pypi: Schedule ingestion as Normal priority.
Sep 6 2018, 5:31 PM · Archive coverage, Origin-Pypi
ardumont closed T422: PyPI lister, a subtask of T419: ingest PyPI into the Software Heritage archive (meta task), as Resolved.
Sep 6 2018, 5:31 PM · Archive coverage, Origin-Pypi

Sep 4 2018

ardumont closed T1111: ingest GitLab.com (meta-task) as Resolved.
Sep 4 2018, 6:17 PM · Archive coverage, General, Origin-GitLab

Aug 24 2018

ardumont added a comment to T1111: ingest GitLab.com (meta-task).

A priori, at current speed, there remains ~7.5 days till the end of the gitlab origins ingestion.

Aug 24 2018, 12:06 PM · Archive coverage, General, Origin-GitLab

Aug 3 2018

ardumont added a comment to T682: Inject Google Code Mercurial repositories.

First pass have been done complete a while back.

Aug 3 2018, 3:05 PM · Archive coverage, Mercurial loader
ardumont added a subtask for T682: Inject Google Code Mercurial repositories: T1156: Fix release targets of already loaded mercurial type origins.
Aug 3 2018, 3:03 PM · Archive coverage, Mercurial loader