Page MenuHomeSoftware Heritage
Feed Advanced Search

Dec 10 2020

zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:07 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:04 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:03 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 11:01 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 10:58 AM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Dec 10 2020, 10:54 AM · Archive coverage, Documentation
zack closed T682: Ingest Google Code Mercurial repositories, a subtask of T367: ingest Google Code repositories, as Resolved.
Dec 10 2020, 10:52 AM · Archive coverage, Restricted Project
zack closed T682: Ingest Google Code Mercurial repositories as Resolved.
Dec 10 2020, 10:52 AM · Archive coverage, Mercurial loader

Dec 9 2020

ardumont updated the task description for T2793: add notable past events to the archive changelog.
Dec 9 2020, 4:26 PM · Archive coverage, Documentation
vsellier added a project to T2594: production: Running nixguix on guix sources: Archive coverage.
Dec 9 2020, 4:25 PM · Archive coverage, System administration
ardumont updated the task description for T2793: add notable past events to the archive changelog.
Dec 9 2020, 4:23 PM · Archive coverage, Documentation
vsellier added a project to T2608: Deploy launchpad and gitea listers on production: Archive coverage.
Dec 9 2020, 4:22 PM · Archive coverage, System administration
ardumont added a comment to T2793: add notable past events to the archive changelog.

Also [1] (period 20/01/2020 up to 01/04/2020) might come in handy...

Dec 9 2020, 4:20 PM · Archive coverage, Documentation
ardumont added a comment to T2793: add notable past events to the archive changelog.

what happened on 2020-03-01, explaining why archive started growing much faster (related to github listing/loading)

Dec 9 2020, 3:49 PM · Archive coverage, Documentation
ardumont updated the task description for T2793: add notable past events to the archive changelog.
Dec 9 2020, 3:17 PM · Archive coverage, Documentation
anlambert updated the task description for T2793: add notable past events to the archive changelog.
Dec 9 2020, 3:04 PM · Archive coverage, Documentation

Dec 1 2020

ardumont added a project to T2833: cpan.loader - archive Perl modules from CPAN: Archive coverage.
Dec 1 2020, 11:41 AM · CPAN lister, Archive coverage

Nov 25 2020

zack updated the task description for T2793: add notable past events to the archive changelog.
Nov 25 2020, 2:00 PM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Nov 25 2020, 1:59 PM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Nov 25 2020, 1:58 PM · Archive coverage, Documentation
zack updated the task description for T2793: add notable past events to the archive changelog.
Nov 25 2020, 1:56 PM · Archive coverage, Documentation
zack changed the status of T2793: add notable past events to the archive changelog from Open to Work in Progress.
Nov 25 2020, 1:55 PM · Archive coverage, Documentation
zack closed T617: ingest Google Code Subversion repositories as Resolved.
Nov 25 2020, 1:49 PM · Archive coverage, Origin-GoogleCode, SVN Loader
zack closed T617: ingest Google Code Subversion repositories, a subtask of T367: ingest Google Code repositories, as Resolved.
Nov 25 2020, 1:49 PM · Archive coverage, Restricted Project

Nov 18 2020

zack triaged T2793: add notable past events to the archive changelog as Normal priority.
Nov 18 2020, 10:20 AM · Archive coverage, Documentation

Nov 3 2020

ardumont moved T2024: Re-implement deposit loader with package loader mechanism from Backlog to Archived on the SWORD deposit board.
Nov 3 2020, 4:06 PM · SWORD deposit, Archive coverage

Oct 23 2020

douardda updated the task description for T2645: Add listing tasks for gitea instances.
Oct 23 2020, 10:20 AM · Origin-Gitea/Gogs, Archive coverage, Lister

Oct 21 2020

douardda updated the task description for T2645: Add listing tasks for gitea instances.
Oct 21 2020, 12:18 PM · Origin-Gitea/Gogs, Archive coverage, Lister

Sep 30 2020

douardda closed T2313: Archive git.fsfe.org (Gitea) as Resolved.

Listed (oneshot full + recurring incremental) and loaded (as far as I can tell).

Sep 30 2020, 9:37 AM · Archive coverage, Lister

Sep 29 2020

douardda added a comment to T2313: Archive git.fsfe.org (Gitea).

I've sent an email to the fsfe.

Sep 29 2020, 11:39 AM · Archive coverage, Lister
zack added a project to T2645: Add listing tasks for gitea instances: Archive coverage.
Sep 29 2020, 10:33 AM · Origin-Gitea/Gogs, Archive coverage, Lister

Sep 28 2020

ardumont added a comment to T2313: Archive git.fsfe.org (Gitea).

The lister is deployed, this forge is not listed though (codeberg.org is).

Sep 28 2020, 10:54 AM · Archive coverage, Lister
douardda added a comment to T2313: Archive git.fsfe.org (Gitea).

Can this be closed now? What's missing? Adding a listing task?

Sep 28 2020, 9:47 AM · Archive coverage, Lister

Sep 22 2020

ardumont removed a revision from T1352: ingest Guix (SD) packages: D4007: loader*: Migrate to swh.core.config.load_from_envvar.
Sep 22 2020, 2:55 PM · Archive coverage
ardumont added a revision to T1352: ingest Guix (SD) packages: D4007: loader*: Migrate to swh.core.config.load_from_envvar.
Sep 22 2020, 1:51 PM · Archive coverage

Sep 21 2020

ardumont closed T2608: Deploy launchpad and gitea listers on production, a subtask of T1734: Create a Lister for launchpad.net, as Resolved.
Sep 21 2020, 1:45 PM · Lister, Archive coverage
ardumont closed T2608: Deploy launchpad and gitea listers on production, a subtask of T2313: Archive git.fsfe.org (Gitea), as Resolved.
Sep 21 2020, 1:45 PM · Archive coverage, Lister
vsellier closed T2594: production: Running nixguix on guix sources, a subtask of T1352: ingest Guix (SD) packages, as Resolved.
Sep 21 2020, 12:09 PM · Archive coverage
vsellier closed T2594: production: Running nixguix on guix sources, a subtask of T2485: staging: Running nixguix on guix sources , as Resolved.
Sep 21 2020, 12:09 PM · Archive coverage

Sep 18 2020

moranegg moved T1681: Use project metadata as a "lister" from Backlog to Implementation on the Metadata workflow board.
Sep 18 2020, 2:19 PM · Archive coverage, Indexer, Metadata workflow

Sep 17 2020

vsellier closed T2577: Test gitea lister on staging environment, a subtask of T2313: Archive git.fsfe.org (Gitea), as Resolved.
Sep 17 2020, 11:41 AM · Archive coverage, Lister
vsellier added a subtask for T1734: Create a Lister for launchpad.net: T2608: Deploy launchpad and gitea listers on production.
Sep 17 2020, 10:40 AM · Lister, Archive coverage
vsellier added a subtask for T2313: Archive git.fsfe.org (Gitea): T2608: Deploy launchpad and gitea listers on production.
Sep 17 2020, 10:40 AM · Archive coverage, Lister

Sep 15 2020

vsellier added a subtask for T2485: staging: Running nixguix on guix sources : T2594: production: Running nixguix on guix sources.
Sep 15 2020, 11:25 AM · Archive coverage
vsellier added a subtask for T1352: ingest Guix (SD) packages: T2594: production: Running nixguix on guix sources.
Sep 15 2020, 11:22 AM · Archive coverage
ardumont closed T2485: staging: Running nixguix on guix sources , a subtask of T1352: ingest Guix (SD) packages, as Resolved.
Sep 15 2020, 11:19 AM · Archive coverage
ardumont closed T2485: staging: Running nixguix on guix sources as Resolved.
Sep 15 2020, 11:19 AM · Archive coverage

Sep 14 2020

vsellier added a comment to T2358: Deploy launchpad lister on staging.

An email was sent on the swh-devel mailing list to ask for reviews.
The deployment in production will be performed in the middle of week 38 is no problems are raised.

Sep 14 2020, 10:09 AM · System administration, Lister, Archive coverage

Sep 10 2020

vsellier reopened T2577: Test gitea lister on staging environment, a subtask of T2313: Archive git.fsfe.org (Gitea), as Work in Progress.
Sep 10 2020, 7:03 PM · Archive coverage, Lister
vsellier closed T2577: Test gitea lister on staging environment, a subtask of T2313: Archive git.fsfe.org (Gitea), as Resolved.
Sep 10 2020, 1:07 PM · Archive coverage, Lister

Sep 9 2020

vsellier added a comment to T2358: Deploy launchpad lister on staging.

The task ran in 30mn (1887s):

Sep 08 13:45:34 worker1 python3[237586]: [2020-09-08 13:45:34,851: INFO/ForkPoolWorker-4] Task swh.lister.launchpad.tasks.FullLaunchpadLister[73e298be-aeda-4882-b52d-dfe5a2ec316c] succeeded in 1887.75128286588s: {'status': 'eventful'}
Sep 9 2020, 10:25 AM · System administration, Lister, Archive coverage

Sep 8 2020

vsellier changed the status of T2577: Test gitea lister on staging environment, a subtask of T2313: Archive git.fsfe.org (Gitea), from Open to Work in Progress.
Sep 8 2020, 4:35 PM · Archive coverage, Lister
vsellier added a subtask for T2313: Archive git.fsfe.org (Gitea): T2577: Test gitea lister on staging environment.
Sep 8 2020, 4:34 PM · Archive coverage, Lister
vsellier closed T2358: Deploy launchpad lister on staging, a subtask of T1734: Create a Lister for launchpad.net, as Resolved.
Sep 8 2020, 3:57 PM · Lister, Archive coverage
vsellier closed T2358: Deploy launchpad lister on staging as Resolved.

The launchpad lister (v0.1.2) is deployed and running on staging

Sep 8 2020, 3:57 PM · System administration, Lister, Archive coverage
vsellier added a revision to T2358: Deploy launchpad lister on staging: D3887: Launchpad: rename task name to match conventions.
Sep 8 2020, 2:25 PM · System administration, Lister, Archive coverage
ardumont added a revision to T2358: Deploy launchpad lister on staging: D3884: lister configuration: Add launchpad lister tasks.
Sep 8 2020, 10:10 AM · System administration, Lister, Archive coverage
ardumont updated the task description for T2358: Deploy launchpad lister on staging.
Sep 8 2020, 10:02 AM · System administration, Lister, Archive coverage
ardumont added a project to T2358: Deploy launchpad lister on staging: System administration.
Sep 8 2020, 9:55 AM · System administration, Lister, Archive coverage

Sep 4 2020

ardumont added a comment to T2358: Deploy launchpad lister on staging.

Thanks for the heads up.

Sep 4 2020, 5:08 PM · System administration, Lister, Archive coverage
douardda added a comment to T2358: Deploy launchpad lister on staging.

FTR, I've run the launchpad lister in a docker and it executed fine, with fine being "it created 19340 load-git tasks"

Sep 4 2020, 5:01 PM · System administration, Lister, Archive coverage

Aug 27 2020

douardda raised the priority of T2313: Archive git.fsfe.org (Gitea) from Wishlist to High.
Aug 27 2020, 4:29 PM · Archive coverage, Lister
douardda added a comment to T1924: Deploy packagist Lister.

I guess this also depends on a packagist loader, which we do not have at all for now...

Aug 27 2020, 11:16 AM · Lister, Archive coverage

Aug 26 2020

douardda added a comment to T2313: Archive git.fsfe.org (Gitea).

Also beware that the default pagination value in the gitea lister is 3 (https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gitea/lister.py$23) so it is very slow.

Aug 26 2020, 11:08 AM · Archive coverage, Lister
douardda closed T1734: Create a Lister for launchpad.net as Resolved.
Aug 26 2020, 11:01 AM · Lister, Archive coverage
douardda added a comment to T2313: Archive git.fsfe.org (Gitea).

Ok I was expecting something a bit smart in explore.sapk.fr, but not really:

Aug 26 2020, 10:48 AM · Archive coverage, Lister
douardda raised the priority of T2358: Deploy launchpad lister on staging from Normal to High.
Aug 26 2020, 10:29 AM · System administration, Lister, Archive coverage
douardda added a comment to T2313: Archive git.fsfe.org (Gitea).

now we have the gitea lister, we should (upgrade swh.lister on prod and) add a few listing tasks, like this fsfe instance, as well as other instances like https://codeberg.org.

Aug 26 2020, 10:23 AM · Archive coverage, Lister

Aug 24 2020

zack removed projects from T2523: Archive opensource.samsung.com: Data Model, Core Loader.
Aug 24 2020, 11:38 AM · Lister, Archive coverage

Aug 19 2020

vlorentz triaged T2523: Archive opensource.samsung.com as Normal priority.
Aug 19 2020, 7:40 PM · Lister, Archive coverage

Aug 8 2020

ardumont added a comment to T2485: staging: Running nixguix on guix sources .

fwiw, the nix sources benefit from this as well

Aug 8 2020, 1:24 PM · Archive coverage
ardumont added a comment to T2485: staging: Running nixguix on guix sources .

For one, the extensions to skip were not finely analyzed (from the top of my head, we could add ".el' for example).

Aug 8 2020, 1:23 PM · Archive coverage
ardumont added a comment to T2485: staging: Running nixguix on guix sources .

Seems to have reduced the cost (from ~4500s to ~1500s) but there might still be margin for improvments [1]
For one, the extensions to skip were not finely analyzed (from the top of my head, we could add ".el' extensions to filter out for example).

Aug 8 2020, 1:22 PM · Archive coverage

Aug 7 2020

ardumont added a comment to T2485: staging: Running nixguix on guix sources .

loader-core 0.9.0 which includes T2510 improvment got deployed on staging to see if that improves time/performance.
(both run for guix and nix sources)

Aug 7 2020, 11:54 PM · Archive coverage

Jul 9 2020

ardumont added a comment to T2485: staging: Running nixguix on guix sources .

Note: status uneventful with a different snapshot is kinda unexpected for me. Not something drastically problematic though. I'll dig in at some point.

Jul 9 2020, 2:27 PM · Archive coverage
zimoun added a comment to T2485: staging: Running nixguix on guix sources .

Note: status uneventful with a different snapshot is kinda unexpected for me. Not something drastically problematic though. I'll dig in at some point.

@ardumont: did you load the same sources.json? Because http://guix.gnu.org/sources.json is refreshed every X hours and some stats of the commits after 2018-12-05 (v0.16.0) says mean at 21 and median at 13, both per day. And since loading requires ~1h15min, you need some luck to read the same son file twice.

Jul 9 2020, 11:10 AM · Archive coverage
ardumont added a comment to T2485: staging: Running nixguix on guix sources .

@ardumont , https://archive.softwareheritage.org/api/1/snapshot/869153d018394df0b75789134d87992eb2353bd4/ says this particular snapshot could not be found. Am I missing something?

Jul 9 2020, 10:19 AM · Archive coverage
ardumont added a comment to T2485: staging: Running nixguix on guix sources .

Second run btw (forgot to hit enter a while back):

Jul 07 12:10:49 worker2 python3[475116]: [2020-07-07 12:10:49,714: INFO/ForkPoolWorker-1] Task swh.loader.package.nixguix.tasks.LoadNixguix[082dd536-6294-421a-881e-e0bf28e94e0b] succeeded in 4497.450984489056s: {'status': 'uneventful', 'snapshot_id': 'ae96e93d0e24fb4ec484d56109c669da0b267908'}
Jul 9 2020, 10:18 AM · Archive coverage
civodul added a comment to T2485: staging: Running nixguix on guix sources .

This is great news, thank you! :-)

Jul 9 2020, 9:52 AM · Archive coverage

Jul 7 2020

ardumont added a comment to T2485: staging: Running nixguix on guix sources .

Run completed.

Jul 7 2020, 9:49 AM · Archive coverage

Jul 6 2020

ardumont added a comment to T2485: staging: Running nixguix on guix sources .

Patched staging nixguix loader worker with the diffs above on staging and triggered back a run.
It seems to no longer complain.

Jul 6 2020, 5:45 PM · Archive coverage
ardumont added a revision to T2485: staging: Running nixguix on guix sources : D3437: nixguix/loader: Check further the source entry only if it's valid.
Jul 6 2020, 5:25 PM · Archive coverage
ardumont added a comment to T2485: staging: Running nixguix on guix sources .

Next issue [2]

Jul 6 2020, 5:09 PM · Archive coverage
ardumont added a revision to T2485: staging: Running nixguix on guix sources : D3436: nixguix/loader: Allow version both as string or integer.
Jul 6 2020, 4:51 PM · Archive coverage
ardumont added a comment to T2485: staging: Running nixguix on guix sources .

First issue, missing a top-level "sources" entry [1]

Jul 6 2020, 4:44 PM · Archive coverage
ardumont triaged T2485: staging: Running nixguix on guix sources as Normal priority.
Jul 6 2020, 4:06 PM · Archive coverage

Jun 17 2020

civodul added a comment to T1352: ingest Guix (SD) packages.
In T1352#45587, @lewo wrote:

are you suggesting that sources.json itself be an "origin"?

The sources.json URL is an "origin". Each snapshot associated to this origin has several branches. Each branch corresponds to a source of the sources.json file.
There is also special branch named evaluation which points to the commit specified by the attribute revision of your sources.json file: this is to link a snapshot to a nixpkgs/guix commit.

Jun 17 2020, 7:02 PM · Archive coverage
zack added a comment to T1352: ingest Guix (SD) packages.

@lewo it's used in our DB but also exposed in the swh-web UI in search results (and in the future it is going to be also be a field for user searches, so that you can search, e.g., "emacs" only in the list of packages archived from a given origin type).

Jun 17 2020, 3:52 PM · Archive coverage
lewo added a comment to T1352: ingest Guix (SD) packages.

@zack

We need a name for this origin type, one of the hardest problem in CS :-)

Where is it used? Is it a new attribute?
We actually had to choose a name for the visit type, and with a lot of inspiration, we choose nixguix :-/

Jun 17 2020, 3:15 PM · Archive coverage
lewo added a comment to T1352: ingest Guix (SD) packages.

@zimoun

Do you mean filter the unsupported urls for the field "urls" in the "type": "url"?
Or do you mean only export "type": "url" and remove all the other types from 'sources.json', for instance "git"?

Jun 17 2020, 3:13 PM · Archive coverage
civodul added a comment to T1352: ingest Guix (SD) packages.
In T1352#45536, @zack wrote:
In T1352#45459, @lewo wrote:

So, we can now consider the sources.json file format as stable and you could make the required changes on your sources.json file. A new SHW origin should then be added.

We need a name for this origin type, one of the hardest problem in CS :-)

Can you suggest something that makes sense for both Nix, Guix, and other players in the field? As an outsider I'm a bit at loss at proposing something…

Jun 17 2020, 2:41 PM · Archive coverage
zimoun added a comment to T1352: ingest Guix (SD) packages.

Thank you for the notification. I have tried to answer by email but I could have failed. Anyway.

Jun 17 2020, 2:00 AM · Archive coverage

Jun 16 2020

anadon added a comment to T1352: ingest Guix (SD) packages.

Repology.org went with "Gnu Guix".

Jun 16 2020, 8:26 PM · Archive coverage
zack added a comment to T1352: ingest Guix (SD) packages.
In T1352#45459, @lewo wrote:

So, we can now consider the sources.json file format as stable and you could make the required changes on your sources.json file. A new SHW origin should then be added.

Jun 16 2020, 6:34 PM · Archive coverage

Jun 15 2020

ardumont added a comment to T1352: ingest Guix (SD) packages.

What do you think @ardumont ?

Jun 15 2020, 7:20 PM · Archive coverage
lewo updated subscribers of T1352: ingest Guix (SD) packages.

The nixguix loader is working well since 2 weeks on the nixpkgs sources.json file!
So, we can now consider the sources.json file format as stable and you could make the required changes on your sources.json file. A new SHW origin should then be added.

Jun 15 2020, 5:53 PM · Archive coverage

Jun 9 2020

olasd added a comment to T2345: Improve handling of recurrent loading tasks in scheduler.

This task describes in detail what kind of scheduling policy we should implement, but it doesn't help much figure out what the next steps should be.

Jun 9 2020, 3:04 PM · Sprint 2021 01, Archive coverage, Scheduling utilities

May 27 2020

ardumont added a comment to T2313: Archive git.fsfe.org (Gitea).

I've add multiple looks to the proposed gitea lister.
This looks fine to me, i've accepted it but not completely.
If some other team member could do a second pass, that'd be neat.

May 27 2020, 6:06 PM · Archive coverage, Lister

May 26 2020

ardumont added a comment to T1352: ingest Guix (SD) packages.

As a rapid follow up, here is the current structure of the sources.json the
loader nixguix is able to ingest. It's not that much different than what @lewo
initially proposed in the lister diff.

May 26 2020, 3:37 PM · Archive coverage

May 19 2020

zack renamed T682: Ingest Google Code Mercurial repositories from Inject Google Code Mercurial repositories to Ingest Google Code Mercurial repositories.
May 19 2020, 9:56 AM · Archive coverage, Mercurial loader