Page MenuHomeSoftware Heritage
Feed All Stories

Dec 7 2022

vlorentz committed rDDATASETeceaf73f0fba: luigi.CreateAthena: Fix validation of DB name (authored by vlorentz).
luigi.CreateAthena: Fix validation of DB name
Dec 7 2022, 10:03 AM
vlorentz committed rDDATASETc717f60fe08e: luigi.RunExportAll: Default to exporting all formats (authored by vlorentz).
luigi.RunExportAll: Default to exporting all formats
Dec 7 2022, 10:03 AM
vlorentz closed D8924: exporters/orc: Fix crash on visit status with no type.
Dec 7 2022, 10:03 AM
vlorentz committed rDDATASET22f7ed11f688: exporters/orc: Fix crash on visit status with no type (authored by vlorentz).
exporters/orc: Fix crash on visit status with no type
Dec 7 2022, 10:02 AM
vlorentz added inline comments to D8908: Add ListOriginContributors.
Dec 7 2022, 9:45 AM
vlorentz closed T1345: Update metadata docs about using CodeMeta vocabulary as Resolved.
Dec 7 2022, 6:20 AM · Documentation
vlorentz closed T1345: Update metadata docs about using CodeMeta vocabulary, a subtask of T1649: Update documentation with compliance scenario changes, as Resolved.
Dec 7 2022, 6:20 AM · SWORD deposit
vlorentz added a comment to T1345: Update metadata docs about using CodeMeta vocabulary.

yes

Dec 7 2022, 6:20 AM · Documentation

Dec 6 2022

moranegg added a comment to T2719: Add entry of the FAIRsFAIR report in `publications`.

Gruenpeter, Morane, Di Cosmo, Roberto, Koers, Hylke, Herterich, Patricia, Hooft, Rob, Parland-von Essen, Jessica, Tana, Jonas, Aalto, Tero, & Jones, Sarah. (2020). M2.15 Assessment report on 'FAIRness of software' (1.1). Zenodo. https://doi.org/10.5281/zenodo.5472911

Dec 6 2022, 11:33 PM · Website
moranegg added a comment to T2719: Add entry of the FAIRsFAIR report in `publications`.

This still needs @rdicosmo's review.
At the moment I tend to say that EU projects reports are not adapted to the publication page, but where should this information go in a persistent way?

Dec 6 2022, 11:30 PM · Website
moranegg triaged T4717: Create tutorial for using the webhook for a Save Code Now as Normal priority.
Dec 6 2022, 11:24 PM · Documentation
moranegg created T4717: Create tutorial for using the webhook for a Save Code Now .
Dec 6 2022, 11:24 PM · Documentation
moranegg added a comment to T4548: Add a public API endpoint and documentation to trigger Save Code Now from webhook.

This task seems rich with information. Is this adapted for a guide or tutorial?
I'm creating a task for a tutorial about the webhook following Ambassador Pierre Poulain suggestion.

Dec 6 2022, 11:22 PM · Web app
moranegg added a member for Community Building: sgranger.
Dec 6 2022, 11:00 PM
moranegg closed T1215: Crossminer: investigate CrossSim tool to tag software projects as Wontfix.
Dec 6 2022, 10:49 PM · Metadata workflow
moranegg added a comment to T2513: Copy metadata on revisions to the extrinsic metadata storage.

thanks @olasd for persevering, is there an ETA for the relaunch from October 21st?

Dec 6 2022, 10:48 PM · Metadata workflow, Roadmap 2020
moranegg removed projects from T1345: Update metadata docs about using CodeMeta vocabulary: SWORD deposit, Metadata workflow.
Dec 6 2022, 10:46 PM · Documentation
moranegg updated subscribers of T1345: Update metadata docs about using CodeMeta vocabulary.

@vlorentz: I believe this is currently the case in the deposit docs, can you confirm?

Dec 6 2022, 10:46 PM · Documentation
moranegg removed projects from T2079: Prepare collaborative document for how to use SWHID on WikiData for preservation and discovery: Documentation, Metadata workflow.
Dec 6 2022, 10:45 PM · Software Stories
moranegg added a comment to T2559: Modify redirection on https://softwareheritage.org/swhid.

@anlambert: are you able to modify a website redirection?
The redirection for this url: https://softwareheritage.org/swhid
should be: https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html

Dec 6 2022, 10:44 PM · Website, SWORD deposit, Metadata workflow
moranegg removed a project from T2397: Review semantic gaps in the CodeMeta crosswalk table: Metadata workflow.
Dec 6 2022, 10:40 PM · Software Stories
moranegg moved T3483: Create charter for the ambassadors's program from In progress to Backlog on the Ambassadors board.
Dec 6 2022, 10:39 PM · Ambassadors
moranegg closed T1897: Specify dates schema when dealing with Legacy Software as Resolved.

SWHAP is dealing with this by having a branch named source code.
I'm considering this task as resolved.

Dec 6 2022, 10:34 PM · SWORD deposit, Metadata workflow, Scientific Community Building
moranegg closed T1897: Specify dates schema when dealing with Legacy Software, a subtask of T1752: Test drive legacy software curation process with Scilab, as Resolved.
Dec 6 2022, 10:34 PM · Acquisition Process (SWHAP), Metadata workflow, Scientific Community Building
rdicosmo committed rDGRPH0a8ae5de6f7b: Fix edges list in graph traversal (authored by rdicosmo).
Fix edges list in graph traversal
Dec 6 2022, 10:33 PM
moranegg added a comment to T3095: Add LIP6 gitlab instance to regular crawling list.

I believe this is done: https://archive.softwareheritage.org/browse/search/?q=gitlab.lip6.fr&visit_type=git&with_content=true&with_visit=true
@ardumont: can you confirm?

Dec 6 2022, 10:30 PM · Scientific Community Building, Archive coverage
moranegg added a project to T3118: Documentation for users and ambassadors: Documentation.
Dec 6 2022, 10:29 PM · Documentation, Scientific Community Building, Community Building, Roadmap 2021, meta-task
moranegg closed T1752: Test drive legacy software curation process with Scilab as Resolved.

This was done by Elisabetta: https://swh.stories.k2.services/inria/Q828742
Brava!

Dec 6 2022, 10:07 PM · Acquisition Process (SWHAP), Metadata workflow, Scientific Community Building
moranegg triaged T4716: Blog post on the new in-production SWHID scenario in HAL + tutorial videos as High priority.
Dec 6 2022, 10:03 PM · Unknown Object (Project)
moranegg added a comment to T1686: visual corporate identity (charte graphique).

do you have the pdf for this, I'm was searching for a document I have seen in the past.

Dec 6 2022, 10:00 PM · Unknown Object (Project)
moranegg added a comment to T4264: Add photos to https://mybox.inria.fr/.

Hello Marla,

Dec 6 2022, 9:59 PM · Unknown Object (Project)
ardumont accepted D8927: setup.py: Ensure testing requirements include luigi.
Dec 6 2022, 6:53 PM
ardumont committed rDDOC7bfdb3d10249: argocd: Drop spurious = character in title (authored by ardumont).
argocd: Drop spurious = character in title
Dec 6 2022, 6:51 PM
KShivendu added a comment to D8907: feat: Add Hex.pm lister.

order sounds best. Do you want to do it?

Dec 6 2022, 6:21 PM
vlorentz added a comment to D8907: feat: Add Hex.pm lister.

order sounds best. Do you want to do it?

Dec 6 2022, 6:18 PM
ardumont added a comment to P1540 computer says no "whatever dude!".

Dropping --tablesample 1 helped...

Dec 6 2022, 6:16 PM
ardumont added a comment to P1540 computer says no "whatever dude!".

yeah, well, it does help to remove the echo in front of the instruction...

Dec 6 2022, 6:15 PM
KShivendu added a comment to D8907: feat: Add Hex.pm lister.

but if you change it, then pagination is unusable because it's offset-based.

Dec 6 2022, 6:14 PM
ardumont added a comment to D8928: grab_next_visits: Open lister name and instance name filtering.

num num num... i did not mean to push that one.
Please, tell me if i need to revert it...

Dec 6 2022, 6:12 PM
ardumont closed D8928: grab_next_visits: Open lister name and instance name filtering.
Dec 6 2022, 6:12 PM
ardumont committed rDSCHcd16fce903ad: grab_next_visits: Open lister name and instance name filtering (authored by ardumont).
grab_next_visits: Open lister name and instance name filtering
Dec 6 2022, 6:12 PM
ardumont closed D8922: send-to-celery: Adapt to schedule from lister name & instance_name.
Dec 6 2022, 6:12 PM
ardumont committed rDSCHa7769639df76: send-to-celery: Adapt to schedule from lister name & instance_name (authored by ardumont).
send-to-celery: Adapt to schedule from lister name & instance_name
Dec 6 2022, 6:12 PM
olasd accepted D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Great, thanks!

Dec 6 2022, 5:53 PM
ardumont requested review of D8928: grab_next_visits: Open lister name and instance name filtering.
Dec 6 2022, 5:08 PM
ardumont created P1540 computer says no "whatever dude!".
Dec 6 2022, 5:06 PM
ardumont added a comment to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Does it seem like we're going to use these arguments in another caller of grab_next_visits?

That's a good question. I think not.

Maybe we would, actually. For instance, it might make sense to use these options to give different scheduling weights for github/gitlab/... origins in the recurrent visit scheduler.

Dec 6 2022, 5:04 PM
swh-public-ci added a comment to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Build is green

Dec 6 2022, 4:58 PM
ardumont updated the diff for D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Fix tests

Dec 6 2022, 4:54 PM
anlambert requested review of D8927: setup.py: Ensure testing requirements include luigi.
Dec 6 2022, 4:29 PM
Harbormaster failed remote builds in B33100: Diff 32147 for D8922: send-to-celery: Adapt to schedule from lister name & instance_name!
Dec 6 2022, 4:24 PM
swh-public-ci added a comment to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Build has FAILED

Dec 6 2022, 4:24 PM
ardumont retitled D8922: send-to-celery: Adapt to schedule from lister name & instance_name from Adapt send-to-celery cli to allow scheduling from simpler lister info to send-to-celery: Adapt to schedule from lister name & instance_name.
Dec 6 2022, 4:20 PM
ardumont updated the diff for D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Drop --lister-uuid flag to the benefit of using --lister-name and --lister-instance-name

Dec 6 2022, 4:20 PM
olasd added a comment to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Either way, the uuid argument in the cli endpoint should go away!

Dec 6 2022, 4:13 PM
olasd added a comment to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Does it seem like we're going to use these arguments in another caller of grab_next_visits?

That's a good question. I think not.

Dec 6 2022, 4:10 PM
ardumont added a comment to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Does it seem like we're going to use these arguments in another caller of grab_next_visits?

Dec 6 2022, 4:09 PM
ardumont added inline comments to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.
Dec 6 2022, 4:08 PM
anlambert accepted D8910: Regenerate the test dataset to include a release with no author.
Dec 6 2022, 4:06 PM
olasd added a comment to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

So, instead of adding more stuff to the grab_next_visits signature, I would have suggested just calling lister_get in swh/scheduler/cli/origin.py to get the lister uuid, and replacing the --lister-uuid CLI argument with --lister-name and --lister-instance-name (so, only changing the CLI function).

Dec 6 2022, 4:05 PM
anlambert accepted D8908: Add ListOriginContributors.

LGTM, added a couple of nitpicks as inline comments.

Dec 6 2022, 4:04 PM
ardumont added inline comments to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.
Dec 6 2022, 4:01 PM
vlorentz added a comment to T4394: Add support for running metadata fetchers without a VCS/package loaders.

We decided to add recurring fetches, so it will take care both of backfilling now, and visiting from time to time in the future. We're going to assume 3 months for now, as it seems reasonable to not exhaust rate limits.

Dec 6 2022, 3:54 PM · Extrinsic metadata
ardumont accepted D8883: Add a script to generate a topological sort.

LGTM, been a while since I read java code, so verbose (especially for iterations).

Dec 6 2022, 3:54 PM
swh-public-ci added a comment to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Build is green

Dec 6 2022, 3:49 PM
ardumont updated the diff for D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Amend commit message and fix tests

Dec 6 2022, 3:45 PM
anlambert accepted D8883: Add a script to generate a topological sort.

LGTM, been a while since I read java code, so verbose (especially for iterations).

Dec 6 2022, 3:17 PM
Harbormaster failed remote builds in B33098: Diff 32145 for D8922: send-to-celery: Adapt to schedule from lister name & instance_name!
Dec 6 2022, 3:01 PM
swh-public-ci added a comment to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Build was aborted

Dec 6 2022, 3:01 PM
ardumont retitled D8922: send-to-celery: Adapt to schedule from lister name & instance_name from scheduler: Adapt send-to-celery to allow scheduling from lister name to Adapt send-to-celery cli to allow scheduling from simpler lister info.
Dec 6 2022, 2:56 PM
anlambert requested changes to D8919: Add CLI script to generate Luigi config and call it.

Could you add a test checking luigi parameters are correctly passed to the subprocess.run instruction ?

Dec 6 2022, 2:54 PM
ardumont updated the test plan for D8922: send-to-celery: Adapt to schedule from lister name & instance_name.
Dec 6 2022, 2:50 PM
ardumont retitled D8922: send-to-celery: Adapt to schedule from lister name & instance_name from scheduler: Open `swh scheduler lister` command to retrieve lister id to scheduler: Adapt send-to-celery to allow scheduling from lister name.
Dec 6 2022, 2:50 PM
ardumont updated the diff for D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

Adapt according to suggestion (still trying to determine why the build fails)

Dec 6 2022, 2:49 PM
anlambert accepted D8926: luigi.RunExportAll: Default to exporting all formats.
Dec 6 2022, 2:47 PM
anlambert accepted D8917: Split swh/graph/luigi.py into modules.
Dec 6 2022, 2:46 PM
anlambert accepted D8925: luigi.CreateAthena: Fix validation of DB name.
Dec 6 2022, 2:43 PM
vlorentz added a revision to T2220: swh-graph in production: D8919: Add CLI script to generate Luigi config and call it.
Dec 6 2022, 2:37 PM · Roadmap 2022, meta-task, Roadmap 2021, Compressed graph service
vlorentz added a task to D8919: Add CLI script to generate Luigi config and call it: T2220: swh-graph in production.
Dec 6 2022, 2:37 PM
vlorentz added a task to D8919: Add CLI script to generate Luigi config and call it: T4676: Add Luigi workflow in swh-dataset.
Dec 6 2022, 2:37 PM
vlorentz added a task to D8924: exporters/orc: Fix crash on visit status with no type: T4676: Add Luigi workflow in swh-dataset.
Dec 6 2022, 2:37 PM
vlorentz added a task to D8925: luigi.CreateAthena: Fix validation of DB name: T4676: Add Luigi workflow in swh-dataset.
Dec 6 2022, 2:37 PM
vlorentz added a task to D8926: luigi.RunExportAll: Default to exporting all formats: T4676: Add Luigi workflow in swh-dataset.
Dec 6 2022, 2:37 PM
vlorentz added revisions to T4676: Add Luigi workflow in swh-dataset: D8919: Add CLI script to generate Luigi config and call it, D8924: exporters/orc: Fix crash on visit status with no type, D8925: luigi.CreateAthena: Fix validation of DB name, D8926: luigi.RunExportAll: Default to exporting all formats.
Dec 6 2022, 2:37 PM · Datasets, Compressed graph service
ardumont planned changes to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

To adapt according to suggestion (already on it).

Dec 6 2022, 2:15 PM
vlorentz requested review of D8926: luigi.RunExportAll: Default to exporting all formats.
Dec 6 2022, 2:07 PM
Harbormaster failed to build B33092: rDWAPPS36ce2b462f5d: archive_coverage: Add link to Archive Changelog in coverage widget for rDWAPPS36ce2b462f5d: archive_coverage: Add link to Archive Changelog in coverage widget!
Dec 6 2022, 2:06 PM
vlorentz requested review of D8925: luigi.CreateAthena: Fix validation of DB name.
Dec 6 2022, 2:05 PM
anlambert accepted D8924: exporters/orc: Fix crash on visit status with no type.
Dec 6 2022, 2:05 PM
vlorentz requested review of D8924: exporters/orc: Fix crash on visit status with no type.
Dec 6 2022, 2:04 PM
anlambert closed D8920: from_disk.Content: Add missing path info for symlink.
Dec 6 2022, 1:54 PM
anlambert committed rDMOD818ad826a4f4: from_disk.Content: Add missing path info for symlink (authored by anlambert).
from_disk.Content: Add missing path info for symlink
Dec 6 2022, 1:54 PM
anlambert closed D8923: archive_coverage: Add link to Archive Changelog in coverage widget.
Dec 6 2022, 1:53 PM
anlambert committed rDWAPPS36ce2b462f5d: archive_coverage: Add link to Archive Changelog in coverage widget (authored by anlambert).
archive_coverage: Add link to Archive Changelog in coverage widget
Dec 6 2022, 1:53 PM
anlambert requested changes to D8909: Login: Add an option to choose an authentication method (by username/password or token).

@anlambert Shouldn't this be replaced by swh auth generate-token?

@anlambert @vlorentz seems legit that anything related to auth for a cli command should be centralized in swh auth.

I can adapt to :

  • make swh scanner depends on swh auth
  • alias swh scanner login to one of swh auth
  • add a set token command to swh auth

What do you think?

Dec 6 2022, 1:52 PM
vlorentz accepted D8923: archive_coverage: Add link to Archive Changelog in coverage widget.

nice

Dec 6 2022, 1:46 PM
anlambert requested review of D8923: archive_coverage: Add link to Archive Changelog in coverage widget.
Dec 6 2022, 1:44 PM
vlorentz accepted D8920: from_disk.Content: Add missing path info for symlink.

ah, so it doesn't matter for other loaders. Phew!

Dec 6 2022, 1:36 PM
anlambert added a comment to D8920: from_disk.Content: Add missing path info for symlink.

Does it mean we were silently dropping data until this? Which loaders use this?

Dec 6 2022, 1:14 PM
olasd added a comment to D8922: send-to-celery: Adapt to schedule from lister name & instance_name.

As I mentioned on IRC, I think we should do that "join" directly in swh scheduler origin send-to-celery (and have it error out if the lister name/instance provided don't match an existing lister); using the uuid was an "easy hack" to extend the API of grab_next_visits, but using lister name/instance in the CLI interface makes more sense.

Dec 6 2022, 12:37 PM