Page MenuHomeSoftware Heritage
Feed All Stories

Sep 2 2021

anlambert updated the diff for D6170: origin_save: Filter out visit type with no scheduler load-* task type.

Handle @ardumont comments

Sep 2 2021, 2:23 PM
anlambert added a comment to D6170: origin_save: Filter out visit type with no scheduler load-* task type.

Thanks.

I like how you were able to scrub out the scheduler mocking part as well, nice work!
That had become a mess to maintain so kudos.

Sep 2 2021, 2:14 PM
ardumont accepted D6170: origin_save: Filter out visit type with no scheduler load-* task type.

I like how you were able to scrub out the scheduler mocking part as well, nice work!
That had become a mess to maintain so kudos.

Sep 2 2021, 2:07 PM
vlorentz accepted D6170: origin_save: Filter out visit type with no scheduler load-* task type.
Sep 2 2021, 2:04 PM
anlambert updated subscribers of D6170: origin_save: Filter out visit type with no scheduler load-* task type.

Wouldn't a config option be simpler?

Sep 2 2021, 1:47 PM
vlorentz added a comment to D6170: origin_save: Filter out visit type with no scheduler load-* task type.

Wouldn't a config option be simpler?

Sep 2 2021, 1:26 PM
anlambert requested review of D6170: origin_save: Filter out visit type with no scheduler load-* task type.
Sep 2 2021, 1:21 PM
ardumont moved T3518: Enable vault cookers to access swh-graph from code-review/await-feedback/pause to deployed/landed/monitoring on the System administration board.
Sep 2 2021, 12:35 PM · System administration, Vault
ardumont committed rSPSITEa2706d97972d: common: Activate graph options for production cookers (authored by ardumont).
common: Activate graph options for production cookers
Sep 2 2021, 12:30 PM
ardumont closed D6169: Enable production vault cookers to use swh.graph.
Sep 2 2021, 12:30 PM
ardumont committed rSPSITE9d72822b1fc6: Enable vault cookers to access swh-graph if configuration requires it (authored by ardumont).
Enable vault cookers to access swh-graph if configuration requires it
Sep 2 2021, 12:30 PM
ardumont triaged T3545: Update the journalbeat version package as Normal priority.
Sep 2 2021, 12:29 PM · Packagers, System administration
vlorentz added a comment to D6169: Enable production vault cookers to use swh.graph.

looks good

Sep 2 2021, 12:15 PM
vlorentz added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

I updated the task with a breakdown of the cost of getting each info.

Sep 2 2021, 12:12 PM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Sep 2 2021, 12:04 PM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Sep 2 2021, 11:56 AM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Sep 2 2021, 11:54 AM · Origin-GitHub, Extrinsic metadata
anlambert added a comment to D5992: add support for the CVS loader to 'Save Code Now'.

What do you think ?

yes, good idea. That'd simplify the setup both in docker and prod (as in nothing to do ;)

Sep 2 2021, 11:42 AM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCHfa9d3286818b: Updated backport on buster-swh from debian/0.18.0-1_swh1 (unstable-swh) (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Updated backport on buster-swh from debian/0.18.0-1_swh1 (unstable-swh)
Sep 2 2021, 11:37 AM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCHdb6bff1747ce: Merge tag 'debian/0.18.0-1_swh1' into debian/buster-swh (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Merge tag 'debian/0.18.0-1_swh1' into debian/buster-swh
Sep 2 2021, 11:37 AM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCHb364c7ea44df: pristine-tar data for swh-scheduler_0.18.0.orig.tar.gz (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
pristine-tar data for swh-scheduler_0.18.0.orig.tar.gz
Sep 2 2021, 11:35 AM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCH1a70fd612b29: Updated debian changelog for version 0.18.0 (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Updated debian changelog for version 0.18.0
Sep 2 2021, 11:35 AM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCH66bf492bbb13: New upstream version 0.18.0 (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
New upstream version 0.18.0
Sep 2 2021, 11:35 AM
Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org> committed rDSCH22baf3f0201c: Update upstream source from tag 'debian/upstream/0.18.0' (authored by Jenkins for Software Heritage <jenkins@jenkins-debian1.internal.softwareheritage.org>).
Update upstream source from tag 'debian/upstream/0.18.0'
Sep 2 2021, 11:35 AM
ardumont committed rDSCHecc14007aa25: runner: Improve help message on the task types flag. (authored by ardumont).
runner: Improve help message on the task types flag.
Sep 2 2021, 11:28 AM
ardumont closed D5818: send-to-celery: Add more options to allow scheduling of edge case origins.
Sep 2 2021, 11:09 AM
ardumont committed rDSCH63fdda00f5f9: send-to-celery: Add more options to allow scheduling of edge cases (authored by ardumont).
send-to-celery: Add more options to allow scheduling of edge cases
Sep 2 2021, 11:09 AM
swh-public-ci added a comment to D5818: send-to-celery: Add more options to allow scheduling of edge case origins.

Build is green

Sep 2 2021, 10:48 AM
ardumont added a comment to D5818: send-to-celery: Add more options to allow scheduling of edge case origins.

There's a weird set of changes that seems to be mixed in to the new send-to-celery options, I'm not sure that was intended?

Sep 2 2021, 10:47 AM
ardumont updated the diff for D5818: send-to-celery: Add more options to allow scheduling of edge case origins.

Only target the one commit for the diff

Sep 2 2021, 10:46 AM
vlorentz requested changes to D6158: maven jar-loader: Initalise files..
Sep 2 2021, 10:30 AM
vlorentz requested changes to D6133: maven-lister: initialise lister..

(marking this diff as "Changes Requested" so I get a notifications when you update it)

Sep 2 2021, 10:29 AM

Sep 1 2021

zack added a comment to T3544: Deal with GitHub removing support for git:// URLs.
In T3544#69746, @olasd wrote:

I can see a few alternatives to using git:// over tcp:

  • Give our swh bot accounts SSH keys, and use that to clone from GitHub over ssh.
Sep 1 2021, 10:06 PM · Origin-GitHub, Git loader
olasd added a comment to T3544: Deal with GitHub removing support for git:// URLs.

The dulwich HTTP(s) support is implemented on top of urllib(3?).

Sep 1 2021, 9:18 PM · Origin-GitHub, Git loader
vlorentz triaged T3544: Deal with GitHub removing support for git:// URLs as High priority.
Sep 1 2021, 9:11 PM · Origin-GitHub, Git loader
ardumont added a comment to T1524: save code now: also add new origins for unknown repos.

The scheduler is getting there.
We are now able to trigger a runner for that part:

Sep 1 2021, 6:02 PM · Save Code Now, Web app
vsellier added a comment to T3433: Deploy swh.search v0.10/v0.11.
  • package python3-swh.search upgraded to version 0.11.4-2, the problem is fixed
  • the new index is well created:
root@search0:/# curl -s http://search-esnode0:9200/_cat/indices\?v
health status index                       uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   origin-v0.11                HljzsdD9SmKI7-8ekB_q3Q  80   0          0            0      4.2kb          4.2kb
green  close  origin                      HthJj42xT5uO7w3Aoxzppw  80   0                                                  
green  close  origin-v0.9.0               o7FiYJWnTkOViKiAdCXCuA  80   0                                                  
green  open   origin-v0.10.0              -fvf4hK9QDeN8qYTJBBlxQ  80   0    1981623       559384      2.3gb          2.3gb
green  close  origin-backup-20210209-1736 P1CKjXW0QiWM5zlzX46-fg  80   0                                                  
green  close  origin-v0.5.0               SGplSaqPR_O9cPYU4ZsmdQ  80   0
  • journal clients enabled and restarted
  • the journal clients lags should recover in less than 12h
  • waiting some time to estimate the duration with only one journal client per type
Sep 1 2021, 5:46 PM · System administration, Archive search
olasd accepted D5818: send-to-celery: Add more options to allow scheduling of edge case origins.

The send-to-celery part LGTM, thanks.

Sep 1 2021, 5:45 PM
ardumont moved T3518: Enable vault cookers to access swh-graph from in-progress to code-review/await-feedback/pause on the System administration board.
Sep 1 2021, 5:32 PM · System administration, Vault
ardumont updated the test plan for D6169: Enable production vault cookers to use swh.graph.
Sep 1 2021, 5:26 PM
ardumont updated the test plan for D6169: Enable production vault cookers to use swh.graph.
Sep 1 2021, 5:26 PM
ardumont requested review of D6169: Enable production vault cookers to use swh.graph.
Sep 1 2021, 5:25 PM
ardumont added a revision to T3518: Enable vault cookers to access swh-graph: D6169: Enable production vault cookers to use swh.graph.
Sep 1 2021, 5:25 PM · System administration, Vault
vsellier added a comment to T3433: Deploy swh.search v0.10/v0.11.

The problem was fixed by rDSEA68347a5604c74150197f691593cbb05bdd34396f
thanks @olasd

Sep 1 2021, 5:22 PM · System administration, Archive search
vsellier added a comment to T3433: Deploy swh.search v0.10/v0.11.

Deployment of version v0.11.4 in staging:
On search0:

  • puppet stopped
  • stop and disable the journal clients and search backend
  • update the swh-search configuration to use a origin-v0.11 index
root@search0:/etc/softwareheritage/search# diff -U2 /tmp/server.yml server.yml 
--- /tmp/server.yml	2021-09-01 13:42:29.347951302 +0000
+++ server.yml	2021-09-01 13:42:35.739953523 +0000
@@ -7,5 +7,5 @@
   indexes:
     origin:
-      index: origin-v0.10.0
+      index: origin-v0.11
       read_alias: origin-read
       write_alias: origin-write
  • update the journal-clients to use a group id swh.search.journal_client.[indexed|object]-v0.11
root@search0:/etc/softwareheritage/search# diff -U3 /tmp/journal_client_objects.yml journal_client_objects.yml 
--- /tmp/journal_client_objects.yml	2021-09-01 13:44:49.843999978 +0000
+++ journal_client_objects.yml	2021-09-01 13:45:03.972004852 +0000
@@ -5,7 +5,7 @@
 journal:
   brokers:
   - journal0.internal.staging.swh.network
-  group_id: swh.search.journal_client-v0.10.0
+  group_id: swh.search.journal_client-v0.11
   prefix: swh.journal.objects
   object_types:
   - origin
root@search0:/etc/softwareheritage/search# diff -U3 /tmp/journal_client_indexed.yml journal_client_indexed.yml 
--- /tmp/journal_client_indexed.yml	2021-09-01 13:44:44.847998252 +0000
+++ journal_client_indexed.yml	2021-09-01 13:44:57.020002454 +0000
@@ -5,7 +5,7 @@
 journal:
   brokers:
   - journal0.internal.staging.swh.network
-  group_id: swh.search.journal_client.indexed-v0.10.0
+  group_id: swh.search.journal_client.indexed-v0.11
   prefix: swh.journal.indexed
   object_types:
   - origin_intrinsic_metadata
  • perform a system upgrade, a reboot was not required
  • enable and start swh-search backend
  • An error occurs after the restart:
Sep 01 14:19:12 search0 python3[4066688]: 2021-09-01 14:19:12 [4066688] root:ERROR command 'cc' failed with exit status 1
                                          Traceback (most recent call last):
                                            File "/usr/lib/python3.7/distutils/unixccompiler.py", line 118, in _compile
                                              extra_postargs)
                                            File "/usr/lib/python3.7/distutils/ccompiler.py", line 909, in spawn
                                              spawn(cmd, dry_run=self.dry_run)
                                            File "/usr/lib/python3.7/distutils/spawn.py", line 36, in spawn
                                              _spawn_posix(cmd, search_path, dry_run=dry_run)
                                            File "/usr/lib/python3.7/distutils/spawn.py", line 159, in _spawn_posix
                                              % (cmd, exit_status))
                                          distutils.errors.DistutilsExecError: command 'cc' failed with exit status 1
Sep 1 2021, 5:15 PM · System administration, Archive search
ardumont closed D6168: d/control: Activate the vault tests with the graph client.

Landed

Sep 1 2021, 4:59 PM
ardumont updated the summary of D6168: d/control: Activate the vault tests with the graph client.
Sep 1 2021, 4:58 PM
ardumont added a comment to D6168: d/control: Activate the vault tests with the graph client.

I think it'd be best to have those tested during packaging.

Testing is always a good idea, looks good to me.

Sep 1 2021, 4:58 PM
ardumont added a comment to D6138: package/utils: Handle downloads for urls with missing schema.

I don't think this is something that we should do by default, because the semantics of
a URI with no scheme are ambiguous in general. If the semantics of the nix
sources.json is that download URIs with no protocol have an implicit http scheme
added, then such url mangling should be implemented in the nix loader directly.

Sep 1 2021, 4:57 PM
anlambert accepted D6168: d/control: Activate the vault tests with the graph client.

I think it'd be best to have those tested during packaging.

Sep 1 2021, 4:49 PM
ardumont added a comment to T3518: Enable vault cookers to access swh-graph.

Dropping the subtask about the full packaging.

Sep 1 2021, 4:44 PM · System administration, Vault
ardumont updated the summary of D6168: d/control: Activate the vault tests with the graph client.
Sep 1 2021, 4:40 PM
ardumont updated the summary of D6168: d/control: Activate the vault tests with the graph client.
Sep 1 2021, 4:39 PM
ardumont requested review of D6168: d/control: Activate the vault tests with the graph client.
Sep 1 2021, 4:39 PM
ardumont created P1148 vault debian package build with graph client tests ok.
Sep 1 2021, 4:38 PM
ardumont added a revision to T3518: Enable vault cookers to access swh-graph: D6168: d/control: Activate the vault tests with the graph client.
Sep 1 2021, 4:37 PM · System administration, Vault
ardumont changed the status of T3518: Enable vault cookers to access swh-graph, a subtask of T3505: Make the git-bare cooker available to the staff and beta-testers in the production webapp, from Open to Work in Progress.
Sep 1 2021, 4:21 PM · Vault, Web app
ardumont changed the status of T3518: Enable vault cookers to access swh-graph from Open to Work in Progress.
Sep 1 2021, 4:21 PM · System administration, Vault
ardumont closed T3543: Debian package python3-swh.graph.client, a subtask of T3518: Enable vault cookers to access swh-graph, as Resolved.
Sep 1 2021, 4:21 PM · System administration, Vault
ardumont closed T3543: Debian package python3-swh.graph.client as Resolved.
Sep 1 2021, 4:21 PM · System administration, Vault
ardumont added a comment to T3543: Debian package python3-swh.graph.client.

Packaging ok:

Sep 1 2021, 4:20 PM · System administration, Vault
ardumont updated the task description for T3518: Enable vault cookers to access swh-graph.
Sep 1 2021, 4:17 PM · System administration, Vault
ardumont added a comment to T3543: Debian package python3-swh.graph.client.

Package python3-swh.graph.client built [1]

Sep 1 2021, 4:16 PM · System administration, Vault
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Sep 1 2021, 4:05 PM · Origin-GitHub, Extrinsic metadata
vlorentz updated the task description for T3542: Decide what metadata we want to / can collect from GitHub.
Sep 1 2021, 4:03 PM · Origin-GitHub, Extrinsic metadata
vlorentz added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

no and yes, respectively

Sep 1 2021, 4:02 PM · Origin-GitHub, Extrinsic metadata
ardumont added a comment to P1147 debian package successful build.

buster:

swh-debian-build-stable
++ git branch
++ grep '*'
++ cut '-d ' -f2
+ CURRENT_BRANCH=debian/buster-swh
+ '[' debian/buster-swh = debian/buster-swh ']'
+ gbp buildpackage --git-builder=sbuild --nolog --batch --no-clean-source --no-run-lintian --arch-all --source --force-orig-source --build-dep-resolver=aptitude '--extra-repository=deb [trusted=yes] https://debian.softwareheritage.org/ buster-swh main' '--extra-repository=deb http://deb.debian.org/debian/ buster-backports main' --build-failed-commands %SBUILD_SHELL
gbp:info: Tarballs 'swh.graph_0.5.0.orig.tar.gz' not found at '/home/tony/debian/tarballs/'
gbp:info: Exporting 'HEAD' to '/home/tony/debian/build-area/swh.graph-tmp'
gbp:info: Moving '/home/tony/debian/build-area/swh.graph-tmp' to '/home/tony/debian/build-area/swh.graph-0.5.0'
gbp:info: Performing the build
dpkg-source: info: using source format '3.0 (quilt)'
dpkg-source: info: building swh.graph using existing ./swh.graph_0.5.0.orig.tar.gz
dpkg-source: warning: ignoring deletion of directory java/target
dpkg-source: warning: ignoring deletion of file java/target/swh-graph-0.5.0.jar, use --include-removal to override
dpkg-source: info: building swh.graph in swh.graph_0.5.0-1~swh1~bpo10+1.debian.tar.xz
dpkg-source: info: building swh.graph in swh.graph_0.5.0-1~swh1~bpo10+1.dsc
sbuild (Debian sbuild) 0.78.1 (09 February 2019) on localhost
Sep 1 2021, 3:33 PM
ardumont edited P1147 debian package successful build.
Sep 1 2021, 3:32 PM
ardumont added a comment to T3543: Debian package python3-swh.graph.client.
  • Add 'has debian packaging branch' tag to the repository
  • Install hook to cascade the debian packaging build [1]
  • Ensure necessary ci jobs is installed to build package (already there)
Sep 1 2021, 3:25 PM · System administration, Vault
vsellier closed T3484: Fix the release builds for swh-search, a subtask of T3433: Deploy swh.search v0.10/v0.11, as Resolved.
Sep 1 2021, 3:21 PM · System administration, Archive search
vsellier closed T3484: Fix the release builds for swh-search as Resolved.

The build is now fixed and the v0.11.4 version is ready to be deployed on the environments

Sep 1 2021, 3:21 PM · System administration, Archive search
ardumont added a comment to T3543: Debian package python3-swh.graph.client.

Local build successful [1]

Sep 1 2021, 3:18 PM · System administration, Vault
ardumont created P1147 debian package successful build.
Sep 1 2021, 3:18 PM
ardumont changed the status of T3543: Debian package python3-swh.graph.client from Open to Work in Progress.
Sep 1 2021, 3:16 PM · System administration, Vault
ardumont changed the status of T3543: Debian package python3-swh.graph.client, a subtask of T3518: Enable vault cookers to access swh-graph, from Open to Work in Progress.
Sep 1 2021, 3:16 PM · System administration, Vault
ardumont triaged T3543: Debian package python3-swh.graph.client as Normal priority.
Sep 1 2021, 3:16 PM · System administration, Vault
olasd requested changes to D6138: package/utils: Handle downloads for urls with missing schema.

I don't think this is something that we should do by default, because the semantics of a URI with no scheme are ambiguous in general.

Sep 1 2021, 2:05 PM
vlorentz added a comment to D6133: maven-lister: initialise lister..

Please use less single-letter variable names in lister.py

Sep 1 2021, 12:13 PM
douardda added a comment to T3542: Decide what metadata we want to / can collect from GitHub.

do we need the "list of forks" if we keep the "fork of what"? I mean these are the 2 ends of the fork relation, right?

Sep 1 2021, 12:06 PM · Origin-GitHub, Extrinsic metadata
vlorentz added a comment to D6158: maven jar-loader: Initalise files..

It means one of the members is different, but it's impossible to tell which.

Sep 1 2021, 11:59 AM
Harbormaster failed to build B23303: rDGQL3b7f93144be2: Initial commit. First version of the GraphQL API service for rDGQL3b7f93144be2: Initial commit. First version of the GraphQL API service!
Sep 1 2021, 11:57 AM
jayeshv committed rDGQL3b7f93144be2: Initial commit. First version of the GraphQL API service (authored by jayeshv).
Initial commit. First version of the GraphQL API service
Sep 1 2021, 11:57 AM
vsellier added a comment to D6139: cassandra: Add option to select (hopefully) more efficient batch insertion algos.

Test with 10 replayers with the 3 kind of algorithm:

  • first interval: one-by-one
  • second interval: concurremt
  • third interval: batch:
Sep 1 2021, 11:37 AM
swh-public-ci added a comment to D6165: Add new RabbitMQ-based client/server API.

Build is green

Sep 1 2021, 11:36 AM
aeviso updated the diff for D6165: Add new RabbitMQ-based client/server API.
  • Remove old debug logging and improve other's messages
  • Rework ProvenanceStorageRabbitMQWorker to handle connection loss
  • Remove old client/server storage based on swh.core.api.RPCClient
Sep 1 2021, 11:34 AM
ardumont added a comment to D5992: add support for the CVS loader to 'Save Code Now'.

What do you think ?

Sep 1 2021, 11:27 AM
ardumont added inline comments to D6138: package/utils: Handle downloads for urls with missing schema.
Sep 1 2021, 11:11 AM
ardumont added a comment to D6138: package/utils: Handle downloads for urls with missing schema.

I've triggered back the build and now it's in a happy place.

Sep 1 2021, 11:10 AM
KShivendu requested review of D6138: package/utils: Handle downloads for urls with missing schema.
Sep 1 2021, 11:10 AM
jayeshv created GraphQL API.
Sep 1 2021, 11:01 AM
anlambert accepted D5992: add support for the CVS loader to 'Save Code Now'.

Also I just realized, as it's not completely ready afaik, don't land it yet though.

@anlambert What do you think of this?

Do we want for example to have a configuration (file) option to hide it and we toggle it later, when it's ready?
That'd avoid to let this diff hang.

TIA

Sep 1 2021, 11:01 AM
vsellier committed rDSNIP30b06ccd0294: grid5000/cassandra: fix statsd configuration of gunicorn services (authored by vsellier).
grid5000/cassandra: fix statsd configuration of gunicorn services
Sep 1 2021, 10:33 AM
ardumont closed T3502: Date overflow error in scheduler journal client as Resolved.

The monitoring icinga alerts have been deployed:

Sep 1 2021, 10:03 AM · System administration, Scheduling utilities
ardumont closed T3497: Allow systemd service status monitoring as Resolved.
Sep 1 2021, 10:01 AM · System administration
ardumont closed D6166: swh-scheduler-journal-client: Delay the restart of failing service.
Sep 1 2021, 9:55 AM
ardumont committed rSPSITEdb047bc1d886: swh-scheduler-journal-client: Delay the restart of failing service (authored by ardumont).
swh-scheduler-journal-client: Delay the restart of failing service
Sep 1 2021, 9:55 AM
ardumont closed D6163: Ensure icinga alerts are raised if scheduler journal client service is down.
Sep 1 2021, 9:55 AM
ardumont committed rSPSITE2434b83e98e9: Ensure icinga alerts are raised if scheduler journal client service is down (authored by ardumont).
Ensure icinga alerts are raised if scheduler journal client service is down
Sep 1 2021, 9:55 AM
vsellier accepted D6166: swh-scheduler-journal-client: Delay the restart of failing service.

LGTM

Sep 1 2021, 9:37 AM
vsellier accepted D6163: Ensure icinga alerts are raised if scheduler journal client service is down.

LGTM

Sep 1 2021, 9:33 AM