Page MenuHomeSoftware Heritage
Feed Advanced Search

Jun 7 2022

ardumont updated the task description for T4305: Aligh swh backends for migration tools to work.
Jun 7 2022, 4:07 PM · Core & foundations
ardumont added a comment to T4305: Aligh swh backends for migration tools to work.

Conclusion, it's mostly [1] ok now. Those who were not usable with the cli tool are now
ok.

Jun 7 2022, 4:03 PM · Core & foundations
ardumont triaged T4312: Refactor swh.vault factory so migration tool can work with it as Normal priority.
Jun 7 2022, 4:01 PM · Core & foundations
ardumont added a comment to T4305: Aligh swh backends for migration tools to work.
  • storage [1]
  • indexer [2]
  • scheduler [3]
  • scrubber [4]
  • vault: status quo, still not working ¯\_(ツ)_/¯ [5]
Jun 7 2022, 3:51 PM · Core & foundations
ardumont updated the task description for T4305: Aligh swh backends for migration tools to work.
Jun 7 2022, 3:35 PM · Core & foundations
ardumont added a comment to T4283: Load https://github.com/chromium/chromium with a higher packfile size limit.

Looks like either the loader didn't detect it is a fork, or github sent a large packfile anyway.

In swh/loader/git/loader.py at the end of the prepare function, could you print self.statsd.constant_tags and self.parent_origins, to see which it is?

Jun 7 2022, 3:34 PM · System administration, Git loader
ardumont accepted D7967: package/archive: Handle tarball artifact with null time.
Jun 7 2022, 3:24 PM
ardumont added a comment to T4283: Load https://github.com/chromium/chromium with a higher packfile size limit.

Loader crashed with memory issues. Probably too much loading in //.
Currently stopping the worker's other processes to let this one finish (i'll restart it).

Jun 7 2022, 3:12 PM · System administration, Git loader
ardumont closed D7966: tests: Fix current_version attribute change.
Jun 7 2022, 3:05 PM
ardumont committed rDVAUda9203fcf216: tests: Fix current_version attribute change (authored by ardumont).
tests: Fix current_version attribute change
Jun 7 2022, 3:05 PM
ardumont updated the test plan for D7965: postgres db: Create guest user at db initialization time.
Jun 7 2022, 3:03 PM
ardumont updated the diff for D7965: postgres db: Create guest user at db initialization time.

It works like @vsellier suggested as well.

Jun 7 2022, 3:01 PM
ardumont added a comment to D7965: postgres db: Create guest user at db initialization time.

is it not possible to use a sql script directly, in the idea of [1] ?
I suppose it will make the script simpler as the postgresql image logic will take care of running the script only during the database initialization.

[1] https://forge.softwareheritage.org/source/swh-provenance/browse/master/docker-compose.yml$16

Jun 7 2022, 2:57 PM
ardumont requested review of D7966: tests: Fix current_version attribute change.
Jun 7 2022, 2:56 PM
ardumont added a revision to T4305: Aligh swh backends for migration tools to work: D7966: tests: Fix current_version attribute change.
Jun 7 2022, 2:53 PM · Core & foundations
ardumont accepted D7962: kafka: add more options to the user management script.

The argument parsing is starting to get unconfortable to read.

Jun 7 2022, 2:43 PM
ardumont accepted D7964: assets/functions: Add missing await keywords in getCanonicalOriginURL.
Jun 7 2022, 2:36 PM
ardumont updated the summary of D7965: postgres db: Create guest user at db initialization time.
Jun 7 2022, 2:36 PM
ardumont updated the test plan for D7965: postgres db: Create guest user at db initialization time.
Jun 7 2022, 2:35 PM
ardumont added a comment to D7913: db: Grant read access to guest user on all tables of the schema.

One remark from @vsellier got my attention, it will break the docker setup.
Maybe @douardda hinted at it as well...

He is not wrong there, see for example [1].
So some adaptations will be needed there.

[1]

Jun 7 2022, 2:29 PM
ardumont added a revision to T4228: scrubber: Investigate the apparent lock (staging): D7965: postgres db: Create guest user at db initialization time.
Jun 7 2022, 2:28 PM · Archive integrity, System administration
ardumont requested review of D7965: postgres db: Create guest user at db initialization time.
Jun 7 2022, 2:28 PM
ardumont updated the task description for T3406: Archive the mathrice gitlab forge.
Jun 7 2022, 11:34 AM · Origin-GitLab, Lister
ardumont committed rSPSITEda77a0fc8867: worker17: Align with worker18 hardware specification 2/2 (authored by ardumont).
worker17: Align with worker18 hardware specification 2/2
Jun 7 2022, 11:27 AM
ardumont updated subscribers of D7913: db: Grant read access to guest user on all tables of the schema.

One remark from @vsellier got my attention, it will break the docker setup.
Maybe @douardda hinted at it as well...

Jun 7 2022, 11:07 AM
ardumont added a comment to T4283: Load https://github.com/chromium/chromium with a higher packfile size limit.

Triggered a run to ingest a fork (extra arguments needed with the cli) on production worker:

Jun 7 2022, 10:59 AM · System administration, Git loader
ardumont committed rSPSITE65f53a80c785: worker17: Align with worker18 hardware specification (authored by ardumont).
worker17: Align with worker18 hardware specification
Jun 7 2022, 10:02 AM
ardumont added a comment to T4283: Load https://github.com/chromium/chromium with a higher packfile size limit.

Success for production worker [1]. Staging worker is still working on it.

Jun 7 2022, 9:25 AM · System administration, Git loader

Jun 3 2022

ardumont updated the summary of D7913: db: Grant read access to guest user on all tables of the schema.
Jun 3 2022, 5:53 PM
ardumont updated the diff for D7913: db: Grant read access to guest user on all tables of the schema.

Adapt according to exchange

Jun 3 2022, 5:53 PM
ardumont updated the summary of D7913: db: Grant read access to guest user on all tables of the schema.
Jun 3 2022, 5:51 PM
ardumont added a comment to D7913: db: Grant read access to guest user on all tables of the schema.

@douardda Any news on how to modify a db template for the tests?

Jun 3 2022, 5:21 PM
ardumont closed D7961: jobs/swh-packages: Bump swh-storage build jobs timeout to 25 minutes.
Jun 3 2022, 5:13 PM
ardumont committed rCJSWH0cf1ebeeeb31: jobs/swh-packages: Bump swh-storage build jobs timeout to 25 minutes (authored by ardumont).
jobs/swh-packages: Bump swh-storage build jobs timeout to 25 minutes
Jun 3 2022, 5:13 PM
ardumont requested review of D7961: jobs/swh-packages: Bump swh-storage build jobs timeout to 25 minutes.
Jun 3 2022, 5:02 PM
ardumont accepted D7960: jobs/swh-packages: Bump swh-web build jobs timeout to 30 minutes.

@vlorentz some impact from stopping the concurrency build part ¯\_(ツ)_/¯

Jun 3 2022, 4:34 PM
ardumont created P1377 non maven typed maven origins listed by full maven lister.
Jun 3 2022, 4:12 PM
ardumont added a comment to T3874: staging: Analyze result of the maven listing and ingestion.

There remains git and other dvcs typed origins [1] listed by maven but not github ones [2].

Jun 3 2022, 4:11 PM · Maven loader, Maven lister, Archive coverage
ardumont updated the task description for T4305: Aligh swh backends for migration tools to work.
Jun 3 2022, 3:57 PM · Core & foundations
ardumont closed D7953: Set current_version attribute to postgresql datastore.
Jun 3 2022, 3:53 PM
ardumont committed rDSTOc19f53f194d0: Set current_version attribute to postgresql datastore (authored by ardumont).
Set current_version attribute to postgresql datastore
Jun 3 2022, 3:53 PM
ardumont added a comment to D7953: Set current_version attribute to postgresql datastore.

lgtm (probably needs tedious hand-managed-db-migration-in-docker test)

Jun 3 2022, 3:48 PM
ardumont added a comment to D7947: cypress: Upgrade to version 10.

/me closes his eyes and accepts the diff ¯\_(ツ)_/¯

The upgrade was basically migrating files in three clicks but at least I found a cypress regression (https://github.com/cypress-io/cypress/issues/22054) that was fixed really quickly.

Jun 3 2022, 3:46 PM
ardumont updated the diff for D7953: Set current_version attribute to postgresql datastore.

Fix current expected code version

Jun 3 2022, 3:41 PM
ardumont closed D7954: Set current_version attribute to postgresql datastore.
Jun 3 2022, 3:38 PM
ardumont committed rDCIDX0c5fdc7feba5: Set current_version attribute to postgresql datastore (authored by ardumont).
Set current_version attribute to postgresql datastore
Jun 3 2022, 3:38 PM
ardumont closed D7958: Remove unused get_current_version method.
Jun 3 2022, 3:33 PM
ardumont committed rDSCH0496c3977a56: Remove unused get_current_version method (authored by ardumont).
Remove unused get_current_version method
Jun 3 2022, 3:33 PM
ardumont accepted D7947: cypress: Upgrade to version 10.

/me closes his eyes and accepts the diff ¯\_(ツ)_/¯

Jun 3 2022, 3:29 PM
ardumont updated the test plan for D7953: Set current_version attribute to postgresql datastore.
Jun 3 2022, 3:24 PM
ardumont closed T4232: Listers: Canonicalize listed github origins, a subtask of T3874: staging: Analyze result of the maven listing and ingestion, as Resolved.
Jun 3 2022, 3:19 PM · Maven loader, Maven lister, Archive coverage
ardumont closed T4232: Listers: Canonicalize listed github origins as Resolved.
Jun 3 2022, 3:19 PM · Lister
ardumont closed T3874: staging: Analyze result of the maven listing and ingestion as Resolved.
Jun 3 2022, 3:18 PM · Maven loader, Maven lister, Archive coverage
ardumont closed T3874: staging: Analyze result of the maven listing and ingestion, a subtask of T3746: staging: Deploy maven indexer/lister/loader, as Resolved.
Jun 3 2022, 3:18 PM · Maven loader, Maven lister, System administration, Archive coverage
ardumont updated the task description for T3874: staging: Analyze result of the maven listing and ingestion.
Jun 3 2022, 3:18 PM · Maven loader, Maven lister, Archive coverage
ardumont added a subtask for T3874: staging: Analyze result of the maven listing and ingestion: T4232: Listers: Canonicalize listed github origins.
Jun 3 2022, 3:18 PM · Maven loader, Maven lister, Archive coverage
ardumont added a parent task for T4232: Listers: Canonicalize listed github origins: T3874: staging: Analyze result of the maven listing and ingestion.
Jun 3 2022, 3:18 PM · Lister
ardumont updated the task description for T4232: Listers: Canonicalize listed github origins.
Jun 3 2022, 3:17 PM · Lister
ardumont updated the diff for D7954: Set current_version attribute to postgresql datastore.

Rebase

Jun 3 2022, 3:11 PM
ardumont accepted D7937: Replace RevisionMetadataIndexer with DirectoryMetadataIndexer.
Jun 3 2022, 3:07 PM
ardumont requested review of D7958: Remove unused get_current_version method.
Jun 3 2022, 2:53 PM
ardumont requested review of D7953: Set current_version attribute to postgresql datastore.
Jun 3 2022, 2:51 PM
ardumont updated the task description for T4305: Aligh swh backends for migration tools to work.
Jun 3 2022, 2:50 PM · Core & foundations
ardumont added a revision to T4305: Aligh swh backends for migration tools to work: D7958: Remove unused get_current_version method.
Jun 3 2022, 2:43 PM · Core & foundations
ardumont updated the task description for T4305: Aligh swh backends for migration tools to work.
Jun 3 2022, 2:40 PM · Core & foundations
ardumont added a project to T4305: Aligh swh backends for migration tools to work: Core & foundations.
Jun 3 2022, 2:36 PM · Core & foundations
ardumont updated the diff for D7954: Set current_version attribute to postgresql datastore.

Update requirements

Jun 3 2022, 2:36 PM
ardumont updated the task description for T4305: Aligh swh backends for migration tools to work.
Jun 3 2022, 2:33 PM · Core & foundations
ardumont accepted D7956: Provenance: Docker for origin layer + logging improvement.

one remark inline.

Jun 3 2022, 2:22 PM
ardumont added a comment to T3874: staging: Analyze result of the maven listing and ingestion.

status: triggered 2 full-maven lister runs on maven central and jboss [1]
And no more exotic github urls are popping up [2].

Jun 3 2022, 2:18 PM · Maven loader, Maven lister, Archive coverage
ardumont requested review of D7954: Set current_version attribute to postgresql datastore.
Jun 3 2022, 11:47 AM
ardumont added a revision to T4305: Aligh swh backends for migration tools to work: D7954: Set current_version attribute to postgresql datastore.
Jun 3 2022, 11:40 AM · Core & foundations
ardumont added a revision to T4305: Aligh swh backends for migration tools to work: D7953: Set current_version attribute to postgresql datastore.
Jun 3 2022, 11:37 AM · Core & foundations
ardumont renamed T4305: Aligh swh backends for migration tools to work from Unstuck migration tools for all swh backend module to Aligh swh backends for migration tools to work.
Jun 3 2022, 11:35 AM · Core & foundations
ardumont changed the status of T4305: Aligh swh backends for migration tools to work from Open to Work in Progress.
Jun 3 2022, 11:33 AM · Core & foundations
ardumont triaged T4305: Aligh swh backends for migration tools to work as High priority.
Jun 3 2022, 11:33 AM · Core & foundations
ardumont added a comment to D7949: db.BaseDb: Propose default get_current_version method implementation.

Although some stuff are worth keeping, notably the tests on the missing coverage cli (i'll do it in another diff).

Jun 3 2022, 11:32 AM
ardumont abandoned D7949: db.BaseDb: Propose default get_current_version method implementation.

Not the proper fix

Jun 3 2022, 11:31 AM
ardumont abandoned D7943: Revert "cli.db: Use attribute current_version instead of undeclared getter".

Without this diff, storage and indexer are actually broken since their datastore is missing the current_version attribute.
But it's simpler to not revert that code (abandon this diff) which somehow clarifies a bit what's searched for and the simple
fix is to add their current_version attribute (the one on db.py becomes unneeded).

Jun 3 2022, 11:31 AM
ardumont created P1374 prod: loader frequency.
Jun 3 2022, 9:55 AM
ardumont updated the title for P1373 staging scheduling frequency from scheduling frequency to staging scheduling frequency.
Jun 3 2022, 9:54 AM
ardumont created P1373 staging scheduling frequency.
Jun 3 2022, 9:52 AM
ardumont added a comment to T3874: staging: Analyze result of the maven listing and ingestion.

Yesterday, i had fixed, diffed, released and pushed the diff [1] fixing the
canonicalization of remaining exotic urls, cleaned up 'git' (out of a maven listing)
origins and triggered back a listing. Today, checking back those origins (staging
scheduler), there was still noise which should no longer have been there...

Jun 3 2022, 9:35 AM · Maven loader, Maven lister, Archive coverage

Jun 2 2022

ardumont updated the summary of D7951: Deploy new origin intrinsic metadata journal client indexer.
Jun 2 2022, 6:03 PM
ardumont changed the status of T4282: Deploy new origin intrinsic metadata journal client indexer > v1.1 from Work in Progress to Open.
Jun 2 2022, 6:00 PM · System administration, Indexer, Metadata workflow
ardumont changed the status of T4282: Deploy new origin intrinsic metadata journal client indexer > v1.1, a subtask of T4273: Rewrite indexers as journal clients when relevant, from Work in Progress to Open.
Jun 2 2022, 6:00 PM · Indexer, Metadata workflow
ardumont moved T4282: Deploy new origin intrinsic metadata journal client indexer > v1.1 from deployed/landed/monitoring to Backlog on the System administration board.
Jun 2 2022, 6:00 PM · System administration, Indexer, Metadata workflow
ardumont updated the task description for T4282: Deploy new origin intrinsic metadata journal client indexer > v1.1.
Jun 2 2022, 5:57 PM · System administration, Indexer, Metadata workflow
ardumont requested review of D7951: Deploy new origin intrinsic metadata journal client indexer.
Jun 2 2022, 5:57 PM
ardumont added a revision to T4282: Deploy new origin intrinsic metadata journal client indexer > v1.1: D7951: Deploy new origin intrinsic metadata journal client indexer.
Jun 2 2022, 5:57 PM · System administration, Indexer, Metadata workflow
ardumont added a comment to T4282: Deploy new origin intrinsic metadata journal client indexer > v1.1.

Reverting:

  • Stopping and disabling journal client services [1]
  • D7950: Revert puppet manifest changes
  • scheduler0.staging: deploy manifest changes [2]
  • workers.staging: Deploy manifest changes [3]
  • check everything is back to normal [4]
Jun 2 2022, 5:53 PM · System administration, Indexer, Metadata workflow
ardumont closed D7950: staging: Revert indexer journal client deployment.
Jun 2 2022, 5:47 PM
ardumont committed rSPSITEa5e36dd1cf54: staging: Revert indexer journal client deployment (authored by ardumont).
staging: Revert indexer journal client deployment
Jun 2 2022, 5:47 PM
ardumont requested review of D7950: staging: Revert indexer journal client deployment.
Jun 2 2022, 5:46 PM
ardumont added a revision to T4274: Resolve all known crashes in the metadata indexer: D7950: staging: Revert indexer journal client deployment.
Jun 2 2022, 5:46 PM · Indexer, Metadata workflow
ardumont added a parent task for T4274: Resolve all known crashes in the metadata indexer: T4282: Deploy new origin intrinsic metadata journal client indexer > v1.1.
Jun 2 2022, 5:37 PM · Indexer, Metadata workflow
ardumont added a subtask for T4282: Deploy new origin intrinsic metadata journal client indexer > v1.1: T4274: Resolve all known crashes in the metadata indexer.
Jun 2 2022, 5:37 PM · System administration, Indexer, Metadata workflow
ardumont updated the summary of D7943: Revert "cli.db: Use attribute current_version instead of undeclared getter".
Jun 2 2022, 5:31 PM
ardumont updated the summary of D7949: db.BaseDb: Propose default get_current_version method implementation.
Jun 2 2022, 5:28 PM
ardumont updated the summary of D7949: db.BaseDb: Propose default get_current_version method implementation.
Jun 2 2022, 5:23 PM