Page MenuHomeSoftware Heritage
Feed Advanced Search

Oct 19 2022

gitlab-migration changed the status of T4282: Deploy new origin intrinsic metadata journal client indexer > v1.1 from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:07 PM · System administration, Indexer, Metadata workflow
gitlab-migration changed the status of T4282: Deploy new origin intrinsic metadata journal client indexer > v1.1, a subtask of T4273: Rewrite indexers as journal clients when relevant, from Resolved to Migrated.
Oct 19 2022, 6:07 PM · Indexer, Metadata workflow
gitlab-migration changed the status of T1488: Tune metadata indexer workers parallelism from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 5:55 PM · System administration, Indexer
gitlab-migration changed the status of T1113: Update streaming replication documentation from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 5:54 PM · Indexer, Web app, System administration
gitlab-migration changed the status of T1113: Update streaming replication documentation, a subtask of T1094: swh-indexer db replica on azure, from Resolved to Migrated.
Oct 19 2022, 5:54 PM · Indexer, Web app, System administration
gitlab-migration changed the status of T1095: indexer: Remove temporary table usage for read-only queries, a subtask of T1094: swh-indexer db replica on azure, from Resolved to Migrated.
Oct 19 2022, 5:54 PM · Indexer, Web app, System administration
gitlab-migration changed the status of T1095: indexer: Remove temporary table usage for read-only queries from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 5:53 PM · Indexer, Web app, System administration
gitlab-migration changed the status of T1094: swh-indexer db replica on azure from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 5:53 PM · Indexer, Web app, System administration
gitlab-migration changed the status of T2603: Configuration mismatch between swh.indexer.journal_client and the configuration declared in puppet from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 5:51 PM · Puppet recipes, Indexer
gitlab-migration changed the status of T872: Deploy and restart indexers, a subtask of T871: Migrate swh-storage api functions relative to indexers to swh-indexer, from Resolved to Migrated.
Oct 19 2022, 5:51 PM · SWORD deposit, Core Loader, Web app, Development environment, Storage manager, Indexer
gitlab-migration changed the status of T872: Deploy and restart indexers from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 5:51 PM · SWORD deposit, Core Loader, Storage manager, Web app, Puppet recipes, Indexer
gitlab-migration changed the status of T579: Puppetize subset of indexers to start indexing computations from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 5:51 PM · Puppet recipes, Indexer, General
gitlab-migration changed the status of T579: Puppetize subset of indexers to start indexing computations, a subtask of T580: Start indexing computations (mimetype, language), from Resolved to Migrated.
Oct 19 2022, 5:51 PM · Indexer, General
gitlab-migration changed the status of T579: Puppetize subset of indexers to start indexing computations, a subtask of T574: Pipeline copy content to azure and then compute multiple indexes independently (meta task), from Resolved to Migrated.
Oct 19 2022, 5:51 PM · Indexer, General

Oct 17 2022

ardumont moved T4606: Deploy swh-indexer v2.7.0 from Backlog to Weekly backlog on the System administration board.
Oct 17 2022, 9:50 AM · System administration, Indexer

Oct 11 2022

vlorentz added a parent task for T4606: Deploy swh-indexer v2.7.0: T4401: Index metadata from the deposit.
Oct 11 2022, 11:29 AM · System administration, Indexer
vlorentz added a subtask for T4401: Index metadata from the deposit: T4606: Deploy swh-indexer v2.7.0.
Oct 11 2022, 11:29 AM · SWORD deposit, Indexer, Metadata workflow
vlorentz closed T4450: Refactor swh-indexer to simplify non-trivial mapping operations as Resolved.
Oct 11 2022, 11:28 AM · Indexer
vlorentz added a parent task for T4606: Deploy swh-indexer v2.7.0: Unknown Object (Maniphest Task).
Oct 11 2022, 11:28 AM · System administration, Indexer

Oct 7 2022

vlorentz updated the task description for T4612: Most indexers are consuming journal topics slower than messages are produced.
Oct 7 2022, 12:08 PM · Indexer, System administration
vlorentz triaged T4612: Most indexers are consuming journal topics slower than messages are produced as Normal priority.
Oct 7 2022, 12:08 PM · Indexer, System administration

Oct 5 2022

vlorentz added a parent task for T4606: Deploy swh-indexer v2.7.0: T4605: Deploy swh-loader-metadata v0.0.3.
Oct 5 2022, 1:07 PM · System administration, Indexer
vlorentz added a subtask for T4605: Deploy swh-loader-metadata v0.0.3: T4606: Deploy swh-indexer v2.7.0.
Oct 5 2022, 1:07 PM · Metadata Loaders, System administration
vlorentz added a comment to T4605: Deploy swh-loader-metadata v0.0.3.

also, this should preferably be deployed after swh-indexer v2.7.0 so we don't need to reset indexer journal consumers to index metadata from Gitea, but not a requirement.

Oct 5 2022, 1:07 PM · Metadata Loaders, System administration
vlorentz added a subtask for T4457: Index metadata from Gitea/Gogs: T4606: Deploy swh-indexer v2.7.0.
Oct 5 2022, 1:06 PM · Origin-Gitea/Gogs, Extrinsic metadata, Indexer
vlorentz added a parent task for T4606: Deploy swh-indexer v2.7.0: T4457: Index metadata from Gitea/Gogs.
Oct 5 2022, 1:06 PM · System administration, Indexer
vlorentz triaged T4606: Deploy swh-indexer v2.7.0 as Normal priority.
Oct 5 2022, 1:05 PM · System administration, Indexer
vlorentz updated the task description for T4605: Deploy swh-loader-metadata v0.0.3.
Oct 5 2022, 1:05 PM · Metadata Loaders, System administration
vlorentz updated the task description for T4605: Deploy swh-loader-metadata v0.0.3.
Oct 5 2022, 1:05 PM · Metadata Loaders, System administration
vlorentz triaged T4605: Deploy swh-loader-metadata v0.0.3 as Normal priority.
Oct 5 2022, 1:04 PM · Metadata Loaders, System administration

Sep 28 2022

ardumont closed T4459: Deploy swh-indexer > v2.6 on staging then production, a subtask of T4392: Metadata Indexer for NuGet (.nuspec), as Resolved.
Sep 28 2022, 7:22 PM · Indexer
ardumont closed T4459: Deploy swh-indexer > v2.6 on staging then production, a subtask of T4401: Index metadata from the deposit, as Resolved.
Sep 28 2022, 7:22 PM · SWORD deposit, Indexer, Metadata workflow
ardumont closed T4459: Deploy swh-indexer > v2.6 on staging then production as Resolved.
Sep 28 2022, 7:22 PM · Indexer, System administration
vlorentz added revisions to T4401: Index metadata from the deposit: D8570: Index extrinsic metadata from the deposit, D8568: codemeta: Fix crash when translating PropertyValue objects from codemeta-in-SWORD.
Sep 28 2022, 7:02 PM · SWORD deposit, Indexer, Metadata workflow
vlorentz closed T4536: Document how swh-indexer uses Codemeta crosswalks as Resolved.
Sep 28 2022, 12:57 PM · Documentation, Indexer

Sep 27 2022

vlorentz added a revision to T4536: Document how swh-indexer uses Codemeta crosswalks: D8549: Make read_crosstable public and document it..
Sep 27 2022, 2:14 PM · Documentation, Indexer

Sep 26 2022

vlorentz closed T4392: Metadata Indexer for NuGet (.nuspec) as Resolved.
Sep 26 2022, 6:30 PM · Indexer

Sep 16 2022

ardumont moved T4459: Deploy swh-indexer > v2.6 on staging then production from code-review/await-feedback/pause to deployed/landed/monitoring on the System administration board.
Sep 16 2022, 6:09 PM · Indexer, System administration

Sep 15 2022

ardumont added a revision to T4459: Deploy swh-indexer > v2.6 on staging then production: D8493: indexer: Use public brokers in production, internal ones for staging.
Sep 15 2022, 6:29 PM · Indexer, System administration
ardumont added a comment to T4459: Deploy swh-indexer > v2.6 on staging then production.

There's a few issues with the configuration of these indexer clients:
the traffic should not be going through the IPSec VPN. They need to use the public,
authenticated kafka endpoints. The IPSec load is making all azure communication
struggle.

ack, that should be "simple" enough to adapt [1]
[1] https://docs.softwareheritage.org/sysadm/mirror-operations/onboard.html?highlight=credential#how-to-create-the-journal-credentials

Sep 15 2022, 6:15 PM · Indexer, System administration
ardumont added a revision to T4459: Deploy swh-indexer > v2.6 on staging then production: D8492: indexer: Allow journal client authentication configuration.
Sep 15 2022, 5:49 PM · Indexer, System administration
vlorentz triaged T4536: Document how swh-indexer uses Codemeta crosswalks as Normal priority.
Sep 15 2022, 12:04 PM · Documentation, Indexer

Sep 13 2022

ardumont moved T4459: Deploy swh-indexer > v2.6 on staging then production from deployed/landed/monitoring to code-review/await-feedback/pause on the System administration board.
Sep 13 2022, 3:42 PM · Indexer, System administration
vlorentz added a revision to T4457: Index metadata from Gitea/Gogs: D8460: Add Gitea metadata mapping.
Sep 13 2022, 1:31 PM · Origin-Gitea/Gogs, Extrinsic metadata, Indexer
ardumont added a comment to T4459: Deploy swh-indexer > v2.6 on staging then production.

There's a few issues with the configuration of these indexer clients:

Sep 13 2022, 10:06 AM · Indexer, System administration

Sep 12 2022

vlorentz added a comment to T4459: Deploy swh-indexer > v2.6 on staging then production.

I'm guessing that's the extrinsic metadata indexer; others need to do plenty of random access to the storage, but that one consumes very quickly from Kafka. On the bright side, it consumes the entire topic within hours so parallelism could be reduced, as a quick fix

Sep 12 2022, 9:04 PM · Indexer, System administration
vsellier added a comment to T4459: Deploy swh-indexer > v2.6 on staging then production.

All the indexers were stopped at 20:00 FR because something was consummng all the bandwidth of the VPN between azure and the our infra.

root@pergamon:/etc/clustershell# clush -b -w @indexer-workers "puppet agent --disable 'stop indexer to avoid bandwith consumption'"
root@pergamon:/etc/clustershell# clush -b -w @indexer-workers "systemctl stop swh-indexer-journal-client@*"
Sep 12 2022, 8:10 PM · Indexer, System administration
olasd updated subscribers of T4459: Deploy swh-indexer > v2.6 on staging then production.

There's a few issues with the configuration of these indexer clients:

Sep 12 2022, 8:10 PM · Indexer, System administration
ardumont moved T4459: Deploy swh-indexer > v2.6 on staging then production from in-progress to deployed/landed/monitoring on the System administration board.
Sep 12 2022, 6:01 PM · Indexer, System administration
ardumont moved T4459: Deploy swh-indexer > v2.6 on staging then production from Weekly backlog to in-progress on the System administration board.
Sep 12 2022, 6:01 PM · Indexer, System administration
ardumont updated the task description for T4459: Deploy swh-indexer > v2.6 on staging then production.
Sep 12 2022, 6:01 PM · Indexer, System administration
ardumont updated the task description for T4459: Deploy swh-indexer > v2.6 on staging then production.
Sep 12 2022, 5:48 PM · Indexer, System administration
ardumont updated the task description for T4459: Deploy swh-indexer > v2.6 on staging then production.
Sep 12 2022, 5:47 PM · Indexer, System administration
ardumont renamed T4459: Deploy swh-indexer > v2.6 on staging then production from Deploy swh-indexer > v2.5 on production and staging to Deploy swh-indexer > v2.6 on staging then production.
Sep 12 2022, 5:33 PM · Indexer, System administration

Sep 8 2022

ardumont moved T4459: Deploy swh-indexer > v2.6 on staging then production from in-progress to Weekly backlog on the System administration board.
Sep 8 2022, 11:16 AM · Indexer, System administration
ardumont renamed T4459: Deploy swh-indexer > v2.6 on staging then production from Deploy swh-indexer v2.4 on production and staging to Deploy swh-indexer > v2.5 on production and staging.
Sep 8 2022, 11:16 AM · Indexer, System administration

Sep 2 2022

vlorentz triaged T4490: gemspec mapping: Add support for optional parameter as Normal priority.
Sep 2 2022, 2:13 PM · Easy hack, Indexer

Sep 1 2022

vlorentz triaged T4480: NpmMapping: Add basic support of SPDX expressions as Normal priority.
Sep 1 2022, 4:59 PM · Indexer, Easy hack
ardumont updated the task description for T4459: Deploy swh-indexer > v2.6 on staging then production.
Sep 1 2022, 11:50 AM · Indexer, System administration

Aug 31 2022

vlorentz added revisions to T4459: Deploy swh-indexer > v2.6 on staging then production: D8372: base: Filter out empty URIs so PyLD does not crash, D8373: Filter out more invalid URIs that make PyLD crash.
Aug 31 2022, 9:14 PM · Indexer, System administration
ardumont closed T4477: staging origin intrinsic metadata indexer are stuck as Resolved.
Aug 31 2022, 6:49 PM · Indexer, System administration
ardumont closed T4477: staging origin intrinsic metadata indexer are stuck, a subtask of T4459: Deploy swh-indexer > v2.6 on staging then production, as Resolved.
Aug 31 2022, 6:49 PM · Indexer, System administration
ardumont moved T4477: staging origin intrinsic metadata indexer are stuck from in-progress to deployed/landed/monitoring on the System administration board.
Aug 31 2022, 6:49 PM · Indexer, System administration
ardumont changed the status of T4477: staging origin intrinsic metadata indexer are stuck, a subtask of T4459: Deploy swh-indexer > v2.6 on staging then production, from Open to Work in Progress.
Aug 31 2022, 6:49 PM · Indexer, System administration
ardumont changed the status of T4477: staging origin intrinsic metadata indexer are stuck from Open to Work in Progress.
Aug 31 2022, 6:49 PM · Indexer, System administration
vsellier added a revision to T4477: staging origin intrinsic metadata indexer are stuck: D8371: staging: Increase the number of workers for storage and indexer storage.
Aug 31 2022, 6:19 PM · Indexer, System administration
ardumont added a revision to T4477: staging origin intrinsic metadata indexer are stuck: D8370: staging intrinsic metadata indexer: Declare batch size to 100.
Aug 31 2022, 6:17 PM · Indexer, System administration
ardumont added a comment to T4477: staging origin intrinsic metadata indexer are stuck.

The lag is subsiding now, slowly because only 1 journal client:

Aug 31 2022, 6:09 PM · Indexer, System administration
ardumont updated subscribers of T4477: staging origin intrinsic metadata indexer are stuck.

After further investigation w/ @vsellier, it's also related to our storage and indexer storage having too few gunicorn workers serving the journal clients (among other things).
So decreasing the batch size to something like 100 and fixing that should fairly help ^.

Aug 31 2022, 6:06 PM · Indexer, System administration
ardumont added a revision to T4477: staging origin intrinsic metadata indexer are stuck: D8369: indexer.cli: Allow batch_size configuration on journal client.
Aug 31 2022, 5:20 PM · Indexer, System administration
ardumont added a comment to T4477: staging origin intrinsic metadata indexer are stuck.

Activating debug log [1]

Aug 31 2022, 5:18 PM · Indexer, System administration
ardumont updated the task description for T4477: staging origin intrinsic metadata indexer are stuck.
Aug 31 2022, 4:43 PM · Indexer, System administration
ardumont triaged T4477: staging origin intrinsic metadata indexer are stuck as Normal priority.
Aug 31 2022, 4:41 PM · Indexer, System administration
ardumont renamed T4459: Deploy swh-indexer > v2.6 on staging then production from Deploy swh-indexer v2.4.2 on production and staging to Deploy swh-indexer v2.4 on production and staging.
Aug 31 2022, 4:39 PM · Indexer, System administration
ardumont added a comment to T4459: Deploy swh-indexer > v2.6 on staging then production.

@vlorentz fixed another error directly within the model to deal with old versioned objects out of the model.
This meant a new release for swh.model and swh.indexer.

Unfortunately, now the indexer debian build is broken due to the objstorage debian build being broken...

objstorage build unstuck [1]
Triggered back the build for indexer.

[1] https://jenkins.softwareheritage.org/view/swh-debian%20(draft)/job/debian/job/packages/job/DOBJS/job/gbp-buildpackage/

Aug 31 2022, 3:31 PM · Indexer, System administration
vlorentz added revisions to T4459: Deploy swh-indexer > v2.6 on staging then production: D8351: Revert "metadata: Drop unsupported key 'type'", D8350: Add support for old dicts in RawExtrinsicMetadata.from_dict.
Aug 31 2022, 3:00 PM · Indexer, System administration
ardumont added a comment to T4459: Deploy swh-indexer > v2.6 on staging then production.

@vlorentz fixed another error directly within the model to deal with old versioned objects out of the model.
This meant a new release for swh.model and swh.indexer.

Unfortunately, now the indexer debian build is broken due to the objstorage debian build being broken...

Aug 31 2022, 2:54 PM · Indexer, System administration
ardumont added a comment to T4459: Deploy swh-indexer > v2.6 on staging then production.

@vlorentz fixed another error directly within the model to deal with old versioned objects.
This meant a new release for swh.model and swh.indexer.

Aug 31 2022, 2:38 PM · Indexer, System administration

Aug 30 2022

ardumont added a comment to T4459: Deploy swh-indexer > v2.6 on staging then production.

Drop the debian constraint on python3-rdflib. Trigger a rebuild and upgraded the package
again. Added the unconditional dependency on python3-rdflib-jsonld dependency (which on
latest debian release is not useful but without being a blocker).

Aug 30 2022, 4:11 PM · Indexer, System administration
ardumont added a comment to T4459: Deploy swh-indexer > v2.6 on staging then production.

Workers refuse to upgrade to the actual 2.4.3 version [1]. I did not realize my previous
upgrade from yesterday stopped at the v2.3.0.

Aug 30 2022, 2:38 PM · Indexer, System administration
ardumont raised the priority of T4459: Deploy swh-indexer > v2.6 on staging then production from Low to Normal.
Aug 30 2022, 11:36 AM · Indexer, System administration
ardumont added a revision to T4459: Deploy swh-indexer > v2.6 on staging then production: D8340: metadata: Drop unsupported key 'type'.
Aug 30 2022, 10:14 AM · Indexer, System administration
ardumont added a comment to T4459: Deploy swh-indexer > v2.6 on staging then production.

Consumer lag is steadily increasing since yesterday [1]. I believe workers are hit by [2] issue.
I've opened [3] to try and unstuck it.

Aug 30 2022, 10:08 AM · Indexer, System administration
vlorentz closed T4277: Deal with null characters in the output of the metadata indexer, a subtask of T4274: Resolve all known crashes in the metadata indexer, as Resolved.
Aug 30 2022, 10:04 AM · Indexer, Metadata workflow
vlorentz closed T4277: Deal with null characters in the output of the metadata indexer as Resolved.
Aug 30 2022, 10:04 AM · Indexer, Metadata workflow
ardumont closed T4429: Deploy swh-indexer v2.3.0 on production and staging, a subtask of T4277: Deal with null characters in the output of the metadata indexer, as Resolved.
Aug 30 2022, 10:01 AM · Indexer, Metadata workflow
ardumont closed T4429: Deploy swh-indexer v2.3.0 on production and staging, a subtask of T4459: Deploy swh-indexer > v2.6 on staging then production, as Resolved.
Aug 30 2022, 10:01 AM · Indexer, System administration
ardumont closed T4429: Deploy swh-indexer v2.3.0 on production and staging as Resolved.

closing as T4459 will take care of this

Aug 30 2022, 10:01 AM · System administration, Indexer

Aug 29 2022

ardumont updated the task description for T4459: Deploy swh-indexer > v2.6 on staging then production.
Aug 29 2022, 5:27 PM · Indexer, System administration
ardumont changed the status of T4459: Deploy swh-indexer > v2.6 on staging then production, a subtask of T4392: Metadata Indexer for NuGet (.nuspec), from Open to Work in Progress.
Aug 29 2022, 4:35 PM · Indexer
ardumont changed the status of T4459: Deploy swh-indexer > v2.6 on staging then production, a subtask of T4401: Index metadata from the deposit, from Open to Work in Progress.
Aug 29 2022, 4:35 PM · SWORD deposit, Indexer, Metadata workflow
ardumont changed the status of T4459: Deploy swh-indexer > v2.6 on staging then production from Open to Work in Progress.
Aug 29 2022, 4:35 PM · Indexer, System administration
ardumont updated the task description for T4459: Deploy swh-indexer > v2.6 on staging then production.
Aug 29 2022, 4:35 PM · Indexer, System administration

Aug 25 2022

vlorentz added a parent task for T4459: Deploy swh-indexer > v2.6 on staging then production: T4392: Metadata Indexer for NuGet (.nuspec).
Aug 25 2022, 3:00 PM · Indexer, System administration
vlorentz added a subtask for T4392: Metadata Indexer for NuGet (.nuspec): T4459: Deploy swh-indexer > v2.6 on staging then production.
Aug 25 2022, 3:00 PM · Indexer
vlorentz added a parent task for T4459: Deploy swh-indexer > v2.6 on staging then production: T4401: Index metadata from the deposit.
Aug 25 2022, 3:00 PM · Indexer, System administration
vlorentz edited subtasks for T4401: Index metadata from the deposit, added: T4459: Deploy swh-indexer > v2.6 on staging then production; removed: Restricted Maniphest Task.
Aug 25 2022, 3:00 PM · SWORD deposit, Indexer, Metadata workflow
vlorentz updated the task description for T4459: Deploy swh-indexer > v2.6 on staging then production.
Aug 25 2022, 2:58 PM · Indexer, System administration
vlorentz added a parent task for T4429: Deploy swh-indexer v2.3.0 on production and staging: T4459: Deploy swh-indexer > v2.6 on staging then production.
Aug 25 2022, 2:57 PM · System administration, Indexer
vlorentz added a subtask for T4459: Deploy swh-indexer > v2.6 on staging then production: T4429: Deploy swh-indexer v2.3.0 on production and staging.
Aug 25 2022, 2:57 PM · Indexer, System administration