Page MenuHomeSoftware Heritage
Feed Advanced Search

Apr 2 2021

vsellier added a comment to T3194: Upgrade opnsense firewalls from 20.7.4 to 21.1.4.

After solving the problem the upgrade was pretty smooth. The firewall perform the following steps:

  • upgrade to the last current minor version of the current major branch
  • upgrade to the first minor version of the next major branch
  • upgrade to the last minor version ot the current major branch
Apr 2 2021, 10:29 AM · System administration
vsellier accepted D5406: prod/webapp: Deploy new production_db configuration.

lgtm

Apr 2 2021, 10:17 AM
vsellier added a comment to T3194: Upgrade opnsense firewalls from 20.7.4 to 21.1.4.

Before starting the upgrade, we discovered 2 problem we had to fix:

  1. The backup had no access to internet we block the upgrade
  2. The master/backup switch was not working for 4 of the 8 VIPs
Apr 2 2021, 10:07 AM · System administration

Apr 1 2021

vsellier changed the status of T3194: Upgrade opnsense firewalls from 20.7.4 to 21.1.4 from Open to Work in Progress.
Apr 1 2021, 2:35 PM · System administration
vsellier closed T3190: counters: Error during directory topic ingestion, a subtask of T2912: Next generation archive counters, as Resolved.
Apr 1 2021, 2:18 PM · Roadmap 2021, System administration, Monitoring, Web app
vsellier closed T3190: counters: Error during directory topic ingestion as Resolved.
Apr 1 2021, 2:18 PM · System administration, Monitoring
vsellier closed D5399: counters: allow to consume big messages of the directory topic.
Apr 1 2021, 12:54 PM
vsellier committed rSPSITE2c2e7ed2403f: counters: allow to consume big messages of the directory topic (authored by vsellier).
counters: allow to consume big messages of the directory topic
Apr 1 2021, 12:54 PM
vsellier added a comment to D5398: postgresql/client: Fix redundant user entry setup.

Is the user not used for the creation of the pgpass file ?

Apr 1 2021, 12:49 PM
vsellier committed rSENV5bcf7c1cc9b3: Update octocatalog-diff facts (authored by vsellier).
Update octocatalog-diff facts
Apr 1 2021, 12:37 PM
vsellier requested review of D5399: counters: allow to consume big messages of the directory topic.
Apr 1 2021, 12:37 PM
vsellier added a revision to T3190: counters: Error during directory topic ingestion: D5399: counters: allow to consume big messages of the directory topic.
Apr 1 2021, 12:37 PM · System administration, Monitoring
vsellier added a comment to T3190: counters: Error during directory topic ingestion.

An improvment of the journal client is necessary to add the support of this configuration like for the producer:

Do you need such improvment though? According to the code you linked, you could pass a
producer_config dict with that key and value.

Apr 1 2021, 12:31 PM · System administration, Monitoring
vsellier added a comment to T3191: journal-client: Add support of max message size configuration.

The journal client supports dynamic configuration via kwargs so no there is no need to improve it.

Apr 1 2021, 12:11 PM · Journal
vsellier closed T3191: journal-client: Add support of max message size configuration as Invalid.
Apr 1 2021, 12:11 PM · Journal
vsellier changed the status of T3191: journal-client: Add support of max message size configuration from Open to Work in Progress.
Apr 1 2021, 11:52 AM · Journal
vsellier added a comment to T3190: counters: Error during directory topic ingestion.

It seems the problem is not present anymore with a higher max message size ('500 * 1024 * 1024').

Apr 1 2021, 11:35 AM · System administration, Monitoring
vsellier added a comment to T3190: counters: Error during directory topic ingestion.

for the record, increasing the property message.max.bytes to 100 * 1024 * 1024 in the consumer configuration is not solving the problem

Apr 1 2021, 10:32 AM · System administration, Monitoring
vsellier added a comment to T3190: counters: Error during directory topic ingestion.

The same problem occured during the poc, theses messages were ignored by using this consumer configuration "errors.tolerance": 'all' [1].
I will try to find if there is a more elegant way to deal with this issue ;)

Apr 1 2021, 10:03 AM · System administration, Monitoring
vsellier updated the task description for T3190: counters: Error during directory topic ingestion.
Apr 1 2021, 9:46 AM · System administration, Monitoring
vsellier changed the status of T3190: counters: Error during directory topic ingestion from Open to Work in Progress.
Apr 1 2021, 9:38 AM · System administration, Monitoring

Mar 31 2021

vsellier updated subscribers of T3041: [production] Provision enough space for the search ES cluster to ingest all intrinsic metadata.

After talking with @rdicosmo, we finally chose to replace on each server the 4 HDD 2.4To by 6 SSD 1.9To to be sure we will have good performances and enought space for the future.
The quote wil nowl be sent to the purchasing service according to the usual procedure [1]

Mar 31 2021, 3:21 PM · System administration, Archive search

Mar 30 2021

vsellier added a project to T3143: Migrate revision metadata to extid in the storage: System administration.
Mar 30 2021, 5:26 PM · System administration, Storage manager, Core Loader
vsellier added a comment to T3041: [production] Provision enough space for the search ES cluster to ingest all intrinsic metadata.

Final quotation sent for approval.
The details are:
3 PowerEdge R6515 (1u) with per server:

  • 10 disks enclosure
  • BOSS controller with 2 240Go cards (for system)
  • 4 SAS 2.5" 10k 2.4To disks
  • SFP+ network card
  • 2 SFP cables
  • 2 power supplies with their cables
  • IDRac enterprise
  • Rack mount rails with cable management
Mar 30 2021, 2:53 PM · System administration, Archive search
vsellier accepted D5383: sys-info: Rework the how-to deploy to add details on architecture.

lgtm

Mar 30 2021, 2:14 PM
vsellier closed T3188: staging/journal: create douardda credentials as Resolved.

credentials sent by PM

Mar 30 2021, 12:52 PM · System administration
vsellier updated the task description for T3188: staging/journal: create douardda credentials.
Mar 30 2021, 12:44 PM · System administration
vsellier added a comment to T3188: staging/journal: create douardda credentials.
  • unprivileged user :
username=swh-douardda
password=XXXXX
Mar 30 2021, 12:43 PM · System administration
vsellier changed the status of T3188: staging/journal: create douardda credentials from Open to Work in Progress.
Mar 30 2021, 12:19 PM · System administration
vsellier triaged T3188: staging/journal: create douardda credentials as Normal priority.
Mar 30 2021, 12:18 PM · System administration
vsellier closed D5377: network: Remove unecessary route between internal network and VLAN1300.
Mar 30 2021, 10:00 AM
vsellier committed rSPSITE6935f3532507: network: Remove unecessary route between internal network and VLAN1300 (authored by vsellier).
network: Remove unecessary route between internal network and VLAN1300
Mar 30 2021, 10:00 AM
vsellier updated the diff for D5377: network: Remove unecessary route between internal network and VLAN1300.

rebase

Mar 30 2021, 9:59 AM

Mar 29 2021

vsellier requested review of D5377: network: Remove unecessary route between internal network and VLAN1300.
Mar 29 2021, 5:58 PM

Mar 26 2021

vsellier committed rDSNIP64291f923a5a: explicit a couple of relations (authored by vsellier).
explicit a couple of relations
Mar 26 2021, 2:37 PM
vsellier added a comment to T3165: Generate historical data from the new counters series.

The final counters architecture looks like this with this improvment:

Mar 26 2021, 12:38 PM · System administration, Monitoring
vsellier committed rDSNIPcf77331a0f8f: Upgrade counters architecture to handle the historical data management (authored by vsellier).
Upgrade counters architecture to handle the historical data management
Mar 26 2021, 12:33 PM
vsellier added a comment to T3165: Generate historical data from the new counters series.

An improvment idea came to me during the refactoring, the script can be splitted and integrated in the 'swh-counters' codebase.

Mar 26 2021, 11:40 AM · System administration, Monitoring
vsellier accepted D5343: docs/sys-info: Update deployment documentation.

lgtm

Mar 26 2021, 10:28 AM
vsellier added inline comments to D5343: docs/sys-info: Update deployment documentation.
Mar 26 2021, 10:23 AM
vsellier accepted D5342: docs/sys-info: Update information and rework sentence phrasing.

lgtm

Mar 26 2021, 10:17 AM

Mar 25 2021

vsellier added inline comments to D5342: docs/sys-info: Update information and rework sentence phrasing.
Mar 25 2021, 6:52 PM
vsellier requested changes to D5342: docs/sys-info: Update information and rework sentence phrasing.
Mar 25 2021, 6:51 PM
vsellier accepted D5341: docs: Unify doc and git READMEs.

LGTM

Mar 25 2021, 5:58 PM
vsellier closed T3175: Prepare production environment as Resolved.

node counters1.internal.softwareheritage.org deployed by terraform. The inventory section is created accordingly[1].
The journal_client is running.

Mar 25 2021, 5:36 PM · Roadmap 2021, System administration, Monitoring
vsellier closed T3175: Prepare production environment, a subtask of T2912: Next generation archive counters, as Resolved.
Mar 25 2021, 5:36 PM · Roadmap 2021, System administration, Monitoring, Web app
vsellier committed rSPRE17a1de991af9: production: add counters1 node (authored by vsellier).
production: add counters1 node
Mar 25 2021, 5:34 PM
vsellier closed D5338: counters: Declare production node.
Mar 25 2021, 3:57 PM
vsellier committed rSPSITE83ab9664ca65: counters: Declare production node (authored by vsellier).
counters: Declare production node
Mar 25 2021, 3:57 PM
vsellier requested review of D5338: counters: Declare production node.
Mar 25 2021, 3:33 PM
vsellier added a revision to T3175: Prepare production environment: D5338: counters: Declare production node.
Mar 25 2021, 3:33 PM · Roadmap 2021, System administration, Monitoring
vsellier committed rSENV02e36ac6de2b: vagrant: add prod-counters1 vm (authored by vsellier).
vagrant: add prod-counters1 vm
Mar 25 2021, 3:31 PM
vsellier changed the status of T3175: Prepare production environment from Open to Work in Progress.
Mar 25 2021, 2:52 PM · Roadmap 2021, System administration, Monitoring
vsellier changed the status of T3165: Generate historical data from the new counters series, a subtask of T2912: Next generation archive counters, from Open to Work in Progress.
Mar 25 2021, 2:38 PM · Roadmap 2021, System administration, Monitoring, Web app
vsellier changed the status of T3165: Generate historical data from the new counters series from Open to Work in Progress.
Mar 25 2021, 2:38 PM · System administration, Monitoring
vsellier closed T3164: Expose counters in prometheus format as Resolved.
Mar 25 2021, 2:38 PM · System administration, Monitoring
vsellier closed T3164: Expose counters in prometheus format, a subtask of T2912: Next generation archive counters, as Resolved.
Mar 25 2021, 2:38 PM · Roadmap 2021, System administration, Monitoring, Web app
vsellier accepted D5334: README: Update description.
Mar 25 2021, 2:16 PM
vsellier closed D5332: counters: count objects from more topics.
Mar 25 2021, 12:30 PM
vsellier committed rSPSITE9558e50ce902: counters: count objects from more topics (authored by vsellier).
counters: count objects from more topics
Mar 25 2021, 12:30 PM
vsellier added a comment to D5332: counters: count objects from more topics.

:)
thanks

Mar 25 2021, 12:29 PM
vsellier added a comment to T3164: Expose counters in prometheus format.

The counters are now exposed throught a /metrics enpoint and ingested by prometheus.
They are well tagged per environment so we will be able to isolate the counters for each one:

Mar 25 2021, 12:27 PM · System administration, Monitoring
vsellier added a revision to T3164: Expose counters in prometheus format: D5332: counters: count objects from more topics.
Mar 25 2021, 12:14 PM · System administration, Monitoring
vsellier requested review of D5332: counters: count objects from more topics.
Mar 25 2021, 12:14 PM
vsellier closed D5329: counters: fix wrong service port in prometheus job.
Mar 25 2021, 11:34 AM
vsellier committed rSPSITE15c5ae96f603: counters: fix wrong service port in prometheus job (authored by vsellier).
counters: fix wrong service port in prometheus job
Mar 25 2021, 11:34 AM
vsellier requested review of D5329: counters: fix wrong service port in prometheus job.
Mar 25 2021, 11:30 AM
vsellier closed D5324: counters: add a prometheus job to read the new metrics end-point.
Mar 25 2021, 9:27 AM
vsellier committed rSPSITE5db040600da7: counters: add a prometheus job to read the new metrics end-point (authored by vsellier).
counters: add a prometheus job to read the new metrics end-point
Mar 25 2021, 9:27 AM

Mar 24 2021

vsellier added a revision to T3164: Expose counters in prometheus format: D5324: counters: add a prometheus job to read the new metrics end-point.
Mar 24 2021, 6:42 PM · System administration, Monitoring
vsellier requested review of D5324: counters: add a prometheus job to read the new metrics end-point.
Mar 24 2021, 6:42 PM
vsellier committed rSENVf8e4410fec19: Update octocatalog-diff facts (authored by vsellier).
Update octocatalog-diff facts
Mar 24 2021, 6:36 PM
vsellier closed D5321: Allow prometheus to retrieve the counter values.
Mar 24 2021, 6:23 PM
vsellier committed rDCNT7dbe186b010a: Allow prometheus to retrieve the counter values (authored by vsellier).
Allow prometheus to retrieve the counter values
Mar 24 2021, 6:23 PM
vsellier committed rDENV3a7d8fdb9877: docker: Configure prometheus to retrieve swh-counters metrics (authored by vsellier).
docker: Configure prometheus to retrieve swh-counters metrics
Mar 24 2021, 6:21 PM
vsellier closed D5322: docker: Configure prometheus to retrieve swh-counters metrics.
Mar 24 2021, 6:21 PM
vsellier accepted D5323: keycloak/deposit: Drop option direct_grant_flow.

LGTM

Mar 24 2021, 6:20 PM
vsellier added a revision to T3164: Expose counters in prometheus format: D5322: docker: Configure prometheus to retrieve swh-counters metrics.
Mar 24 2021, 5:45 PM · System administration, Monitoring
vsellier requested review of D5322: docker: Configure prometheus to retrieve swh-counters metrics.
Mar 24 2021, 5:45 PM
vsellier requested review of D5321: Allow prometheus to retrieve the counter values.
Mar 24 2021, 5:33 PM
vsellier added a revision to T3164: Expose counters in prometheus format: D5321: Allow prometheus to retrieve the counter values.
Mar 24 2021, 5:32 PM · System administration, Monitoring
vsellier accepted D5317: Deploy memcached on deposit instance.

lgtm

Mar 24 2021, 10:39 AM
vsellier added a comment to T3164: Expose counters in prometheus format.

The current serie were the counters are stored is named sql_swh_archive_object_count, the serie for swh-counters could be swh_archive_object_count

Mar 24 2021, 10:28 AM · System administration, Monitoring
vsellier committed rSENV160a357c4621: vagrant: declare moma and its certificates (authored by vsellier).
vagrant: declare moma and its certificates
Mar 24 2021, 10:25 AM
vsellier changed the status of T3164: Expose counters in prometheus format, a subtask of T2912: Next generation archive counters, from Open to Work in Progress.
Mar 24 2021, 10:08 AM · Roadmap 2021, System administration, Monitoring, Web app
vsellier changed the status of T3164: Expose counters in prometheus format from Open to Work in Progress.
Mar 24 2021, 10:08 AM · System administration, Monitoring
vsellier closed T3086: Prepare disk replacement on granet as Resolved.

The 2 remaining disks were inserted in place of the 2 old sdd and sdf.
They needed to be configured in JBOD mode:

root@granet:~# megacli -PDMakeJBOD  -physdrv[32:3] -a0
Mar 24 2021, 10:06 AM · System administration

Mar 22 2021

vsellier removed a project from T3165: Generate historical data from the new counters series: Web app.
Mar 22 2021, 6:31 PM · System administration, Monitoring
vsellier triaged T3165: Generate historical data from the new counters series as Normal priority.
Mar 22 2021, 6:31 PM · System administration, Monitoring
vsellier triaged T3164: Expose counters in prometheus format as Normal priority.
Mar 22 2021, 5:50 PM · System administration, Monitoring
vsellier closed T3159: Deploy swh-counters:v0.1.0 in staging, a subtask of T2912: Next generation archive counters, as Resolved.
Mar 22 2021, 5:34 PM · Roadmap 2021, System administration, Monitoring, Web app
vsellier closed T3159: Deploy swh-counters:v0.1.0 in staging as Resolved.

A new vm counters0.internal.staging.swh.network is deployed and hosting redis, swh-counters and its journal-client.
The lag in staging will be recovered in a couple of hours.

Mar 22 2021, 5:34 PM · Staging environment, System administration, Monitoring
vsellier closed D5297: staging: Add counters0 vm.
Mar 22 2021, 5:09 PM
vsellier committed rSPREba5211fafa29: staging: Add counters0 vm (authored by vsellier).
staging: Add counters0 vm
Mar 22 2021, 5:09 PM
vsellier committed rSPSITE221db263a3f1: counters: fix the journal client configuration (authored by vsellier).
counters: fix the journal client configuration
Mar 22 2021, 4:06 PM
vsellier requested review of D5297: staging: Add counters0 vm.
Mar 22 2021, 3:40 PM
vsellier added a revision to T3159: Deploy swh-counters:v0.1.0 in staging: D5297: staging: Add counters0 vm.
Mar 22 2021, 3:40 PM · Staging environment, System administration, Monitoring
vsellier closed D5296: Add swh-counters deployment configuration.
Mar 22 2021, 11:38 AM
vsellier committed rSPSITE1618407da8f1: Add swh-counters deployment configuration (authored by vsellier).
Add swh-counters deployment configuration
Mar 22 2021, 11:38 AM
vsellier added inline comments to D5296: Add swh-counters deployment configuration.
Mar 22 2021, 9:47 AM