Page MenuHomeSoftware Heritage
Feed Advanced Search

Jul 11 2022

vsellier committed rSENV333381e88e93: Declare the cassandra nodes (authored by vsellier).
Declare the cassandra nodes
Jul 11 2022, 2:27 PM
vsellier committed rSPSITE02794d0df963: Install zfs and docker on the cassandra node to prepare the cass operator tests (authored by vsellier).
Install zfs and docker on the cassandra node to prepare the cass operator tests
Jul 11 2022, 2:19 PM
vsellier closed D8105: Install zfs and docker on the cassandra node to prepare the cass operator tests.
Jul 11 2022, 2:19 PM
vsellier added inline comments to D8105: Install zfs and docker on the cassandra node to prepare the cass operator tests.
Jul 11 2022, 2:18 PM
vsellier updated the diff for D8105: Install zfs and docker on the cassandra node to prepare the cass operator tests.

update dns configuration to use pergamon directly

Jul 11 2022, 2:13 PM
vsellier committed R260:ae5601358904: fake release (authored by vsellier).
fake release
Jul 11 2022, 12:08 PM
vsellier committed R260:c9a8881ffd16: bootstrap environment's values (authored by vsellier).
bootstrap environment's values
Jul 11 2022, 12:08 PM
vsellier updated the task description for T4387: Scrubber processes getting killed by OOM killer.
Jul 11 2022, 9:47 AM · System administration, Datastore Scrubber
vsellier updated the task description for T4387: Scrubber processes getting killed by OOM killer.
Jul 11 2022, 9:46 AM · System administration, Datastore Scrubber
vsellier created T4387: Scrubber processes getting killed by OOM killer.
Jul 11 2022, 9:43 AM · System administration, Datastore Scrubber
vsellier requested review of D8105: Install zfs and docker on the cassandra node to prepare the cass operator tests.
Jul 11 2022, 9:33 AM
vsellier added a revision to T4373: [cassandra] Test the new hardware: D8105: Install zfs and docker on the cassandra node to prepare the cass operator tests.
Jul 11 2022, 9:33 AM · Storage manager, System administration

Jul 7 2022

vsellier added a comment to T4379: [cassandra] create etcd / controlplane servers.

The management nodes were correctly created but it seems rancher is having some issuer to register them in the cluster.

Jul 7 2022, 6:52 PM · Storage manager, System administration
vsellier closed T4359: Update rancher cluster to kubernetes 1.22 as Resolved.

The kubernetes upgrade was launched through the azure portal (it's also possible to trigger it with the az command line)
Everything looks fine:

  • A new node with the version 1.22.6 was triggerd
kubectl get pods -o wide; echo; kubectl get nodes -o wide
NAME                               READY   STATUS    RESTARTS      AGE   IP            NODE                              NOMINATED NODE   READINESS GATES
debian                             1/1     Running   1 (23m ago)   27m   10.244.0.63   aks-default-36212332-vmss000000   <none>           <none>
rancher-59f4c74c6f-5vlq6           1/1     Running   0             91m   10.244.0.59   aks-default-36212332-vmss000000   <none>           <none>
rancher-59f4c74c6f-92txx           1/1     Running   0             90m   10.244.0.60   aks-default-36212332-vmss000000   <none>           <none>
rancher-59f4c74c6f-cfshs           1/1     Running   0             91m   10.244.0.58   aks-default-36212332-vmss000000   <none>           <none>
rancher-webhook-6958cfcddf-2gjwn   1/1     Running   0             85d   10.244.0.26   aks-default-36212332-vmss000000   <none>           <none>
Jul 7 2022, 6:37 PM · System administration
vsellier closed T4359: Update rancher cluster to kubernetes 1.22, a subtask of T4358: Upgrade AKS versions, as Resolved.
Jul 7 2022, 6:37 PM · System administration
vsellier changed the status of T4359: Update rancher cluster to kubernetes 1.22 from Open to Work in Progress.
Jul 7 2022, 6:22 PM · System administration
vsellier changed the status of T4359: Update rancher cluster to kubernetes 1.22, a subtask of T4358: Upgrade AKS versions, from Open to Work in Progress.
Jul 7 2022, 6:22 PM · System administration
vsellier updated the diff for D8094: Declare the kubernetes cluster and management nodes for cassandra.

rebase

Jul 7 2022, 2:56 PM
vsellier accepted D8089: Provision thanos query node.
Jul 7 2022, 2:51 PM
vsellier requested review of D8094: Declare the kubernetes cluster and management nodes for cassandra.
Jul 7 2022, 12:11 PM
vsellier added a revision to T4379: [cassandra] create etcd / controlplane servers: D8094: Declare the kubernetes cluster and management nodes for cassandra.
Jul 7 2022, 12:11 PM · Storage manager, System administration
vsellier changed the status of T4379: [cassandra] create etcd / controlplane servers, a subtask of T4373: [cassandra] Test the new hardware, from Open to Work in Progress.
Jul 7 2022, 11:56 AM · Storage manager, System administration
vsellier changed the status of T4379: [cassandra] create etcd / controlplane servers from Open to Work in Progress.
Jul 7 2022, 11:56 AM · Storage manager, System administration
vsellier added a comment to D8089: Provision thanos query node.

I've no idea if the cpu/memory/disk spec are large enough or not, I didn't find the info on the thanos documentation

Jul 7 2022, 11:52 AM
vsellier requested changes to D8089: Provision thanos query node.
Jul 7 2022, 11:52 AM

Jul 5 2022

vsellier removed a parent task for T4374: [cassandra] Test basic topology: T4373: [cassandra] Test the new hardware.
Jul 5 2022, 5:50 PM · Storage manager, System administration
vsellier removed a subtask for T4373: [cassandra] Test the new hardware: T4374: [cassandra] Test basic topology.
Jul 5 2022, 5:50 PM · Storage manager, System administration
vsellier removed a subtask for T4373: [cassandra] Test the new hardware: T4375: [cassandra] One cassandra per data disk.
Jul 5 2022, 5:50 PM · Storage manager, System administration
vsellier removed a parent task for T4375: [cassandra] One cassandra per data disk: T4373: [cassandra] Test the new hardware.
Jul 5 2022, 5:50 PM · Storage manager, System administration
vsellier added a parent task for T4374: [cassandra] Test basic topology: T4379: [cassandra] create etcd / controlplane servers.
Jul 5 2022, 5:49 PM · Storage manager, System administration
vsellier added a subtask for T4379: [cassandra] create etcd / controlplane servers: T4374: [cassandra] Test basic topology.
Jul 5 2022, 5:49 PM · Storage manager, System administration
vsellier added a parent task for T4375: [cassandra] One cassandra per data disk: T4379: [cassandra] create etcd / controlplane servers.
Jul 5 2022, 5:49 PM · Storage manager, System administration
vsellier added a subtask for T4379: [cassandra] create etcd / controlplane servers: T4375: [cassandra] One cassandra per data disk.
Jul 5 2022, 5:49 PM · Storage manager, System administration
vsellier triaged T4379: [cassandra] create etcd / controlplane servers as Normal priority.
Jul 5 2022, 5:47 PM · Storage manager, System administration
vsellier changed the status of T4373: [cassandra] Test the new hardware from Open to Work in Progress.
Jul 5 2022, 5:41 PM · Storage manager, System administration
vsellier renamed T4359: Update rancher cluster to kubernetes 1.22 from Update AKS cluster to kubernetes 1.22 to Update rancher cluster to kubernetes 1.22.
Jul 5 2022, 5:38 PM · System administration
vsellier accepted D8064: swh-graph: rename services (now production-ready, no longer dev).

Please also merge this in the staging branch and notify the sysadm irc room when it's pushed, we will need to deploy it manually to clean the previous services

Jul 5 2022, 3:55 PM
vsellier triaged T4375: [cassandra] One cassandra per data disk as Normal priority.
Jul 5 2022, 9:52 AM · Storage manager, System administration
vsellier triaged T4374: [cassandra] Test basic topology as Normal priority.
Jul 5 2022, 9:43 AM · Storage manager, System administration
vsellier triaged T4373: [cassandra] Test the new hardware as Normal priority.
Jul 5 2022, 9:36 AM · Storage manager, System administration
vsellier requested changes to D8064: swh-graph: rename services (now production-ready, no longer dev).
Jul 5 2022, 8:49 AM

Jun 30 2022

vsellier committed rCJSWHb0e07c673ec0: wip - add a forge to host the local Changes (authored by vsellier).
wip - add a forge to host the local Changes
Jun 30 2022, 11:25 PM
vsellier committed rCJSWH3d562f112c91: wip - poc the swh-apps pipeline (authored by vsellier).
wip - poc the swh-apps pipeline
Jun 30 2022, 5:13 PM
vsellier committed rCJSWH278f12b744b1: wip - poc the swh-apps pipeline (authored by vsellier).
wip - poc the swh-apps pipeline
Jun 30 2022, 5:10 PM
vsellier closed D8062: fix a typo on the production objstorage vhost.
Jun 30 2022, 4:48 PM
vsellier committed rSPSITE57fb33253ca7: fix a typo on the production objstorage vhost (authored by vsellier).
fix a typo on the production objstorage vhost
Jun 30 2022, 4:47 PM
vsellier requested review of D8062: fix a typo on the production objstorage vhost.
Jun 30 2022, 4:40 PM
vsellier closed D8057: Add a docker environment to test the job-builder inside jenkins.
Jun 30 2022, 4:09 PM
vsellier committed rCJSWH21e47db56cb5: Add a docker environment to test the job-builder inside jenkins (authored by vsellier).
Add a docker environment to test the job-builder inside jenkins
Jun 30 2022, 4:09 PM
vsellier updated the diff for D8057: Add a docker environment to test the job-builder inside jenkins.

rebase

Jun 30 2022, 4:09 PM
vsellier updated the diff for D8057: Add a docker environment to test the job-builder inside jenkins.

fix the readme name

Jun 30 2022, 4:08 PM
vsellier added inline comments to D8057: Add a docker environment to test the job-builder inside jenkins.
Jun 30 2022, 4:07 PM
vsellier requested review of D8057: Add a docker environment to test the job-builder inside jenkins.
Jun 30 2022, 12:48 PM

Jun 29 2022

vsellier triaged T4360: Update gitlab kubernetes cluster to 1.22 as Normal priority.
Jun 29 2022, 9:55 AM · System administration
vsellier triaged T4359: Update rancher cluster to kubernetes 1.22 as Normal priority.
Jun 29 2022, 9:55 AM · System administration
vsellier added a comment to T4358: Upgrade AKS versions.

It seems the rancher cluster can be updated to any version :
from https://rancher.com/docs/rancher/v2.6/en/installation/install-rancher-on-k8s/:

Rancher can be installed on any Kubernetes cluster. This cluster can use upstream Kubernetes, or it can use one of Rancher’s Kubernetes distributions, or it can be a managed Kubernetes cluster from a provider such as Amazon EKS.

It's also confirmed by the suse rke compatibility matrix: https://www.suse.com/assets/EN-Rancherv2.6.4-150422-0151-56.pdf

Jun 29 2022, 9:54 AM · System administration
vsellier moved T4358: Upgrade AKS versions from Backlog to in-progress on the System administration board.
Jun 29 2022, 9:45 AM · System administration
vsellier changed the status of T4358: Upgrade AKS versions from Open to Work in Progress.
Jun 29 2022, 9:45 AM · System administration

Jun 28 2022

vsellier closed T4340: swh-graph timeouts as Wontfix.

I will be solved by D7890

Jun 28 2022, 6:49 PM · Compressed graph service
vsellier closed T4313: [provenance] some process are oom killed as Resolved.
Jun 28 2022, 6:48 PM · System administration, Provenance database
vsellier committed R259:c84dcaac46d9: swh-provenance-client: update requirements-frozen.txt (authored by vsellier).
swh-provenance-client: update requirements-frozen.txt
Jun 28 2022, 6:04 PM
vsellier committed R259:920615f59e89: swh-provenance-client: update requirements-frozen.txt (authored by vsellier).
swh-provenance-client: update requirements-frozen.txt
Jun 28 2022, 5:36 PM
vsellier closed D8040: Limit the number of entries in the cache.
Jun 28 2022, 10:17 AM
vsellier committed rDPROVf5f741366383: Limit the number of entries in the cache (authored by vsellier).
Limit the number of entries in the cache
Jun 28 2022, 10:17 AM
vsellier updated the diff for D8040: Limit the number of entries in the cache.

make mypy happy

Jun 28 2022, 9:38 AM

Jun 27 2022

vsellier updated the diff for D8040: Limit the number of entries in the cache.

add missing parenthesis

Jun 27 2022, 10:59 PM
vsellier added inline comments to D8040: Limit the number of entries in the cache.
Jun 27 2022, 10:52 PM
vsellier updated the diff for D8040: Limit the number of entries in the cache.

update according the reviews

  • simplify the cache management
  • fix the doc strings
Jun 27 2022, 10:52 PM
vsellier accepted D8038: Provision scrubber1 for checker services.
Jun 27 2022, 4:48 PM
vsellier requested review of D8040: Limit the number of entries in the cache.
Jun 27 2022, 4:10 PM
vsellier added a revision to T4313: [provenance] some process are oom killed: D8040: Limit the number of entries in the cache.
Jun 27 2022, 3:58 PM · System administration, Provenance database
vsellier changed the status of T4313: [provenance] some process are oom killed from Open to Work in Progress.
Jun 27 2022, 3:56 PM · System administration, Provenance database
vsellier accepted D8039: Install scrubber services on scrubber nodes.
Jun 27 2022, 3:24 PM
vsellier requested changes to D8038: Provision scrubber1 for checker services.
Jun 27 2022, 3:24 PM

Jun 24 2022

vsellier accepted D8034: Deploy production swh-scrubber db connection.

looks good (for what it's worth)

Jun 24 2022, 6:47 PM
vsellier added a comment to T4340: swh-graph timeouts.

It's confirmed that the issue seems to be on the python part of the current implementation so I'm eager to see D7890 landed ;)

Jun 24 2022, 10:13 AM · Compressed graph service

Jun 22 2022

vsellier added a comment to T4340: swh-graph timeouts.

I reversed engineered the py4j communication protocol, so next time it will hang, we should be able to tell if the issue is on the gateway server side or on the python side:

  • Create a name pipe
mkfifo /tmp/test
chmod a+w /tmp/test
tail -F /tmp/test
  • query the graph
ss -ltp | grep java
<get the port number>
telnet localhost <port number>
c
o0
get_handler
s/tmp/test
e
Jun 22 2022, 2:07 PM · Compressed graph service
vsellier added a comment to T4347: gitlab migration reset state routine is flaky.

Looks like something is wrong in the operator state management.
For what I found on internet, it could be related to the cert-manager version but it should be already fixed. For example: https://gitlab.com/gitlab-org/cloud-native/gitlab-operator/-/issues/315
(The current cert-manager version in the cluster is 1.8.0)

Jun 22 2022, 10:40 AM · System administration, GitLab migration, Roadmap 2020

Jun 20 2022

vsellier updated the task description for T4340: swh-graph timeouts.
Jun 20 2022, 10:17 AM · Compressed graph service
vsellier updated the task description for T4340: swh-graph timeouts.
Jun 20 2022, 10:16 AM · Compressed graph service
vsellier triaged T4340: swh-graph timeouts as High priority.
Jun 20 2022, 10:16 AM · Compressed graph service

Jun 17 2022

vsellier closed T4315: [provenance] Fallback to swh-storage if a revision or its parent is not found in swh-graph as Resolved.

A dozen of clients running in the provenance-client01 are using the multiplexer configuration.
It seems to work correctly

Jun 17 2022, 5:03 PM · Provenance database
vsellier committed rDPROV80434e3b2191: Reduce multiplexer logs output (authored by vsellier).
Reduce multiplexer logs output
Jun 17 2022, 9:43 AM

Jun 16 2022

vsellier accepted D8000: docs/journal-clients: Reference a new anchor title.
Jun 16 2022, 6:02 PM
vsellier closed D7962: kafka: add more options to the user management script.
Jun 16 2022, 5:49 PM
vsellier committed rSPSITE395c4a2a79de: kafka: add more options to the user management script (authored by vsellier).
kafka: add more options to the user management script
Jun 16 2022, 5:49 PM
vsellier updated the diff for D7962: kafka: add more options to the user management script.

rebase

Jun 16 2022, 5:48 PM
vsellier committed R259:6742c91df028: swh-provenance-client: update requirements-frozen.txt (authored by vsellier).
swh-provenance-client: update requirements-frozen.txt
Jun 16 2022, 10:34 AM
vsellier committed rDPROV2453c3ef3b53: Add logs relative to the cache flush performances (authored by vsellier).
Add logs relative to the cache flush performances
Jun 16 2022, 10:31 AM
vsellier committed rDPROV12c45c828f2a: Don't stop the ingestion if an error occurs in one of the archive backend (authored by vsellier).
Don't stop the ingestion if an error occurs in one of the archive backend
Jun 16 2022, 10:31 AM
vsellier committed rDPROVf5ed9de87b39: Improve origin layer logs (authored by vsellier).
Improve origin layer logs
Jun 16 2022, 10:31 AM
vsellier committed rDPROVb69c0f7689f0: Add a new multiplexed archive type (authored by vsellier).
Add a new multiplexed archive type
Jun 16 2022, 10:31 AM
vsellier closed D7985: [provenance] Implement a naive archive multiplexer.
Jun 16 2022, 10:31 AM · Provenance database
vsellier committed rDPROVd45f066a8c51: Declare the missing swh-graph dependency (authored by vsellier).
Declare the missing swh-graph dependency
Jun 16 2022, 10:31 AM
vsellier committed R259:3cb73d1fe0ca: swh-provenance-client: update requirements-frozen.txt (authored by vsellier).
swh-provenance-client: update requirements-frozen.txt
Jun 16 2022, 9:19 AM
vsellier updated the diff for D7985: [provenance] Implement a naive archive multiplexer.

Update according the reviews

  • Add and fix license headers
  • Ensure the _revisions_count variable was computed before returning its value
Jun 16 2022, 8:34 AM · Provenance database

Jun 15 2022

vsellier added inline comments to D7985: [provenance] Implement a naive archive multiplexer.
Jun 15 2022, 8:38 PM · Provenance database
vsellier added a comment to T4064: Test GitLab migration scripts.

\o/ well done

Jun 15 2022, 6:41 PM · System administration, GitLab migration, Roadmap 2020
vsellier retitled D7985: [provenance] Implement a naive archive multiplexer from Declare the missing swh-graph dependency to [provenance] Implement a naive archive multiplexer.
Jun 15 2022, 10:57 AM · Provenance database
vsellier added a comment to D7985: [provenance] Implement a naive archive multiplexer.

I've deliberately created the diff with the 3 commits inside, I just forgot to update the title ;)

Jun 15 2022, 10:55 AM · Provenance database