Page MenuHomeSoftware Heritage
Feed Advanced Search

Oct 10 2022

vsellier committed rSPREaea3211079c0: azure: ignore user_data for hosts created by the legacy scripts (authored by vsellier).
azure: ignore user_data for hosts created by the legacy scripts
Oct 10 2022, 1:40 PM
vsellier changed the status of T4063: Deploy gitlab instance for production, a subtask of T2225: Migrate to GitLab, from Open to Work in Progress.
Oct 10 2022, 1:23 PM · meta-task, Roadmap 2022, GitLab migration, Roadmap 2020
vsellier changed the status of T4063: Deploy gitlab instance for production from Open to Work in Progress.
Oct 10 2022, 1:23 PM · System administration, GitLab migration

Oct 7 2022

vsellier added a comment to T4615: jenkins: Unstuck buster-swh image build.

I've tested with the ips closed to the apache ones (88.99.95.218 and 88.99.95.220) and we can reach them, so we are probably blocked somewhere in the apache's infra

Oct 7 2022, 6:42 PM · System administration
vsellier added a comment to T4615: jenkins: Unstuck buster-swh image build.

Apparently we're blocked somewhere at hertzner:

% tcptraceroute downloads.apache.org
Selected device vlan1300, address XXX, port 49995 for outgoing packets
Tracing the path to downloads.apache.org (88.99.95.219) on TCP port 80 (http), 30 hops max
 1  xxx
 2  xxx 
 3  * * *
 4  xe1-1-10-paris1-rtr-131.noc.renater.fr (193.51.177.106)  1.101 ms  1.038 ms  1.018 ms
 5  renater-ias-geant-gw.par.fr.geant.net (83.97.89.9)  1.470 ms  1.343 ms  1.259 ms
 6  ae5.mx1.gen.ch.geant.net (62.40.98.182)  18.107 ms  18.143 ms  17.893 ms
 7  ae2.mx1.fra.de.geant.net (62.40.98.180)  17.804 ms  27.069 ms  18.987 ms
 8  * * *
 9  core24.fsn1.hetzner.com (213.239.252.42)  20.140 ms  20.348 ms  19.926 ms
10  ex9k1.dc1.fsn1.hetzner.com (213.239.245.234)  19.754 ms  19.677 ms  20.425 ms
11  * * *
...
Oct 7 2022, 5:24 PM · System administration
vsellier committed R260:778964d7d721: cassandra-replay: redispatch replayers after revision catchup (authored by vsellier).
cassandra-replay: redispatch replayers after revision catchup
Oct 7 2022, 4:34 PM
vsellier closed T4358: Upgrade AKS versions as Resolved.
Oct 7 2022, 9:08 AM · System administration
vsellier closed T4360: Update gitlab kubernetes cluster to 1.22, a subtask of T4358: Upgrade AKS versions, as Resolved.
Oct 7 2022, 9:08 AM · System administration
vsellier closed T4360: Update gitlab kubernetes cluster to 1.22 as Resolved.

After the operator was updated to 0.12.2, it was possible to upgrade kubernetes to 1.22.
It was done by changing the version in the azure portal:

Oct 7 2022, 9:08 AM · System administration
vsellier closed T4610: upgrade staging instance to 15.4, a subtask of T2221: Development workflow & code quality, as Resolved.
Oct 7 2022, 9:05 AM · meta-task, Roadmap 2020
vsellier closed T4610: upgrade staging instance to 15.4 as Resolved.
Oct 7 2022, 9:05 AM · System administration
vsellier updated the task description for T4610: upgrade staging instance to 15.4.
Oct 7 2022, 9:04 AM · System administration
vsellier committed rSKCONF5058d885aed1: gitlab-staging: upgrade operator to 0.12.4 (authored by vsellier).
gitlab-staging: upgrade operator to 0.12.4
Oct 7 2022, 12:14 AM
vsellier updated the task description for T4610: upgrade staging instance to 15.4.
Oct 7 2022, 12:13 AM · System administration

Oct 6 2022

vsellier committed rSKCONFabefae9c9438: gitlab-staging: fix secrets cluster ip (authored by vsellier).
gitlab-staging: fix secrets cluster ip
Oct 6 2022, 11:33 PM
vsellier committed rSKCONF1e5488d5005a: gitlab-staging: upgrade operator to 0.12.0 (authored by vsellier).
gitlab-staging: upgrade operator to 0.12.0
Oct 6 2022, 9:35 PM
vsellier closed D8635: gitlab-staging: Add the configuration to install the gitlab operator.
Oct 6 2022, 9:32 PM
vsellier committed rSKCONFfef45a541365: gitlab-staging: Add the configuration to install the gitlab operator (authored by vsellier).
gitlab-staging: Add the configuration to install the gitlab operator
Oct 6 2022, 9:32 PM
vsellier updated the task description for T4610: upgrade staging instance to 15.4.
Oct 6 2022, 9:31 PM · System administration
vsellier updated the task description for T4610: upgrade staging instance to 15.4.
Oct 6 2022, 9:30 PM · System administration
vsellier changed the status of T4610: upgrade staging instance to 15.4 from Open to Work in Progress.
Oct 6 2022, 9:17 PM · System administration
vsellier updated the diff for D8635: gitlab-staging: Add the configuration to install the gitlab operator.
  • fix the version number
  • remove the command line in comment
Oct 6 2022, 8:08 PM
vsellier added inline comments to D8635: gitlab-staging: Add the configuration to install the gitlab operator.
Oct 6 2022, 8:05 PM
vsellier added inline comments to D8635: gitlab-staging: Add the configuration to install the gitlab operator.
Oct 6 2022, 1:32 PM
vsellier requested review of D8635: gitlab-staging: Add the configuration to install the gitlab operator.
Oct 6 2022, 11:03 AM
vsellier added a revision to T4063: Deploy gitlab instance for production: D8635: gitlab-staging: Add the configuration to install the gitlab operator.
Oct 6 2022, 11:03 AM · System administration, GitLab migration

Oct 5 2022

vsellier added a comment to T4358: Upgrade AKS versions.

Looks like the gitlab operator is now compatible with kubernetes 1.22.
https://docs.gitlab.com/operator/installation.html#kubernetes

Oct 5 2022, 2:24 PM · System administration
vsellier planned changes to D8617: thanos: Declare archive-production thanos for live data querying.

unfortunately it can't work without T4604 as the store is configured to use the letencrypt certificate:

Oct 5 2022, 10:52 AM
vsellier triaged T4604: [dynamic infra] Manage SSL certificates as Normal priority.
Oct 5 2022, 10:11 AM · System administration
vsellier renamed T4063: Deploy gitlab instance for production from Deploy gitlab instance to Deploy gitlab instance for production.
Oct 5 2022, 9:57 AM · System administration, GitLab migration

Oct 4 2022

vsellier requested review of D8617: thanos: Declare archive-production thanos for live data querying.
Oct 4 2022, 6:45 PM
vsellier added a revision to T4385: Federate prometheus instances through thanos: D8617: thanos: Declare archive-production thanos for live data querying.
Oct 4 2022, 6:45 PM · meta-task, System administration, Roadmap 2022
vsellier closed D8599: k8s-archive-production: Add an internal ingress to expose reaper webui.
Oct 4 2022, 6:42 PM
vsellier committed rSPSITE9065bc6afddd: k8s-archive-production: Add an internal ingress to expose reaper webui (authored by vsellier).
k8s-archive-production: Add an internal ingress to expose reaper webui
Oct 4 2022, 6:42 PM
vsellier updated the diff for D8599: k8s-archive-production: Add an internal ingress to expose reaper webui.

rebase

Oct 4 2022, 6:42 PM
vsellier added a comment to T4251: [swh-search] Investigate long search queries response time.

@KShivendu Any news regarding these profiling ?

Oct 4 2022, 6:22 PM · System administration, Archive search
vsellier triaged T4603: move graphql to a sub url instead of the standalone vhost as Normal priority.
Oct 4 2022, 6:20 PM · System administration, GraphQL API
vsellier closed T4132: Add the graphql service in the docker environment, a subtask of T4131: Graphql service in staging, as Resolved.
Oct 4 2022, 6:16 PM · System administration, GraphQL API
vsellier closed T4132: Add the graphql service in the docker environment as Resolved.
Oct 4 2022, 6:16 PM · System administration, GraphQL API
vsellier updated the task description for T4132: Add the graphql service in the docker environment.
Oct 4 2022, 6:16 PM · System administration, GraphQL API
vsellier closed T4497: [sentry] Out of disk space as Resolved.

Closing as there is no alerts since almost one month

Oct 4 2022, 6:15 PM · Sentry, System administration
vsellier committed rSKCONFee77cc15f039: argocd: ignore argocd-cm and argocd-rabc-cm changes (authored by vsellier).
argocd: ignore argocd-cm and argocd-rabc-cm changes
Oct 4 2022, 4:17 PM
vsellier closed T4534: Evaluate MetalLB as inbound loadbalancer, a subtask of T4523: Dynamic infrastructure, as Resolved.
Oct 4 2022, 3:03 PM · meta-task, System administration
vsellier closed T4534: Evaluate MetalLB as inbound loadbalancer as Resolved.

regarding the last tests, we can start using it to battle proof its usage.
I found in several documentations where it's the tool recommended to manage load balancing on on-premise kubernetes deployments, for example: https://kubernetes.github.io/ingress-nginx/deploy/baremetal/#a-pure-software-solution-metallb

Oct 4 2022, 3:03 PM · System administration
vsellier added a comment to T4385: Federate prometheus instances through thanos.

thanos exposed on the production cluster with this commit: rSPRE8fade05553ed4a01e54e1b8481150c0e055e3f34

Oct 4 2022, 2:32 PM · meta-task, System administration, Roadmap 2022
vsellier committed rSPRE8fade05553ed: Export the grpc port of thanos through an ingress configuration (authored by vsellier).
Export the grpc port of thanos through an ingress configuration
Oct 4 2022, 2:20 PM
vsellier committed R260:e79a98efcf72: cassandra-replay: increase revision replayer count (authored by vsellier).
cassandra-replay: increase revision replayer count
Oct 4 2022, 9:32 AM

Oct 3 2022

vsellier created P1468 (An Untitled Masterwork).
Oct 3 2022, 4:03 PM
vsellier committed rSPREfa50a5ff1cb2: k8s-archive-production: active thanos sidecard and services (authored by vsellier).
k8s-archive-production: active thanos sidecard and services
Oct 3 2022, 3:43 PM
vsellier committed R260:b4067e7cf048: cassandra-replay: reduce load on production cluster during metallb tests (authored by vsellier).
cassandra-replay: reduce load on production cluster during metallb tests
Oct 3 2022, 3:05 PM
vsellier requested review of D8599: k8s-archive-production: Add an internal ingress to expose reaper webui.
Oct 3 2022, 1:55 PM
vsellier added a revision to T4458: Test reaper to automate the cassandra repair actions: D8599: k8s-archive-production: Add an internal ingress to expose reaper webui.
Oct 3 2022, 1:55 PM · System administration
vsellier committed rSKCONF110115991c54: reaper: Add an ingress to internally expose the webui (authored by vsellier).
reaper: Add an ingress to internally expose the webui
Oct 3 2022, 1:49 PM

Oct 1 2022

vsellier committed R260:de634df57950: cassandra-replay: redispatch replayers after snapshot catchup (authored by vsellier).
cassandra-replay: redispatch replayers after snapshot catchup
Oct 1 2022, 12:06 PM

Sep 30 2022

vsellier committed rSKCONF39fd005dd4af: archive-production: configure metallb to allow several service on same ip (authored by vsellier).
archive-production: configure metallb to allow several service on same ip
Sep 30 2022, 8:01 PM
vsellier committed rSPRE3147328ddfa6: k8s-archive-production: configure thanos sidecar to push on azure (authored by vsellier).
k8s-archive-production: configure thanos sidecar to push on azure
Sep 30 2022, 5:55 PM
vsellier closed T4461: Move argocd to a private admin url as Resolved.
Sep 30 2022, 5:27 PM · System administration
vsellier committed rSPSITE4b2879647a52: argocd: fix the missing dot in the dns declaration (authored by vsellier).
argocd: fix the missing dot in the dns declaration
Sep 30 2022, 5:22 PM
vsellier updated the task description for T4385: Federate prometheus instances through thanos.
Sep 30 2022, 3:50 PM · meta-task, System administration, Roadmap 2022
vsellier closed D8577: Disable ping on hosts/ips managed by metallb.
Sep 30 2022, 9:28 AM
vsellier committed rSPSITEbee3d6dc026c: Disable ping on hosts/ips managed by metallb (authored by vsellier).
Disable ping on hosts/ips managed by metallb
Sep 30 2022, 9:28 AM

Sep 29 2022

vsellier requested review of D8577: Disable ping on hosts/ips managed by metallb.
Sep 29 2022, 10:42 AM
vsellier added a revision to T4534: Evaluate MetalLB as inbound loadbalancer: D8577: Disable ping on hosts/ips managed by metallb.
Sep 29 2022, 10:42 AM · System administration

Sep 28 2022

vsellier committed rSKCONF01abd45d6427: argocd: Force the redirect to https for the internal ingress (authored by vsellier).
argocd: Force the redirect to https for the internal ingress
Sep 28 2022, 7:23 PM
vsellier committed rSKCONF246fb2ffa395: argocd: Delete public ingress (authored by vsellier).
argocd: Delete public ingress
Sep 28 2022, 7:23 PM
vsellier closed D8567: argocd: Remove public site.
Sep 28 2022, 7:08 PM
vsellier committed rSPSITEdd8df9301d75: argocd: Remove public site (authored by vsellier).
argocd: Remove public site
Sep 28 2022, 7:08 PM
vsellier updated the test plan for D8567: argocd: Remove public site.
Sep 28 2022, 5:48 PM
vsellier updated the diff for D8567: argocd: Remove public site.
  • Add monitoring of the internal service
Sep 28 2022, 5:47 PM
vsellier requested review of D8567: argocd: Remove public site.
Sep 28 2022, 5:29 PM
vsellier added a revision to T4461: Move argocd to a private admin url: D8567: argocd: Remove public site.
Sep 28 2022, 5:29 PM · System administration
vsellier added a comment to T4461: Move argocd to a private admin url.

Argo is now reachable on the internal network at https://argocd.internal.admin.swh.network/ but it uses an self-signed certificated

Sep 28 2022, 12:28 PM · System administration
vsellier committed rSKCONF3bf578f9b891: Use the official kubernetes nginx ingress controller (authored by vsellier).
Use the official kubernetes nginx ingress controller
Sep 28 2022, 12:01 PM
vsellier committed rSKCONF4b7b571bba07: argocd: don't try to override the default nginx ingress service (authored by vsellier).
argocd: don't try to override the default nginx ingress service
Sep 28 2022, 10:51 AM
vsellier closed D8559: argocd: Prepare the configuration to migrate to the internal admin network.
Sep 28 2022, 10:35 AM
vsellier committed rSPSITE73ae99dafa58: argocd: Prepare the configuration to migrate to the internal admin network (authored by vsellier).
argocd: Prepare the configuration to migrate to the internal admin network
Sep 28 2022, 10:35 AM
vsellier updated the diff for D8559: argocd: Prepare the configuration to migrate to the internal admin network.

rebase

Sep 28 2022, 10:29 AM
vsellier committed rSPRE6b8883e1f458: argocd: disable default rancher ingress controller (authored by vsellier).
argocd: disable default rancher ingress controller
Sep 28 2022, 10:27 AM
vsellier requested review of D8559: argocd: Prepare the configuration to migrate to the internal admin network.
Sep 28 2022, 10:16 AM
vsellier added a revision to T4461: Move argocd to a private admin url: D8559: argocd: Prepare the configuration to migrate to the internal admin network.
Sep 28 2022, 10:16 AM · System administration
vsellier added a comment to T4461: Move argocd to a private admin url.

hum I completely forgot we'll have to think how to manage the ssl termination.
As is, there is not cert manager deployed on the cluster and I'm not sure we want to do it.

Sep 28 2022, 10:09 AM · System administration
vsellier committed rSKCONF311693349c85: argocd: migrate to an internal domain (authored by vsellier).
argocd: migrate to an internal domain
Sep 28 2022, 9:56 AM
vsellier changed the status of T4461: Move argocd to a private admin url from Open to Work in Progress.

As metallb is configured, the internal domain can used the VIP

Sep 28 2022, 9:56 AM · System administration
vsellier added a comment to D8551: first add forge now requests added to changelog.

Sure there is (and it must have) nothing secret logged in the task.

Sep 28 2022, 9:29 AM

Sep 27 2022

vsellier committed rSKCONF8862c36c5e25: metallb: ignore fields updated during runtime (authored by vsellier).
metallb: ignore fields updated during runtime
Sep 27 2022, 11:40 PM
vsellier added a comment to T4534: Evaluate MetalLB as inbound loadbalancer.

A new test with a node completely down, it seems it recover after ~5mn which looks related to some cache expiracy somewhere

Tue Sep 27 16:31:59 UTC 2022
60 bytes from 2e:81:20:19:02:4a (192.168.100.119): index=2926 time=1.679 msec
Tue Sep 27 16:32:01 UTC 2022
Timeout
Timeout
Tue Sep 27 16:32:03 UTC 2022
...
Tue Sep 27 16:37:56 UTC 2022
Timeout
Timeout
Tue Sep 27 16:37:58 UTC 2022
60 bytes from 2e:84:a0:44:9e:c9 (192.168.100.119): index=2927 time=814.409 msec
60 bytes from 2e:84:a0:44:9e:c9 (192.168.100.119): index=2928 time=864.574 usec
60 bytes from 2e:84:a0:44:9e:c9 (192.168.100.119): index=2929 time=973.083 msec
60 bytes from 2e:84:a0:44:9e:c9 (192.168.100.119): index=2930 time=32.151 msec
Sep 27 2022, 6:42 PM · System administration
vsellier added a comment to T4534: Evaluate MetalLB as inbound loadbalancer.

If a node is drained out of the cluster, the rebalancing occurs in ~10s which it's what it's announced in the documentation

Tue Sep 27 16:17:21 UTC 2022
60 bytes from 2e:84:a0:44:9e:c9 (192.168.100.119): index=1985 time=1.710 msec
60 bytes from 2e:84:a0:44:9e:c9 (192.168.100.119): index=1986 time=1.376 msec
Tue Sep 27 16:17:23 UTC 2022
Timeout
Tue Sep 27 16:17:25 UTC 2022
Timeout
Timeout
Tue Sep 27 16:17:27 UTC 2022
Timeout
Timeout
Tue Sep 27 16:17:29 UTC 2022
Timeout
Timeout
Tue Sep 27 16:17:31 UTC 2022
Timeout
Timeout
Tue Sep 27 16:17:33 UTC 2022
Timeout
Timeout
60 bytes from 2e:81:20:19:02:4a (192.168.100.119): index=1987 time=669.150 msec
Tue Sep 27 16:17:35 UTC 2022
60 bytes from 2e:81:20:19:02:4a (192.168.100.119): index=1988 time=959.452 usec
Sep 27 2022, 6:19 PM · System administration
vsellier added a comment to T4534: Evaluate MetalLB as inbound loadbalancer.

With the ingress controller correctly configured and an ingress declared, everything seems to work correctly:

vsellier@pergamon ~ % cat test-ingress.txt
GET /graphql/ HTTP/1.0
Host: archive.softwareheritage.org
Sep 27 2022, 6:04 PM · System administration
vsellier committed rSKCONF27bee789d015: production: Ensure the ingress class created by nginx is the default (authored by vsellier).
production: Ensure the ingress class created by nginx is the default
Sep 27 2022, 5:31 PM
vsellier committed R260:581d22afb81d: production: enabled graphql ingress (authored by vsellier).
production: enabled graphql ingress
Sep 27 2022, 4:40 PM
vsellier committed R260:f4cb0216d314: production: Add missing configuration entry to deploy graphql (authored by vsellier).
production: Add missing configuration entry to deploy graphql
Sep 27 2022, 4:37 PM
vsellier committed R260:1f07eaf3cece: production add mandatory values for graphql configuration (authored by vsellier).
production add mandatory values for graphql configuration
Sep 27 2022, 4:31 PM
vsellier committed R260:5f413a4e1350: production: test ingress / metallb by deploying graphql (authored by vsellier).
production: test ingress / metallb by deploying graphql
Sep 27 2022, 4:23 PM
vsellier committed rSKCONF9d4c4220af86: production: fix nginx ingress chart version (authored by vsellier).
production: fix nginx ingress chart version
Sep 27 2022, 4:11 PM
vsellier committed rSKCONF6c40e8fe60e3: production: fix nginx ingress chart version (authored by vsellier).
production: fix nginx ingress chart version
Sep 27 2022, 3:59 PM
vsellier committed rSKCONF28387e6c1fd9: production-cluster: Deploy the nginx ingress controller (authored by vsellier).
production-cluster: Deploy the nginx ingress controller
Sep 27 2022, 3:51 PM
vsellier committed rSPREacbc844f9b54: Don't use the default rancher ingress manager (authored by vsellier).
Don't use the default rancher ingress manager
Sep 27 2022, 3:46 PM
vsellier committed rSPRE833b5359f64f: temporary increase prometheus retention (authored by vsellier).
temporary increase prometheus retention
Sep 27 2022, 3:22 PM
vsellier committed rSPREea34a207f927: Update storage configuration to match the reality (authored by vsellier).
Update storage configuration to match the reality
Sep 27 2022, 3:22 PM
vsellier committed R260:3a2dd951b885: Upgrade storage replayers to last available version (authored by vsellier).
Upgrade storage replayers to last available version
Sep 27 2022, 11:48 AM