Page MenuHomeSoftware Heritage
Feed Advanced Search

Dec 19 2019

vlorentz moved T2144: Define an architecture for end-to-end monitoring/testing from deployed to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 19 2019, 3:18 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz moved T2144: Define an architecture for end-to-end monitoring/testing from done to deployed on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 19 2019, 3:18 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz moved T2118: Deposit: End to End monitoring from in progress to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 19 2019, 3:18 PM · Sprint 2019/12 (Monitor and Conquer)
ardumont moved T2134: loader: Implement uniform loading CLI from done to deployed on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 19 2019, 2:12 PM · Sprint 2019/12 (Monitor and Conquer)
olasd moved T2133: Scheduler listener/runner: add statsd probes from done to deployed on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 19 2019, 2:07 PM · Metrics/monitoring, Scheduling utilities, Sprint 2019/12 (Monitor and Conquer)
olasd moved T1359: Add sentry support in every swh running service from done to deployed on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 19 2019, 2:06 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
olasd moved T1358: Setup a sentry service from done to deployed on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 19 2019, 2:06 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
olasd moved T1358: Setup a sentry service from in progress to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 19 2019, 2:06 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
olasd closed T1358: Setup a sentry service as Resolved.

Sentry is now available at https://sentry.softwareheritage.org/.

Dec 19 2019, 10:19 AM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2118: Deposit: End to End monitoring, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2117: Save Code Now: End to End monitoring, as Resolved.
Dec 19 2019, 10:06 AM · System administration, Monitoring, Roadmap 2021
zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2129: Journal: End to end monitoring, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T2144: Define an architecture for end-to-end monitoring/testing as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2125: Production Web UI end to end testing, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T2144: Define an architecture for end-to-end monitoring/testing, a subtask of T2126: Production Vault end to end testing, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack added a comment to T2144: Define an architecture for end-to-end monitoring/testing.

(marking as done as it was moved to the done column on the sprint board, please reopen if not ok)

Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
zack closed T1359: Add sentry support in every swh running service as Resolved.

(marking as done as it was moved to the done column on the sprint board, please reopen if not ok)

Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
zack closed T1359: Add sentry support in every swh running service , a subtask of T1358: Setup a sentry service, as Resolved.
Dec 19 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration

Dec 16 2019

vlorentz changed the status of T2118: Deposit: End to End monitoring from Open to Work in Progress.
Dec 16 2019, 4:09 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz moved T2144: Define an architecture for end-to-end monitoring/testing from in progress to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 16 2019, 3:57 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz moved T1359: Add sentry support in every swh running service from in progress to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 16 2019, 3:57 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration

Dec 13 2019

vlorentz added a revision to T2126: Production Vault end to end testing: D2454: Add a timeout to check_vault..
Dec 13 2019, 4:30 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz updated subscribers of T2144: Define an architecture for end-to-end monitoring/testing.

Current solution is to use Icinga (which can then report metrics to Prometheus, according to @olasd)

Dec 13 2019, 4:29 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz claimed T2126: Production Vault end to end testing.
Dec 13 2019, 4:29 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz moved T2126: Production Vault end to end testing from Backlog to in progress on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 13 2019, 4:29 PM · Sprint 2019/12 (Monitor and Conquer)

Dec 12 2019

zack moved T2134: loader: Implement uniform loading CLI from in progress to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 12 2019, 5:51 PM · Sprint 2019/12 (Monitor and Conquer)
zack moved T2124: Save Code Now: monitoring of admin infra from in progress to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 12 2019, 5:50 PM · Sprint 2019/12 (Monitor and Conquer)
anlambert closed T2124: Save Code Now: monitoring of admin infra as Resolved by committing rDWAPPS4c55d9af7b41: origin_save: Add prometheus metrics for origin save requests.
Dec 12 2019, 5:41 PM · Sprint 2019/12 (Monitor and Conquer)
ardumont updated the task description for T2120: Loaders: standalone task tests.
Dec 12 2019, 3:22 PM · Sprint 2019/12 (Monitor and Conquer)
ardumont closed T2134: loader: Implement uniform loading CLI as Resolved.
Dec 12 2019, 3:21 PM · Sprint 2019/12 (Monitor and Conquer)
ardumont closed T2134: loader: Implement uniform loading CLI, a subtask of T2120: Loaders: standalone task tests, as Resolved.
Dec 12 2019, 3:21 PM · Sprint 2019/12 (Monitor and Conquer)

Dec 11 2019

anlambert added a comment to T2124: Save Code Now: monitoring of admin infra.

After second thoughts, relying on the use of prometheus_client seems a better approach for the swh-web case so D2419 is abandoned in favor of D2432.

Dec 11 2019, 6:09 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz renamed T2144: Define an architecture for end-to-end monitoring/testing from Define an architecture for end-to-end monitoring to Define an architecture for end-to-end monitoring/testing.
Dec 11 2019, 5:21 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz added a subtask for T2129: Journal: End to end monitoring: T2144: Define an architecture for end-to-end monitoring/testing.
Dec 11 2019, 5:21 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz added a subtask for T2117: Save Code Now: End to End monitoring: T2144: Define an architecture for end-to-end monitoring/testing.
Dec 11 2019, 5:21 PM · System administration, Monitoring, Roadmap 2021
vlorentz added a subtask for T2126: Production Vault end to end testing: T2144: Define an architecture for end-to-end monitoring/testing.
Dec 11 2019, 5:21 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz added parent tasks for T2144: Define an architecture for end-to-end monitoring/testing: T2129: Journal: End to end monitoring, T2126: Production Vault end to end testing, T2125: Production Web UI end to end testing, T2118: Deposit: End to End monitoring, T2117: Save Code Now: End to End monitoring.
Dec 11 2019, 5:21 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz added a subtask for T2125: Production Web UI end to end testing: T2144: Define an architecture for end-to-end monitoring/testing.
Dec 11 2019, 5:21 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz added a subtask for T2118: Deposit: End to End monitoring: T2144: Define an architecture for end-to-end monitoring/testing.
Dec 11 2019, 5:21 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz moved T2144: Define an architecture for end-to-end monitoring/testing from Backlog to in progress on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 11 2019, 5:20 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz changed the status of T2144: Define an architecture for end-to-end monitoring/testing from Open to Work in Progress.
Dec 11 2019, 5:20 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz triaged T2144: Define an architecture for end-to-end monitoring/testing as Normal priority.
Dec 11 2019, 5:20 PM · Sprint 2019/12 (Monitor and Conquer)
ardumont updated the task description for T2134: loader: Implement uniform loading CLI.
Dec 11 2019, 4:30 PM · Sprint 2019/12 (Monitor and Conquer)
anlambert claimed T2124: Save Code Now: monitoring of admin infra.
Dec 11 2019, 4:14 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz changed the status of T2117: Save Code Now: End to End monitoring from Open to Work in Progress.
Dec 11 2019, 4:12 PM · System administration, Monitoring, Roadmap 2021
vlorentz assigned T2117: Save Code Now: End to End monitoring to anlambert.
Dec 11 2019, 4:10 PM · System administration, Monitoring, Roadmap 2021
anlambert moved T2117: Save Code Now: End to End monitoring from Backlog to in progress on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 11 2019, 4:09 PM · System administration, Monitoring, Roadmap 2021
vlorentz moved T2130: Scheduler monitoring: probe rabbitmq status from Backlog to deployed on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 11 2019, 3:42 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz closed T2142: Document how to use Sentry with the docker dev environment, a subtask of T1359: Add sentry support in every swh running service , as Resolved.
Dec 11 2019, 3:42 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
vlorentz closed T2142: Document how to use Sentry with the docker dev environment as Resolved.
Dec 11 2019, 3:42 PM · Docker environment, Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring
vlorentz added revisions to T1359: Add sentry support in every swh running service : D2428: Add sentry integration to the JS code., D2426: Initialize Sentry on Celery worker startup., D2423: Add sentry integration to swh-web, D2411: Make the CLI initialize sentry-sdk based on CLI options/envvars., D2418: Add gunicorn config script to initialize sentry-sdk based on envvars., D2420: Import gunicorn config from swh-core..
Dec 11 2019, 3:41 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
vlorentz claimed T1359: Add sentry support in every swh running service .
Dec 11 2019, 3:40 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
vlorentz moved T1359: Add sentry support in every swh running service from Backlog to in progress on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 11 2019, 3:40 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
vlorentz moved T2142: Document how to use Sentry with the docker dev environment from in progress to deployed on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 11 2019, 3:40 PM · Docker environment, Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring
vlorentz added a comment to T2142: Document how to use Sentry with the docker dev environment.

Resolved by D2424.

Dec 11 2019, 3:39 PM · Docker environment, Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring
ardumont updated the task description for T2120: Loaders: standalone task tests.
Dec 11 2019, 3:18 PM · Sprint 2019/12 (Monitor and Conquer)
ardumont updated the task description for T2120: Loaders: standalone task tests.
Dec 11 2019, 2:45 PM · Sprint 2019/12 (Monitor and Conquer)
olasd closed T2130: Scheduler monitoring: probe rabbitmq status as Resolved.

Deployed with rSPSITE6db1d7ffe

Dec 11 2019, 11:20 AM · Sprint 2019/12 (Monitor and Conquer)
douardda added a revision to T2118: Deposit: End to End monitoring: D2427: Make load-deposit and check-deposit URL argument absolute.
Dec 11 2019, 10:57 AM · Sprint 2019/12 (Monitor and Conquer)
ardumont added a comment to T2134: loader: Implement uniform loading CLI.

The good news is that it's not the big part of the task.
So, yes, i'll improve on the current implementation that after i'm done deploying and migrating all the things ;)
The big part is the implied refactoring pulled by this.
Which i'm almost done (that's the deploy and migrate part i'm doing).

Dec 11 2019, 10:48 AM · Sprint 2019/12 (Monitor and Conquer)
douardda moved T2118: Deposit: End to End monitoring from Backlog to in progress on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 11 2019, 10:22 AM · Sprint 2019/12 (Monitor and Conquer)

Dec 10 2019

olasd added a comment to T2128: Monitor journal consumer lag.

Packaged and deployed the consumer group exporter on getty for both kafka clusters.

Dec 10 2019, 8:10 PM · Metrics/monitoring, Sprint 2019/12 (Monitor and Conquer)
ardumont updated the task description for T2120: Loaders: standalone task tests.
Dec 10 2019, 6:46 PM · Sprint 2019/12 (Monitor and Conquer)
anlambert added a comment to T2124: Save Code Now: monitoring of admin infra.

Yes I agree, I ended up using statsd in D2419 after all. django-prometheus could be interesting to use but it maybe redundant with the upcoming sentry integration in swh-web.

Dec 10 2019, 11:26 AM · Sprint 2019/12 (Monitor and Conquer)
douardda added a comment to T2124: Save Code Now: monitoring of admin infra.

prometheus_client looks quite low-level to me. We already have statsd for this kind of purpose, which does not imply having to deal with one more dedicated http server or so...

Dec 10 2019, 11:12 AM · Sprint 2019/12 (Monitor and Conquer)
douardda added a comment to T2134: loader: Implement uniform loading CLI.

Sorry I'm commenting this a bit late, but wouldn't it make more sense to have something like:

Dec 10 2019, 11:00 AM · Sprint 2019/12 (Monitor and Conquer)
ardumont updated the task description for T2120: Loaders: standalone task tests.
Dec 10 2019, 10:15 AM · Sprint 2019/12 (Monitor and Conquer)

Dec 9 2019

anlambert moved T2124: Save Code Now: monitoring of admin infra from Backlog to in progress on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 9 2019, 1:54 PM · Sprint 2019/12 (Monitor and Conquer)
douardda moved T2122: Lister monitoring: add statsd probe for each lister instance from in progress to Backlog on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 9 2019, 1:51 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz renamed T2142: Document how to use Sentry with the docker dev environment from Add sentry to the docker-dev environment to Document how to use Sentry with the docker dev environment.
Dec 9 2019, 12:43 PM · Docker environment, Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring
douardda added a comment to T2122: Lister monitoring: add statsd probe for each lister instance.

Okay I see no "simple" way of doing this with the current implementation of tasks using SWHTask.

Dec 9 2019, 12:32 PM · Sprint 2019/12 (Monitor and Conquer)
vlorentz added a project to T2142: Document how to use Sentry with the docker dev environment: Docker environment.
Dec 9 2019, 10:33 AM · Docker environment, Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring
vlorentz moved T2142: Document how to use Sentry with the docker dev environment from Backlog to in progress on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 9 2019, 10:28 AM · Docker environment, Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring
vlorentz moved T1358: Setup a sentry service from Backlog to in progress on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 9 2019, 10:28 AM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
vlorentz changed the status of T2142: Document how to use Sentry with the docker dev environment from Open to Work in Progress.
Dec 9 2019, 10:28 AM · Docker environment, Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring
vlorentz changed the status of T2142: Document how to use Sentry with the docker dev environment, a subtask of T1359: Add sentry support in every swh running service , from Open to Work in Progress.
Dec 9 2019, 10:28 AM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
vlorentz triaged T2142: Document how to use Sentry with the docker dev environment as Normal priority.
Dec 9 2019, 10:27 AM · Docker environment, Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring

Dec 7 2019

olasd merged T1361: Push rabbitmq metrics to Prometheus into T2130: Scheduler monitoring: probe rabbitmq status.
Dec 7 2019, 6:22 PM · Sprint 2019/12 (Monitor and Conquer)
olasd added a comment to T2128: Monitor journal consumer lag.

A quick test shows that https://github.com/braedon/prometheus-kafka-consumer-group-exporter does a decent job.

Dec 7 2019, 6:21 PM · Metrics/monitoring, Sprint 2019/12 (Monitor and Conquer)

Dec 6 2019

olasd changed the status of T1358: Setup a sentry service from Open to Work in Progress.

I think I've mostly coerced sentry, at url https://sentry.softwareheritage.org/, into working. I used the opportunity to start refactoring the way apache is handled in our puppet environment, as well as slowly migrating some vhosts to Let's Encrypt.

Dec 6 2019, 11:06 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
ardumont updated the task description for T2134: loader: Implement uniform loading CLI.
Dec 6 2019, 6:49 PM · Sprint 2019/12 (Monitor and Conquer)
anlambert added a comment to T2124: Save Code Now: monitoring of admin infra.

We could use [[ https://github.com/prometheus/client_python | prometheus_client ]] (packaged in debian) to generate and export Prometheus metrics for swh-web.

Dec 6 2019, 2:25 PM · Sprint 2019/12 (Monitor and Conquer)
zack moved T2133: Scheduler listener/runner: add statsd probes from Backlog to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 6 2019, 12:30 PM · Metrics/monitoring, Scheduling utilities, Sprint 2019/12 (Monitor and Conquer)
douardda moved T2122: Lister monitoring: add statsd probe for each lister instance from Backlog to in progress on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 6 2019, 10:36 AM · Sprint 2019/12 (Monitor and Conquer)
douardda closed T2133: Scheduler listener/runner: add statsd probes as Resolved.

Closed by D2394

Dec 6 2019, 10:22 AM · Metrics/monitoring, Scheduling utilities, Sprint 2019/12 (Monitor and Conquer)
ardumont updated the task description for T2134: loader: Implement uniform loading CLI.
Dec 6 2019, 10:06 AM · Sprint 2019/12 (Monitor and Conquer)
ardumont updated the task description for T2134: loader: Implement uniform loading CLI.
Dec 6 2019, 10:04 AM · Sprint 2019/12 (Monitor and Conquer)
ardumont updated the task description for T2134: loader: Implement uniform loading CLI.
Dec 6 2019, 9:46 AM · Sprint 2019/12 (Monitor and Conquer)
ardumont updated the task description for T2120: Loaders: standalone task tests.
Dec 6 2019, 9:46 AM · Sprint 2019/12 (Monitor and Conquer)

Dec 4 2019

ardumont updated the task description for T2134: loader: Implement uniform loading CLI.
Dec 4 2019, 4:22 PM · Sprint 2019/12 (Monitor and Conquer)
zack moved T2119: Monitoring of workers from in progress to done on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 4 2019, 3:37 PM · Scheduling utilities, Sprint 2019/12 (Monitor and Conquer)
zack moved T1360: Install a sentry server from Backlog to deployed on the Sprint 2019/12 (Monitor and Conquer) board.
Dec 4 2019, 3:37 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
douardda closed T2119: Monitoring of workers as Resolved by committing rDSCHf206076232a8: celery: make SWHTask send start/end of execution statsd gauges with timestamps.
Dec 4 2019, 3:12 PM · Scheduling utilities, Sprint 2019/12 (Monitor and Conquer)
douardda added a revision to T2133: Scheduler listener/runner: add statsd probes: D2394: celery: add 2 statsd probes for the runner and listener.
Dec 4 2019, 10:58 AM · Metrics/monitoring, Scheduling utilities, Sprint 2019/12 (Monitor and Conquer)
douardda added a revision to T2119: Monitoring of workers: D2393: celery: make SWHTask send start/end of execution statsd gauges with timestamps.
Dec 4 2019, 10:30 AM · Scheduling utilities, Sprint 2019/12 (Monitor and Conquer)

Dec 3 2019

olasd closed T1360: Install a sentry server, a subtask of T1358: Setup a sentry service, as Resolved.
Dec 3 2019, 6:09 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
olasd closed T1360: Install a sentry server as Resolved.

The new virtual machine for sentry, [[ https://en.m.wikipedia.org/wiki/Riverside_Museum | riverside.internal.softwareheritage.org ]], has now been installed.

Dec 3 2019, 6:09 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
vlorentz lowered the priority of T1359: Add sentry support in every swh running service from High to Normal.
Dec 3 2019, 5:50 PM · Sprint 2019/12 (Monitor and Conquer), Metrics/monitoring, System administration
vlorentz lowered the priority of T2117: Save Code Now: End to End monitoring from High to Normal.
Dec 3 2019, 5:50 PM · System administration, Monitoring, Roadmap 2021
vlorentz lowered the priority of T2118: Deposit: End to End monitoring from High to Normal.
Dec 3 2019, 5:50 PM · Sprint 2019/12 (Monitor and Conquer)