Page MenuHomeSoftware Heritage
Feed Advanced Search

Jan 8 2023

gitlab-migration closed T3587: check swh sentry instance for eventual sensitive info leakage as Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 10:23 PM · Sentry
gitlab-migration closed T2163: Hook up sentry to centralized authentication as Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 10:22 PM · Sentry
gitlab-migration closed T2161: Setup repository integration for sentry as Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 10:22 PM · Sentry
gitlab-migration changed the status of T4427: Deal with recurring Sentry reports of temporary server shutdowns from Resolved to Migrated.

This task has been migrated to GitLab.

Jan 8 2023, 4:37 PM · Core & foundations, Sentry

Oct 19 2022

gitlab-migration changed the status of T4497: [sentry] Out of disk space from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:08 PM · Sentry, System administration
gitlab-migration changed the status of T3920: Errors from vault workers are not logged to Sentry from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:05 PM · System administration, Vault, Sentry
gitlab-migration changed the status of T3619: Enable Sentry for swh-graph from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:04 PM · System administration, Sentry
gitlab-migration changed the status of T3466: Enable Sentry for the deposit server from Invalid to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:03 PM · System administration, Sentry, SWORD deposit
gitlab-migration changed the status of T3015: Sentry should have two different projects for swh-indexer and swh-indexer-storage from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:01 PM · System administration, Sentry
gitlab-migration changed the status of T2910: Sentry: Increase disk space, a subtask of T2899: Sentry doesn't react to new errors, from Resolved to Migrated.
Oct 19 2022, 6:00 PM · Sentry, System administration
gitlab-migration changed the status of T2910: Sentry: Increase disk space from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:00 PM · Sentry, System administration
gitlab-migration changed the status of T2899: Sentry doesn't react to new errors from Resolved to Migrated.

This task has been migrated to GitLab.

Oct 19 2022, 6:00 PM · Sentry, System administration
gitlab-migration closed T2162: Setup a centralized authentication service, a subtask of T2163: Hook up sentry to centralized authentication, as Migrated.
Oct 19 2022, 5:56 PM · Sentry

Oct 4 2022

vsellier closed T4497: [sentry] Out of disk space as Resolved.

Closing as there is no alerts since almost one month

Oct 4 2022, 6:15 PM · Sentry, System administration

Sep 15 2022

vlorentz closed T4427: Deal with recurring Sentry reports of temporary server shutdowns as Resolved.
Sep 15 2022, 1:56 PM · Core & foundations, Sentry

Sep 6 2022

vlorentz added revisions to T4497: [sentry] Out of disk space: D8402: conftest: Refactor GraphServerProcess to be more flexible, D8403: Kill GraphServerProcess on test teardown, D8404: Return HTTP 503 on AioRpcError.
Sep 6 2022, 5:31 PM · Sentry, System administration
vsellier added a comment to T4497: [sentry] Out of disk space.

The root cause is a swh-graph experiment that generated a lot of grpc errors which are huge.

Sep 6 2022, 12:41 PM · Sentry, System administration
vsellier added a comment to T4497: [sentry] Out of disk space.

No consumers seem to have a big lag on these topics, so it should be possible to reduce the lag to unblock the server and have a look which service is sending the events:

root@riverside:/var/lib/sentry-onpremise# docker-compose-1.29.2 run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092 --list | tr -d '\r' | xargs -t -n1 docker-compose-1.29.2 run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092  --describe --group | grep -e GROUP -e " events "
Creating sentry-self-hosted_kafka_run ... done
docker-compose-1.29.2 run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092 --describe --group snuba-consumers
Creating sentry-self-hosted_kafka_run ... done
GROUP           TOPIC           PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                  HOST            CLIENT-ID
snuba-consumers events          0          82585390        82587094        1704            -                                            -               -
docker-compose-1.29.2 run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092 --describe --group snuba-post-processor:sync:6fa9928e1d6911edac290242ac170014
Creating sentry-self-hosted_kafka_run ... done
GROUP                                                      TOPIC            PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                  HOST            CLIENT-ID
docker-compose-1.29.2 run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092 --describe --group ingest-consumer
Creating sentry-self-hosted_kafka_run ... done
Sep 6 2022, 11:18 AM · Sentry, System administration
vsellier added a comment to T4497: [sentry] Out of disk space.

The biggest topics are:

root@riverside:/var/lib/docker/volumes/sentry-kafka/_data# du -sch * | sort -h | tail -n 5
31M	snuba-commit-log-0
291M	outcomes-0
30G	ingest-events-0
43G	events-0
73G	total
Sep 6 2022, 11:11 AM · Sentry, System administration
vsellier changed the status of T4497: [sentry] Out of disk space from Open to Work in Progress.
Sep 6 2022, 11:09 AM · Sentry, System administration

Aug 24 2022

vlorentz removed a subtask for T4427: Deal with recurring Sentry reports of temporary server shutdowns: Restricted Maniphest Task.
Aug 24 2022, 12:35 PM · Core & foundations, Sentry

Aug 17 2022

vlorentz added a subtask for T4427: Deal with recurring Sentry reports of temporary server shutdowns: Restricted Maniphest Task.
Aug 17 2022, 3:50 PM · Core & foundations, Sentry

Aug 9 2022

vlorentz added revisions to T4427: Deal with recurring Sentry reports of temporary server shutdowns: D8223: Convert psycopg2 errors to TransientRemoteException instead of RemoteException, D8224: retry: Add constant 10s wait when retrying transient exceptions.
Aug 9 2022, 4:20 PM · Core & foundations, Sentry
vlorentz added revisions to T4427: Deal with recurring Sentry reports of temporary server shutdowns: D8219: Remove support for deprecated exception format, D8220: Add tests for RPC server's exception handling, D8221: Make the RPC client raise a specific exception class on 503.
Aug 9 2022, 11:18 AM · Core & foundations, Sentry
vlorentz claimed T4427: Deal with recurring Sentry reports of temporary server shutdowns.
Aug 9 2022, 11:17 AM · Core & foundations, Sentry
vlorentz triaged T4427: Deal with recurring Sentry reports of temporary server shutdowns as Normal priority.
Aug 9 2022, 11:17 AM · Core & foundations, Sentry

Feb 9 2022

ardumont changed the status of T3920: Errors from vault workers are not logged to Sentry from Invalid to Resolved.
Feb 9 2022, 3:03 PM · System administration, Vault, Sentry
ardumont closed T3920: Errors from vault workers are not logged to Sentry as Invalid.

It seems everything is already ok (another cooking task issue reported [1]) in the end so closing this.

Feb 9 2022, 3:03 PM · System administration, Vault, Sentry
ardumont added a comment to T3920: Errors from vault workers are not logged to Sentry.

I don't know how to trigger an error in the vault, currently; you'd have to change the code to manually do that :|

Feb 9 2022, 10:24 AM · System administration, Vault, Sentry
ardumont added a comment to T3920: Errors from vault workers are not logged to Sentry.

Also that's a cooker worker issue from sentry from 18h ago or so (as of the moment of this comment).

Feb 9 2022, 10:03 AM · System administration, Vault, Sentry
vlorentz added a comment to T3920: Errors from vault workers are not logged to Sentry.

I'll let you trigger some cooking and reports here your finding.

Feb 9 2022, 10:02 AM · System administration, Vault, Sentry
ardumont added a comment to T3920: Errors from vault workers are not logged to Sentry.

Something is bothering me, isn't there some catchall exceptions happening in the vault source code somewhere already?

Feb 9 2022, 9:51 AM · System administration, Vault, Sentry

Feb 8 2022

vlorentz added a comment to T3920: Errors from vault workers are not logged to Sentry.

I wonder if replacing @worker_init.connect with @worker_process_init.connect at https://forge.softwareheritage.org/source/swh-scheduler/browse/master/swh/scheduler/celery_backend/config.py$157 would work.

Feb 8 2022, 6:43 PM · System administration, Vault, Sentry
vlorentz added a comment to T3920: Errors from vault workers are not logged to Sentry.

so I'm guessing Celery is eating logs somehow, so Sentry doesn't see them

Feb 8 2022, 6:37 PM · System administration, Vault, Sentry
ardumont added a comment to T3920: Errors from vault workers are not logged to Sentry.

So i don't currently know what's wrong (if anything is).

Feb 8 2022, 6:17 PM · System administration, Vault, Sentry
ardumont added a comment to T3920: Errors from vault workers are not logged to Sentry.

So i was wrong, it is correctly set [1].
And there are sentry issues about workers [2].

Feb 8 2022, 6:16 PM · System administration, Vault, Sentry
ardumont moved T3920: Errors from vault workers are not logged to Sentry from in-progress to code-review/await-feedback/pause on the System administration board.
Feb 8 2022, 5:15 PM · System administration, Vault, Sentry
ardumont changed the status of T3920: Errors from vault workers are not logged to Sentry from Open to Work in Progress.
Feb 8 2022, 5:15 PM · System administration, Vault, Sentry
ardumont moved T3920: Errors from vault workers are not logged to Sentry from Backlog to Weekly backlog on the System administration board.
Feb 8 2022, 5:15 PM · System administration, Vault, Sentry
ardumont added a revision to T3920: Errors from vault workers are not logged to Sentry: D7123: Configure vault cookers to send their issue to sentry.
Feb 8 2022, 4:18 PM · System administration, Vault, Sentry
ardumont added a comment to T3920: Errors from vault workers are not logged to Sentry.

Yes, i confirm (from #swh-sysadm discussion)

Feb 8 2022, 4:13 PM · System administration, Vault, Sentry
anlambert updated subscribers of T3920: Errors from vault workers are not logged to Sentry.

Looking at puppet configuration, my guess is that the sentry_dsn is not set for the vault cookers.

Feb 8 2022, 3:52 PM · System administration, Vault, Sentry
vlorentz triaged T3920: Errors from vault workers are not logged to Sentry as High priority.
Feb 8 2022, 2:55 PM · System administration, Vault, Sentry

Oct 15 2021

ardumont closed T3619: Enable Sentry for swh-graph as Resolved.
Oct 15 2021, 11:01 AM · System administration, Sentry
ardumont moved T3619: Enable Sentry for swh-graph from code-review/await-feedback/pause to deployed/landed/monitoring on the System administration board.
Oct 15 2021, 10:55 AM · System administration, Sentry
ardumont moved T3619: Enable Sentry for swh-graph from in-progress to code-review/await-feedback/pause on the System administration board.
Oct 15 2021, 10:42 AM · System administration, Sentry
ardumont changed the status of T3619: Enable Sentry for swh-graph from Open to Work in Progress.
Oct 15 2021, 10:42 AM · System administration, Sentry
ardumont added a revision to T3619: Enable Sentry for swh-graph: D6480: Activate sentry for swh.graph.
Oct 15 2021, 10:33 AM · System administration, Sentry
ardumont moved T3619: Enable Sentry for swh-graph from Backlog to Weekly backlog on the System administration board.
Oct 15 2021, 10:29 AM · System administration, Sentry

Sep 29 2021

ardumont added a comment to T3619: Enable Sentry for swh-graph.

In the mean time, logs can be reached in the dedicated dashboard

Sep 29 2021, 3:37 PM · System administration, Sentry
vlorentz triaged T3619: Enable Sentry for swh-graph as High priority.
Sep 29 2021, 3:05 PM · System administration, Sentry

Sep 17 2021

ardumont triaged T3587: check swh sentry instance for eventual sensitive info leakage as Normal priority.
Sep 17 2021, 3:01 PM · Sentry

Sep 8 2021

vlorentz changed the status of T3466: Enable Sentry for the deposit server from Resolved to Invalid.
Sep 8 2021, 4:38 PM · System administration, Sentry, SWORD deposit
ardumont changed the status of T3466: Enable Sentry for the deposit server from Invalid to Resolved.
Sep 8 2021, 4:31 PM · System administration, Sentry, SWORD deposit

Sep 3 2021

moranegg moved T3466: Enable Sentry for the deposit server from Backlog to Landed/Tests/Validations (staging) on the SWORD deposit board.
Sep 3 2021, 11:36 AM · System administration, Sentry, SWORD deposit

Aug 5 2021

vlorentz closed T3466: Enable Sentry for the deposit server as Invalid.

uh, indeed

Aug 5 2021, 2:57 PM · System administration, Sentry, SWORD deposit
ardumont added a comment to T3466: Enable Sentry for the deposit server.

Does it?

Aug 5 2021, 2:53 PM · System administration, Sentry, SWORD deposit
vlorentz triaged T3466: Enable Sentry for the deposit server as Normal priority.
Aug 5 2021, 2:48 PM · System administration, Sentry, SWORD deposit

Jul 29 2021

ardumont moved T2910: Sentry: Increase disk space from deployed/landed/monitoring to done on the System administration board.
Jul 29 2021, 1:22 PM · Sentry, System administration

Feb 18 2021

vsellier changed the status of T3015: Sentry should have two different projects for swh-indexer and swh-indexer-storage from Invalid to Resolved.
Feb 18 2021, 9:28 AM · System administration, Sentry

Feb 11 2021

vlorentz closed T3015: Sentry should have two different projects for swh-indexer and swh-indexer-storage as Invalid.

Well my concern was about having different versions running at the same time, but Sentry is able to detect the version so that's not it.

Feb 11 2021, 5:34 PM · System administration, Sentry
vsellier claimed T3015: Sentry should have two different projects for swh-indexer and swh-indexer-storage.
Feb 11 2021, 3:34 PM · System administration, Sentry
vsellier added a comment to T3015: Sentry should have two different projects for swh-indexer and swh-indexer-storage.

I'm not sure to understand the real problem here.
As the indexer and indexer-storage are in same source repository, the versions should match or increase in //. Sentry should be able to deal with it as any other version upgrade.

Feb 11 2021, 3:23 PM · System administration, Sentry
vsellier changed the status of T3015: Sentry should have two different projects for swh-indexer and swh-indexer-storage from Open to Work in Progress.
Feb 11 2021, 3:05 PM · System administration, Sentry

Feb 8 2021

vsellier moved T3015: Sentry should have two different projects for swh-indexer and swh-indexer-storage from Backlog to Weekly backlog on the System administration board.
Feb 8 2021, 12:50 PM · System administration, Sentry

Feb 2 2021

vlorentz edited projects for T3015: Sentry should have two different projects for swh-indexer and swh-indexer-storage, added: System administration; removed System administrators.
Feb 2 2021, 9:49 AM · System administration, Sentry
vlorentz triaged T3015: Sentry should have two different projects for swh-indexer and swh-indexer-storage as Normal priority.
Feb 2 2021, 9:49 AM · System administration, Sentry

Jan 6 2021

ardumont moved T2910: Sentry: Increase disk space from Backlog to deployed/landed/monitoring on the System administration board.
Jan 6 2021, 3:43 PM · Sentry, System administration

Dec 21 2020

vsellier closed T2910: Sentry: Increase disk space, a subtask of T2899: Sentry doesn't react to new errors, as Resolved.
Dec 21 2020, 7:05 PM · Sentry, System administration
vsellier closed T2910: Sentry: Increase disk space as Resolved.
Dec 21 2020, 7:05 PM · Sentry, System administration
vsellier added a comment to T2910: Sentry: Increase disk space.
  • before :
root@riverside:~# pvscan
  PV /dev/sda1   VG riverside-vg    lvm2 [<63.98 GiB / 0    free]
  Total: 1 [<63.98 GiB] / in use: 1 [<63.98 GiB] / in no VG: 0 [0   ]
root@riverside:~# df -h /
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/riverside--vg-root   60G   56G  1.4G  98% /

(2% some cleanup seems to have occur since the creation of the task :) )

  • disk extended on proxmox by 16Go on proxmox
(extract of dmesg of riverside)
[350521.461023] sd 2:0:0:0: Capacity data has changed
[350521.461339] sd 2:0:0:0: [sda] 167772160 512-byte logical blocks: (85.9 GB/80.0 GiB)
[350521.461484] sda: detected capacity change from 68719476736 to 85899345920
  • partition resized :
root@riverside:~# parted /dev/sda
GNU Parted 3.2
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print free                                                       
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sda: 85.9GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Dec 21 2020, 7:05 PM · Sentry, System administration
vsellier changed the status of T2910: Sentry: Increase disk space from Open to Work in Progress.
Dec 21 2020, 6:58 PM · Sentry, System administration

Dec 19 2020

olasd added a comment to T2899: Sentry doesn't react to new errors.

I confirm that sentry is still happily processing events, more than 12 hours after the last upgrade :)

Dec 19 2020, 10:55 AM · Sentry, System administration

Dec 18 2020

ardumont added a comment to T2899: Sentry doesn't react to new errors.

\o/

Dec 18 2020, 5:16 PM · Sentry, System administration
olasd closed T2899: Sentry doesn't react to new errors as Resolved.

After pushing through the updates up to 20.12.1, it seems the events are being processed correctly. I'm somewhat confident that the updated celery in the sentry image will not exhibit the same processing bug, but I'll keep an eye on the logs for a bit...

Dec 18 2020, 4:50 PM · Sentry, System administration
olasd added a comment to T2899: Sentry doesn't react to new errors.

To look at the state of the celery queues, from https://docs.celeryproject.org/en/stable/userguide/monitoring.html#monitoring-redis-queues:

Dec 18 2020, 2:36 PM · Sentry, System administration
olasd added a comment to T2899: Sentry doesn't react to new errors.

Looks like the events are/were getting stuck at the celery stage.

Dec 18 2020, 2:30 PM · Sentry, System administration
olasd added a comment to T2899: Sentry doesn't react to new errors.

Thanks for the thorough investigation so far!

Dec 18 2020, 12:46 PM · Sentry, System administration
vsellier added a comment to T2899: Sentry doesn't react to new errors.

To eliminate another possible root cause, a test was done in a temporary project with the last version of the python library, it doesn't work either

Dec 18 2020, 10:52 AM · Sentry, System administration

Dec 17 2020

vsellier added a comment to T2899: Sentry doesn't react to new errors.

we have followed the event track on the consumer code without finding anything suspicious.
As a last try, we have fully rebooted the vm, but as expected, it changed nothing at all.

Dec 17 2020, 5:37 PM · Sentry, System administration
vsellier updated subscribers of T2899: Sentry doesn't react to new errors.

@olasd, if you have some detailed of the version upgrades you have performed yesterday, perhaps it could help to diagnose.

Dec 17 2020, 3:22 PM · Sentry, System administration
vsellier changed the status of T2899: Sentry doesn't react to new errors from Open to Work in Progress.
Dec 17 2020, 2:59 PM · Sentry, System administration

Jan 16 2020

olasd changed the status of T2162: Setup a centralized authentication service, a subtask of T2163: Hook up sentry to centralized authentication, from Open to Work in Progress.
Jan 16 2020, 7:30 PM · Sentry

Dec 19 2019

olasd added a subtask for T2163: Hook up sentry to centralized authentication: T2162: Setup a centralized authentication service.
Dec 19 2019, 10:28 AM · Sentry
olasd removed a parent task for T2163: Hook up sentry to centralized authentication: T2162: Setup a centralized authentication service.
Dec 19 2019, 10:28 AM · Sentry
olasd triaged T2163: Hook up sentry to centralized authentication as Normal priority.
Dec 19 2019, 10:28 AM · Sentry
olasd triaged T2161: Setup repository integration for sentry as Normal priority.
Dec 19 2019, 10:24 AM · Sentry
olasd created Sentry.
Dec 19 2019, 10:23 AM