Page MenuHomeSoftware Heritage

Make workers send task events only when required
ClosedPublic

Authored by ardumont on Oct 7 2021, 4:49 PM.

Details

Summary

Currently, that means only workers whose tasks are managed through the scheduler-runner
and scheduler listener:

  • save code now (swh-worker@loader_high_priority)
  • vault (swh-worker@vault_cooker)
  • indexer (swh-worker@indexer*)
  • lister (swh-worker@lister)

All other workers (swh-worker@loader_*) are actually scheduled through the next-gen
scheduler and no longer need to send those events. It's dealt with the scheduler journal
client. So it's currently extra work for no reason (rabbitmq, celery, scheduler).

The following commit deactivates such events for all other workers.

Related to D6408#166956
Related to T3458

Test Plan

bin/octo-diff on worker2 staging (this node has both workers with and without events):

% $SWH_PUPPET_ENVIRONMENT_HOME/bin/octocatalog-diff --octocatalog-diff-args --no-truncate-details --to staging worker2
...
*******************************************
  File[/etc/systemd/system/swh-worker@.service] =>
   parameters =>
     content =>
      @@ -1,2 +1,5 @@
      +# Managed by puppet - modifications will be overwritten
      +# In defined class profile::swh::deploy::worker::base
      +
       [Unit]
       Description=Software Heritage Worker (%i)
      @@ -8,4 +11,6 @@
      _
       Type=simple
      +# Following environment variables can be overriden in the respective
      +# swh-worker@<service>.service.d/parameters.conf
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/%i.yml
       Environment=SWH_LOG_TARGET=journal
      @@ -14,5 +19,6 @@
       Environment=LOGLEVEL=info
       Environment=CELERY_HOSTNAME=worker2.internal.staging.swh.network
      -ExecStart=/usr/bin/python3 -m celery worker -n %i@${CELERY_HOSTNAME} --app=swh.scheduler.celery_backend.config.app --pool=prefork --events --concurrency=${CONCURRENCY} --maxtasksperchild=${MAX_TASKS_PER_CHILD} -Ofair --loglevel=${LOGLEVEL} --without-gossip --without-mingle --without-heartbeat
      +Environment=CELERY_WORKER_EXTRA_ARGS=
      +ExecStart=/usr/bin/python3 -m celery worker -n %i@${CELERY_HOSTNAME} --app=swh.scheduler.celery_backend.config.app --pool=prefork ${CELERY_WORKER_EXTRA_ARGS} --concurrency=${CONCURRENCY} --maxtasksperchild=${MAX_TASKS_PER_CHILD} -Ofair --loglevel=${LOGLEVEL} --without-gossip --without-mingle --without-heartbeat
      _
       KillMode=process
*******************************************
  File[/etc/systemd/system/swh-worker@lister.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -11,2 +11,3 @@
      _
      _
      +Environment=CELERY_WORKER_EXTRA_ARGS=--events
*******************************************
  File[/etc/systemd/system/swh-worker@loader_high_priority.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -8,2 +8,3 @@
      _
      _
      +Environment=CELERY_WORKER_EXTRA_ARGS=--events
*******************************************
  File[/etc/systemd/system/swh-worker@vault_cooker.service.d/parameters.conf] =>
   parameters =>
     content =>
      @@ -11,2 +11,3 @@
      _
      _
      +Environment=CELERY_WORKER_EXTRA_ARGS=--events
*******************************************
  Profile::Swh::Deploy::Worker::Instance[checker_deposit] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[lister] =>
   parameters =>
     send_task_events =>
      + true
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_archive] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_cran] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_debian] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_deposit] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_git] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_high_priority] =>
   parameters =>
     send_task_events =>
      + true
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_mercurial] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_nixguix] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_npm] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_opam] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_pypi] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[loader_svn] =>
   parameters =>
     send_task_events =>
      + false
*******************************************
  Profile::Swh::Deploy::Worker::Instance[vault_cooker] =>
   parameters =>
     send_task_events =>
      + true
*******************************************
  Systemd::Dropin_file[swh-worker@lister/parameters.conf] =>
   parameters =>
     content =>
      @@ -11,2 +11,3 @@
      _
      _
      +Environment=CELERY_WORKER_EXTRA_ARGS=--events
*******************************************
  Systemd::Dropin_file[swh-worker@loader_high_priority/parameters.conf] =>
   parameters =>
     content =>
      @@ -8,2 +8,3 @@
      _
      _
      +Environment=CELERY_WORKER_EXTRA_ARGS=--events
*******************************************
  Systemd::Dropin_file[swh-worker@vault_cooker/parameters.conf] =>
   parameters =>
     content =>
      @@ -11,2 +11,3 @@
      _
      _
      +Environment=CELERY_WORKER_EXTRA_ARGS=--events
*******************************************
  Systemd::Unit_file[swh-worker@.service] =>
   parameters =>
     content =>
      @@ -1,2 +1,5 @@
      +# Managed by puppet - modifications will be overwritten
      +# In defined class profile::swh::deploy::worker::base
      +
       [Unit]
       Description=Software Heritage Worker (%i)
      @@ -8,4 +11,6 @@
      _
       Type=simple
      +# Following environment variables can be overriden in the respective
      +# swh-worker@<service>.service.d/parameters.conf
       Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/%i.yml
       Environment=SWH_LOG_TARGET=journal
      @@ -14,5 +19,6 @@
       Environment=LOGLEVEL=info
       Environment=CELERY_HOSTNAME=worker2.internal.staging.swh.network
      -ExecStart=/usr/bin/python3 -m celery worker -n %i@${CELERY_HOSTNAME} --app=swh.scheduler.celery_backend.config.app --pool=prefork --events --concurrency=${CONCURRENCY} --maxtasksperchild=${MAX_TASKS_PER_CHILD} -Ofair --loglevel=${LOGLEVEL} --without-gossip --without-mingle --without-heartbeat
      +Environment=CELERY_WORKER_EXTRA_ARGS=
      +ExecStart=/usr/bin/python3 -m celery worker -n %i@${CELERY_HOSTNAME} --app=swh.scheduler.celery_backend.config.app --pool=prefork ${CELERY_WORKER_EXTRA_ARGS} --concurrency=${CONCURRENCY} --maxtasksperchild=${MAX_TASKS_PER_CHILD} -Ofair --loglevel=${LOGLEVEL} --without-gossip --without-mingle --without-heartbeat
      _
       KillMode=process
*******************************************
*** End octocatalog-diff on worker2.internal.staging.swh.network

Diff Detail

Repository
rSPSITE puppet-swh-site
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

ardumont created this revision.
ardumont retitled this revision from Only activate worker tasks events for the relevant workers to wip: Only activate worker tasks events for the relevant workers.Oct 7 2021, 4:50 PM
site-modules/profile/manifests/swh/deploy/worker/instance.pp
19

That currently does not work since the $send_task_events is not propagated in that base profile ^.

ardumont edited the test plan for this revision. (Show Details)
ardumont retitled this revision from wip: Only activate worker tasks events for the relevant workers to Only activate worker task events on worker requiring it.
ardumont edited the summary of this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)

Align diff description and the actual code

ardumont edited the test plan for this revision. (Show Details)
ardumont edited the summary of this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)

Align code and diff

ardumont retitled this revision from Only activate worker task events on worker requiring it to wip: Only activate worker task events on worker requiring it.Oct 7 2021, 6:19 PM
ardumont retitled this revision from wip: Only activate worker task events on worker requiring it to Only activate worker task events on worker requiring it.Oct 7 2021, 6:23 PM
ardumont edited the test plan for this revision. (Show Details)
18:25 <+ardumont> the test plan is a bit huge due to the octo-diff
18:27 <+ardumont> (i have still some doubt about whether i need to adapt the scheduler code a bit like D6405, but with `if self.send_events:` instead)
18:27 -- Notice(swhbot): D6405 (author: ardumont, Abandoned) on swh-scheduler: Respect task configuration to allow ignoring task result event <https://forge.softwareheritage.org/D6405>
18:28 <+olasd> you shouldn't, no. self.events should just become a noop
18:28 <+ardumont> (well, i mean with the right predicate)
18:28 <+ardumont> ack, thx
18:29 <+olasd> as for the diff, this is going to break monitoring (the ping-restart script) and all the stuff that depends on swh-worker@xxx.service. I think a gentler approach would be using an environment variable to set the --events flag or not
18:29 <+olasd> inside the existing swh-worker@.service
18:30 <+olasd> e.g. having a CELERY_WORKER_EXTRA_ARGS= environment variable that would be empty by default
18:31 <+ardumont> yeah, the breaking part saddens me a bit
18:31 <+olasd> and set to --events for the worker instances that need it
18:31 <+ardumont> ah interesting
18:31 <+ardumont> sounds way simpler, thx
18:31 <+olasd> (and add that envvar to the command line)
18:31 <+ardumont> thx
18:31 <+ardumont> miam
18:31 <+ardumont> i'll have a stab at it tomorrow
18:31 <+olasd> sorry I'm not writing that to the diff, my mouse is behaving and I'm about to head back home
18:31 <+olasd> feel free to c/p it :P
18:32 <+ardumont> yeah, thx, it's all good

for tomorrow ^

ardumont retitled this revision from Only activate worker task events on worker requiring it to Make workers send task events only when required.Oct 8 2021, 10:50 AM
ardumont edited the summary of this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)
ardumont edited the summary of this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)

Adapt according to last discussion ^

site-modules/profile/manifests/swh/deploy/worker/instance.pp
19

for older version of the diff (no longer relevant now)

olasd added a subscriber: olasd.

Thanks!

This revision is now accepted and ready to land.Oct 8 2021, 10:56 AM
ardumont edited the summary of this revision. (Show Details)

Rebase