Page MenuHomeSoftware Heritage

Deploy new origin intrinsic metadata journal client indexer
ClosedPublic

Authored by ardumont on May 31 2022, 4:11 PM.

Details

Summary

Plan:

  • Push this commit only in staging for now (when accepted)
  • scheduler0.staging: Stop current journal client service
  • Run puppet on scheduler0.staging to redefine the journal client service (as this here)
  • Run this on staging workers to stop the old worker service (and declare the swh-journal-client service on workers)

This also adapts the instance.pp profile to clean up missing parts.

Related to T4282

Test Plan

octo-diff:

  • workers: purge old swh-worker@origin_intrinsic_metadata service and configuration files + install new indexer journal client service [1]
  • scheduler0.staging: Drop old indexer journal client service [2]

[1]

$SWH_PUPPET_ENVIRONMENT_HOME/bin/octocatalog-diff --octocatalog-diff-args --no-truncate-details --to staging worker3.internal.staging.swh.network
Found host worker3.internal.staging.swh.network
WARN     -> Environment "staging-add-prometheus-metrics" contained non-word characters, correcting name to staging_add_prometheus_metrics
WARN     -> Environment "staging-bullseye-rabbitmq-plugin" contained non-word characters, correcting name to staging_bullseye_rabbitmq_plugin
WARN     -> Environment "staging-check-journal-client" contained non-word characters, correcting name to staging_check_journal_client
WARN     -> Environment "staging-check-journal-client-2nd-implementation" contained non-word characters, correcting name to staging_check_journal_client_2nd_implementation
WARN     -> Environment "staging-check-journal-client-first-implem" contained non-word characters, correcting name to staging_check_journal_client_first_implem
WARN     -> Environment "staging-pin" contained non-word characters, correcting name to staging_pin
Cloning into '/tmp/swh-ocd.EnSguLjO/environments/production/data/private'...
done.
Cloning into '/tmp/swh-ocd.EnSguLjO/environments/staging/data/private'...
done.
*** Running octocatalog-diff on host worker3.internal.staging.swh.network
I, [2022-06-01T11:15:26.983158 #372981]  INFO -- : Catalogs compiled for worker3.internal.staging.swh.network
I, [2022-06-01T11:15:27.429249 #372981]  INFO -- : Diffs computed for worker3.internal.staging.swh.network
diff origin/production/worker3.internal.staging.swh.network current/worker3.internal.staging.swh.network
*******************************************
- Concat_fragment[profile::cron::swh-worker-indexer_origin_intrinsic_metadata-autorestart]
*******************************************
+ File[/etc/softwareheritage/indexer/journal_client.yml] =>
   parameters =>
      "ensure": "present"
      "group": "swhstorage"
      "mode": "0640"
      "notify": "Service[swh-indexer-journal-client]"
      "owner": "root"
      "content": >>>
---
journal:
  brokers:
  - journal1.internal.staging.swh.network
  group_id: swh.indexer.journal_client
  prefix: swh.journal.objects
scheduler:
  cls: remote
  args:
    url: http://scheduler0.internal.staging.swh.network:5008/
<<<
*******************************************
  File[/etc/softwareheritage/indexer_origin_intrinsic_metadata.yml] =>
   parameters =>
     ensure =>
      - present
      + absent
*******************************************
+ File[/etc/softwareheritage/journal] =>
   parameters =>
      "ensure": "directory"
      "group": "swhworker"
      "mode": "0644"
      "owner": "swhworker"
*******************************************
+ File[/etc/systemd/system/swh-indexer-journal-client.service] =>
   parameters =>
      "ensure": "file"
      "group": "root"
      "mode": "0444"
      "notify": "Class[Systemd::Systemctl::Daemon_reload]"
      "owner": "root"
      "show_diff": true
      "content": >>>
# Indexer Journal Client unit file
# Managed by puppet class profile::swh::deploy::indexer_journal_client
# Changes will be overwritten

[Unit]
Description=Software Heritage Indexer Journal Client
After=network.target

[Service]
Environment=SWH_SENTRY_DSN=https://swh::deploy::indexer::sentry_token@sentry.softwareheritage.org/5
Environment=SWH_SENTRY_ENVIRONMENT=staging
Environment=SWH_MAIN_PACKAGE=swh.indexer
User=swhstorage
Group=swhstorage
Type=simple
ExecStart=/usr/bin/swh indexer --config-file /etc/softwareheritage/indexer/journal_client.yml journal-client indexer origin-intrinsic-metadata
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
<<<
*******************************************
  File[/etc/systemd/system/swh-worker@indexer_origin_intrinsic_metadata.service.d/parameters.conf] =>
   parameters =>
     ensure =>
      - file
      + absent
*******************************************
- File[/etc/systemd/system/swh-worker@indexer_origin_intrinsic_metadata.service.d]
*******************************************
  Package[python3-swh.indexer] =>
   parameters =>
     notify =>
      - ["Profile::Swh::Deploy::Worker::Instance[indexer_content_mimetype]", "Profile::Swh::Deploy::Worker::Instance[indexer_fossology_license]", "Profile::Swh::Deploy::Worker::Instance[indexer_origin_intrinsic_metadata]"]
      + ["Profile::Swh::Deploy::Worker::Instance[indexer_content_mimetype]", "Profile::Swh::Deploy::Worker::Instance[indexer_fossology_license]"]
*******************************************
+ Package[python3-swh.journal] =>
   parameters =>
      "ensure": "installed"
*******************************************
- Profile::Cron::D[swh-worker-indexer_origin_intrinsic_metadata-autorestart]
*******************************************
  Profile::Swh::Deploy::Worker::Instance[indexer_origin_intrinsic_metadata] =>
   parameters =>
     ensure =>
      - present
      + absent
     send_task_events =>
      - true
      + false
     sentry_name =>
      - indexer
      + indexer_origin_intrinsic_metadata
*******************************************
+ Service[swh-indexer-journal-client] =>
   parameters =>
      "enable": true
      "ensure": "running"
*******************************************
  Service[swh-worker@indexer_origin_intrinsic_metadata] =>
   parameters =>
     enable =>
      - true
     ensure =>
      + absent
*******************************************
  Systemd::Dropin_file[swh-worker@indexer_origin_intrinsic_metadata/parameters.conf] =>
   parameters =>
     content =>
      - # Managed by puppet - modifications will be overwritten
# In defined class profile::swh::deploy::worker::instance

[Service]
Environment=CONCURRENCY=1
Environment=MAX_TASKS_PER_CHILD=5
Environment=LOGLEVEL=info
Environment=SWH_SENTRY_DSN=https://swh::deploy::indexer::sentry_token@sentry.softwareheritage.org/5
Environment=SWH_SENTRY_ENVIRONMENT=staging
Environment=SWH_MAIN_PACKAGE=swh.indexer


Environment=SWH_WORKER_CLI_EXTRA_ARGS=--events

     ensure =>
      - present
      + absent
*******************************************
+ Systemd::Unit_file[swh-indexer-journal-client.service] =>
   parameters =>
      "ensure": "present"
      "group": "root"
      "mode": "0444"
      "notify": ["Service[swh-indexer-journal-client]"]
      "owner": "root"
      "path": "/etc/systemd/system"
      "show_diff": true
      "content": >>>
# Indexer Journal Client unit file
# Managed by puppet class profile::swh::deploy::indexer_journal_client
# Changes will be overwritten

[Unit]
Description=Software Heritage Indexer Journal Client
After=network.target

[Service]
Environment=SWH_SENTRY_DSN=https://swh::deploy::indexer::sentry_token@sentry.softwareheritage.org/5
Environment=SWH_SENTRY_ENVIRONMENT=staging
Environment=SWH_MAIN_PACKAGE=swh.indexer
User=swhstorage
Group=swhstorage
Type=simple
ExecStart=/usr/bin/swh indexer --config-file /etc/softwareheritage/indexer/journal_client.yml journal-client indexer origin-intrinsic-metadata
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
<<<
*******************************************
*** End octocatalog-diff on worker3.internal.staging.swh.network

[2]

$SWH_PUPPET_ENVIRONMENT_HOME/bin/octocatalog-diff --octocatalog-diff-args --no-truncate-details --to staging scheduler0.internal.staging.swh.network
Found host scheduler0.internal.staging.swh.network
WARN     -> Environment "staging-add-prometheus-metrics" contained non-word characters, correcting name to staging_add_prometheus_metrics
WARN     -> Environment "staging-bullseye-rabbitmq-plugin" contained non-word characters, correcting name to staging_bullseye_rabbitmq_plugin
WARN     -> Environment "staging-check-journal-client" contained non-word characters, correcting name to staging_check_journal_client
WARN     -> Environment "staging-check-journal-client-2nd-implementation" contained non-word characters, correcting name to staging_check_journal_client_2nd_implementation
WARN     -> Environment "staging-check-journal-client-first-implem" contained non-word characters, correcting name to staging_check_journal_client_first_implem
WARN     -> Environment "staging-pin" contained non-word characters, correcting name to staging_pin
Cloning into '/tmp/swh-ocd.pFC4u6sm/environments/production/data/private'...
done.
Cloning into '/tmp/swh-ocd.pFC4u6sm/environments/staging/data/private'...
done.
*** Running octocatalog-diff on host scheduler0.internal.staging.swh.network
I, [2022-06-01T11:19:07.109910 #378588]  INFO -- : Catalogs compiled for scheduler0.internal.staging.swh.network
I, [2022-06-01T11:19:07.514855 #378588]  INFO -- : Diffs computed for scheduler0.internal.staging.swh.network
diff origin/production/scheduler0.internal.staging.swh.network current/scheduler0.internal.staging.swh.network
*******************************************
- File[/etc/softwareheritage/indexer/journal_client.yml]
*******************************************
- File[/etc/softwareheritage/indexer]
*******************************************
- File[/etc/systemd/system/swh-indexer-journal-client.service]
*******************************************
- Package[python3-swh.indexer]
*******************************************
- Service[swh-indexer-journal-client]
*******************************************
- Systemd::Unit_file[swh-indexer-journal-client.service]
*******************************************
*** End octocatalog-diff on scheduler0.internal.staging.swh.network

Diff Detail

Repository
rSPSITE puppet-swh-site
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

manifests/site.pp
45 ↗(On Diff #28560)

Temporarily to keep production untouched

site-modules/profile/manifests/swh/deploy/worker/indexer_origin_intrinsic_metadata.pp
6

maybe purge?

  • Is the scheduler section in 'swh::deploy::indexer_journal_client::config' still needed ?
site-modules/profile/manifests/swh/deploy/worker/indexer_origin_intrinsic_metadata.pp
6

anything different from present or running will remove the config files.
but it seems the services will remain in place (cf instance.pp content)

site-modules/profile/templates/swh/deploy/journal/swh-indexer-journal-client.service.erb
22

Won't it break the configuration in production during the transition pahse ? (I'm not sure to clearly understand what it will goring on for the indexer_content_mimetype)

site-modules/profile/manifests/swh/deploy/worker/indexer_origin_intrinsic_metadata.pp
6

yes, possible.
I'm pretty sure i'll have to clean up dangling stuff after running this anyway so i'm not sure it's worth bothering more.
Or i can try and improve instance.pp so it deals with the purge keyword (and the add the clean up behavior).

What do you think?

site-modules/profile/templates/swh/deploy/journal/swh-indexer-journal-client.service.erb
22

yes, that's why:

  • i've commented the origin-intrinsic-metadata entry in the swh::instance entry
  • and in the plan i've mentioned, stop puppet on saatchi first ;)
vsellier added inline comments.
site-modules/profile/manifests/swh/deploy/worker/indexer_origin_intrinsic_metadata.pp
6

as we said orally, there is already a cleaning part on instance.pp, it should be possible to add the service cleaning for cheap

This revision is now accepted and ready to land.Jun 1 2022, 10:57 AM

after rereading this together, it seems the worker activation is missing

This revision now requires changes to proceed.Jun 1 2022, 11:02 AM
ardumont edited the test plan for this revision. (Show Details)

Adapt according to discussion (description and test plan updated already)

Is the scheduler section in 'swh::deploy::indexer_journal_client::config' still needed ?

yes, it's still needed for we only adapted that profile and moved it to the indexer workers.

I'm not sure if the service should be stopped before being removed but otherwise it looks ok

This revision is now accepted and ready to land.Jun 1 2022, 11:40 AM