Page MenuHomeSoftware Heritage

Deploy opam lister/loader to staging
Closed, MigratedEdits Locked

Description

Plan:

  • lister v0.15 (includes opam lister)
  • loader.core: Tag v0.23 which includes the opam loader
  • Register opam listing and loading task in the scheduler [1]
  • D6061: Add opam lister task to the worker configuration
  • Upgrade loader.core to staging workers and restart swh-worker@lister service
  • Trigger first listing [2]
  • Monitoring
  • D6062: Puppetize new swh-worker@loader_opam
  • Deploy new loader
  • Start loader service

[1]

swhscheduler@scheduler0:~$ dpkg -l python3-swh.loader.core
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                    Version               Architecture Description
+++-=======================-=====================-============-=================================
ii  python3-swh.loader.core 0.23.1-1~swh1~bpo10+1 all          Software Heritage Loader Core
swhscheduler@scheduler0:~$ swh scheduler --config-file /etc/softwareheritage/scheduler/backend.yml task-type register
...
INFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.opam
INFO:swh.scheduler.cli.task_type:Create task type list-opam in scheduler
INFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin loader.opam
INFO:swh.scheduler.cli.task_type:Create task type load-opam in scheduler
...
$ psql service=staging-swh-scheduler
16:49:33 swh-scheduler@db1:5432=> select * from task_type where type like '%opam%';
+-----------+-----------------------------------+----------------------------------------+------------------+--------------+--------------+----------------+------------------+-------------+-------------+
|   type    |            description            |              backend_name              | default_interval | min_interval | max_interval | backoff_factor | max_queue_length | num_retries | retry_delay |
+-----------+-----------------------------------+----------------------------------------+------------------+--------------+--------------+----------------+------------------+-------------+-------------+
| list-opam | Lister task for the Opam registry | swh.lister.opam.tasks.OpamListerTask   | 1 day            | 1 day        | 1 day        |              1 |           (null) |      (null) | (null)      |
| load-opam | Load Opam's artifacts             | swh.loader.package.opam.tasks.LoadOpam | 1 day            | 1 day        | 1 day        |              1 |           (null) |      (null) | (null)      |
+-----------+-----------------------------------+----------------------------------------+------------------+--------------+--------------+----------------+------------------+-------------+-------------+
(2 rows)

Time: 144.033 ms

[2]

swhscheduler@scheduler0:~$ /usr/bin/swh scheduler --config-file scheduler.yml task add list-opam url=https://opam.ocaml.org instance=opam.ocaml.org
Created 1 tasks

Task 24858921
  Next run: today (2021-08-05T15:20:39.703665+00:00)
  Interval: 1 day, 0:00:00
  Type: list-opam
  Policy: recurring
  Args:
  Keyword args:
    instance: 'opam.ocaml.org'
    url: 'https://opam.ocaml.org'

Event Timeline

ardumont triaged this task as Normal priority.Jul 20 2021, 4:48 PM
ardumont created this task.
ardumont updated the task description. (Show Details)
ardumont moved this task from Backlog to Weekly backlog on the System administration board.
ardumont updated the task description. (Show Details)

Listing execution went well [1]. New listed origins stored in the scheduler as expected
[2].

Aug 05 15:25:51 worker2 python3[825703]: [2021-08-05 15:25:51,891: INFO/MainProcess] Received task: swh.lister.opam.tasks.OpamListerTask[1dc9bf55-d8db-4b38-96e9-b41ae50d56ae]
Aug 05 15:25:51 worker2 python3[830189]: [NOTE] Will configure from built-in defaults.
Aug 05 15:25:51 worker2 python3[830189]: Checking for available remotes: rsync and local, git, mercurial, darcs. Perfect!
Aug 05 15:25:52 worker2 python3[830189]: <><> Fetching repository information ><><><><><><><><><><><><><><><><><><><><><>
Aug 05 15:26:00 worker2 python3[830189]: [opam.ocaml.org] Initialised
Aug 05 15:27:38 worker2 python3[825710]: [2021-08-05 15:27:38,662: INFO/ForkPoolWorker-4] Task swh.lister.opam.tasks.OpamListerTask[1dc9bf55-d8db-4b38-96e9-b41ae50d56ae] succeeded in 106.7339002690278s: {'pages': 3459, 'origins': 3459}

[2]

17:40:34 swh-scheduler@db1:5432=> select * from listers where name='opam';
+--------------------------------------+------+----------------+------------------------------+---------------+------------------------------+
|                  id                  | name | instance_name  |           created            | current_state |           updated            |
+--------------------------------------+------+----------------+------------------------------+---------------+------------------------------+
| 51433b50-1ae7-4385-8b78-a14fc4798d40 | opam | opam.ocaml.org | 2021-08-05 15:20:45.16259+00 | {}            | 2021-08-05 15:20:45.16259+00 |
+--------------------------------------+------+----------------+------------------------------+---------------+------------------------------+
(1 row)

Time: 10.147 ms
17:40:27 swh-scheduler@db1:5432=> select count(*) from listed_origins lo inner join listers l on l.id=lo.lister_id where l.name='opam';
+-------+
| count |
+-------+
|  3459 |
+-------+
(1 row)

Time: 2427.849 ms (00:02.428)

Scheduling those origins [1] for the loading went less well [2]
I may have missed something though, investigating what exactly.

[1]

swhscheduler@scheduler0:~$ swh scheduler --config-file /etc/softwareheritage/scheduler/backend.yml origin schedule-next opam 1

[2]

Aug 05 15:42:29 worker0 python3[863145]: [2021-08-05 15:42:29,397: ERROR/MainProcess] Received unregistered task of type 'swh.loader.package.opam.tasks.LoadOpam'.
ardumont updated the task description. (Show Details)
ardumont moved this task from Weekly backlog to in-progress on the System administration board.

I may have missed something though, investigating what exactly.

I forgot to upgrade the swh.loader.core package which is actually holding the new package loader...

I forgot to upgrade the swh.loader.core package which is actually holding the new package loader...

Upgraded and restarted the services and now the task LoadOpam is properly registered [2]
Starting back the scheduling for those origins [1], loading is ongoing without crashing [3].

To follow through logs, one can use the kibana dashboard created for the occasion [4].
This displays the staging worker's lister and loader work logs.

Some origins can be browsed through the webapp staging instance [5]

[1]

swhscheduler@scheduler0:~$ swh scheduler --config-file /etc/softwareheritage/scheduler/backend.yml origin schedule-next opam 100

[2]

root@pergamon:~# clush -b -w @staging-loader-workers "systemctl status swh-worker@loader_opam" | grep ". swh.loader.package.opam.tasks.LoadOpam"
Aug 05 15:53:14 worker0 python3[870899]:   . swh.loader.package.opam.tasks.LoadOpam
Aug 05 15:53:12 worker1 python3[847882]:   . swh.loader.package.opam.tasks.LoadOpam
Aug 05 15:53:12 worker2 python3[838601]:   . swh.loader.package.opam.tasks.LoadOpam

[3]

Aug 05 15:56:04 worker2 python3[838606]: [2021-08-05 15:56:04,014: INFO/ForkPoolWorker-1] Task swh.loader.package.opam.tasks.LoadOpam[6f437ba9-28fb-4895-9ff2-cfcb9a174081] succeeded in 21.193766856915317s: {'status': 'uneventful', 'snapshot_id': '1a8893e6a86f444e8be8e7bda6cb34fb1735a00e'}
Aug 05 15:56:04 worker2 python3[838601]: [2021-08-05 15:56:04,020: INFO/MainProcess] Received task: swh.loader.package.opam.tasks.LoadOpam[e4b46d44-1aea-4e6e-b118-d127d6e4854d]
Aug 05 15:56:26 worker2 python3[838606]: [2021-08-05 15:56:26,960: INFO/ForkPoolWorker-1] Task swh.loader.package.opam.tasks.LoadOpam[84be40af-382c-4348-b190-a0146329187d] succeeded in 22.939665604033507s: {'status': 'eventful', 'snapshot_id': '3196e88b69d85c760e801ddad86be27556fb3adb'}
Aug 05 15:56:27 worker2 python3[838601]: [2021-08-05 15:56:27,195: INFO/MainProcess] Received task: swh.loader.package.opam.tasks.LoadOpam[39f28a91-1dc1-4d6b-be81-5a72a052be86]
Aug 05 15:56:36 worker2 python3[838893]: [2021-08-05 15:56:36,232: INFO/ForkPoolWorker-2] Task swh.loader.package.opam.tasks.LoadOpam[e4b46d44-1aea-4e6e-b118-d127d6e4854d] succeeded in 9.024149124044925s: {'status': 'eventful', 'snapshot_id': '126108159f6135fe43fd5f639912da19a51cee6d'}
Aug 05 15:56:36 worker2 python3[838601]: [2021-08-05 15:56:36,238: INFO/MainProcess] Received task: swh.loader.package.opam.tasks.LoadOpam[393a40cf-fca5-4b77-aa6d-bbb1ee2282c6]
Aug 05 15:56:43 worker2 python3[838893]: [2021-08-05 15:56:43,808: INFO/ForkPoolWorker-2] Task swh.loader.package.opam.tasks.LoadOpam[39f28a91-1dc1-4d6b-be81-5a72a052be86] succeeded in 7.567892697989009s: {'status': 'eventful', 'snapshot_id': '4d08b80cd884485103117b05ecc52208424c67a7'}
Aug 05 15:56:43 worker2 python3[838601]: [2021-08-05 15:56:43,814: INFO/MainProcess] Received task: swh.loader.package.opam.tasks.LoadOpam[7a7cde1e-119f-4a89-a33a-10dc98ce4ea7]

[4] http://kibana0.internal.softwareheritage.org:5601/goto/e0611ba609daa546347d59935e416232

[5] https://webapp.staging.swh.network/browse/search/?q=opam.ocaml.org&with_content=true

ardumont moved this task from deployed/landed/monitoring to done on the System administration board.

Done and a full run got done.