Page MenuHomeSoftware Heritage

staging: Deploy bower lister
Closed, MigratedEdits Locked

Description

plan:

  • Declare new lister task in scheduler
  • R260:e3aad: Declare new lister in swh-charts
  • Add new lister task in scheduler [1]
  • Checks
    • Listing ok [2] [2']
    • Trigger specific origins for scheduling [3]
    • T4555#92215: ingestion ok on the few origins sent

[1]

swhscheduler@scheduler0:~$ swh scheduler --url http://scheduler0.internal.staging.swh.network:5008/ task add list-bower
Created 1 tasks

Task 33419582
  Next run: today (2022-09-26T11:44:22.561371+00:00)
  Interval: 1 day, 0:00:00
  Type: list-bower
  Policy: recurring
  Args:
  Keyword args:

[2]

│ listers [2022-09-26 11:45:16,730: INFO/MainProcess] Connected to amqp://swhconsumer:**@scheduler0.internal.staging.swh.network:5672//                                                                                                       │
│ listers [2022-09-26 11:45:16,973: INFO/MainProcess] lister@lister-bower-84c7fcbbf4-7r74f ready.                                                                                                                                             │
│ listers [2022-09-26 11:45:16,980: INFO/MainProcess] Task swh.lister.bower.tasks.BowerListerTask[a9250c2c-175a-487e-b2a4-dfbf079909d2] received                                                                                              │
│ listers [2022-09-26 11:45:17,363: INFO/ForkPoolWorker-1] Fetching URL https://registry.bower.io/packages with params {}                                                                                                                     │
│ listers [2022-09-26 11:50:03,596: INFO/ForkPoolWorker-1] Task swh.lister.bower.tasks.BowerListerTask[a9250c2c-175a-487e-b2a4-dfbf079909d2] succeeded in 286.4608801769791s: {'pages': 1, 'origins': 68864}

[2']

13:46:50 swh-scheduler@db1:5432=> select now(), visit_type, count(*) from listed_origins where lister_id = ( select id from listers where name='bower') group by visit_type;
+-------------------------------+------------+-------+
|              now              | visit_type | count |
+-------------------------------+------------+-------+
| 2022-09-26 11:47:13.155334+00 | git        |  2915 |
+-------------------------------+------------+-------+
(1 row)

Time: 76.876 ms
13:50:00 swh-scheduler@db1:5432=> select now(), visit_type, count(*) from listed_origins where lister_id = ( select id from listers where name='bower') group by visit_type;
+-------------------------------+------------+-------+
|              now              | visit_type | count |
+-------------------------------+------------+-------+
| 2022-09-26 11:54:15.203022+00 | git        | 67904 |
+-------------------------------+------------+-------+
(1 row)

Time: 221.874 ms

[3]

13:47:13 swh-scheduler@db1:5432=> select * from listers where name='bower';
+--------------------------------------+-------+---------------+-------------------------------+---------------+-------------------------------+
|                  id                  | name  | instance_name |            created            | current_state |            updated            |
+--------------------------------------+-------+---------------+-------------------------------+---------------+-------------------------------+
| f43c9008-023a-4ed1-bb0e-0c5e6f5af47b | bower | bower         | 2022-09-26 11:45:17.215894+00 | {}            | 2022-09-26 11:45:17.215894+00 |
+--------------------------------------+-------+---------------+-------------------------------+---------------+-------------------------------+
(1 row)

Time: 5.684 ms

swhscheduler@scheduler0:~$ swh scheduler -C /etc/softwareheritage/scheduler/listener-runner.yml origin send-to-celery --lister-uuid 'f43c9008-023a-4ed1-bb0e-0c5e6f5af47b' --tablesample 1 git
100 slots available in celery queue  # <--------- cheated: increased temporarily the max queue length to 200 instead of 100 so we can schedule some more origins
100 visits to send to celery

Event Timeline

ardumont triaged this task as Normal priority.Sep 26 2022, 1:26 PM
ardumont created this task.
ardumont moved this task from Backlog to Weekly backlog on the System administration board.
ardumont changed the task status from Open to Work in Progress.Sep 26 2022, 1:39 PM
ardumont moved this task from Weekly backlog to in-progress on the System administration board.
ardumont updated the task description. (Show Details)
ardumont updated the task description. (Show Details)
ardumont updated the task description. (Show Details)

Some bower origins visited and ok:

19:18:05 swh-scheduler@db1:5432=> select lo.url, lo.visit_type, last_seen, last_scheduled, last_successful, last_visit, last_visit_status from listed_origins lo inner join origin_visit_stats o on o.url=lo.url where lo.lister_id = ( select id from listers where name='bower') and last_visit_status='successful' order by last_successful desc limit 10;
+------------------------------------------------------+------------+-------------------------------+------------------------------+-------------------------------+-------------------------------+-------------------+
|                         url                          | visit_type |           last_seen           |        last_scheduled        |        last_successful        |          last_visit           | last_visit_status |
+------------------------------------------------------+------------+-------------------------------+------------------------------+-------------------------------+-------------------------------+-------------------+
| https://github.com/rxaviers/cldr.git                 | git        | 2022-09-28 11:56:53.560263+00 | 2022-09-28 10:53:03.68185+00 | 2022-09-28 15:53:20.313464+00 | 2022-09-28 15:53:20.313464+00 | successful        |
| https://github.com/rxaviers/cldrjs.git               | git        | 2022-09-28 11:56:53.560263+00 | 2022-09-28 10:53:03.68185+00 | 2022-09-28 15:53:19.821064+00 | 2022-09-28 15:53:19.821064+00 | successful        |
| https://github.com/unicode-cldr/cldr-core.git        | git        | 2022-09-28 11:56:53.560263+00 | 2022-09-28 10:53:03.68185+00 | 2022-09-28 15:50:30.500446+00 | 2022-09-28 15:50:30.500446+00 | successful        |
| https://github.com/rxaviers/cldr-data-bower.git      | git        | 2022-09-28 11:56:53.560263+00 | 2022-09-28 10:53:03.68185+00 | 2022-09-28 15:48:47.234603+00 | 2022-09-28 15:48:47.234603+00 | successful        |
| https://github.com/DivineOmega/cachet.js.git         | git        | 2022-09-28 11:56:48.186394+00 | 2022-09-28 10:53:03.68185+00 | 2022-09-28 15:48:36.467171+00 | 2022-09-28 15:48:36.467171+00 | successful        |
| https://github.com/Axiacore/cachet-site-notifier.git | git        | 2022-09-28 11:56:48.186394+00 | 2022-09-28 10:53:03.68185+00 | 2022-09-28 15:48:29.466577+00 | 2022-09-28 15:48:29.466577+00 | successful        |
| https://github.com/renie/CacheRequest.git            | git        | 2022-09-28 11:56:48.186394+00 | 2022-09-28 10:53:03.68185+00 | 2022-09-28 15:48:27.524073+00 | 2022-09-28 15:48:27.524073+00 | successful        |
| https://github.com/vash15/cache-text.git             | git        | 2022-09-28 11:56:48.186394+00 | 2022-09-28 10:53:03.68185+00 | 2022-09-28 15:48:23.566331+00 | 2022-09-28 15:48:23.566331+00 | successful        |
| https://github.com/Wizcorp/cachepuncher.git          | git        | 2022-09-28 11:56:48.186394+00 | 2022-09-28 10:53:03.68185+00 | 2022-09-28 15:48:20.930672+00 | 2022-09-28 15:48:20.930672+00 | successful        |
| https://github.com/iamso/cacherrr.git                | git        | 2022-09-28 11:56:48.186394+00 | 2022-09-28 10:53:03.68185+00 | 2022-09-28 15:48:17.939761+00 | 2022-09-28 15:48:17.939761+00 | successful        |
+------------------------------------------------------+------------+-------------------------------+------------------------------+-------------------------------+-------------------------------+-------------------+
(10 rows)

Time: 4260.413 ms (00:04.260)
ardumont claimed this task.
ardumont updated the task description. (Show Details)
ardumont moved this task from deployed/landed/monitoring to done on the System administration board.