plan:
- Declare new lister task in scheduler
- R260:e3aad: Declare new lister in swh-charts
- Add new lister task in scheduler [1]
- Checks
- Listing ok [2] [2']
- Trigger specific origins for scheduling [3]
- T4555#92215: ingestion ok on the few origins sent
[1]
swhscheduler@scheduler0:~$ swh scheduler --url http://scheduler0.internal.staging.swh.network:5008/ task add list-bower Created 1 tasks Task 33419582 Next run: today (2022-09-26T11:44:22.561371+00:00) Interval: 1 day, 0:00:00 Type: list-bower Policy: recurring Args: Keyword args:
[2]
│ listers [2022-09-26 11:45:16,730: INFO/MainProcess] Connected to amqp://swhconsumer:**@scheduler0.internal.staging.swh.network:5672// │ │ listers [2022-09-26 11:45:16,973: INFO/MainProcess] lister@lister-bower-84c7fcbbf4-7r74f ready. │ │ listers [2022-09-26 11:45:16,980: INFO/MainProcess] Task swh.lister.bower.tasks.BowerListerTask[a9250c2c-175a-487e-b2a4-dfbf079909d2] received │ │ listers [2022-09-26 11:45:17,363: INFO/ForkPoolWorker-1] Fetching URL https://registry.bower.io/packages with params {} │ │ listers [2022-09-26 11:50:03,596: INFO/ForkPoolWorker-1] Task swh.lister.bower.tasks.BowerListerTask[a9250c2c-175a-487e-b2a4-dfbf079909d2] succeeded in 286.4608801769791s: {'pages': 1, 'origins': 68864}
[2']
13:46:50 swh-scheduler@db1:5432=> select now(), visit_type, count(*) from listed_origins where lister_id = ( select id from listers where name='bower') group by visit_type; +-------------------------------+------------+-------+ | now | visit_type | count | +-------------------------------+------------+-------+ | 2022-09-26 11:47:13.155334+00 | git | 2915 | +-------------------------------+------------+-------+ (1 row) Time: 76.876 ms 13:50:00 swh-scheduler@db1:5432=> select now(), visit_type, count(*) from listed_origins where lister_id = ( select id from listers where name='bower') group by visit_type; +-------------------------------+------------+-------+ | now | visit_type | count | +-------------------------------+------------+-------+ | 2022-09-26 11:54:15.203022+00 | git | 67904 | +-------------------------------+------------+-------+ (1 row) Time: 221.874 ms
[3]
13:47:13 swh-scheduler@db1:5432=> select * from listers where name='bower'; +--------------------------------------+-------+---------------+-------------------------------+---------------+-------------------------------+ | id | name | instance_name | created | current_state | updated | +--------------------------------------+-------+---------------+-------------------------------+---------------+-------------------------------+ | f43c9008-023a-4ed1-bb0e-0c5e6f5af47b | bower | bower | 2022-09-26 11:45:17.215894+00 | {} | 2022-09-26 11:45:17.215894+00 | +--------------------------------------+-------+---------------+-------------------------------+---------------+-------------------------------+ (1 row) Time: 5.684 ms swhscheduler@scheduler0:~$ swh scheduler -C /etc/softwareheritage/scheduler/listener-runner.yml origin send-to-celery --lister-uuid 'f43c9008-023a-4ed1-bb0e-0c5e6f5af47b' --tablesample 1 git 100 slots available in celery queue # <--------- cheated: increased temporarily the max queue length to 200 instead of 100 so we can schedule some more origins 100 visits to send to celery