- D8033: Implement lister
- D8174, D8321: Implement loader
- T4466#89828 Lister run in docker
- T4466#90034 Loader run in docker
- D8033: Document lister
- D8174: Document loader
- Deploy on staging
- Call for public review
- Deploy on production
Description
Revisions and Commits
Related Objects
Event Timeline
AUR Lister runs in Docker report
Aur Lister runs fine in Docker, quite long (+/- 30 minutes) to list origins.
Found 78702 AUR packages in aur_index Successfully removed /tmp/aur_archive directory Task swh.lister.aur.tasks.AurListerTask[a7ed0b48-3d3b-4aad-b158-6d888ff9aab5] succeeded in 1619.0577569839952s: {'pages': 78702, 'origins': 78702} swh-scheduler=# select count(*) from listed_origins where visit_type = 'aur'; count ------- 78702
Aur Loader runs in Docker report
Aur Loader runs in Docker but I don't get why It loads origins after the lister has completed (I.e I've not run origin scheduled next aur qty)
For now it looks good and is quite fast because the packages it download are very small. It grabs +/- 25000 origins in an hour without errors:
swh-scheduler=# select count(*) from origin_visit_stats where visit_type='aur' and last_visit_status='successful';
count
27057
(1 row)
swh-scheduler=# select count(*) from origin_visit_stats where visit_type='aur' and last_visit_status='failed';
count
0
(1 row)
I've made a complete run on docker
Lister:
2022-08-30 10:31:30,328: INFO/ForkPoolWorker-1] Task swh.lister.aur.tasks.AurListerTask[a24d7a3d-81ea-4ef9-90e7-e9cad8a3ffec] succeeded in 946.656092988007s: {'pages': 78803, 'origins': 78803} swh-scheduler=# select count(*) from listed_origins where visit_type='aur'; -[ RECORD 1 ] count | 78803
Loader (It takes between 2 and 3 hours to complete loading everything):
swh-scheduler=# select count(*) from origin_visit_stats where visit_type='aur' and last_visit_status='successful'; -[ RECORD 1 ] count | 78799 swh-scheduler=# select count(*) from origin_visit_stats where visit_type='aur' and last_visit_status='failed'; -[ RECORD 1 ] count | 4 swh-scheduler=# select count(*) from origin_visit_stats where visit_type='aur' and last_visit_status='not_found'; -[ RECORD 1 ] count | 0