Page MenuHomeSoftware Heritage

Deploy remaining next-gen listers on staging
Closed, ResolvedPublic


Event Timeline

ardumont triaged this task as Normal priority.Feb 1 2021, 3:52 PM
ardumont created this task.

Status: OK

Schedule gnu on staging:

swhworker@worker0:~$ swh scheduler --url task add list-gnu-full
Created 1 tasks

Task 17271527
  Next run: today (2021-02-01T14:51:42.852948+00:00)
  Interval: 90 days, 0:00:00
  Type: list-gnu-full
  Policy: recurring
  Keyword args:

swhworker@worker0:~$ logout

Check everything runs smoothly:

Feb 01 14:51:54 worker0 python3[161717]: [2021-02-01 14:51:54,021: INFO/ForkPoolWorker-4] Task swh.lister.gnu.tasks.GNUListerTask[7a5a3564-bd65-4dec-8107-e73cdb9cfd47] succeeded in 2.6035053330124356s: {'pages': 1, 'origins': 384}


swh-scheduler=> select count(*) from listed_origins lo inner join listers l on and'GNU' and l.instance_name='GNU';
(1 row)

Run ok:

swhworker@worker0:~$ SWH_CONFIG_FILENAME=lister.yml swh lister run --lister cgit url= instance=eclipse base_git_url=

Which stored listed_origins alright:

swh-scheduler=> select count(*) from listed_origins lo inner join listers l on and'cgit' and l.instance_name='eclipse';
(1 row)

Explained because the new listing did add the 1340 new computed origins in one go.
There were already 900 done before.

Note that with the new cgit implementations, this adds a trailing / at the end of the origin urls.
It's still resolvable at least with our tryouts on either the cgit instances and

But that means current listed data in staging must be reworked prior to listing (or we'll store 2 different
origins which are the "same").

With D4987 on the verge of being packaged, readapted the current listed_origins (staging) with:

swh-scheduler=> begin;
swh-scheduler=> update listed_origins set url=regexp_replace(url, '/$', '') where lister_id='1af42c1b-69b0-41ef-ad98-371843de406e' ;
swh-scheduler=> select * from listed_origins where lister_id='1af42c1b-69b0-41ef-ad98-371843de406e' ;
swh-scheduler=> commit;

(so they match)

swh.lister v0.8.0 packaged with the latest packagist lister port.

Run triggered on staging:

swhworker@worker0:~$ SWH_CONFIG_FILENAME=/etc/softwareheritage/lister.yml swh lister run --lister packagist                                                                                                                                    WARNING:swh.lister.packagist.lister:Unexpected HTTP status code 404 on b'{"type":"https:\\/\\/\\/html\\/rfc2616#section-10","title":"An error occurred","status":404,"det
ail":"Not Found"}'

so far so good:

swh-scheduler=> select now(),, l.instance_name, count(*) from listed_origins lo inner join listers l on group by (, l.instance_name);
              now              |    name     | instance_name |  count
 2021-02-03 12:23:19.926102+00 | Packagist   | packagist     |  286705
(14 rows)
ardumont changed the task status from Open to Work in Progress.Feb 3 2021, 1:24 PM
ardumont updated the task description. (Show Details)
ardumont moved this task from Backlog to in-progress on the System administration board.
ardumont moved this task from in-progress to deployed/landed on the System administration board.
ardumont claimed this task.
ardumont moved this task from deployed/landed to done on the System administration board.