the instance is at https://gitlab.ow2.org/ , we should add it to our crawler rotation
Description
Description
Related Objects
Related Objects
Event Timeline
Comment Actions
$ curl --head https://gitlab.ow2.org/api/v4/projects HTTP/2 200 server: nginx date: Tue, 27 Aug 2019 10:21:30 GMT content-type: application/json content-length: 19658 vary: Accept-Encoding cache-control: no-cache link: <https://gitlab.ow2.org/api/v4/projects?membership=false&order_by=created_at&owned=false&page=2&per_page=20&simple=false&sort=desc&starred=false&statistics=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false>; rel="next", <https://gitlab.ow2.org/api/v4/projects?membership=false&order_by=created_at&owned=false&page=1&per_page=20&simple=false&sort=desc&starred=false&statistics=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false>; rel="first", <https://gitlab.ow2.org/api/v4/projects?membership=false&order_by=created_at&owned=false&page=51&per_page=20&simple=false&sort=desc&starred=false&statistics=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false>; rel="last" vary: Origin x-content-type-options: nosniff x-frame-options: SAMEORIGIN x-next-page: 2 x-page: 1 x-per-page: 20 x-prev-page: x-request-id: a8aqxWitUT6 x-runtime: 1.180268 x-total: 1003 x-total-pages: 51 strict-transport-security: max-age=31536000 referrer-policy: strict-origin-when-cross-origin
Comment Actions
Do we have an admin contact there, to make sure that cloning all their repos at once will not kill their infra?
Comment Actions
A first round has been done:
softwareheritage-scheduler=> select status, count(*) from task where type='load-git' and policy='oneshot' and priority='high' and arguments#>>'{args,0}' like 'https://gitlab.ow2.org%' group by status; status | count -----------+------- completed | 960 disabled | 43 (2 rows)
Note:
I did not investigate the 43 disabled.
Comment Actions
add it to our crawler rotation
done
$ SCHEDULER_API_URL=http://saatchi.internal.softwareheritage.org:5008/; $ swh scheduler --url $SCHEDULER_API_URL task add list-gitlab-full api_baseurl=https://gitlab.ow2.org/api/v4 instance=ow2 $ swh scheduler --url $SCHEDULER_API_URL task list --task-type list-gitlab-full ... Task 203527512 Next run: in 3 months (2019-12-01 09:09:56+00:00) Interval: 90 days, 0:00:00 Type: list-gitlab-full Policy: recurring Status: next_run_not_scheduled Priority: Args: Keyword args: api_baseurl: 'https://gitlab.ow2.org/api/v4' instance: 'ow2' ...
I expect things to do mostly noop next time it runs (aside the first 43 disabled).
Comment Actions
The "standard" listing (output recurring tasks with no priority) ran:
softwareheritage-scheduler=> select status, count(*) from task where type='load-git' and policy='recurring' and priority is null and arguments#>>'{args,0}' like 'https://gitlab.ow2.org%' group by status; status | count ------------------------+------- next_run_not_scheduled | 1003
So closing this.