Page MenuHomeSoftware Heritage

gitlab.com: Full lister fails to retrieve the range information it needs to start listing
Started, Work in Progress, NormalPublic

Description

It worked for a time since we did list gitlab instances (T1139).
Investigate and fix.

Jun 29 07:34:04 worker01 python3[18335]: [2019-06-29 07:34:04,108: ERROR/ForkPoolWorker-3] Task swh.lister.gitlab.tasks.FullGitLabRelister[25599a68-d227-4d2f-9e94-7345ec7b0d2d] raised unexpected: ValueError('Problem during information fetch: 404',)
                                         Traceback (most recent call last):
                                           File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 382, in trace_task
                                             R = retval = fun(*args, **kwargs)
                                           File "/usr/lib/python3/dist-packages/swh/scheduler/task.py", line 45, in __call__
                                             return super().__call__(*args, **kwargs)
                                           File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 641, in __protected_call__
                                             return self.run(*args, **kwargs)
                                           File "/usr/lib/python3/dist-packages/swh/lister/gitlab/tasks.py", line 47, in full_gitlab_relister
                                             _, total_pages, _ = lister.get_pages_information()
                                           File "/usr/lib/python3/dist-packages/swh/lister/gitlab/lister.py", line 75, in get_pages_information
                                             'Problem during information fetch: %s' % response.status_code)
                                         ValueError: Problem during information fetch: 404

Note:

  • This is solely a problem for the main instance gitlab.com. See comments below for more details.
  • Other gitlab instances have been unstuck and are currently listed (debian, gnome, gite.lirmm, ...)

Event Timeline

ardumont created this task.Jun 29 2019, 9:53 AM
ardumont triaged this task as High priority.
ardumont updated the task description. (Show Details)
ardumont changed the task status from Open to Work in Progress.EditedJun 30 2019, 10:47 AM
ardumont added a subscriber: douardda.

Heads up.

Depending on the gitlab instance (probably the server's gitlab version), some expected headers ("x-total", "x-total-page") are missing from the head request response:

curl -I https://gitlab.com/api/v4/projects
HTTP/1.1 200 OK
Server: nginx
Date: Sun, 30 Jun 2019 08:17:46 GMT
Content-Type: application/json
Content-Length: 19710
Vary: Accept-Encoding
Cache-Control: no-cache
Link: <https://gitlab.com/api/v4/projects?membership=false&order_by=created_at&owned=false&page=2&per_page=20&repository_checksum_failed=false&simple=false&sort=desc&starred=false&statistics=false&wiki_checksum_failed=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false>; rel="next", <https://gitlab.com/api/v4/projects?membership=false&order_by=created_at&owned=false&page=1&per_page=20&repository_checksum_failed=false&simple=false&sort=desc&starred=false&statistics=false&wiki_checksum_failed=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false>; rel="first"
Vary: Origin
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Next-Page: 2
X-Page: 1
X-Per-Page: 20
X-Prev-Page:
X-Request-Id: gZjWx0b2XL2
X-Runtime: 7.111431
Strict-Transport-Security: max-age=31536000
Referrer-Policy: strict-origin-when-cross-origin
RateLimit-Limit: 600
RateLimit-Observed: 3
RateLimit-Remaining: 597
RateLimit-Reset: 1561882726
RateLimit-ResetTime: Sun, 30 Jun 2019 08:18:46 GMT

All other gitlab instances are fine in that regard though:

|-------------+---------+---------------+------------|
| instance    | x-total | x-total-pages | x-per-page |
|-------------+---------+---------------+------------|
| gitlab.com  | X       | X             | ok         |
| framagit    | ok      | ok            | ok         |
| riseup      | ok      | ok            | ok         |
| inria       | ok      | ok            | ok         |
| freedesktop | ok      | ok            | ok         |
|-------------+---------+---------------+------------|
| debian      | ok      | ok            | ok         |
| gnome       | ok      | ok            | ok         |
| git.lirmm   | ok      | ok            | ok         |
| common-lisp | ok      | ok            | ok         |
|-------------+---------+---------------+------------|

So the problem lies elsewhere.
In my analysis, i checked back the scheduler's list-gitlab-full tasks.
The error seems manual in the end (so pffffiiiiouuuu):

168414597 | list-gitlab-full | {"args": [], "kwargs": {"instance": "common-lisp", "api_baseurl": "https://gitlab.common-lisp.net/api/v4/projects/"}}
168414596 | list-gitlab-full | {"args": [], "kwargs": {"instance": "gite.lirmm", "api_baseurl": "https://gite.lirmm.fr/api/v4/projects/"}}          
168414595 | list-gitlab-full | {"args": [], "kwargs": {"instance": "gnome", "api_baseurl": "https://gitlab.gnome.org/api/v4/projects/"}}            
168414594 | list-gitlab-full | {"args": [], "kwargs": {"instance": "debian", "api_baseurl": "https://salsa.debian.org/api/v4/projects/"}}           
104285333 | list-gitlab-full | {"args": [], "kwargs": {"instance": "riseup", "api_baseurl": "https://0xacab.org/api/v4"}}                           
167870867 | list-gitlab-full | {"args": [], "kwargs": {"instance": "freedesktop", "api_baseurl": "https://gitlab.freedesktop.org/api/v4"}}          
 97209370 | list-gitlab-full | {"args": [], "kwargs": {"instance": "gitlab", "api_baseurl": "https://gitlab.com/api/v4"}}                           
102364637 | list-gitlab-full | {"args": [], "kwargs": {"instance": "inria", "api_baseurl": "https://gitlab.inria.fr/api/v4"}}                       
104286042 | list-gitlab-full | {"args": [], "kwargs": {"instance": "framagit", "api_baseurl": "https://framagit.org/api/v4"}}

The new ones have their api_baseurl incorrect.
That converges with a point @douardda emphasized to me about the baseurl some time ago.
We could avoid setting the full base api url (and let the lister fill it in).
As pros, that avoid that kind of misfire.
As cons, that potentially impedes composition of instances which could use different api version (apparently not the case so far ;).
I'll just concentrate on fixing the urls and respawn those tasks for now.


In regards to the missing headers, what that means though is:

  • if we were to bootstrap a full listing on the main gitlab.com, that would not work [1]
  • (hypothesis) potentially, if that's a gitlab software version issue. In the future, we could have that same behavior spread over to the other gitlab instances.

[1] The subject of the stacktrace error is the instruction _, total_pages, _ = lister.get_pages_information().
This tries to determine the listing ranges it can distribute on workers.

I'll just concentrate on fixing the urls and respawn those tasks for now.

update task set arguments = '{"args": [], "kwargs": {"instance": "common-lisp", "api_baseurl": "https://gitlab.common-lisp.net/api/v4"}}' where type='list-gitlab-full' and arguments#>>'{kwargs,instance}' = 'common-lisp';
update task set arguments = '{"args": [], "kwargs": {"instance": "gite.lirmm", "api_baseurl": "https://git.lirmm.fr/api/v4"}}' where type='list-gitlab-full' and arguments#>>'{kwargs,instance}' = 'git.lirmm';
update task set arguments = '{"args": [], "kwargs": {"instance": "gnome", "api_baseurl": "https://gitlab.gnome.org/api/v4"}}'  where type='list-gitlab-full' and arguments#>>'{kwargs,instance}' = 'gnome';
update task set arguments = '{"args": [], "kwargs": {"instance": "debian", "api_baseurl": "https://salsa.debian.org/api/v4/projects/"}}' where type='list-gitlab-full' and arguments#>>'{kwargs,instance}' = 'debian';

There we go, unification!

psql service=swh-scheduler -c "select id, type, status, arguments from task where type = 'list-gitlab-full'";                                                                                                        ~
    id     |       type       |         status         |                                                  arguments
-----------+------------------+------------------------+-------------------------------------------------------------------------------------------------------------
 168414594 | list-gitlab-full | next_run_scheduled     | {"args": [], "kwargs": {"instance": "debian", "api_baseurl": "https://salsa.debian.org/api/v4"}}
 168414596 | list-gitlab-full | next_run_scheduled     | {"args": [], "kwargs": {"instance": "gite.lirmm", "api_baseurl": "https://git.lirmm.fr/api/v4"}}
 168414595 | list-gitlab-full | next_run_scheduled     | {"args": [], "kwargs": {"instance": "gnome", "api_baseurl": "https://gitlab.gnome.org/api/v4"}}
 168414597 | list-gitlab-full | next_run_scheduled     | {"args": [], "kwargs": {"instance": "common-lisp", "api_baseurl": "https://gitlab.common-lisp.net/api/v4"}}
 104285333 | list-gitlab-full | next_run_not_scheduled | {"args": [], "kwargs": {"instance": "riseup", "api_baseurl": "https://0xacab.org/api/v4"}}
 167870867 | list-gitlab-full | next_run_not_scheduled | {"args": [], "kwargs": {"instance": "freedesktop", "api_baseurl": "https://gitlab.freedesktop.org/api/v4"}}
  97209370 | list-gitlab-full | next_run_scheduled     | {"args": [], "kwargs": {"instance": "gitlab", "api_baseurl": "https://gitlab.com/api/v4"}}
 102364637 | list-gitlab-full | next_run_not_scheduled | {"args": [], "kwargs": {"instance": "inria", "api_baseurl": "https://gitlab.inria.fr/api/v4"}}
 104286042 | list-gitlab-full | next_run_scheduled     | {"args": [], "kwargs": {"instance": "framagit", "api_baseurl": "https://framagit.org/api/v4"}}
(9 rows)
ardumont renamed this task from gitlab lister: Full lister fails to retrieve the range information it needs to start listing to gitlab.com: Full lister fails to retrieve the range information it needs to start listing.Jun 30 2019, 11:00 AM
ardumont lowered the priority of this task from High to Normal.
ardumont updated the task description. (Show Details)
ardumont removed ardumont as the assignee of this task.Jul 3 2019, 3:26 PM