Page MenuHomeSoftware Heritage

Unstuck bitbucket incremental lister
Open, HighPublic

Description

The incremental Bitbucket lister is currently stuck in production (see SWH-LISTER-5K) due to an error 500 returned by that Bitbucket Web API URL.

As we can see from the lister state in scheduler database, origins modified after 2022-04-22T17:14:21.817530+00:0 have not been listed.

softwareheritage-scheduler=> select name, current_state from listers where name = 'bitbucket';
   name    |                      current_state                      
-----------+---------------------------------------------------------
 bitbucket | {"last_repo_cdate": "2022-04-22T17:14:21.817530+00:00"}
(1 row)

Bumping the after query parameter to 2022-04-23T00:00:00.00000+00:00 makes the Bitbucket Web API return results again so
a possible mitigation would be to increment the after date query parameter in the lister until there is no error 500 returned by the Bitbucket
Web API (a similar process is implemented in the gitlab lister for instance).

Event Timeline

anlambert triaged this task as Normal priority.May 13 2022, 11:26 AM
anlambert created this task.

A possible workaround for this without coding anything just yet would be to upgrade the scheduler's listers record for bitbucket to the unstucking date you mention first [1]

If the issue arises again, then that'd be the time to actually code something in the lister.

[1]

11:43:46 softwareheritage-scheduler@belvedere:5432=> begin;
BEGIN
Time: 7.295 ms
11:44:23 *softwareheritage-scheduler@belvedere:5432=> update listers
softwareheritage-scheduler-> set current_state=jsonb_set(current_state, '{last_repo_cdate}', '"2022-04-23T00:00:00.00000+00:00"', false)
softwareheritage-scheduler-> where name='bitbucket';
UPDATE 1
Time: 6.261 ms
11:44:23 *softwareheritage-scheduler@belvedere:5432=> select name, current_state from listers where name = 'bitbucket';
+-----------+--------------------------------------------------------+
|   name    |                     current_state                      |
+-----------+--------------------------------------------------------+
| bitbucket | {"last_repo_cdate": "2022-04-23T00:00:00.00000+00:00"} |
+-----------+--------------------------------------------------------+
(1 row)

Time: 5.912 ms
11:44:23 *softwareheritage-scheduler@belvedere:5432=> rollback;
ROLLBACK
Time: 71.357 ms

We had a mysterious error 500 issue with one buggy repository in the bitbucket API in the past, which we had reported to their Jira to no avail, but someone at Octobus (I think it was @marmoute or @Alphare ?) managed to reach out to one of the devs at atlassian to actually debug and fix the issue.

Maybe this contact could be notified again?

In T4239#84980, @olasd wrote:

We had a mysterious error 500 issue with one buggy repository in the bitbucket API in the past, which we had reported to their Jira to no avail, but someone at Octobus (I think it was @marmoute or @Alphare ?) managed to reach out to one of the devs at atlassian to actually debug and fix the issue.

Maybe this contact could be notified again?

Indeed, see T1859#34742. I will report the issue to Bitbucket Jira then.

Our initial contact (Erik von Zijst) is no longer there, but you can try pinging other who were part of the discussion :

  • TJ Kells <tkells@atlassian.com>
  • Daniel Tao <dtao@atlassian.com>
  • Ming Gong <mgong@atlassian.com>
  • Tom Kane <tkane@atlassian.com>
anlambert raised the priority of this task from Normal to High.Wed, Sep 7, 1:30 PM

Raising priority as it's been four months since the lister is stuck in production.

13:26 $ psql service=swh-scheduler
psql (12.12 (Debian 12.12-1.pgdg110+1), server 12.11 (Debian 12.11-1.pgdg110+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

softwareheritage-scheduler=> select current_state from listers where name = 'bitbucket';
                      current_state                      
---------------------------------------------------------
 {"last_repo_cdate": "2022-04-22T17:14:21.817530+00:00"}
(1 row)