Page MenuHomeSoftware Heritage

Deploy Gogs lister to staging
Closed, MigratedEdits Locked

Description

Deploy Gogs lister to staging.
It lists 'git' origins (nothing to do for loader).

Plan:

  • Register task type to scheduler [1]
  • Register swhbot account on the gogs site (try.gogs.io for the tryout)
  • Generate access token and install it in the credential repository
  • Update charts with new lister
  • Schedule a gogs forge to list [2]
  • Checks
    • Does not finish properly, gets stuck behind T4533.
    • Release v3.0.2 with the fix deployed -> 160 origins listed

[1]

10:25:14 swh-scheduler@db1:5432=> select * from task_type where type like 'list-gogs%';
+----------------+--------------------------------+----------------------------------------+------------------+--------------+--------------+----------------+------------------+-------------+-------------+
|      type      |          description           |              backend_name              | default_interval | min_interval | max_interval | backoff_factor | max_queue_length | num_retries | retry_delay |
+----------------+--------------------------------+----------------------------------------+------------------+--------------+--------------+----------------+------------------+-------------+-------------+
| list-gogs-full | Full update of a Gogs instance | swh.lister.gogs.tasks.FullGogsRelister | 90 days          | 90 days      | 90 days      |              1 |             1000 |      (null) | (null)      |
+----------------+--------------------------------+----------------------------------------+------------------+--------------+--------------+----------------+------------------+-------------+-------------+
(1 row)

Time: 1159.368 ms (00:01.159)

[2]

swhscheduler@scheduler0:~$ swh scheduler --url http://scheduler0.internal.staging.swh.network:5008/ task add list-gogs-full https://try.gogs.io/api/v1/
Created 1 tasks

Task 33419470
  Next run: today (2022-09-13T09:12:08.867813+00:00)
  Interval: 90 days, 0:00:00
  Type: list-gogs-full
  Policy: recurring
  Args:
    'https://try.gogs.io/api/v1/'
  Keyword args:
13:52:08 swh-scheduler@db1:5432=> select now(), visit_type, count(*) from listed_origins where lister_id = ( select id from listers where name='gogs') group by visit_type;
+-------------------------------+------------+-------+
|              now              | visit_type | count |
+-------------------------------+------------+-------+
| 2022-09-21 11:52:14.859315+00 | git        |   160 |
+-------------------------------+------------+-------+
(1 row)

Time: 81.638 ms

Event Timeline

vlorentz triaged this task as Normal priority.Sep 1 2022, 10:26 AM
vlorentz added a project: Origin-Gitea/Gogs.
ardumont updated the task description. (Show Details)
ardumont changed the task status from Open to Work in Progress.Sep 13 2022, 11:31 AM
ardumont moved this task from Backlog to in-progress on the System administration board.

It appeared to not be doing much [0]. Although, T4533 got raised and [1] went down a bit earlier.
I've stopped the lister in the mean time.

[0]

14:18:12 swh-scheduler@db1:5432=> select now(), visit_type, count(*) from listed_origins where lister_id = (select id from listers where name='gogs') group by visit_type;
+-----+------------+-------+
| now | visit_type | count |
+-----+------------+-------+
+-----+------------+-------+
(0 rows)

Time: 78.367 ms
14:18:20 swh-scheduler@db1:5432=> select now(), * from listers where name='gogs';
+-------------------------------+--------------------------------------+------+---------------+-------------------------------+---------------+-------------------------------+
|              now              |                  id                  | name | instance_name |            created            | current_state |            updated            |
+-------------------------------+--------------------------------------+------+---------------+-------------------------------+---------------+-------------------------------+
| 2022-09-13 12:18:29.230675+00 | ec6c8408-42f9-47b8-b045-5a812d4a0eff | gogs | try.gogs.io   | 2022-09-13 09:13:00.647726+00 | {}            | 2022-09-13 09:13:00.647726+00 |
+-------------------------------+--------------------------------------+------+---------------+-------------------------------+---------------+-------------------------------+
(1 row)

Time: 4.721 ms

[1] https://try.gogs.io/

thanks. will try to reproduce this locally.

try.gogs.io being down must have been a coincidence last week.

I triggered back a listing (debug mode) and it just gets stuck behind the issue (in the task mentioned above).

[2022-09-20 09:08:32,414: INFO/MainProcess] Task swh.lister.gogs.tasks.FullGogsRelister[740fe66e-1b02-43ce-a696-0f6f22272617] received
[2022-09-20 09:08:32,542: DEBUG/ForkPoolWorker-1] Loading config file /etc/swh/config.yml
[2022-09-20 09:08:32,819: INFO/ForkPoolWorker-1] Using authentication credentials from user swhbot
[2022-09-20 09:08:32,826: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search with params {'limit': 50, 'page': 1}
[2022-09-20 09:08:34,404: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=2 with params {}
[2022-09-20 09:08:34,733: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=3 with params {}
[2022-09-20 09:08:35,115: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=4 with params {}
[2022-09-20 09:08:35,480: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=5 with params {}
[2022-09-20 09:08:35,937: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=6 with params {}
[2022-09-20 09:08:36,233: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=7 with params {}
[2022-09-20 09:08:36,539: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=8 with params {}
[2022-09-20 09:08:36,911: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=9 with params {}
[2022-09-20 09:08:37,313: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=10 with params {}
[2022-09-20 09:08:37,768: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=11 with params {}
[2022-09-20 09:08:38,091: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=12 with params {}
[2022-09-20 09:08:38,419: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=13 with params {}
[2022-09-20 09:08:38,798: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=14 with params {}
[2022-09-20 09:08:39,165: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=15 with params {}
[2022-09-20 09:08:39,455: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=16 with params {}
[2022-09-20 09:08:39,843: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=17 with params {}
[2022-09-20 09:08:40,047: WARNING/ForkPoolWorker-1] Unexpected HTTP status code 500 on https://try.gogs.io/api/v1/repos/search?page=17: b''
[2022-09-20 09:08:40,390: ERROR/ForkPoolWorker-1] Task swh.lister.gogs.tasks.FullGogsRelister[740fe66e-1b02-43ce-a696-0f6f22272617] raised unexpected: HTTPError('500 Server Error: Internal Server Error for url: https://try.gogs.io/api/v1/repos/search?page=17')
Traceback (most recent call last):
  File "/opt/swh/.local/lib/python3.10/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/opt/swh/.local/lib/python3.10/site-packages/sentry_sdk/integrations/celery.py", line 204, in _inner
    reraise(*exc_info)
  File "/opt/swh/.local/lib/python3.10/site-packages/sentry_sdk/_compat.py", line 54, in reraise
    raise value
  File "/opt/swh/.local/lib/python3.10/site-packages/sentry_sdk/integrations/celery.py", line 199, in _inner
    return f(*args, **kwargs)
  File "/opt/swh/.local/lib/python3.10/site-packages/swh/scheduler/task.py", line 61, in __call__
    result = super().__call__(*args, **kwargs)
  File "/opt/swh/.local/lib/python3.10/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
  File "/opt/swh/.local/lib/python3.10/site-packages/swh/lister/gogs/tasks.py", line 23, in list_gogs_full
    return lister.run().dict()
  File "/opt/swh/.local/lib/python3.10/site-packages/swh/lister/pattern.py", line 127, in run
    for page in self.get_pages():
  File "/opt/swh/.local/lib/python3.10/site-packages/swh/lister/gogs/lister.py", line 166, in get_pages
    response = self.page_request(next_link, {})
  File "/opt/swh/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 324, in wrapped_f
    return self(f, *args, **kw)
  File "/opt/swh/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 404, in __call__
    do = self.iter(retry_state=retry_state)
  File "/opt/swh/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 349, in iter
    return fut.result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/opt/swh/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 407, in __call__
    result = fn(*args, **kwargs)
  File "/opt/swh/.local/lib/python3.10/site-packages/swh/lister/gogs/lister.py", line 136, in page_request
    response.raise_for_status()
  File "/opt/swh/.local/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://try.gogs.io/api/v1/repos/search?page=17

So a fix is in order as suggested by @vlorentz and @anlambert described [1].

[1] T4533#91327

@ardumont because of T4423 we will face 500 for https://try.gogs.io/api/v1/repos/search?page=17 no matter how many times we retry (until it is fixed by Gogs maintainers). So should we not ignore the page and move on to the next one?

In this forge task, I also suggested temporarily changing the page size to 1 in order to skip the fatal repos.

@ardumont because of T4423 we will face 500 for https://try.gogs.io/api/v1/repos/search?page=17 no matter how many times we retry (until it is fixed by Gogs maintainers).

Right!

So should we not ignore the page and move on to the next one?

That'd make sense.

In this forge task, I also suggested temporarily changing the page size to 1 in order to skip the fatal repos.

Thanks for reminding me this, i don't recall (and it's there alright ;).

I can imagine that even with page of 1, you could end up with a 500 issue nonetheless so might as well ignore the 500 as you suggested above.
So we can have a chance to finalize the listing.

I was wondering why the number of listed repos was so low, turns out gogs has an option to not list a repository through the API when creating it:

Just tested the lister in docker and I got a lot more origins listed:

docker-swh-lister-1  | [2022-09-21 12:24:57,583: INFO/MainProcess] Task swh.lister.gogs.tasks.FullGogsRelister[2b0ba017-4917-44dc-9511-e860bb431322] received
docker-swh-lister-1  | [2022-09-21 12:24:57,584: DEBUG/ForkPoolWorker-1] Loading config file /lister.yml
docker-swh-lister-1  | [2022-09-21 12:24:57,594: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search with params {'limit': 50, 'page': 1}
docker-swh-lister-1  | [2022-09-21 12:24:58,556: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=2 with params {}
docker-swh-lister-1  | [2022-09-21 12:24:58,791: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=3 with params {}
docker-swh-lister-1  | [2022-09-21 12:24:59,018: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=4 with params {}
docker-swh-lister-1  | [2022-09-21 12:24:59,263: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=5 with params {}
docker-swh-lister-1  | [2022-09-21 12:24:59,580: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=6 with params {}
docker-swh-lister-1  | [2022-09-21 12:24:59,845: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=7 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:00,094: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=8 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:00,323: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=9 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:00,560: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=10 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:00,787: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=11 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:01,020: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=12 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:01,324: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=13 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:01,794: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=14 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:02,191: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=15 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:02,549: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=16 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:02,967: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=17 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:03,250: WARNING/ForkPoolWorker-1] Unexpected HTTP status code 500 on https://try.gogs.io/api/v1/repos/search?page=17: b''
docker-swh-lister-1  | [2022-09-21 12:25:03,250: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=18 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:03,515: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=19 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:03,765: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=20 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:04,078: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=21 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:04,444: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=22 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:04,695: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=23 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:04,948: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=24 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:05,262: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=25 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:05,814: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=26 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:06,184: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=27 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:06,454: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=28 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:06,688: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=29 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:06,935: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=30 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:07,247: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=31 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:07,500: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=32 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:07,734: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=33 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:08,064: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=34 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:08,337: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=35 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:08,581: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=36 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:08,818: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=37 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:09,057: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=38 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:09,289: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=39 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:09,531: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=40 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:09,756: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=41 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:09,999: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=42 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:10,213: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=43 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:10,444: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=44 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:10,689: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=45 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:10,914: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=46 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:11,163: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=47 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:11,390: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=48 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:11,710: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=49 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:12,068: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=50 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:12,342: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=51 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:12,619: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=52 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:12,862: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=53 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:13,169: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=54 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:13,549: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search?page=55 with params {}
docker-swh-lister-1  | [2022-09-21 12:25:13,827: DEBUG/ForkPoolWorker-1] Start from server, version: 0.9, properties: {'capabilities': {'publisher_confirms': True, 'exchange_exchange_bindings': True, 'basic.nack': True, 'consumer_cancel_notify': True, 'connection.blocked': True, 'consumer_priorities': True, 'authentication_failure_close': True, 'per_consumer_qos': True, 'direct_reply_to': True}, 'cluster_name': 'rabbit@0f4428ad1388', 'copyright': 'Copyright (C) 2007-2018 Pivotal Software, Inc.', 'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'platform': 'Erlang/OTP 19.2.1', 'product': 'RabbitMQ', 'version': '3.6.16'}, mechanisms: [b'AMQPLAIN', b'PLAIN'], locales: ['en_US']
docker-swh-lister-1  | [2022-09-21 12:25:13,828: DEBUG/ForkPoolWorker-1] using channel_id: 1
docker-swh-lister-1  | [2022-09-21 12:25:13,831: DEBUG/ForkPoolWorker-1] Channel open
docker-swh-lister-1  | [2022-09-21 12:25:13,833: INFO/ForkPoolWorker-1] Task swh.lister.gogs.tasks.FullGogsRelister[2b0ba017-4917-44dc-9511-e860bb431322] succeeded in 16.23296177299926s: {'pages': 55, 'origins': 575}

Nevertheless, they are less origins in the scheduler db, maybe some get listed more than once, need to check.

14:27 $ doco exec swh-scheduler /bin/bash
swh@94a1eb125b9f:/$ psql swh-scheduler;
psql (12.12 (Debian 12.12-1.pgdg110+1))
Type "help" for help.

swh-scheduler=# select name, id from listers;
     name      |                  id                  
---------------+--------------------------------------
 save-code-now | ee7641fa-bf80-4acb-a41d-a76203c2a677
 gogs          | 2cc0c371-a6b6-4b09-819f-806041513044
(2 rows)

swh-scheduler=# select count(*) from listed_origins where lister_id = '2cc0c371-a6b6-4b09-819f-806041513044';
 count 
-------
   447
(1 row)

Nevertheless, they are less origins in the scheduler db, maybe some get listed more than once, need to check.

Confirmed after hacking on the lister code, some origins can be listed more than once but listing https://try.gogs.io should return 447 origins not 160.

I noticed your remarks but that's not what's happening in staging [3] for some unknown reason [1]

Version: image: softwareheritage/lister:20220921.1 -> which means lister v3.0.2 (so
including Kumar's fix) [2]

Did you activate something special in the account token you are using?

[1]

│ listers [2022-09-23 08:25:26,261: INFO/MainProcess] Task swh.lister.gogs.tasks.FullGogsRelister[781652e8-0f3b-4418-908c-4f6e8aad7812] received                                                                                              │
│ listers [2022-09-23 08:25:26,507: INFO/ForkPoolWorker-1] Using authentication credentials from user swhbot                                                                                                                                  │
│ listers [2022-09-23 08:25:27,567: INFO/ForkPoolWorker-1] Task swh.lister.gogs.tasks.FullGogsRelister[781652e8-0f3b-4418-908c-4f6e8aad7812] succeeded in 1.0819350309902802s: {'pages': 1, 'origins': 0}

[2] https://forge.softwareheritage.org/source/swh-apps/history/master/;swh-lister-20220921.1

[3]

10:24:32 swh-scheduler@db1:5432=> select now(), * from task where type = 'list-gogs-full';
+-------------------------------+----------+----------------+---------------------------------------------------------+-------------------------------+------------------+------------------------+-----------+--------------+----------+
|              now              |    id    |      type      |                        arguments                        |           next_run            | current_interval |         status         |  policy   | retries_left | priority |
+-------------------------------+----------+----------------+---------------------------------------------------------+-------------------------------+------------------+------------------------+-----------+--------------+----------+
| 2022-09-23 08:32:15.149015+00 | 33419470 | list-gogs-full | {"args": ["https://try.gogs.io/api/v1/"], "kwargs": {}} | 2022-12-22 08:25:27.569387+00 | 90 days          | next_run_not_scheduled | recurring |            0 | (null)   |
+-------------------------------+----------+----------------+---------------------------------------------------------+-------------------------------+------------------+------------------------+-----------+--------------+----------+
(1 row)

Time: 4.898 ms

As a shot in the dark, i've changed the list task arguments [1] but i'd be surprised if that'd change anything...

...

Nope, still the same [2]

[1]

10:39:13 swh-scheduler@db1:5432=> select now(), id, type, next_run, arguments, status from task where type = 'list-gogs-full';
+-------------------------------+----------+----------------+-------------------------------+----------------------------------------------------------------+------------------------+
|              now              |    id    |      type      |           next_run            |                           arguments                            |         status         |
+-------------------------------+----------+----------------+-------------------------------+----------------------------------------------------------------+------------------------+
| 2022-09-23 08:39:16.460374+00 | 33419470 | list-gogs-full | 2022-12-22 08:25:27.569387+00 | {"args": ["https://try.gogs.io/api/v1/"], "kwargs": {}}        | disabled               |
| 2022-09-23 08:39:16.460374+00 | 33419563 | list-gogs-full | 2022-12-22 08:37:47.615091+00 | {"args": [], "kwargs": {"url": "https://try.gogs.io/api/v1/"}} | next_run_not_scheduled |
+-------------------------------+----------+----------------+-------------------------------+----------------------------------------------------------------+------------------------+
(2 rows)

Time: 4.914 ms

[2]

│ listers [2022-09-23 08:41:26,598: DEBUG/ForkPoolWorker-2] Loading config file /etc/swh/config.yml                                                                                                                                           │
│ listers [2022-09-23 08:41:26,793: INFO/ForkPoolWorker-2] Using authentication credentials from user swhbot                                                                                                                                  │
│ listers [2022-09-23 08:41:26,803: DEBUG/ForkPoolWorker-2] Fetching URL https://try.gogs.io/api/v1/repos/search with params {'limit': 50, 'page': 17}                                                                                        │
│ listers [2022-09-23 08:41:27,591: DEBUG/ForkPoolWorker-2] Start from server, version: 0.9, properties: {'capabilities': {'publisher_confirms': True, 'exchange_exchange_bindings': True, 'basic.nack': True, 'consumer_cancel_notify': True │
│ listers [2022-09-23 08:41:27,595: DEBUG/ForkPoolWorker-2] using channel_id: 1                                                                                                                                                               │
│ listers [2022-09-23 08:41:27,598: DEBUG/ForkPoolWorker-2] Channel open                                                                                                                                                                      │
│ listers [2022-09-23 08:41:27,609: INFO/ForkPoolWorker-2] Task swh.lister.gogs.tasks.FullGogsRelister[feb27905-3458-40bc-af46-298ae695d214] succeeded in 0.9511380520416424s: {'pages': 1, 'origins': 0}                                     │

ah i think i got it...

The workaround from @KShivendu assumes the urls is already created ok... and then when
it hits an issue, it built the new link out of the one received from the response of the
first query... [1]

But on my side, the "stateless" lister starts from the problematic page (17), and then
plainly stops... [2].

Note that there is a misbehavior in terms here for that "stateless" typed lister. If it
is indeed a stateless lister, it should start from scratch each time it runs. If it has
an incremental behavior (it seems so from that described behavior) then it should be
probably typed differently.

[1]

| [2022-09-21 12:24:57,594: DEBUG/ForkPoolWorker-1] Fetching URL https://try.gogs.io/api/v1/repos/search with params {'limit': 50, 'page': 1}
...
| [2022-09-21 12:25:03,250: WARNING/ForkPoolWorker-1] Unexpected HTTP status code 500 on https://try.gogs.io/api/v1/repos/search?page=17: b''  <- problematic page kumar worked around

[2] Staging lister starts from this erratic page and stops.

...
listers [2022-09-23 08:41:26,803: DEBUG/ForkPoolWorker-2] Fetching URL https://try.gogs.io/api/v1/repos/search with params {'limit': 50, 'page': 17}
listers [2022-09-23 08:41:27,609: INFO/ForkPoolWorker-2] Task swh.lister.gogs.tasks.FullGogsRelister[feb27905-3458-40bc-af46-298ae695d214] succeeded in 0.9511380520416424s: {'pages': 1, 'origins': 0}

But on my side, the "stateless" lister starts from the problematic page (17), and then
plainly stops... [2].

Note that there is a misbehavior in terms here for that "stateless" typed lister. If it
is indeed a stateless lister, it should start from scratch each time it runs. If it has
an incremental behavior (it seems so from that described behavior) then it should be
probably typed differently.

Bingo, it is a stateful lister... so nevermind my remark ^

I either remembered an old stateless implementation or i remember another implementation
altogether.

I'll unstuck the situation by resetting that state.

@KShivendu, you may want to deal with that edgecase though (please, create another
task/diff if you want to address it). If the first page of the saved state is not
working, next time it triggers, the current lister implementation won't list anything
¯\_(ツ)_/¯.

[1]

10:41:16 swh-scheduler@db1:5432=> select * from listers where name='gogs';
+--------------------------------------+------+---------------+-------------------------------+-------------------------------------------------------------------------------------------------------+-------------------------------+
|                  id                  | name | instance_name |            created            |                                             current_state                                             |            updated            |
+--------------------------------------+------+---------------+-------------------------------+-------------------------------------------------------------------------------------------------------+-------------------------------+
| ec6c8408-42f9-47b8-b045-5a812d4a0eff | gogs | try.gogs.io   | 2022-09-13 09:13:00.647726+00 | {"last_seen_repo_id": 7591, "last_seen_next_link": "https://try.gogs.io/api/v1/repos/search?page=17"} | 2022-09-20 09:08:40.081281+00 |
+--------------------------------------+------+---------------+-------------------------------+-------------------------------------------------------------------------------------------------------+-------------------------------+
(1 row)

Time: 5.468 ms

I'll unstuck the situation by resetting that state.

State reset [1] and trigger back a listing [2] which resulted in the same amount of
origins than @anlambert [3].

[1]

11:00:17 swh-scheduler@db1:5432=> update listers set current_state='{}' where name='gogs';
UPDATE 1
Time: 162.686 ms

[2]

11:00:24 swh-scheduler@db1:5432=> update task set next_run=now(), status='next_run_not_scheduled' where type='list-gogs-full' and id=33419563;
UPDATE 1
Time: 168.393 ms

[3]

│ listers [2022-09-23 09:02:03,628: DEBUG/ForkPoolWorker-3] Fetching URL https://try.gogs.io/api/v1/repos/search?page=55 with params {}
| ...
│ listers [2022-09-23 09:02:04,270: INFO/ForkPoolWorker-3] Task swh.lister.gogs.tasks.FullGogsRelister[e8d72e76-4059-405c-b175-9a2055849365] succeeded in 89.15930224396288s: {'pages': 55, 'origins': 576}
ardumont claimed this task.
ardumont moved this task from deployed/landed/monitoring to done on the System administration board.

I either remembered an old stateless implementation or i remember another
implementation altogether.

The latter, I remembered the pubdev lister which is stateless... closing the subject ;)

Thanks to @anlambert, a possible gogs public instance [1] to check again.

[1] https://gogs.univ-littoral.fr/explore/repos