diff --git a/sysadm/deployment/howto-process-add-forge-now-requests.rst b/sysadm/deployment/howto-process-add-forge-now-requests.rst new file mode 100644 --- /dev/null +++ b/sysadm/deployment/howto-process-add-forge-now-requests.rst @@ -0,0 +1,245 @@ +.. _how-to-process-add-forge-now-requests: + +How to process add-forge-now requests +===================================== + +.. admonition:: Intended audience + :class: important + + sysadm staff members + +The processing is semi-automatic for the moment. Referencing the steps is a kickstarter +for automation. + + +Introduction +------------ + +A forge ticket (`see for example the git.afpy.org ticket +`_) should +have been opened by a moderator. + +Meaning the `moderation process is ongoing +`_ and the upstream +forge (to be ingested) has been notified we will start the ingestion soon. + + +.. _add-forge-now-testing-on-staging: + +Testing on staging +------------------ + +To ensure we can ingest that forge, we start by testing out a subset of that forge +listing on staging. It's a pre-check flight to determine we have the right amount of +information. + +On a staging node (usually the scheduling node of the domain), run: + +.. code:: + + swh scheduler --url http://scheduler0.internal.staging.swh.network:5008/ \ + add-forge-now --preset staging \ + register-lister gitea \ + url= + + +For example, forge `git.afpy.org `_ which is a `gitea +`_ instance, we'd run: + +.. code:: + + swh scheduler --url http://scheduler0.internal.staging.swh.network:5008/ \ + add-forge-now --preset staging \ + register-lister gitea \ + url=https://git.afpy.org/api/v1/ + + INFO:swh.lister.pattern:Max origins per page set, truncated 36 page results down to 30 + INFO:swh.lister.pattern:Disabling origins before sending them to the scheduler + INFO:swh.lister.pattern:Reached page limit of 3, terminating + + +Ensure the :ref:`lister got registered` in the staging +scheduler db. + +After a bit of time, you can :ref:`check origins from that forge got listed +` in the scheduler db: + + +Still on a staging node, we trigger the first ingestion for those origins: + +.. code:: + + swh scheduler --preset staging add-forge-now \ + schedule-first-visits \ + --visit-type \ + --visit-type \ + --lister-name \ + --lister-instance-name + +For our particular instance: + +.. code:: + + swh scheduler --preset staging add-forge-now \ + schedule-first-visits \ + --visit-type git \ + --lister-name gitea \ + --lister-instance-name git.afpy.org + + 100 slots available in celery queue + 15 visits to send to celery + +After some time, :ref:`check those origins got ingested at least in part +`. + +If everything is fine, let's :ref:`schedule that forge in production +`. + + +.. _add-forge-now-deploying-on-production: + +Deploying on production +----------------------- + +After :ref:`testing with success the forge ingestion in staging +`, it's time to deploy the full and recurrent listing +for that forge. + +Let's start by registering the lister for that forge as usual: + +.. code:: + + swh scheduler --url http://saatchi.internal.softwareheritage.org:5008/ \ + add-forge-now --preset staging \ + register-lister \ + url= + +For example: + +.. code:: + + swh scheduler --url http://saatchi.internal.softwareheritage.org:5008/ \ + add-forge-now --preset staging \ + register-lister gitea \ + url=https://git.afpy.org/api/v1/ + +Ensure the :ref:`lister got registered` in the production +scheduler db. + +After a bit of time, you can :ref:`check origins from that forge got listed +` in the scheduler db: + +Once the listing is through, we trigger the add-forge-now scheduling to make a first +pass on that forge. + +.. code:: + + swh scheduler --url http://saatchi.internal.softwareheritage.org:5008/ \ + add-forge-now ( --preset production ) \ + schedule-first-visits \ + --visit-type \ + --lister-name \ + --lister-instance-name + +For example: + +.. code:: + + swh scheduler --url http://saatchi.internal.softwareheritage.org:5008/ \ + add-forge-now \ + schedule-first-visits \ + --visit-type git \ + --lister-name gitea \ + --lister-instance-name git.afpy.org + + 10000 slots available in celery queue + 37 visits to send to celery + +After a while, :ref:`you can check those origins should have been ingested in part +`. You can now notify the moderator in the ticket that the +first ingestion got done. + +.. _add-forge-now-checks: + +Usual checks +------------ + +In the following, we will demonstrate the usual checks happening in the scheduler db. +The format will be the generic query to execute followed by an actual execution (with a +sampled output). + +.. _check-lister-is-registered: + +Check the lister is registered +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: + + select * from listers where name='' and instance_name=''; + +For example: + +.. code:: + + 2022-12-06 11:50:17 swh-scheduler@db1:5432 λ select * from listers where name='gitea' and instance_name='git.afpy.org'; + +--------------------------------------+-------+---------------+-------------------------------+ + | id | name | instance_name | created | ... + +--------------------------------------+-------+---------------+-------------------------------+ + | d07d1c90-5016-4ab6-91ac-3300f8eb4fc6 | gitea | git.afpy.org | 2022-12-06 10:47:46.975571+00 | + +--------------------------------------+-------+---------------+-------------------------------+ + (1 row) + + Time: 4.109 ms + +.. _check-origins-got-listed: + +Check origins got listed +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: + + select * from listed_origins + where lister_id = (select id from listers where name='' + and instance_name='') ; + +.. code:: + + 2022-12-06 11:50:24 swh-scheduler@db1:5432 λ select * from listed_origins where lister_id = (select id from listers where name='gitea' and instance_name='git.afpy.org'); + +--------------------------------------+-----------------------------------------------------------+------------+ + | lister_id | url | visit_type | ... + +--------------------------------------+-----------------------------------------------------------+------------+ + | d07d1c90-5016-4ab6-91ac-3300f8eb4fc6 | https://git.afpy.org/AFPy/afpy.org.git | git | + | d07d1c90-5016-4ab6-91ac-3300f8eb4fc6 | https://git.afpy.org/foxmask/baeuda.git | git | + | d07d1c90-5016-4ab6-91ac-3300f8eb4fc6 | https://git.afpy.org/fcode/boilerplate-python.git | git | + | ... + +--------------------------------------+-----------------------------------------------------------+------------+ + (15 rows) + + Time: 1225.399 ms (00:01.225) + + +.. _check-origins-got-ingested: + +Check origins got ingested +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: + + select * from origin_visit_stats + where visit_type='' + and url like 'https://%'; + +.. code:: + + 2022-12-06 18:47:43 swh-scheduler@db1:5432 λ select * from origin_visit_stats where visit_type='git' and url like 'https://git.afpy.org%'; + +-----------------------------------------------------------+------------+--------------------------------------------+-------------------------------+---------------------------+----------------------+-------------------+-------------------------------+-------------------------------+-------------------+ + | url | visit_type | last_snapshot | last_scheduled | next_visit_queue_position | next_position_offset | successive_visits | last_successful | last_visit | last_visit_status | + +-----------------------------------------------------------+------------+--------------------------------------------+-------------------------------+---------------------------+----------------------+-------------------+-------------------------------+-------------------------------+-------------------+ + | https://git.afpy.org/fcode/boilerplate-python.git | git | \x812e8ff75a6424c51cfd22a9202503f21cccf13d | 2022-12-06 17:16:22.384523+00 | 167043832784 | 4 | 1 | 2022-12-06 17:17:28.58761+00 | 2022-12-06 17:17:28.58761+00 | successful | + | https://git.afpy.org/AFPy/infra.git | git | \x2696466a836f7411e35d3802d0de31fb5a3a1c5d | 2022-12-06 17:16:22.384523+00 | 167043839992 | 4 | 1 | 2022-12-06 17:27:31.499761+00 | 2022-12-06 17:27:31.499761+00 | successful | + | https://git.afpy.org/mdk/git-xss-locator.git | git | \xa55b9be6a67299b668084a68c2775081f3cfe255 | 2022-12-06 17:16:22.384523+00 | 167043822872 | 4 | 1 | 2022-12-06 17:17:11.161265+00 | 2022-12-06 17:17:11.161265+00 | successful | + ... + +-----------------------------------------------------------+------------+--------------------------------------------+-------------------------------+---------------------------+----------------------+-------------------+-------------------------------+-------------------------------+-------------------+ + (15 rows) + + Time: 933.779 ms diff --git a/sysadm/deployment/index.rst b/sysadm/deployment/index.rst --- a/sysadm/deployment/index.rst +++ b/sysadm/deployment/index.rst @@ -11,3 +11,4 @@ howto-debian-packaging jenkins argocd + howto-process-add-forge-now-requests