Changeset View
Changeset View
Standalone View
Standalone View
swh/lister/cpan/__init__.py
Show All 14 Lines | |||||
Origins retrieving strategy | Origins retrieving strategy | ||||
--------------------------- | --------------------------- | ||||
To get a list of all package names and their associated release artifacts we call | To get a list of all package names and their associated release artifacts we call | ||||
a first `http api endpoint`_ that retrieve results and a ``_scroll_id`` that will | a first `http api endpoint`_ that retrieve results and a ``_scroll_id`` that will | ||||
be used to scroll pages through `search`_ endpoint. | be used to scroll pages through `search`_ endpoint. | ||||
The lister is incremental, it stores the UTC date the lister has been executed as | |||||
``lister.state.last_listing_date``. When present that value is used to filter results | |||||
which have an UTC `date`_ greater or equal. | |||||
Page listing | Page listing | ||||
------------ | ------------ | ||||
Each page returns a list of ``results`` which are raw data from api response. | Each page returns a list of ``results`` which are raw data from api response. | ||||
Origins from page | Origins from page | ||||
----------------- | ----------------- | ||||
Show All 22 Lines | |||||
You can follow lister execution by displaying logs of swh-lister service:: | You can follow lister execution by displaying logs of swh-lister service:: | ||||
docker compose logs -f swh-lister | docker compose logs -f swh-lister | ||||
.. _cpan.org: https://cpan.org/ | .. _cpan.org: https://cpan.org/ | ||||
.. _metacpan.org: https://metacpan.org/ | .. _metacpan.org: https://metacpan.org/ | ||||
.. _http api endpoint: https://explorer.metacpan.org/?url=/release/ | .. _http api endpoint: https://explorer.metacpan.org/?url=/release/ | ||||
.. _search: https://github.com/metacpan/metacpan-api/blob/master/docs/API-docs.md#search-without-constraints # noqa: B950 | .. _search: https://github.com/metacpan/metacpan-api/blob/master/docs/API-docs.md#search-without-constraints | ||||
.. _date: https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html#date-params | |||||
""" | """ # noqa: B950 | ||||
def register(): | def register(): | ||||
from .lister import CpanLister | from .lister import CpanLister | ||||
return { | return { | ||||
"lister": CpanLister, | "lister": CpanLister, | ||||
"task_modules": ["%s.tasks" % __name__], | "task_modules": ["%s.tasks" % __name__], | ||||
} | } |