Changeset View
Changeset View
Standalone View
Standalone View
swh/lister/cpan/__init__.py
Show All 10 Lines | |||||
The Cpan lister list origins from `cpan.org`_, the Comprehensive Perl Archive | The Cpan lister list origins from `cpan.org`_, the Comprehensive Perl Archive | ||||
Network. It provides search features via `metacpan.org`_. | Network. It provides search features via `metacpan.org`_. | ||||
As of September 2022 `cpan.org`_ list 43675 package names. | As of September 2022 `cpan.org`_ list 43675 package names. | ||||
Origins retrieving strategy | Origins retrieving strategy | ||||
--------------------------- | --------------------------- | ||||
To get a list of all package names we call a first `http api endpoint`_ that | To get a list of all package names and their associated release artifacts we call | ||||
retrieve results and a ``_scroll_id`` that will be used to scroll pages through | a first `http api endpoint`_ that retrieve results and a ``_scroll_id`` that will | ||||
`search`_ endpoint. | be used to scroll pages through `search`_ endpoint. | ||||
Page listing | Page listing | ||||
------------ | ------------ | ||||
Each page returns a list of ``results`` which are raw data from api response. | Each page returns a list of ``results`` which are raw data from api response. | ||||
Origins from page | Origins from page | ||||
----------------- | ----------------- | ||||
Show All 22 Lines | Then schedule a Cpan listing task:: | ||||
docker compose exec swh-scheduler swh scheduler task add -p oneshot list-cpan | docker compose exec swh-scheduler swh scheduler task add -p oneshot list-cpan | ||||
You can follow lister execution by displaying logs of swh-lister service:: | You can follow lister execution by displaying logs of swh-lister service:: | ||||
docker compose logs -f swh-lister | docker compose logs -f swh-lister | ||||
.. _cpan.org: https://cpan.org/ | .. _cpan.org: https://cpan.org/ | ||||
.. _metacpan.org: https://metacpan.org/ | .. _metacpan.org: https://metacpan.org/ | ||||
.. _http api endpoint: https://explorer.metacpan.org/?url=/distribution/ | .. _http api endpoint: https://explorer.metacpan.org/?url=/release/ | ||||
.. _search: https://github.com/metacpan/metacpan-api/blob/master/docs/API-docs.md#search-without-constraints # noqa: B950 | .. _search: https://github.com/metacpan/metacpan-api/blob/master/docs/API-docs.md#search-without-constraints # noqa: B950 | ||||
""" | """ | ||||
def register(): | def register(): | ||||
from .lister import CpanLister | from .lister import CpanLister | ||||
return { | return { | ||||
"lister": CpanLister, | "lister": CpanLister, | ||||
"task_modules": ["%s.tasks" % __name__], | "task_modules": ["%s.tasks" % __name__], | ||||
} | } |