Details
- Reviewers
vlorentz - Group Reviewers
Reviewers - Maniphest Tasks
- T2833: cpan.loader - archive Perl modules from CPAN
- Commits
- rDLS05cd1de1cde7: cpan: Fix module version extraction for some edge cases
Diff Detail
- Repository
- rDLS Listers
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Build is green
Patch application report for D8648 (id=31231)
Could not rebase; Attempt merge onto 108816f232...
Updating 108816f..8d26db1 Fast-forward swh/lister/cpan/__init__.py | 8 +- swh/lister/cpan/lister.py | 144 ++++++++++-- ...TU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw== | 50 ----- ...NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw==_visit1 | 16 -- .../v1__search_scroll_page1 | 247 +++++++++++++++++++++ .../v1__search_scroll_page2 | 39 ++++ .../v1__search_scroll_page3 | 85 +++++++ .../v1__search_scroll_page4 | 131 +++++++++++ ...ibution__search,fields=name,size=1000,scroll=1m | 52 ----- .../https_fastapi.metacpan.org/v1_release__search | 246 ++++++++++++++++++++ swh/lister/cpan/tests/test_lister.py | 166 ++++++++++++-- 11 files changed, 1025 insertions(+), 159 deletions(-) delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll,scroll=1m,scroll_id=cXVlcnlUaGVuRmV0Y2g7Mzs5NTU1MTQ1NTk6eXptdmszQUNUam1XbVJjRjRkRk9Udzs5NTQ5NjQ5NjI6ZHZIZWxCb3BUZi1Cb3NwRDB5NmRQUTs5NTU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw== delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll,scroll=1m,scroll_id=cXVlcnlUaGVuRmV0Y2g7Mzs5NTU1MTQ1NTk6eXptdmszQUNUam1XbVJjRjRkRk9Udzs5NTQ5NjQ5NjI6ZHZIZWxCb3BUZi1Cb3NwRDB5NmRQUTs5NTU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw==_visit1 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page1 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page2 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page3 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page4 delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1_distribution__search,fields=name,size=1000,scroll=1m create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1_release__search
Changes applied before test
commit 8d26db1cf78bddfb005addd2bc41fdca44fc19f4 Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Mon Oct 10 15:55:54 2022 +0200 cpan: Fix module version extraction for some edge cases CPAN API can return versions that are not of str type: either int or float. When version equals 0, it means that version failed to be parsed by CPAN so we try to extract it from release name in that case. Otherwise we ensure to convert the version to str type. Related to T2833 commit 2177ac9f5a08c2bd276f494b2aa4c8f0d4239e65 Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Tue Sep 27 16:34:38 2022 +0200 cpan: Improve listing process by querying the metacpan release endpoint Instead of querying the metacpan distribution endpoint to list origins, prefer to use the release endpoint instead enabling to list all artifacts associated to CPAN packages by scrolling results. Compared to previous implementation, it enables to compute a last_update date for all CPAN packages but also to obtain artifact sha256 checksums that will be used by the CPAN loader to check downloads integrity. As the multiple versions of a module are spread across multiple pages from the CPAN API, origins are sent to the scheduler once all pages processed, it is also faster to proceed that way. Also compute extrinsic metadata URL for each perl module versions in order for the cpan loader to query it. Related to T2833
See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/775/ for more details.
Build has FAILED
Patch application report for D8648 (id=31253)
Could not rebase; Attempt merge onto 108816f232...
Updating 108816f..2777809 Fast-forward swh/lister/cpan/__init__.py | 8 +- swh/lister/cpan/lister.py | 158 ++++++++++--- ...TU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw== | 50 ----- ...NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw==_visit1 | 16 -- .../v1__search_scroll_page1 | 247 +++++++++++++++++++++ .../v1__search_scroll_page2 | 39 ++++ .../v1__search_scroll_page3 | 85 +++++++ .../v1__search_scroll_page4 | 131 +++++++++++ ...ibution__search,fields=name,size=1000,scroll=1m | 52 ----- .../https_fastapi.metacpan.org/v1_release__search | 246 ++++++++++++++++++++ swh/lister/cpan/tests/test_lister.py | 165 ++++++++++++-- 11 files changed, 1037 insertions(+), 160 deletions(-) delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll,scroll=1m,scroll_id=cXVlcnlUaGVuRmV0Y2g7Mzs5NTU1MTQ1NTk6eXptdmszQUNUam1XbVJjRjRkRk9Udzs5NTQ5NjQ5NjI6ZHZIZWxCb3BUZi1Cb3NwRDB5NmRQUTs5NTU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw== delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll,scroll=1m,scroll_id=cXVlcnlUaGVuRmV0Y2g7Mzs5NTU1MTQ1NTk6eXptdmszQUNUam1XbVJjRjRkRk9Udzs5NTQ5NjQ5NjI6ZHZIZWxCb3BUZi1Cb3NwRDB5NmRQUTs5NTU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw==_visit1 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page1 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page2 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page3 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page4 delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1_distribution__search,fields=name,size=1000,scroll=1m create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1_release__search
Changes applied before test
commit 27778090c535fa473ea08bd6f5a9e0a491de573a Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Mon Oct 10 15:55:54 2022 +0200 cpan: Fix module version extraction for some edge cases CPAN API can return versions that are not of str type: either int or float. When version equals 0, it means that version failed to be parsed by CPAN so we try to extract it from release name in that case. Otherwise we ensure to convert the version to str type. Related to T2833 commit 5042a43e31c091d186a7e38c36df0235f6cd65e7 Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Tue Sep 27 16:34:38 2022 +0200 cpan: Improve listing process by querying the metacpan release endpoint Instead of querying the metacpan distribution endpoint to list origins, prefer to use the release endpoint instead enabling to list all artifacts associated to CPAN packages by scrolling results. Compared to previous implementation, it enables to compute a last_update date for all CPAN packages but also to obtain artifact sha256 checksums that will be used by the CPAN loader to check downloads integrity. As the multiple versions of a module are spread across multiple pages from the CPAN API, origins are sent to the scheduler once all pages processed, it is also faster to proceed that way. Related to T2833
Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/778/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/778/console
Build has FAILED
Patch application report for D8648 (id=31259)
Could not rebase; Attempt merge onto 108816f232...
Updating 108816f..729d9b6 Fast-forward swh/lister/cpan/__init__.py | 8 +- swh/lister/cpan/lister.py | 158 ++++++++++--- ...TU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw== | 50 ----- ...NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw==_visit1 | 16 -- .../v1__search_scroll_page1 | 247 +++++++++++++++++++++ .../v1__search_scroll_page2 | 39 ++++ .../v1__search_scroll_page3 | 85 +++++++ .../v1__search_scroll_page4 | 131 +++++++++++ ...ibution__search,fields=name,size=1000,scroll=1m | 52 ----- .../https_fastapi.metacpan.org/v1_release__search | 246 ++++++++++++++++++++ swh/lister/cpan/tests/test_lister.py | 165 ++++++++++++-- 11 files changed, 1037 insertions(+), 160 deletions(-) delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll,scroll=1m,scroll_id=cXVlcnlUaGVuRmV0Y2g7Mzs5NTU1MTQ1NTk6eXptdmszQUNUam1XbVJjRjRkRk9Udzs5NTQ5NjQ5NjI6ZHZIZWxCb3BUZi1Cb3NwRDB5NmRQUTs5NTU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw== delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll,scroll=1m,scroll_id=cXVlcnlUaGVuRmV0Y2g7Mzs5NTU1MTQ1NTk6eXptdmszQUNUam1XbVJjRjRkRk9Udzs5NTQ5NjQ5NjI6ZHZIZWxCb3BUZi1Cb3NwRDB5NmRQUTs5NTU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw==_visit1 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page1 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page2 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page3 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page4 delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1_distribution__search,fields=name,size=1000,scroll=1m create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1_release__search
Changes applied before test
commit 729d9b64da81df1ef2d81034b96a16b16a8d9544 Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Mon Oct 10 15:55:54 2022 +0200 cpan: Fix module version extraction for some edge cases CPAN API can return versions that are not of str type: either int or float. When version equals 0, it means that version failed to be parsed by CPAN so we try to extract it from release name in that case. Otherwise we ensure to convert the version to str type. Related to T2833 commit 5121157ce326d32411e32f9f984f9a1f6e8710ae Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Tue Sep 27 16:34:38 2022 +0200 cpan: Improve listing process by querying the metacpan release endpoint Instead of querying the metacpan distribution endpoint to list origins, prefer to use the release endpoint instead enabling to list all artifacts associated to CPAN packages by scrolling results. Compared to previous implementation, it enables to compute a last_update date for all CPAN packages but also to obtain artifact sha256 checksums that will be used by the CPAN loader to check downloads integrity. As the multiple versions of a module are spread across multiple pages from the CPAN API, origins are sent to the scheduler once all pages processed, it is also faster to proceed that way. Related to T2833
Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/780/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/780/console
Build is green
Patch application report for D8648 (id=31261)
Could not rebase; Attempt merge onto 108816f232...
Updating 108816f..a64077d Fast-forward swh/lister/cpan/__init__.py | 8 +- swh/lister/cpan/lister.py | 159 ++++++++++--- ...TU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw== | 50 ----- ...NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw==_visit1 | 16 -- .../v1__search_scroll_page1 | 247 +++++++++++++++++++++ .../v1__search_scroll_page2 | 39 ++++ .../v1__search_scroll_page3 | 85 +++++++ .../v1__search_scroll_page4 | 131 +++++++++++ ...ibution__search,fields=name,size=1000,scroll=1m | 52 ----- .../https_fastapi.metacpan.org/v1_release__search | 246 ++++++++++++++++++++ swh/lister/cpan/tests/test_lister.py | 165 ++++++++++++-- 11 files changed, 1038 insertions(+), 160 deletions(-) delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll,scroll=1m,scroll_id=cXVlcnlUaGVuRmV0Y2g7Mzs5NTU1MTQ1NTk6eXptdmszQUNUam1XbVJjRjRkRk9Udzs5NTQ5NjQ5NjI6ZHZIZWxCb3BUZi1Cb3NwRDB5NmRQUTs5NTU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw== delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll,scroll=1m,scroll_id=cXVlcnlUaGVuRmV0Y2g7Mzs5NTU1MTQ1NTk6eXptdmszQUNUam1XbVJjRjRkRk9Udzs5NTQ5NjQ5NjI6ZHZIZWxCb3BUZi1Cb3NwRDB5NmRQUTs5NTU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw==_visit1 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page1 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page2 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page3 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page4 delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1_distribution__search,fields=name,size=1000,scroll=1m create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1_release__search
Changes applied before test
commit a64077d2251605f4aced9c2a35d649d8b7a56ef7 Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Mon Oct 10 15:55:54 2022 +0200 cpan: Fix module version extraction for some edge cases CPAN API can return versions that are not of str type: either int or float. When version equals 0, it means that version failed to be parsed by CPAN so we try to extract it from release name in that case. Otherwise we ensure to convert the version to str type. Related to T2833 commit e09a31c4c0072ff93453215aa772a7cfcabec5f1 Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Tue Sep 27 16:34:38 2022 +0200 cpan: Improve listing process by querying the metacpan release endpoint Instead of querying the metacpan distribution endpoint to list origins, prefer to use the release endpoint instead enabling to list all artifacts associated to CPAN packages by scrolling results. Compared to previous implementation, it enables to compute a last_update date for all CPAN packages but also to obtain artifact sha256 checksums that will be used by the CPAN loader to check downloads integrity. As the multiple versions of a module are spread across multiple pages from the CPAN API, origins are sent to the scheduler once all pages processed, it is also faster to proceed that way. Related to T2833
See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/782/ for more details.
Build is green
Patch application report for D8648 (id=31263)
Could not rebase; Attempt merge onto 108816f232...
Updating 108816f..05cd1de Fast-forward swh/lister/cpan/__init__.py | 8 +- swh/lister/cpan/lister.py | 158 ++++++++++--- ...TU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw== | 50 ----- ...NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw==_visit1 | 16 -- .../v1__search_scroll_page1 | 247 +++++++++++++++++++++ .../v1__search_scroll_page2 | 39 ++++ .../v1__search_scroll_page3 | 85 +++++++ .../v1__search_scroll_page4 | 131 +++++++++++ ...ibution__search,fields=name,size=1000,scroll=1m | 52 ----- .../https_fastapi.metacpan.org/v1_release__search | 246 ++++++++++++++++++++ swh/lister/cpan/tests/test_lister.py | 165 ++++++++++++-- 11 files changed, 1037 insertions(+), 160 deletions(-) delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll,scroll=1m,scroll_id=cXVlcnlUaGVuRmV0Y2g7Mzs5NTU1MTQ1NTk6eXptdmszQUNUam1XbVJjRjRkRk9Udzs5NTQ5NjQ5NjI6ZHZIZWxCb3BUZi1Cb3NwRDB5NmRQUTs5NTU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw== delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll,scroll=1m,scroll_id=cXVlcnlUaGVuRmV0Y2g7Mzs5NTU1MTQ1NTk6eXptdmszQUNUam1XbVJjRjRkRk9Udzs5NTQ5NjQ5NjI6ZHZIZWxCb3BUZi1Cb3NwRDB5NmRQUTs5NTU1MTQ1NjA6eXptdmszQUNUam1XbVJjRjRkRk9UdzswOw==_visit1 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page1 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page2 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page3 create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1__search_scroll_page4 delete mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1_distribution__search,fields=name,size=1000,scroll=1m create mode 100644 swh/lister/cpan/tests/data/https_fastapi.metacpan.org/v1_release__search
Changes applied before test
commit 05cd1de1cde7ed26ca46d970e4635ba142af9031 Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Mon Oct 10 15:55:54 2022 +0200 cpan: Fix module version extraction for some edge cases CPAN API can return versions that are not of str type: either int or float. When version equals 0, it means that version failed to be parsed by CPAN so we try to extract it from release name in that case. Otherwise we ensure to convert the version to str type. Related to T2833 commit f57b8f3a2c49080ae9bc11217b8d6ef4ed8c564e Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Tue Sep 27 16:34:38 2022 +0200 cpan: Improve listing process by querying the metacpan release endpoint Instead of querying the metacpan distribution endpoint to list origins, prefer to use the release endpoint instead enabling to list all artifacts associated to CPAN packages by scrolling results. Compared to previous implementation, it enables to compute a last_update date for all CPAN packages but also to obtain artifact sha256 checksums that will be used by the CPAN loader to check downloads integrity. As the multiple versions of a module are spread across multiple pages from the CPAN API, origins are sent to the scheduler once all pages processed, it is also faster to proceed that way. Related to T2833
See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/784/ for more details.