diff --git a/PKG-INFO b/PKG-INFO index 208999e..a92a4b5 100644 --- a/PKG-INFO +++ b/PKG-INFO @@ -1,56 +1,56 @@ Metadata-Version: 2.1 Name: swh.loader.core -Version: 2.6.0 +Version: 2.6.1 Summary: Software Heritage Base Loader Home-page: https://forge.softwareheritage.org/diffusion/DLDBASE Author: Software Heritage developers Author-email: swh-devel@inria.fr License: UNKNOWN Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest Project-URL: Funding, https://www.softwareheritage.org/donate Project-URL: Source, https://forge.softwareheritage.org/source/swh-loader-core Project-URL: Documentation, https://docs.softwareheritage.org/devel/swh-loader-core/ Platform: UNKNOWN Classifier: Programming Language :: Python :: 3 Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3) Classifier: Operating System :: OS Independent Classifier: Development Status :: 5 - Production/Stable Requires-Python: >=3.7 Description-Content-Type: text/markdown Provides-Extra: testing License-File: LICENSE License-File: AUTHORS Software Heritage - Loader foundations ====================================== The Software Heritage Loader Core is a low-level loading utilities and helpers used by :term:`loaders `. The main entry points are classes: - :class:`swh.loader.core.loader.BaseLoader` for loaders (e.g. svn) - :class:`swh.loader.core.loader.DVCSLoader` for DVCS loaders (e.g. hg, git, ...) - :class:`swh.loader.package.loader.PackageLoader` for Package loaders (e.g. PyPI, Npm, ...) Package loaders --------------- This package also implements many package loaders directly, out of convenience, as they usually are quite similar and each fits in a single file. They all roughly follow these steps, explained in the :py:meth:`swh.loader.package.loader.PackageLoader.load` documentation. See the :ref:`package-loader-tutorial` for details. VCS loaders ----------- Unlike package loaders, VCS loaders remain in separate packages, as they often need more advanced conversions and very VCS-specific operations. This usually involves getting the branches of a repository and recursively loading revisions in the history (and directory trees in these revisions), until a known revision is found diff --git a/debian/changelog b/debian/changelog index 03c087d..79c5b01 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,1687 +1,1693 @@ -swh-loader-core (2.6.0-1~swh1~bpo10+1) buster-swh; urgency=medium +swh-loader-core (2.6.1-1~swh1) unstable-swh; urgency=medium - * Rebuild for buster-swh + * New upstream release 2.6.1 - (tagged by Antoine R. Dumont + (@ardumont) on 2022-04-08 11:03:06 + +0200) + * Upstream changes: - v2.6.1 - Rename metadata key in data + received from the deposit server - origin/master npm: Add all + fields we use to the ExtID manifest - npm: Include package + version id in ExtID manifest - -- Software Heritage autobuilder (on jenkins-debian1) Wed, 02 Mar 2022 13:01:13 +0000 + -- Software Heritage autobuilder (on jenkins-debian1) Fri, 08 Apr 2022 09:13:17 +0000 swh-loader-core (2.6.0-1~swh1) unstable-swh; urgency=medium * New upstream release 2.6.0 - (tagged by Valentin Lorentz on 2022-03-02 13:54:45 +0100) * Upstream changes: - v2.6.0 - * Update for the new output format of the Deposit's API. -- Software Heritage autobuilder (on jenkins-debian1) Wed, 02 Mar 2022 12:58:43 +0000 swh-loader-core (2.5.4-1~swh2) unstable-swh; urgency=medium * Bump new release with opam tests deactivated -- Antoine R. Dumont (@ardumont) Fri, 25 Feb 2022 12:40:40 +0100 swh-loader-core (2.5.4-1~swh1) unstable-swh; urgency=medium * New upstream release 2.5.4 - (tagged by Antoine R. Dumont (@ardumont) on 2022-02-25 10:23:51 +0100) * Upstream changes: - v2.5.4 - loader/opam/tests: Do not run actual opam init command call -- Software Heritage autobuilder (on jenkins-debian1) Fri, 25 Feb 2022 09:28:10 +0000 swh-loader-core (2.5.3-1~swh1) unstable-swh; urgency=medium * New upstream release 2.5.3 - (tagged by Antoine R. Dumont (@ardumont) on 2022-02-24 16:02:53 +0100) * Upstream changes: - v2.5.3 - opam: Allow build to run the opam init completely -- Software Heritage autobuilder (on jenkins-debian1) Thu, 24 Feb 2022 15:07:20 +0000 swh-loader-core (2.5.2-1~swh1) unstable-swh; urgency=medium * New upstream release 2.5.2 - (tagged by Valentin Lorentz on 2022-02-24 09:52:26 +0100) * Upstream changes: - v2.5.2 - * deposit: Remove unused raw_info -- Software Heritage autobuilder (on jenkins-debian1) Thu, 24 Feb 2022 08:57:52 +0000 swh-loader-core (2.5.1-1~swh1) unstable-swh; urgency=medium * New upstream release 2.5.1 - (tagged by Antoine R. Dumont (@ardumont) on 2022-02-16 15:27:02 +0100) * Upstream changes: - v2.5.1 - Add URL and directory to CLI loader status echo - Fix load_maven scheduling task name - docs: Fix typo detected with codespell - pre-commit: Bump hooks and add new one to check commit message spelling -- Software Heritage autobuilder (on jenkins-debian1) Wed, 16 Feb 2022 14:30:47 +0000 swh-loader-core (2.5.0-1~swh1) unstable-swh; urgency=medium * New upstream release 2.5.0 - (tagged by Antoine R. Dumont (@ardumont) on 2022-02-08 10:46:14 +0100) * Upstream changes: - v2.5.0 - Move visit date helper from hg loader to core -- Software Heritage autobuilder (on jenkins-debian1) Tue, 08 Feb 2022 09:49:53 +0000 swh-loader-core (2.4.1-1~swh1) unstable-swh; urgency=medium * New upstream release 2.4.1 - (tagged by Nicolas Dandrimont on 2022-02-03 14:12:05 +0100) * Upstream changes: - Release swh.loader.core 2.4.1 - fix Person mangling -- Software Heritage autobuilder (on jenkins-debian1) Thu, 03 Feb 2022 13:17:35 +0000 swh-loader-core (2.3.0-1~swh1) unstable-swh; urgency=medium * New upstream release 2.3.0 - (tagged by Nicolas Dandrimont on 2022-01-24 11:18:43 +0100) * Upstream changes: - Release swh.loader.core - Stop using the deprecated 'TimestampWithTimezone.offset' attribute - Include clone_with_timeout utility from swh.loader.mercurial -- Software Heritage autobuilder (on jenkins-debian1) Mon, 24 Jan 2022 10:22:35 +0000 swh-loader-core (2.2.0-1~swh1) unstable-swh; urgency=medium * New upstream release 2.2.0 - (tagged by Antoine R. Dumont (@ardumont) on 2022-01-18 14:33:08 +0100) * Upstream changes: - v2.2.0 - tests: Replace 'offset' and 'negative_utc' with 'offset_bytes' - deposit: Remove 'negative_utc' from test data - tests: Use TimestampWithTimezone.from_datetime() instead of the constructor - Add releases notes (from user-provided Atom document) to release messages. - deposit: Strip 'offset_bytes' from date dicts to support swh-model 4.0.0 - Pin mypy and drop type annotations which makes mypy unhappy -- Software Heritage autobuilder (on jenkins-debian1) Tue, 18 Jan 2022 15:52:53 +0000 swh-loader-core (2.1.1-1~swh1) unstable-swh; urgency=medium * New upstream release 2.1.1 - (tagged by Valentin Lorentz on 2021-12-09 17:14:12 +0100) * Upstream changes: - v2.1.1 - * nixguix: Fix crash when filtering extids on archives that were already loaded, but only from different URLs -- Software Heritage autobuilder (on jenkins-debian1) Thu, 09 Dec 2021 16:17:54 +0000 swh-loader-core (2.1.0-1~swh1) unstable-swh; urgency=medium * New upstream release 2.1.0 - (tagged by Valentin Lorentz on 2021-12-09 16:34:51 +0100) * Upstream changes: - v2.1.0 - * maven: various refactorings - * nixguix: Filter out releases with URLs different from the expected one -- Software Heritage autobuilder (on jenkins-debian1) Thu, 09 Dec 2021 15:38:14 +0000 swh-loader-core (2.0.0-1~swh1) unstable-swh; urgency=medium * New upstream release 2.0.0 - (tagged by Antoine R. Dumont (@ardumont) on 2021-12-07 15:53:23 +0100) * Upstream changes: - v2.0.0 - package-loaders: Add support for extid versions, and bump it for Debian - debian: Remove the extrinsic version from release names - debian: Fix confusion between the two versions -- Software Heritage autobuilder (on jenkins-debian1) Tue, 07 Dec 2021 14:57:19 +0000 swh-loader-core (1.3.0-1~swh1) unstable-swh; urgency=medium * New upstream release 1.3.0 - (tagged by Antoine Lambert on 2021-12-07 10:54:49 +0100) * Upstream changes: - version 1.3.0 -- Software Heritage autobuilder (on jenkins-debian1) Tue, 07 Dec 2021 09:58:53 +0000 swh-loader-core (1.2.1-1~swh1) unstable-swh; urgency=medium * New upstream release 1.2.1 - (tagged by Antoine R. Dumont (@ardumont) on 2021-12-03 16:15:32 +0100) * Upstream changes: - v1.2.1 - package.loader: Deduplicate extid target -- Software Heritage autobuilder (on jenkins-debian1) Fri, 03 Dec 2021 15:19:13 +0000 swh-loader-core (1.2.0-1~swh1) unstable-swh; urgency=medium * New upstream release 1.2.0 - (tagged by Antoine R. Dumont (@ardumont) on 2021-12-03 12:16:04 +0100) * Upstream changes: - v1.2.0 - debian: Rename loading task function to fix scheduling - debian: Handle extra sha1 sum in source package metadata - debian: Remove unused date parameter of DebianLoader - package.loader: Deduplicate target SWHIDs - package-loader-tutorial: Update to mention releases instead of revisions - package-loader-tutorial: Add a checklist - package-loader-tutorial: Highlight the recommendation to submit the loader early. -- Software Heritage autobuilder (on jenkins-debian1) Fri, 03 Dec 2021 11:19:52 +0000 swh-loader-core (1.1.0-1~swh1) unstable-swh; urgency=medium * New upstream release 1.1.0 - (tagged by Valentin Lorentz on 2021-11-22 11:58:11 +0100) * Upstream changes: - v1.1.0 - * Package loader: Uniformize author and message -- Software Heritage autobuilder (on jenkins-debian1) Mon, 22 Nov 2021 11:01:45 +0000 swh-loader-core (1.0.1-1~swh1) unstable-swh; urgency=medium * New upstream release 1.0.1 - (tagged by Valentin Lorentz on 2021-11-10 14:47:52 +0100) * Upstream changes: - v1.0.1 - * utils: Add types and let log instruction do the formatting - * Fix tests when run by gbp on Sid. -- Software Heritage autobuilder (on jenkins-debian1) Wed, 10 Nov 2021 13:53:43 +0000 swh-loader-core (1.0.0-1~swh1) unstable-swh; urgency=medium * New upstream release 1.0.0 - (tagged by Valentin Lorentz on 2021-11-10 14:25:24 +0100) * Upstream changes: - v1.0.0 - Main change: thismakes package loaders write releases instead of revisions - Other more-or-less related changes: - * Add missing documentation for `get_metadata_authority`. - * opam: Write package definitions to the extrinsic metadata storage - * deposit: Remove 'parent' deposit - * cleanup tests and unused code - * Document how each package loader populates fields. - * Refactor package loaders to make the version part of BasePackageInfo -- Software Heritage autobuilder (on jenkins-debian1) Wed, 10 Nov 2021 13:38:43 +0000 swh-loader-core (0.25.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.25.0 - (tagged by Antoine R. Dumont (@ardumont) on 2021-09-29 09:19:10 +0200) * Upstream changes: - v0.25.0 - Allow opam loader to actually use multi-instance opam root - opam: Define a initialize_opam_root parameter for opam loader -- Software Heritage autobuilder (on jenkins-debian1) Wed, 29 Sep 2021 07:26:12 +0000 swh-loader-core (0.23.5-1~swh1) unstable-swh; urgency=medium * New upstream release 0.23.5 - (tagged by Antoine R. Dumont (@ardumont) on 2021-09-24 17:31:22 +0200) * Upstream changes: - v0.23.5 - opam: Initialize opam root directory outside the constructor -- Software Heritage autobuilder (on jenkins-debian1) Fri, 24 Sep 2021 15:34:52 +0000 swh-loader-core (0.23.4-1~swh1) unstable-swh; urgency=medium * New upstream release 0.23.4 - (tagged by Antoine R. Dumont (@ardumont) on 2021-09-20 11:53:11 +0200) * Upstream changes: - v0.23.4 - Ensure that filename fallback out of an url is properly sanitized -- Software Heritage autobuilder (on jenkins-debian1) Mon, 20 Sep 2021 09:56:31 +0000 swh-loader-core (0.23.3-1~swh1) unstable-swh; urgency=medium * New upstream release 0.23.3 - (tagged by Antoine Lambert on 2021-09-16 10:47:40 +0200) * Upstream changes: - version 0.23.3 -- Software Heritage autobuilder (on jenkins-debian1) Thu, 16 Sep 2021 08:51:47 +0000 swh-loader-core (0.23.2-1~swh1) unstable-swh; urgency=medium * New upstream release 0.23.2 - (tagged by Valentin Lorentz on 2021-08-12 12:22:44 +0200) * Upstream changes: - v0.23.2 - * deposit: Update status_detail on loader failure -- Software Heritage autobuilder (on jenkins-debian1) Thu, 12 Aug 2021 10:25:44 +0000 swh-loader-core (0.23.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.23.1 - (tagged by Antoine R. Dumont (@ardumont) on 2021-08-05 16:11:02 +0200) * Upstream changes: - v0.23.1 - Fix pypi upload issue. -- Software Heritage autobuilder (on jenkins-debian1) Thu, 05 Aug 2021 14:20:37 +0000 swh-loader-core (0.22.3-1~swh1) unstable-swh; urgency=medium * New upstream release 0.22.3 - (tagged by Valentin Lorentz on 2021-06-25 14:50:40 +0200) * Upstream changes: - v0.22.3 - * Use the postgresql class to instantiate storage in tests - * package-loader-tutorial: Add anchor so it can be referenced from swh-docs -- Software Heritage autobuilder (on jenkins-debian1) Fri, 25 Jun 2021 12:57:33 +0000 swh-loader-core (0.22.2-1~swh1) unstable-swh; urgency=medium * New upstream release 0.22.2 - (tagged by Antoine Lambert on 2021-06-10 16:11:30 +0200) * Upstream changes: - version 0.22.2 -- Software Heritage autobuilder (on jenkins-debian1) Thu, 10 Jun 2021 14:19:06 +0000 swh-loader-core (0.22.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.22.1 - (tagged by Antoine Lambert on 2021-05-27 14:02:35 +0200) * Upstream changes: - version 0.22.1 -- Software Heritage autobuilder (on jenkins-debian1) Thu, 27 May 2021 12:20:04 +0000 swh-loader-core (0.22.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.22.0 - (tagged by Valentin Lorentz on 2021-04-15 15:13:56 +0200) * Upstream changes: - v0.22.0 - Documentation: - * Document the big picture view of VCS and package loaders - * Add a package loader tutorial. - * Write an overview of how to write VCS loaders. - * Fix various Sphinx warnings - Package loaders: - * Add sha512 as a valid field in dsc metadata - * package loaders: Stop reading/writing Revision.metadata -- Software Heritage autobuilder (on jenkins-debian1) Thu, 15 Apr 2021 13:18:13 +0000 swh-loader-core (0.21.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.21.0 - (tagged by Valentin Lorentz on 2021-03-30 17:19:13 +0200) * Upstream changes: - v0.21.0 - * tests: recompute ids when evolving RawExtrinsicMetadata objects, to support swh-model 2.0.0 - * deposit.loader: Make archive.tar the default_filename - * debian: Make resolve_revision_from use the sha256 of the .dsc - * package.loader.*: unify package "cache"/deduplication using ExtIDs - * package.loader: Lookup packages from the ExtID storage - * package.loader: Write to the ExtID storage -- Software Heritage autobuilder (on jenkins-debian1) Tue, 30 Mar 2021 15:26:35 +0000 swh-loader-core (0.20.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.20.0 - (tagged by Valentin Lorentz on 2021-03-02 10:52:18 +0100) * Upstream changes: - v0.20.0 - * RawExtrinsicMetadata: update to use the API in swh-model 1.0.0 -- Software Heritage autobuilder (on jenkins-debian1) Tue, 02 Mar 2021 09:57:21 +0000 swh-loader-core (0.19.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.19.0 - (tagged by Antoine R. Dumont (@ardumont) on 2021-02-25 15:52:12 +0100) * Upstream changes: - v0.19.0 - deposit: Make deposit loader deal with tarball as well - deposit: Update deposit status when the load status is 'partial' - Make finalize_visit a method instead of nested function. -- Software Heritage autobuilder (on jenkins-debian1) Thu, 25 Feb 2021 14:55:54 +0000 swh-loader-core (0.18.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.18.1 - (tagged by Antoine R. Dumont (@ardumont) on 2021-02-19 18:02:58 +0100) * Upstream changes: - v0.18.1 - nixguix: Fix missing max_content_size constructor parameter -- Software Heritage autobuilder (on jenkins-debian1) Fri, 19 Feb 2021 17:06:33 +0000 swh-loader-core (0.18.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.18.0 - (tagged by Antoine R. Dumont (@ardumont) on 2021-02-17 13:13:24 +0100) * Upstream changes: - v0.18.0 - core.loader: Merge Loader into BaseLoader - Unify loader instantiation - nixguix: Ensure interaction with the origin url for edge case tests -- Software Heritage autobuilder (on jenkins-debian1) Wed, 17 Feb 2021 12:16:47 +0000 swh-loader-core (0.17.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.17.0 - (tagged by Antoine R. Dumont (@ardumont) on 2021-02-11 11:20:55 +0100) * Upstream changes: - v0.17.0 - package: Mark visit as not_found when relevant - package: Mark visit status as failed when relevant - core: Allow vcs loaders to deal with not_found status - core: Mark visit status as failed when relevant - loader: Make loader write the origin_visit_status' type -- Software Heritage autobuilder (on jenkins-debian1) Thu, 11 Feb 2021 10:23:42 +0000 swh-loader-core (0.16.0-1~swh2) unstable-swh; urgency=medium * Bump dependencies -- Antoine R. Dumont (@ardumont) Wed, 03 Feb 2021 14:25:26 +0100 swh-loader-core (0.16.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.16.0 - (tagged by Antoine R. Dumont (@ardumont) on 2021-02-03 14:14:01 +0100) * Upstream changes: - v0.16.0 - Adapt origin_get_latest_visit_status according to latest api change - Add a cli section in the doc - tox.ini: Add swh.core[testing] requirement - Small docstring improvements in the deposit loader code -- Software Heritage autobuilder (on jenkins-debian1) Wed, 03 Feb 2021 13:17:30 +0000 swh-loader-core (0.15.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.15.0 - (tagged by Nicolas Dandrimont on 2020-11-03 17:21:21 +0100) * Upstream changes: - Release swh-loader-core v0.15.0 - Attach raw extrinsic metadata to directories, not revisions - Handle a bunch of deprecation warnings: - explicit args in swh.objstorage get_objstorage - id -> target for raw extrinsic metadata objects - positional arguments for storage.raw_extrinsic_metadata_get -- Software Heritage autobuilder (on jenkins-debian1) Tue, 03 Nov 2020 16:26:20 +0000 swh-loader-core (0.14.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.14.0 - (tagged by Valentin Lorentz on 2020-10-16 18:23:28 +0200) * Upstream changes: - v0.14.0 - * npm: write metadata on revisions instead of snapshots. - * pypi: write metadata on revisions instead of snapshots. - * deposit.loader: Avoid unnecessary metadata json transformation -- Software Heritage autobuilder (on jenkins-debian1) Fri, 16 Oct 2020 16:26:14 +0000 swh-loader-core (0.13.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.13.1 - (tagged by Antoine R. Dumont (@ardumont) on 2020-10-02 16:54:05 +0200) * Upstream changes: - v0.13.1 - core.loader: Allow config parameter passing through constructor - tox.ini: pin black to the pre-commit version (19.10b0) to avoid flip-flops -- Software Heritage autobuilder (on jenkins-debian1) Fri, 02 Oct 2020 14:55:59 +0000 swh-loader-core (0.13.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.13.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-10-02 13:18:55 +0200) * Upstream changes: - v0.13.0 - package.loader: Migrate away from SWHConfig mixin - core.loader: Migrate away from SWHConfig mixin - Expose deposit configuration only within the deposit tests -- Software Heritage autobuilder (on jenkins-debian1) Fri, 02 Oct 2020 11:21:55 +0000 swh-loader-core (0.12.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.12.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-10-01 16:03:45 +0200) * Upstream changes: - v0.12.0 - deposit: Adapt loader to send extrinsic raw metadata to the metadata storage - core.loader: Log information about origin currently being ingested - Adapt cli declaration entrypoint to swh.core 0.3 -- Software Heritage autobuilder (on jenkins-debian1) Thu, 01 Oct 2020 14:04:59 +0000 swh-loader-core (0.11.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.11.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-09-18 10:19:56 +0200) * Upstream changes: - v0.11.0 - loader: Stop materializing full lists of objects to be stored - tests.get_stats: Don't return a 'person' count - python: Reorder imports with isort - pre-commit: Add isort hook and configuration - pre-commit: Update flake8 hook configuration - cli: speedup the `swh` cli command startup time -- Software Heritage autobuilder (on jenkins-debian1) Fri, 18 Sep 2020 09:12:18 +0000 swh-loader-core (0.10.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.10.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-09-04 13:19:29 +0200) * Upstream changes: - v0.10.0 - loader: Adapt to latest storage revision_get change - origin/master Rename metadata format 'original-artifact-json' to 'original-artifacts-json'. - Tell pytest not to recurse in dotdirs. - package loader: Add the 'url' to the 'original_artifact' extrinsic metadata. - Write 'original_artifact' metadata to the extrinsic metadata storage. - Move parts of _load_revision to a new _load_directory method. - tests: Don't use naive datetimes. - package.loader: Split the warning message into multiple chunks - Replace calls to snapshot_get with snapshot_get_all_branches. -- Software Heritage autobuilder (on jenkins-debian1) Fri, 04 Sep 2020 11:28:09 +0000 swh-loader-core (0.9.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.9.1 - (tagged by Antoine R. Dumont (@ardumont) on 2020-08-08 14:47:52 +0200) * Upstream changes: - v0.9.1 - nixguix: Make the unsupported artifact extensions configurable - package.loader: Log a failure summary report at the end of the task -- Software Heritage autobuilder (on jenkins-debian1) Sat, 08 Aug 2020 12:51:33 +0000 swh-loader-core (0.9.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.9.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-08-07 22:57:14 +0200) * Upstream changes: - v0.9.0 - nixguix: Filter out unsupported artifact extensions - swh.loader.tests: Use snapshot_get_all_branches in check_snapshot - test_npm: Adapt content_get_metadata call to content_get - npm: Fix assertion to use the correct storage api -- Software Heritage autobuilder (on jenkins-debian1) Fri, 07 Aug 2020 21:00:40 +0000 swh-loader-core (0.8.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.8.1 - (tagged by Antoine R. Dumont (@ardumont) on 2020-08-06 16:48:38 +0200) * Upstream changes: - v0.8.1 - Adapt code according to storage signature -- Software Heritage autobuilder (on jenkins-debian1) Thu, 06 Aug 2020 14:50:39 +0000 swh-loader-core (0.8.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.8.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-08-05 10:16:36 +0200) * Upstream changes: - v0.8.0 - archive: fix docstring - nixguix: Fix docstring - nixguix: Align error message formatting using f-string - nixguix: Fix format issue in error message - Convert the 'metadata' and 'info' cached-properties/lazy-attributes into methods - cran: fix call to logger.warning - pypi: Load the content of the API's response as extrinsic snapshot metadata - Add a default value for RawExtrinsicMetadataCore.discovery_date - npm: Load the content of the API's response as extrinsic snapshot metadata - Make retrieve_sources use generic api_info instead of duplicating its code - nixguix: Load the content of sources.json as extrinsic snapshot metadata - Update tests to accept PagedResult from storage.raw_extrinsic_metadata_get -- Software Heritage autobuilder (on jenkins-debian1) Wed, 05 Aug 2020 08:19:20 +0000 swh-loader-core (0.7.3-1~swh1) unstable-swh; urgency=medium * New upstream release 0.7.3 - (tagged by Valentin Lorentz on 2020-07-30 19:16:21 +0200) * Upstream changes: - v0.7.3 - core.loader: Fix Iterable/List typing issues - package.loader: Fix type warning -- Software Heritage autobuilder (on jenkins-debian1) Thu, 30 Jul 2020 17:23:57 +0000 swh-loader-core (0.7.2-1~swh1) unstable-swh; urgency=medium * New upstream release 0.7.2 - (tagged by Valentin Lorentz on 2020-07-29 11:41:39 +0200) * Upstream changes: - v0.7.2 - * Fix typo in message logged on extrinsic metadata loading errors. - * Don't pass non-sequence iterables to the storage API. -- Software Heritage autobuilder (on jenkins-debian1) Wed, 29 Jul 2020 09:45:52 +0000 swh-loader-core (0.7.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.7.1 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-28 12:14:02 +0200) * Upstream changes: - v0.7.1 - Apply rename of object_metadata to raw_extrinsic_metadata. -- Software Heritage autobuilder (on jenkins-debian1) Tue, 28 Jul 2020 10:16:56 +0000 swh-loader-core (0.6.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.6.1 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-23 11:12:29 +0200) * Upstream changes: - v0.6.1 - npm.loader: Fix null author parsing corner case - npm.loader: Fix author parsing corner case - npm.loader: Extract _author_str function + add types, tests - core.loader: docs: Update origin_add reference -- Software Heritage autobuilder (on jenkins-debian1) Thu, 23 Jul 2020 09:15:41 +0000 swh-loader-core (0.6.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.6.0 - (tagged by Valentin Lorentz on 2020-07-20 13:23:22 +0200) * Upstream changes: - v0.6.0 - * Use the new object_metadata_add endpoint instead of origin_metadata_add. - * Apply renaming of MetadataAuthorityType.DEPOSIT to MetadataAuthorityType.DEPOSIT_CLIENT. -- Software Heritage autobuilder (on jenkins-debian1) Mon, 20 Jul 2020 11:27:53 +0000 swh-loader-core (0.5.10-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.10 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-17 15:10:42 +0200) * Upstream changes: - v0.5.10 - test_init: Decrease assertion checks so debian package builds fine - test_nixguix: Simplify the nixguix specific check_snapshot function -- Software Heritage autobuilder (on jenkins-debian1) Fri, 17 Jul 2020 13:13:19 +0000 swh-loader-core (0.5.9-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.9 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-17 11:52:38 +0200) * Upstream changes: - v0.5.9 - test.check_snapshot: Drop accepting using dict for snapshot comparison - test: Check against snapshot model object -- Software Heritage autobuilder (on jenkins-debian1) Fri, 17 Jul 2020 09:55:12 +0000 swh-loader-core (0.5.8-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.8 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-16 17:18:17 +0200) * Upstream changes: - v0.5.8 - test_init: Use snapshot object -- Software Heritage autobuilder (on jenkins-debian1) Thu, 16 Jul 2020 15:20:49 +0000 swh-loader-core (0.5.7-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.7 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-16 16:10:57 +0200) * Upstream changes: - v0.5.7 - test_init: Fix tests using the latest swh-storage fixture -- Software Heritage autobuilder (on jenkins-debian1) Thu, 16 Jul 2020 14:14:59 +0000 swh-loader-core (0.5.5-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.5 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-15 12:34:09 +0200) * Upstream changes: - v0.5.5 - check_snapshot: Check existence down to contents - Expose a pytest_plugin module so other loaders can reuse for tests - pytest: Remove no longer needed pytest setup - Fix branches types in tests - Small code improvement in package/loader.py -- Software Heritage autobuilder (on jenkins-debian1) Wed, 15 Jul 2020 10:37:11 +0000 swh-loader-core (0.5.4-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.4 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-10 09:52:21 +0200) * Upstream changes: - v0.5.4 - Clean up the swh.scheduler / swh.storage pytest plugin imports -- Software Heritage autobuilder (on jenkins-debian1) Fri, 10 Jul 2020 07:54:56 +0000 swh-loader-core (0.5.3-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.3 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-09 09:46:21 +0200) * Upstream changes: - v0.5.3 - Update the revision metadata field as an immutable dict - tests: Use dedicated storage and scheduler fixtures - loaders.tests: Simplify and add coverage to check_snapshot -- Software Heritage autobuilder (on jenkins-debian1) Thu, 09 Jul 2020 07:48:33 +0000 swh-loader-core (0.5.2-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.2 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-07 12:29:17 +0200) * Upstream changes: - v0.5.2 - nixguix/loader: Check further the source entry only if it's valid - nixguix/loader: Allow version both as string or integer - Move remaining common test utility functions to top-level arborescence - Move common test utility function to the top-level arborescence - Define common test helper function - Reuse swh.model.from_disk.iter_directory function -- Software Heritage autobuilder (on jenkins-debian1) Tue, 07 Jul 2020 10:31:36 +0000 swh-loader-core (0.5.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.1 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-01 12:32:54 +0200) * Upstream changes: - v0.5.1 - Use origin_add instead of deprecated origin_add_one endpoint - Migrate to use object's "object_type" field when computing objects -- Software Heritage autobuilder (on jenkins-debian1) Wed, 01 Jul 2020 10:34:59 +0000 swh-loader-core (0.5.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-06-29 13:18:41 +0200) * Upstream changes: - v0.5.0 - loader*: Drop obsolete origin visit fields -- Software Heritage autobuilder (on jenkins-debian1) Mon, 29 Jun 2020 11:20:59 +0000 swh-loader-core (0.4.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.4.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-06-23 15:02:20 +0200) * Upstream changes: - v0.4.0 - loader: Retrieve latest snapshot with snapshot-get-latest function -- Software Heritage autobuilder (on jenkins-debian1) Tue, 23 Jun 2020 13:14:09 +0000 swh-loader-core (0.3.2-1~swh1) unstable-swh; urgency=medium * New upstream release 0.3.2 - (tagged by Antoine R. Dumont (@ardumont) on 2020-06-22 15:13:05 +0200) * Upstream changes: - v0.3.2 - Add helper function to ensure loader visit are as expected -- Software Heritage autobuilder (on jenkins-debian1) Mon, 22 Jun 2020 13:15:41 +0000 swh-loader-core (0.3.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.3.1 - (tagged by Antoine Lambert on 2020-06-12 16:43:18 +0200) * Upstream changes: - version 0.3.1 -- Software Heritage autobuilder (on jenkins-debian1) Fri, 12 Jun 2020 14:47:42 +0000 swh-loader-core (0.3.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.3.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-06-12 11:05:41 +0200) * Upstream changes: - v0.3.0 - Migrate to new storage.origin_visit_add endpoint - loader: Migrate to origin visit status - test_deposits: Fix origin_metadata_get which is a paginated endpoint - Fix a potential UnboundLocalError in clean_dangling_folders() -- Software Heritage autobuilder (on jenkins-debian1) Fri, 12 Jun 2020 09:08:17 +0000 swh-loader-core (0.2.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.2.0 - (tagged by David Douard on 2020-06-04 14:20:08 +0200) * Upstream changes: - v0.2.0 -- Software Heritage autobuilder (on jenkins-debian1) Thu, 04 Jun 2020 12:25:57 +0000 swh-loader-core (0.1.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.1.0 - (tagged by Nicolas Dandrimont on 2020-05-29 16:01:11 +0200) * Upstream changes: - Release swh.loader.core v0.1.0 - Make sure partial visits don't reference unloaded snapshots - Ensure proper behavior when loading into partial archives (e.g. staging) - Improve test coverage -- Software Heritage autobuilder (on jenkins-debian1) Fri, 29 May 2020 14:05:36 +0000 swh-loader-core (0.0.97-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.97 - (tagged by Antoine R. Dumont (@ardumont) on 2020-05-26 14:22:51 +0200) * Upstream changes: - v0.0.97 - nixguix: catch and log artifact resolution failures - nixguix: Override known_artifacts to filter out "evaluation" branch - nixguix.tests: Add missing __init__ file -- Software Heritage autobuilder (on jenkins-debian1) Tue, 26 May 2020 12:25:35 +0000 swh-loader-core (0.0.96-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.96 - (tagged by Valentin Lorentz on 2020-05-19 18:42:23 +0200) * Upstream changes: - v0.0.96 - * Pass bytes instead a dict to origin_metadata_add. -- Software Heritage autobuilder (on jenkins-debian1) Tue, 19 May 2020 16:45:03 +0000 swh-loader-core (0.0.95-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.95 - (tagged by Valentin Lorentz on 2020-05-19 14:44:01 +0200) * Upstream changes: - v0.0.95 - * Use the new swh-storage API for storing metadata. -- Software Heritage autobuilder (on jenkins-debian1) Tue, 19 May 2020 12:47:48 +0000 swh-loader-core (0.0.94-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.94 - (tagged by Antoine R. Dumont (@ardumont) on 2020-05-15 12:49:22 +0200) * Upstream changes: - v0.0.94 - deposit: Adapt loader to use the latest deposit update api - tests: Use proper date initialization - setup.py: add documentation link -- Software Heritage autobuilder (on jenkins-debian1) Fri, 15 May 2020 10:52:16 +0000 swh-loader-core (0.0.93-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.93 - (tagged by Antoine R. Dumont (@ardumont) on 2020-04-23 16:43:16 +0200) * Upstream changes: - v0.0.93 - deposit.loader: Build revision out of the deposit api read metadata -- Software Heritage autobuilder (on jenkins-debian1) Thu, 23 Apr 2020 14:46:48 +0000 swh-loader-core (0.0.92-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.92 - (tagged by Antoine R. Dumont (@ardumont) on 2020-04-23 11:49:30 +0200) * Upstream changes: - v0.0.92 - deposit.loader: Fix revision metadata redundancy in deposit metadata - loader.deposit: Clarify FIXME intent - test_nixguix: Remove the incorrect fixme - test_nixguix: Add a fixme note on test_loader_two_visits - package.nixguix: Ensure the revisions are structurally sound -- Software Heritage autobuilder (on jenkins-debian1) Thu, 23 Apr 2020 09:52:18 +0000 swh-loader-core (0.0.91-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.91 - (tagged by Antoine R. Dumont (@ardumont) on 2020-04-21 15:59:55 +0200) * Upstream changes: - v0.0.91 - deposit.loader: Fix committer date appropriately - tests_deposit: Define specific requests_mock_datadir fixture - nixguix: Move helper function below the class definition - setup: Update the minimum required runtime python3 version -- Software Heritage autobuilder (on jenkins-debian1) Tue, 21 Apr 2020 14:02:51 +0000 swh-loader-core (0.0.90-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.90 - (tagged by Antoine R. Dumont (@ardumont) on 2020-04-15 14:27:01 +0200) * Upstream changes: - v0.0.90 - Improve exception handling -- Software Heritage autobuilder (on jenkins-debian1) Wed, 15 Apr 2020 12:30:07 +0000 swh-loader-core (0.0.89-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.89 - (tagged by Antoine R. Dumont (@ardumont) on 2020-04-14 15:48:15 +0200) * Upstream changes: - v0.0.89 - package.utils: Define a timeout on download connections - package.loader: Clear proxy buffer state when failing to load revision - Fix a couple of storage args deprecation warnings - cli: Sort loaders list and fix some tests - Add a pyproject.toml file to target py37 for black - Enable black -- Software Heritage autobuilder (on jenkins-debian1) Tue, 14 Apr 2020 15:30:08 +0000 swh-loader-core (0.0.88-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.88 - (tagged by Antoine R. Dumont (@ardumont) on 2020-04-03 15:52:07 +0200) * Upstream changes: - v0.0.88 - v0.0.88 nixguix: validate and clean sources.json structure -- Software Heritage autobuilder (on jenkins-debian1) Fri, 03 Apr 2020 13:54:24 +0000 swh-loader-core (0.0.87-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.87 - (tagged by Antoine R. Dumont (@ardumont) on 2020-04-02 14:37:37 +0200) * Upstream changes: - v0.0.87 - nixguix: rename the `url` source attribute to `urls` - nixguix: rename the test file - nixguix: add the integrity attribute in release metadata -- Software Heritage autobuilder (on jenkins-debian1) Thu, 02 Apr 2020 12:39:58 +0000 swh-loader-core (0.0.86-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.86 - (tagged by Antoine R. Dumont (@ardumont) on 2020-03-26 16:15:24 +0100) * Upstream changes: - v0.0.86 - core.loader: Remove origin_visit_update call from DVCSLoader class -- Software Heritage autobuilder (on jenkins-debian1) Thu, 26 Mar 2020 15:19:29 +0000 swh-loader-core (0.0.85-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.85 - (tagged by Antoine R. Dumont (@ardumont) on 2020-03-26 15:36:58 +0100) * Upstream changes: - v0.0.85 - core.loader: Allow core loader to update origin_visit in one call - Rename the functional loader to nixguix loader -- Software Heritage autobuilder (on jenkins-debian1) Thu, 26 Mar 2020 14:43:17 +0000 swh-loader-core (0.0.84-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.84 - (tagged by Antoine R. Dumont (@ardumont) on 2020-03-24 11:29:49 +0100) * Upstream changes: - v0.0.84 - test: Use storage endpoint to check latest origin visit status - package.loader: Fix status visit to 'partial' - package.loader: add a test to reproduce EOFError error -- Software Heritage autobuilder (on jenkins-debian1) Tue, 24 Mar 2020 10:32:55 +0000 swh-loader-core (0.0.83-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.83 - (tagged by Antoine R. Dumont (@ardumont) on 2020-03-23 15:16:14 +0100) * Upstream changes: - v0.0.83 - Make the swh.loader.package exception handling more granular - package.loader: Reference a snapshot on partial visit - package.loader: Extract a _load_snapshot method - functional: create a branch named evaluation pointing to the evaluation commit - package.loader: add extra_branches method -- Software Heritage autobuilder (on jenkins-debian1) Mon, 23 Mar 2020 14:19:43 +0000 swh-loader-core (0.0.82-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.82 - (tagged by Antoine R. Dumont (@ardumont) on 2020-03-18 11:55:48 +0100) * Upstream changes: - v0.0.82 - functional.loader: Add loader - package.loader: ignore non tarball source -- Software Heritage autobuilder (on jenkins-debian1) Wed, 18 Mar 2020 10:59:38 +0000 swh-loader-core (0.0.81-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.81 - (tagged by Antoine R. Dumont (@ardumont) on 2020-03-16 13:14:33 +0100) * Upstream changes: - v0.0.81 - Migrate to latest storage.origin_visit_add api change - Move Person parsing to swh- model. -- Software Heritage autobuilder (on jenkins-debian1) Mon, 16 Mar 2020 12:17:43 +0000 swh-loader-core (0.0.80-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.80 - (tagged by Valentin Lorentz on 2020-02-28 17:05:14 +0100) * Upstream changes: - v0.0.80 - * use swh-model objects instead of dicts. -- Software Heritage autobuilder (on jenkins-debian1) Fri, 28 Feb 2020 16:10:06 +0000 swh-loader-core (0.0.79-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.79 - (tagged by Antoine R. Dumont (@ardumont) on 2020-02-25 11:40:05 +0100) * Upstream changes: - v0.0.79 - Move revision loading logic to its own function. - Use swh-storage validation proxy earlier in the pipeline. - Use swh-storage validation proxy. - Add missing __init__.py and fix tests. -- Software Heritage autobuilder (on jenkins-debian1) Tue, 25 Feb 2020 10:48:07 +0000 swh-loader-core (0.0.78-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.78 - (tagged by Antoine R. Dumont (@ardumont) on 2020-02-06 15:28:11 +0100) * Upstream changes: - v0.0.78 - tests: Use new get_storage signature - loader.core.converters: Prefer the with open pattern to read file - test_converters: Add coverage on prepare_contents method - test_converters: Migrate to pytest - loader.core/package: Call storage's (skipped_)content_add endpoints -- Software Heritage autobuilder (on jenkins-debian1) Thu, 06 Feb 2020 15:09:05 +0000 swh-loader-core (0.0.77-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.77 - (tagged by Antoine R. Dumont (@ardumont) on 2020-01-30 10:32:08 +0100) * Upstream changes: - v0.0.77 - loader.npm: If no upload time provided, use artifact's mtime if provided - loader.npm: Fail ingestion if at least 1 artifact has no upload time -- Software Heritage autobuilder (on jenkins-debian1) Thu, 30 Jan 2020 09:37:58 +0000 swh-loader-core (0.0.76-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.76 - (tagged by Antoine R. Dumont (@ardumont) on 2020-01-28 13:07:30 +0100) * Upstream changes: - v0.0.76 - npm.loader: Skip artifacts with no intrinsic metadata - pypi.loader: Skip artifacts with no intrinsic metadata - package.loader: Fix edge case when some listing returns no content - core.loader: Drop retro- compatibility class names - loader.tests: Add filter and buffer proxy storage - docs: Fix sphinx warnings - README: Update class names -- Software Heritage autobuilder (on jenkins-debian1) Tue, 28 Jan 2020 12:11:07 +0000 swh-loader-core (0.0.75-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.75 - (tagged by Antoine R. Dumont (@ardumont) on 2020-01-16 14:14:29 +0100) * Upstream changes: - v0.0.75 - cran.loader: Align cran loader with other package loaders -- Software Heritage autobuilder (on jenkins-debian1) Thu, 16 Jan 2020 13:17:30 +0000 swh-loader-core (0.0.74-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.74 - (tagged by Antoine R. Dumont (@ardumont) on 2020-01-15 15:30:13 +0100) * Upstream changes: - v0.0.74 - Drop no longer used retrying dependency - core.loader: Clean up indirection and retry behavior - tests: Use retry proxy storage in loaders - core.loader: Drop dead code - cran.loader: Fix parsing description file error -- Software Heritage autobuilder (on jenkins-debian1) Wed, 15 Jan 2020 14:33:57 +0000 swh-loader-core (0.0.73-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.73 - (tagged by Antoine R. Dumont (@ardumont) on 2020-01-09 10:00:21 +0100) * Upstream changes: - v0.0.73 - package.cran: Name CRAN task appropriately -- Software Heritage autobuilder (on jenkins-debian1) Thu, 09 Jan 2020 09:05:07 +0000 swh-loader-core (0.0.72-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.72 - (tagged by Antoine R. Dumont (@ardumont) on 2020-01-06 16:37:58 +0100) * Upstream changes: - v0.0.72 - package.loader: Fail fast when unable to create origin/origin_visit - cran.loader: Add implementation -- Software Heritage autobuilder (on jenkins-debian1) Mon, 06 Jan 2020 15:50:08 +0000 swh-loader-core (0.0.71-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.71 - (tagged by Antoine R. Dumont (@ardumont) on 2019-12-20 14:22:31 +0100) * Upstream changes: - v0.0.71 - package.utils: Drop unneeded hashes from download computation -- Software Heritage autobuilder (on jenkins-debian1) Fri, 20 Dec 2019 13:26:09 +0000 swh-loader-core (0.0.70-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.70 - (tagged by Antoine R. Dumont (@ardumont) on 2019-12-20 11:32:09 +0100) * Upstream changes: - v0.0.70 - debian.loader: Improve and fix revision resolution's corner cases -- Software Heritage autobuilder (on jenkins-debian1) Fri, 20 Dec 2019 10:39:34 +0000 swh-loader-core (0.0.69-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.69 - (tagged by Antoine R. Dumont (@ardumont) on 2019-12-12 16:21:59 +0100) * Upstream changes: - v0.0.69 - loader.core: Fix correctly loader initialization -- Software Heritage autobuilder (on jenkins-debian1) Thu, 12 Dec 2019 15:26:13 +0000 swh-loader-core (0.0.68-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.68 - (tagged by Antoine R. Dumont (@ardumont) on 2019-12-12 15:45:21 +0100) * Upstream changes: - v0.0.68 - loader.core: Fix initialization issue in dvcs loaders -- Software Heritage autobuilder (on jenkins-debian1) Thu, 12 Dec 2019 14:49:12 +0000 swh-loader-core (0.0.67-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.67 - (tagged by Antoine R. Dumont (@ardumont) on 2019-12-12 14:02:47 +0100) * Upstream changes: - v0.0.67 - loader.core: Type methods - loader.core: Transform data input into list - loader.core: Add missing conversion step on content -- Software Heritage autobuilder (on jenkins-debian1) Thu, 12 Dec 2019 13:07:47 +0000 swh-loader-core (0.0.66-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.66 - (tagged by Antoine R. Dumont (@ardumont) on 2019-12-12 12:01:14 +0100) * Upstream changes: - v0.0.66 - Drop deprecated behavior -- Software Heritage autobuilder (on jenkins-debian1) Thu, 12 Dec 2019 11:05:17 +0000 swh-loader-core (0.0.65-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.65 - (tagged by Antoine R. Dumont (@ardumont) on 2019-12-12 11:42:46 +0100) * Upstream changes: - v0.0.65 - loader.cli: Improve current implementation - tasks: Enforce kwargs use in task message -- Software Heritage autobuilder (on jenkins-debian1) Thu, 12 Dec 2019 10:51:02 +0000 swh-loader-core (0.0.64-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.64 - (tagged by Antoine R. Dumont (@ardumont) on 2019-12-10 09:49:06 +0100) * Upstream changes: - v0.0.64 - requirements-test: Add missing test dependency - tests: Refactor using pytest-mock's mocker fixture - loader.cli: Add tests around cli - package.npm: Align loader instantiation - loader.cli: Reference new loader cli -- Software Heritage autobuilder (on jenkins-debian1) Tue, 10 Dec 2019 08:56:02 +0000 swh-loader-core (0.0.63-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.63 - (tagged by Antoine R. Dumont (@ardumont) on 2019-12-05 16:01:49 +0100) * Upstream changes: - v0.0.63 - Add missing inclusion instruction -- Software Heritage autobuilder (on jenkins-debian1) Thu, 05 Dec 2019 15:05:39 +0000 swh-loader-core (0.0.62-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.62 - (tagged by Antoine R. Dumont (@ardumont) on 2019-12-05 15:46:46 +0100) * Upstream changes: - v0.0.62 - Move package loaders to their own namespace -- Software Heritage autobuilder (on jenkins-debian1) Thu, 05 Dec 2019 14:50:19 +0000 swh-loader-core (0.0.61-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.61 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-28 17:25:49 +0100) * Upstream changes: - v0.0.61 - pypi: metadata -> revision: Deal with previous metadata format - npm: metadata -> revision: Deal with previous metadata format -- Software Heritage autobuilder (on jenkins-debian1) Thu, 28 Nov 2019 16:29:47 +0000 swh-loader-core (0.0.60-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.60 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-26 12:09:28 +0100) * Upstream changes: - v0.0.60 - package.deposit: Fix revision- get inconsistency - package.deposit: Provide parents in any case - package.deposit: Fix url computation issue - utils: Work around header issue during download -- Software Heritage autobuilder (on jenkins-debian1) Tue, 26 Nov 2019 11:18:41 +0000 swh-loader-core (0.0.59-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.59 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-22 18:11:33 +0100) * Upstream changes: - v0.0.59 - npm: Explicitly retrieve the revision date from extrinsic metadata -- Software Heritage autobuilder (on jenkins-debian1) Fri, 22 Nov 2019 17:15:34 +0000 swh-loader-core (0.0.58-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.58 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-22 12:08:10 +0100) * Upstream changes: - v0.0.58 - package.pypi: Filter out non- sdist package type -- Software Heritage autobuilder (on jenkins-debian1) Fri, 22 Nov 2019 11:11:56 +0000 swh-loader-core (0.0.57-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.57 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-22 11:26:11 +0100) * Upstream changes: - v0.0.57 - package.pypi: Fix project url computation edge case - Use pkg_resources to get the package version instead of vcversioner -- Software Heritage autobuilder (on jenkins-debian1) Fri, 22 Nov 2019 10:31:11 +0000 swh-loader-core (0.0.56-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.56 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-21 16:12:46 +0100) * Upstream changes: - v0.0.56 - package.tasks: Rename appropriately load_deb_package task type name - Fix typos reported by codespell - Add a pre-commit config file -- Software Heritage autobuilder (on jenkins-debian1) Thu, 21 Nov 2019 15:16:23 +0000 swh-loader-core (0.0.55-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.55 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-21 13:51:03 +0100) * Upstream changes: - v0.0.55 - package.tasks: Rename load_archive into load_archive_files - Migrate tox.ini to extras = xxx instead of deps = .[testing] - Merge tox test environments -- Software Heritage autobuilder (on jenkins-debian1) Thu, 21 Nov 2019 12:56:07 +0000 swh-loader-core (0.0.54-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.54 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-21 11:29:20 +0100) * Upstream changes: - v0.0.54 - loader.package.deposit: Drop swh.deposit.client requirement - Include all requirements in MANIFEST.in -- Software Heritage autobuilder (on jenkins-debian1) Thu, 21 Nov 2019 10:32:23 +0000 swh-loader-core (0.0.53-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.53 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-20 14:26:36 +0100) * Upstream changes: - v0.0.53 - loader.package.tasks: Document tasks - Define correctly the setup.py's entry_points -- Software Heritage autobuilder (on jenkins-debian1) Wed, 20 Nov 2019 13:30:10 +0000 swh-loader-core (0.0.52-1~swh3) unstable-swh; urgency=medium * Update dh-python version constraint -- Antoine R. Dumont (@ardumont) Wed, 20 Nov 2019 12:03:00 +0100 swh-loader-core (0.0.52-1~swh2) unstable-swh; urgency=medium * Add egg-info to pybuild.testfiles. -- Antoine R. Dumont (@ardumont) Wed, 20 Nov 2019 11:42:42 +0100 swh-loader-core (0.0.52-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.52 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-19 15:15:40 +0100) * Upstream changes: - v0.0.52 - Ensure BufferedLoader and UnbufferedLoader do flush their storage - loader.package: Register loader package tasks - package.tasks: Rename debian task to load_deb -- Software Heritage autobuilder (on jenkins-debian1) Tue, 19 Nov 2019 14:18:41 +0000 swh-loader-core (0.0.51-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.51 - (tagged by David Douard on 2019-11-18 17:05:17 +0100) * Upstream changes: - v0.0.51 -- Software Heritage autobuilder (on jenkins-debian1) Mon, 18 Nov 2019 16:09:44 +0000 swh-loader-core (0.0.50-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.50 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-13 15:56:55 +0100) * Upstream changes: - v0.0.50 - package.loader: Check snapshot_id is set as returned value - package.loader: Ensure the origin visit type is set appropriately - package.loader: Fix serialization issue - package.debian: Align origin_visit type to 'deb' as in production -- Software Heritage autobuilder (on jenkins-debian1) Wed, 13 Nov 2019 15:04:37 +0000 swh-loader-core (0.0.49-1~swh2) unstable-swh; urgency=medium * Update dependencies -- Antoine R. Dumont Fri, 08 Nov 2019 14:07:20 +0100 swh-loader-core (0.0.49-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.49 - (tagged by Antoine R. Dumont (@ardumont) on 2019-11-08 13:21:56 +0100) * Upstream changes: - v0.0.49 - New package loader implementations: archive, pypi, npm, deposit, debian -- Software Heritage autobuilder (on jenkins-debian1) Fri, 08 Nov 2019 12:29:47 +0000 swh-loader-core (0.0.48-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.48 - (tagged by Stefano Zacchiroli on 2019-10-01 16:49:39 +0200) * Upstream changes: - v0.0.48 - * typing: minimal changes to make a no-op mypy run pass -- Software Heritage autobuilder (on jenkins-debian1) Tue, 01 Oct 2019 14:52:59 +0000 swh-loader-core (0.0.47-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.47 - (tagged by Antoine Lambert on 2019-10-01 11:32:50 +0200) * Upstream changes: - version 0.0.47: Workaround HashCollision errors -- Software Heritage autobuilder (on jenkins-debian1) Tue, 01 Oct 2019 09:35:38 +0000 swh-loader-core (0.0.46-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.46 - (tagged by Antoine R. Dumont (@ardumont) on 2019-09-06 18:30:42 +0200) * Upstream changes: - v0.0.46 - pytest.ini: Remove warnings about our custom markers - pep8: Fix log.warning calls - core/loader: Fix get_save_data_path implementation - Fix validation errors in test. -- Software Heritage autobuilder (on jenkins-debian1) Fri, 06 Sep 2019 16:33:13 +0000 swh-loader-core (0.0.45-1~swh2) unstable-swh; urgency=medium * Fix missing build dependency -- Antoine R. Dumont (@ardumont) Tue, 03 Sep 2019 14:12:13 +0200 swh-loader-core (0.0.45-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.45 - (tagged by Antoine R. Dumont (@ardumont) on 2019-09-03 10:38:36 +0200) * Upstream changes: - v0.0.45 - loader: Provide visit type when calling origin_visit_add - loader: Drop keys 'perms' and 'path' from content before sending to the - storage - swh.loader.package: Implement GNU loader - docs: add code of conduct document -- Software Heritage autobuilder (on jenkins-debian1) Tue, 03 Sep 2019 08:41:49 +0000 swh-loader-core (0.0.44-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.44 - (tagged by Valentin Lorentz on 2019-06-25 12:18:27 +0200) * Upstream changes: - Drop use of deprecated methods fetch_history_* -- Software Heritage autobuilder (on jenkins-debian1) Wed, 26 Jun 2019 09:40:59 +0000 swh-loader-core (0.0.43-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.43 - (tagged by Valentin Lorentz on 2019-06-18 16:21:58 +0200) * Upstream changes: - Use origin urls instead of origin ids. -- Software Heritage autobuilder (on jenkins-debian1) Wed, 19 Jun 2019 09:33:53 +0000 swh-loader-core (0.0.42-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.42 - (tagged by David Douard on 2019-05-20 11:28:49 +0200) * Upstream changes: - v0.0.42 - update/fix requirements -- Software Heritage autobuilder (on jenkins-debian1) Mon, 20 May 2019 09:33:47 +0000 swh-loader-core (0.0.41-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.41 - (tagged by Antoine R. Dumont (@ardumont) on 2019-04-11 11:46:00 +0200) * Upstream changes: - v0.0.41 - core.loader: Migrate to latest snapshot_add, origin_visit_update api - core.loader: Count only the effectively new objects ingested - test_utils: Add coverage on utils module -- Software Heritage autobuilder (on jenkins-debian1) Thu, 11 Apr 2019 09:52:55 +0000 swh-loader-core (0.0.40-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.40 - (tagged by Antoine Lambert on 2019-03-29 10:57:14 +0100) * Upstream changes: - version 0.0.40 -- Software Heritage autobuilder (on jenkins-debian1) Fri, 29 Mar 2019 10:02:37 +0000 swh-loader-core (0.0.39-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.39 - (tagged by Antoine R. Dumont (@ardumont) on 2019-01-30 11:10:39 +0100) * Upstream changes: - v0.0.39 -- Software Heritage autobuilder (on jenkins-debian1) Wed, 30 Jan 2019 10:13:56 +0000 swh-loader-core (0.0.35-1~swh1) unstable-swh; urgency=medium * v0.0.35 * tests: Initialize tox.ini use * tests, debian/*: Migrate to pytest -- Antoine R. Dumont (@ardumont) Tue, 23 Oct 2018 15:47:22 +0200 swh-loader-core (0.0.34-1~swh1) unstable-swh; urgency=medium * v0.0.34 * setup: prepare for PyPI upload * README.md: Simplify module description * core.tests: Install tests fixture for derivative loaders to use -- Antoine R. Dumont (@ardumont) Tue, 09 Oct 2018 14:11:29 +0200 swh-loader-core (0.0.33-1~swh1) unstable-swh; urgency=medium * v0.0.33 * loader/utils: Add clean_dangling_folders function to ease clean up * loader/core: Add optional pre_cleanup for dangling files cleaning -- Antoine R. Dumont (@ardumont) Fri, 09 Mar 2018 14:41:17 +0100 swh-loader-core (0.0.32-1~swh1) unstable-swh; urgency=medium * v0.0.32 * Improve origin_visit initialization step * Properly sandbox the prepare statement so that if it breaks, we can * update appropriately the visit with the correct status -- Antoine R. Dumont (@ardumont) Wed, 07 Mar 2018 11:06:27 +0100 swh-loader-core (0.0.31-1~swh1) unstable-swh; urgency=medium * Release swh.loader.core v0.0.31 * Remove backwards-compatibility when sending snapshots -- Nicolas Dandrimont Tue, 13 Feb 2018 18:52:20 +0100 swh-loader-core (0.0.30-1~swh1) unstable-swh; urgency=medium * Release swh.loader.core v0.0.30 * Update Debian metadata for snapshot-related breakage -- Nicolas Dandrimont Tue, 06 Feb 2018 14:22:53 +0100 swh-loader-core (0.0.29-1~swh1) unstable-swh; urgency=medium * Release swh.loader.core v0.0.29 * Replace occurrences with snapshots * Enhance logging on error cases -- Nicolas Dandrimont Tue, 06 Feb 2018 14:13:11 +0100 swh-loader-core (0.0.28-1~swh1) unstable-swh; urgency=medium * v0.0.28 * Add stateless loader base class * Remove bare exception handlers -- Antoine R. Dumont (@ardumont) Tue, 19 Dec 2017 17:48:09 +0100 swh-loader-core (0.0.27-1~swh1) unstable-swh; urgency=medium * v0.0.27 * Migrate from indexer's indexer_configuration to storage's tool notion. -- Antoine R. Dumont (@ardumont) Thu, 07 Dec 2017 10:36:23 +0100 swh-loader-core (0.0.26-1~swh1) unstable-swh; urgency=medium * v0.0.26 * Fix send_provider method -- Antoine R. Dumont (@ardumont) Tue, 05 Dec 2017 15:40:57 +0100 swh-loader-core (0.0.25-1~swh1) unstable-swh; urgency=medium * v0.0.25 * swh.loader.core: Fix to retrieve the provider_id as an actual id * swh.loader.core: Fix log format error * swh.loader.core: Align log message according to conventions -- Antoine R. Dumont (@ardumont) Wed, 29 Nov 2017 12:55:45 +0100 swh-loader-core (0.0.24-1~swh1) unstable-swh; urgency=medium * v0.0.24 * Added metadata injection possible from loader core -- Antoine R. Dumont (@ardumont) Fri, 24 Nov 2017 11:35:40 +0100 swh-loader-core (0.0.23-1~swh1) unstable-swh; urgency=medium * v0.0.23 * loader: Fix dangling data flush -- Antoine R. Dumont (@ardumont) Tue, 07 Nov 2017 16:25:20 +0100 swh-loader-core (0.0.22-1~swh1) unstable-swh; urgency=medium * v0.0.22 * core.loader: Use the global setup set in swh.core.config * core.loader: Properly batch object insertions for big requests -- Antoine R. Dumont (@ardumont) Mon, 30 Oct 2017 18:50:00 +0100 swh-loader-core (0.0.21-1~swh1) unstable-swh; urgency=medium * v0.0.21 * swh.loader.core: Only send origin if not already sent before -- Antoine R. Dumont (@ardumont) Tue, 24 Oct 2017 16:30:53 +0200 swh-loader-core (0.0.20-1~swh1) unstable-swh; urgency=medium * v0.0.20 * Permit to add 'post_load' actions in loaders -- Antoine R. Dumont (@ardumont) Fri, 13 Oct 2017 14:30:37 +0200 swh-loader-core (0.0.19-1~swh1) unstable-swh; urgency=medium * v0.0.19 * Permit to add 'post_load' actions in loaders -- Antoine R. Dumont (@ardumont) Fri, 13 Oct 2017 14:14:14 +0200 swh-loader-core (0.0.18-1~swh1) unstable-swh; urgency=medium * Release swh.loader.core version 0.0.18 * Update packaging runes -- Nicolas Dandrimont Thu, 12 Oct 2017 18:07:53 +0200 swh-loader-core (0.0.17-1~swh1) unstable-swh; urgency=medium * Release swh.loader.core v0.0.17 * Allow iterating when fetching and storing data * Allow overriding the status of the loaded visit * Allow overriding the status of the load itself -- Nicolas Dandrimont Wed, 11 Oct 2017 16:38:29 +0200 swh-loader-core (0.0.16-1~swh1) unstable-swh; urgency=medium * Release swh.loader.core v0.0.16 * Migrate from swh.model.git to swh.model.from_disk -- Nicolas Dandrimont Fri, 06 Oct 2017 14:46:41 +0200 swh-loader-core (0.0.15-1~swh1) unstable-swh; urgency=medium * v0.0.15 * docs: Add sphinx apidoc generation skeleton * docs: Add a simple README.md explaining the module's goal * swh.loader.core.loader: Unify origin_visit add/update function call -- Antoine R. Dumont (@ardumont) Fri, 29 Sep 2017 11:47:37 +0200 swh-loader-core (0.0.14-1~swh1) unstable-swh; urgency=medium * v0.0.14 * Add the blake2s256 hash computation -- Antoine R. Dumont (@ardumont) Sat, 25 Mar 2017 18:20:52 +0100 swh-loader-core (0.0.13-1~swh1) unstable-swh; urgency=medium * v0.0.13 * Improve core loader's interface api -- Antoine R. Dumont (@ardumont) Wed, 22 Feb 2017 13:43:54 +0100 swh-loader-core (0.0.12-1~swh1) unstable-swh; urgency=medium * v0.0.12 * Update storage configuration reading -- Antoine R. Dumont (@ardumont) Thu, 15 Dec 2016 18:34:41 +0100 swh-loader-core (0.0.11-1~swh1) unstable-swh; urgency=medium * v0.0.11 * d/control: Bump dependency to latest storage * Fix: Objects can be injected even though global loading failed * Populate the counters in fetch_history * Open open/close fetch_history function in the core loader -- Antoine R. Dumont (@ardumont) Wed, 24 Aug 2016 14:38:55 +0200 swh-loader-core (0.0.10-1~swh1) unstable-swh; urgency=medium * v0.0.10 * d/control: Update dependency -- Antoine R. Dumont (@ardumont) Sat, 11 Jun 2016 02:26:50 +0200 swh-loader-core (0.0.9-1~swh1) unstable-swh; urgency=medium * v0.0.9 * Improve default task that initialize storage as well -- Antoine R. Dumont (@ardumont) Fri, 10 Jun 2016 15:12:14 +0200 swh-loader-core (0.0.8-1~swh1) unstable-swh; urgency=medium * v0.0.8 * Migrate specific converter to the right module * Fix dangling parameter -- Antoine R. Dumont (@ardumont) Wed, 08 Jun 2016 18:09:23 +0200 swh-loader-core (0.0.7-1~swh1) unstable-swh; urgency=medium * v0.0.7 * Fix on revision conversion -- Antoine R. Dumont (@ardumont) Wed, 08 Jun 2016 16:19:02 +0200 swh-loader-core (0.0.6-1~swh1) unstable-swh; urgency=medium * v0.0.6 * d/control: Bump dependency on swh-model * d/control: Add missing description * Keep the abstraction for all entities * Align parameter definition order * Fix missing option in DEFAULT ones * Decrease verbosity * Fix missing origin_id assignment * d/rules: Add target to run tests during packaging -- Antoine R. Dumont (@ardumont) Wed, 08 Jun 2016 16:00:40 +0200 swh-loader-core (0.0.5-1~swh1) unstable-swh; urgency=medium * v0.0.5 -- Antoine R. Dumont (@ardumont) Wed, 25 May 2016 12:17:06 +0200 swh-loader-core (0.0.4-1~swh1) unstable-swh; urgency=medium * v0.0.4 * Rename package from python3-swh.loader to python3-swh.loader.core -- Antoine R. Dumont (@ardumont) Wed, 25 May 2016 11:44:48 +0200 swh-loader-core (0.0.3-1~swh1) unstable-swh; urgency=medium * v0.0.3 * Improve default configuration * Rename package from swh-loader-vcs to swh-loader -- Antoine R. Dumont (@ardumont) Wed, 25 May 2016 11:23:06 +0200 swh-loader-core (0.0.2-1~swh1) unstable-swh; urgency=medium * v0.0.2 * Fix: Flush data even when no data is sent to swh-storage -- Antoine R. Dumont (@ardumont) Tue, 24 May 2016 16:41:49 +0200 swh-loader-core (0.0.1-1~swh1) unstable-swh; urgency=medium * Initial release * v0.0.1 -- Antoine R. Dumont (@ardumont) Wed, 13 Apr 2016 16:54:47 +0200 diff --git a/pytest.ini b/pytest.ini index 276cddc..478264b 100644 --- a/pytest.ini +++ b/pytest.ini @@ -1,7 +1,7 @@ [pytest] -norecursedirs = docs .* +norecursedirs = build docs .* markers = db: marks tests as using a db (deselect with '-m "not db"') fs: marks tests as using the filesystem (deselect with '-m "not fs"') diff --git a/swh.loader.core.egg-info/PKG-INFO b/swh.loader.core.egg-info/PKG-INFO index 208999e..a92a4b5 100644 --- a/swh.loader.core.egg-info/PKG-INFO +++ b/swh.loader.core.egg-info/PKG-INFO @@ -1,56 +1,56 @@ Metadata-Version: 2.1 Name: swh.loader.core -Version: 2.6.0 +Version: 2.6.1 Summary: Software Heritage Base Loader Home-page: https://forge.softwareheritage.org/diffusion/DLDBASE Author: Software Heritage developers Author-email: swh-devel@inria.fr License: UNKNOWN Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest Project-URL: Funding, https://www.softwareheritage.org/donate Project-URL: Source, https://forge.softwareheritage.org/source/swh-loader-core Project-URL: Documentation, https://docs.softwareheritage.org/devel/swh-loader-core/ Platform: UNKNOWN Classifier: Programming Language :: Python :: 3 Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3) Classifier: Operating System :: OS Independent Classifier: Development Status :: 5 - Production/Stable Requires-Python: >=3.7 Description-Content-Type: text/markdown Provides-Extra: testing License-File: LICENSE License-File: AUTHORS Software Heritage - Loader foundations ====================================== The Software Heritage Loader Core is a low-level loading utilities and helpers used by :term:`loaders `. The main entry points are classes: - :class:`swh.loader.core.loader.BaseLoader` for loaders (e.g. svn) - :class:`swh.loader.core.loader.DVCSLoader` for DVCS loaders (e.g. hg, git, ...) - :class:`swh.loader.package.loader.PackageLoader` for Package loaders (e.g. PyPI, Npm, ...) Package loaders --------------- This package also implements many package loaders directly, out of convenience, as they usually are quite similar and each fits in a single file. They all roughly follow these steps, explained in the :py:meth:`swh.loader.package.loader.PackageLoader.load` documentation. See the :ref:`package-loader-tutorial` for details. VCS loaders ----------- Unlike package loaders, VCS loaders remain in separate packages, as they often need more advanced conversions and very VCS-specific operations. This usually involves getting the branches of a repository and recursively loading revisions in the history (and directory trees in these revisions), until a known revision is found diff --git a/swh.loader.core.egg-info/SOURCES.txt b/swh.loader.core.egg-info/SOURCES.txt index dbf32c7..7a9d27f 100644 --- a/swh.loader.core.egg-info/SOURCES.txt +++ b/swh.loader.core.egg-info/SOURCES.txt @@ -1,216 +1,218 @@ .gitignore .pre-commit-config.yaml AUTHORS CODE_OF_CONDUCT.md CONTRIBUTORS LICENSE MANIFEST.in Makefile README.rst conftest.py mypy.ini pyproject.toml pytest.ini requirements-swh.txt requirements-test.txt requirements.txt setup.cfg setup.py tox.ini docs/.gitignore docs/Makefile docs/README.rst docs/cli.rst docs/conf.py docs/index.rst docs/package-loader-specifications.rst docs/package-loader-tutorial.rst docs/vcs-loader-overview.rst docs/_static/.placeholder docs/_templates/.placeholder swh/__init__.py swh.loader.core.egg-info/PKG-INFO swh.loader.core.egg-info/SOURCES.txt swh.loader.core.egg-info/dependency_links.txt swh.loader.core.egg-info/entry_points.txt swh.loader.core.egg-info/requires.txt swh.loader.core.egg-info/top_level.txt swh/loader/__init__.py swh/loader/cli.py swh/loader/exception.py swh/loader/pytest_plugin.py swh/loader/core/__init__.py swh/loader/core/converters.py swh/loader/core/loader.py swh/loader/core/py.typed swh/loader/core/utils.py swh/loader/core/tests/__init__.py swh/loader/core/tests/test_converters.py swh/loader/core/tests/test_loader.py swh/loader/core/tests/test_utils.py swh/loader/package/__init__.py swh/loader/package/loader.py swh/loader/package/py.typed swh/loader/package/utils.py swh/loader/package/archive/__init__.py swh/loader/package/archive/loader.py swh/loader/package/archive/tasks.py swh/loader/package/archive/tests/__init__.py swh/loader/package/archive/tests/test_archive.py swh/loader/package/archive/tests/test_tasks.py swh/loader/package/archive/tests/data/not_gzipped_tarball.tar.gz swh/loader/package/archive/tests/data/https_ftp.gnu.org/gnu_8sync_8sync-0.1.0.tar.gz swh/loader/package/archive/tests/data/https_ftp.gnu.org/gnu_8sync_8sync-0.1.0.tar.gz_visit1 swh/loader/package/archive/tests/data/https_ftp.gnu.org/gnu_8sync_8sync-0.1.0.tar.gz_visit2 swh/loader/package/archive/tests/data/https_ftp.gnu.org/gnu_8sync_8sync-0.2.0.tar.gz swh/loader/package/cran/__init__.py swh/loader/package/cran/loader.py swh/loader/package/cran/tasks.py swh/loader/package/cran/tests/__init__.py swh/loader/package/cran/tests/test_cran.py swh/loader/package/cran/tests/test_tasks.py swh/loader/package/cran/tests/data/description/KnownBR swh/loader/package/cran/tests/data/description/acepack swh/loader/package/cran/tests/data/https_cran.r-project.org/src_contrib_1.4.0_Recommended_KernSmooth_2.22-6.tar.gz swh/loader/package/debian/__init__.py swh/loader/package/debian/loader.py swh/loader/package/debian/tasks.py swh/loader/package/debian/tests/__init__.py swh/loader/package/debian/tests/test_debian.py swh/loader/package/debian/tests/test_tasks.py swh/loader/package/debian/tests/data/http_deb.debian.org/debian_pool_contrib_c_cicero_cicero_0.7.2-3.diff.gz swh/loader/package/debian/tests/data/http_deb.debian.org/debian_pool_contrib_c_cicero_cicero_0.7.2-3.dsc swh/loader/package/debian/tests/data/http_deb.debian.org/debian_pool_contrib_c_cicero_cicero_0.7.2-4.diff.gz swh/loader/package/debian/tests/data/http_deb.debian.org/debian_pool_contrib_c_cicero_cicero_0.7.2-4.dsc swh/loader/package/debian/tests/data/http_deb.debian.org/debian_pool_contrib_c_cicero_cicero_0.7.2.orig.tar.gz swh/loader/package/debian/tests/data/http_deb.debian.org/onefile.txt swh/loader/package/deposit/__init__.py swh/loader/package/deposit/loader.py swh/loader/package/deposit/tasks.py swh/loader/package/deposit/tests/__init__.py swh/loader/package/deposit/tests/conftest.py swh/loader/package/deposit/tests/test_deposit.py swh/loader/package/deposit/tests/test_tasks.py swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_666_meta swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_666_raw swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_777_meta swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_777_raw swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_888_meta swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_888_raw swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_999_meta swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_999_raw swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello-2.10.zip swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello-2.12.tar.gz swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.10.json swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.11.json swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.12.json swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.13.json swh/loader/package/maven/__init__.py swh/loader/package/maven/loader.py swh/loader/package/maven/tasks.py swh/loader/package/maven/tests/__init__.py swh/loader/package/maven/tests/test_maven.py swh/loader/package/maven/tests/test_tasks.py swh/loader/package/maven/tests/data/https_maven.org/sprova4j-0.1.0-sources.jar swh/loader/package/maven/tests/data/https_maven.org/sprova4j-0.1.0.pom swh/loader/package/maven/tests/data/https_maven.org/sprova4j-0.1.1-sources.jar swh/loader/package/maven/tests/data/https_maven.org/sprova4j-0.1.1.pom swh/loader/package/nixguix/__init__.py swh/loader/package/nixguix/loader.py swh/loader/package/nixguix/tasks.py swh/loader/package/nixguix/tests/__init__.py swh/loader/package/nixguix/tests/conftest.py swh/loader/package/nixguix/tests/test_nixguix.py swh/loader/package/nixguix/tests/test_tasks.py swh/loader/package/nixguix/tests/data/https_example.com/file.txt swh/loader/package/nixguix/tests/data/https_fail.com/truncated-archive.tgz swh/loader/package/nixguix/tests/data/https_ftp.gnu.org/gnu_8sync_8sync-0.1.0.tar.gz swh/loader/package/nixguix/tests/data/https_ftp.gnu.org/gnu_8sync_8sync-0.1.0.tar.gz_visit1 swh/loader/package/nixguix/tests/data/https_ftp.gnu.org/gnu_8sync_8sync-0.1.0.tar.gz_visit2 swh/loader/package/nixguix/tests/data/https_ftp.gnu.org/gnu_8sync_8sync-0.2.0.tar.gz swh/loader/package/nixguix/tests/data/https_github.com/owner-1_repository-1_revision-1.tgz swh/loader/package/nixguix/tests/data/https_github.com/owner-2_repository-1_revision-1.tgz swh/loader/package/nixguix/tests/data/https_github.com/owner-3_repository-1_revision-1.tgz swh/loader/package/nixguix/tests/data/https_nix-community.github.io/nixpkgs-swh_sources-EOFError.json swh/loader/package/nixguix/tests/data/https_nix-community.github.io/nixpkgs-swh_sources.json swh/loader/package/nixguix/tests/data/https_nix-community.github.io/nixpkgs-swh_sources.json_visit1 swh/loader/package/nixguix/tests/data/https_nix-community.github.io/nixpkgs-swh_sources_special.json swh/loader/package/nixguix/tests/data/https_nix-community.github.io/nixpkgs-swh_sources_special.json_visit1 swh/loader/package/npm/__init__.py swh/loader/package/npm/loader.py swh/loader/package/npm/tasks.py swh/loader/package/npm/tests/__init__.py swh/loader/package/npm/tests/test_npm.py swh/loader/package/npm/tests/test_tasks.py swh/loader/package/npm/tests/data/https_registry.npmjs.org/@aller_shared_-_shared-0.1.0.tgz swh/loader/package/npm/tests/data/https_registry.npmjs.org/@aller_shared_-_shared-0.1.1-alpha.14.tgz swh/loader/package/npm/tests/data/https_registry.npmjs.org/jammit-express_-_jammit-express-0.0.1.tgz swh/loader/package/npm/tests/data/https_registry.npmjs.org/nativescript-telerik-analytics_-_nativescript-telerik-analytics-1.0.0.tgz swh/loader/package/npm/tests/data/https_registry.npmjs.org/org_-_org-0.0.2.tgz +swh/loader/package/npm/tests/data/https_registry.npmjs.org/org_-_org-0.0.3-beta.tgz swh/loader/package/npm/tests/data/https_registry.npmjs.org/org_-_org-0.0.3.tgz swh/loader/package/npm/tests/data/https_registry.npmjs.org/org_-_org-0.0.4.tgz swh/loader/package/npm/tests/data/https_registry.npmjs.org/org_-_org-0.0.5.tgz swh/loader/package/npm/tests/data/https_registry.npmjs.org/org_-_org-0.1.0.tgz swh/loader/package/npm/tests/data/https_registry.npmjs.org/org_-_org-0.2.0.tgz swh/loader/package/npm/tests/data/https_replicate.npmjs.com/@aller_shared swh/loader/package/npm/tests/data/https_replicate.npmjs.com/catify swh/loader/package/npm/tests/data/https_replicate.npmjs.com/jammit-express swh/loader/package/npm/tests/data/https_replicate.npmjs.com/jammit-no-time swh/loader/package/npm/tests/data/https_replicate.npmjs.com/nativescript-telerik-analytics swh/loader/package/npm/tests/data/https_replicate.npmjs.com/org +swh/loader/package/npm/tests/data/https_replicate.npmjs.com/org_version_mismatch swh/loader/package/npm/tests/data/https_replicate.npmjs.com/org_visit1 swh/loader/package/opam/__init__.py swh/loader/package/opam/loader.py swh/loader/package/opam/tasks.py swh/loader/package/opam/tests/__init__.py swh/loader/package/opam/tests/test_opam.py swh/loader/package/opam/tests/test_tasks.py swh/loader/package/opam/tests/data/fake_opam_repo/_repo swh/loader/package/opam/tests/data/fake_opam_repo/version swh/loader/package/opam/tests/data/fake_opam_repo/repo/loadertest/lock swh/loader/package/opam/tests/data/fake_opam_repo/repo/loadertest/repos-config swh/loader/package/opam/tests/data/fake_opam_repo/repo/loadertest/packages/agrid/agrid.0.1/opam swh/loader/package/opam/tests/data/fake_opam_repo/repo/loadertest/packages/directories/directories.0.1/opam swh/loader/package/opam/tests/data/fake_opam_repo/repo/loadertest/packages/directories/directories.0.2/opam swh/loader/package/opam/tests/data/fake_opam_repo/repo/loadertest/packages/directories/directories.0.3/opam swh/loader/package/opam/tests/data/fake_opam_repo/repo/loadertest/packages/ocb/ocb.0.1/opam swh/loader/package/opam/tests/data/https_github.com/OCamlPro_agrid_archive_0.1.tar.gz swh/loader/package/opam/tests/data/https_github.com/OCamlPro_directories_archive_0.1.tar.gz swh/loader/package/opam/tests/data/https_github.com/OCamlPro_directories_archive_0.2.tar.gz swh/loader/package/opam/tests/data/https_github.com/OCamlPro_directories_archive_0.3.tar.gz swh/loader/package/opam/tests/data/https_github.com/OCamlPro_ocb_archive_0.1.tar.gz swh/loader/package/pypi/__init__.py swh/loader/package/pypi/loader.py swh/loader/package/pypi/tasks.py swh/loader/package/pypi/tests/__init__.py swh/loader/package/pypi/tests/test_pypi.py swh/loader/package/pypi/tests/test_tasks.py swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/0805nexter-1.1.0.tar.gz swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/0805nexter-1.1.0.zip swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/0805nexter-1.2.0.zip swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/0805nexter-1.3.0.zip swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/0805nexter-1.4.0.zip swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/nexter-1.1.0.tar.gz swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/nexter-1.1.0.zip swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/packages_70_97_c49fb8ec24a7aaab54c3dbfbb5a6ca1431419d9ee0f6c363d9ad01d2b8b1_0805nexter-1.3.0.zip swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/packages_86_10_c9555ec63106153aaaad753a281ff47f4ac79e980ff7f5d740d6649cd56a_upymenu-0.0.1.tar.gz swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/packages_c4_a0_4562cda161dc4ecbbe9e2a11eb365400c0461845c5be70d73869786809c4_0805nexter-1.2.0.zip swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/packages_c4_a0_4562cda161dc4ecbbe9e2a11eb365400c0461845c5be70d73869786809c4_0805nexter-1.2.0.zip_visit1 swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/packages_ec_65_c0116953c9a3f47de89e71964d6c7b0c783b01f29fa3390584dbf3046b4d_0805nexter-1.1.0.zip swh/loader/package/pypi/tests/data/https_files.pythonhosted.org/packages_ec_65_c0116953c9a3f47de89e71964d6c7b0c783b01f29fa3390584dbf3046b4d_0805nexter-1.1.0.zip_visit1 swh/loader/package/pypi/tests/data/https_pypi.org/pypi_0805nexter_json swh/loader/package/pypi/tests/data/https_pypi.org/pypi_0805nexter_json_visit1 swh/loader/package/pypi/tests/data/https_pypi.org/pypi_nexter_json swh/loader/package/pypi/tests/data/https_pypi.org/pypi_upymenu_json swh/loader/package/tests/__init__.py swh/loader/package/tests/common.py swh/loader/package/tests/test_conftest.py swh/loader/package/tests/test_loader.py swh/loader/package/tests/test_loader_metadata.py swh/loader/package/tests/test_utils.py swh/loader/tests/__init__.py swh/loader/tests/conftest.py swh/loader/tests/py.typed swh/loader/tests/test_cli.py swh/loader/tests/test_init.py swh/loader/tests/data/0805nexter-1.1.0.tar.gz \ No newline at end of file diff --git a/swh/loader/package/deposit/loader.py b/swh/loader/package/deposit/loader.py index 964b880..794f33d 100644 --- a/swh/loader/package/deposit/loader.py +++ b/swh/loader/package/deposit/loader.py @@ -1,381 +1,381 @@ # Copyright (C) 2019-2021 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information import datetime from datetime import timezone import json import logging from typing import Any, Dict, Iterator, List, Mapping, Optional, Sequence, Tuple, Union import attr import requests from swh.core.config import load_from_envvar from swh.loader.core.loader import DEFAULT_CONFIG from swh.loader.package.loader import ( BasePackageInfo, PackageLoader, RawExtrinsicMetadataCore, ) from swh.loader.package.utils import cached_method, download from swh.model.hashutil import hash_to_bytes, hash_to_hex from swh.model.model import ( MetadataAuthority, MetadataAuthorityType, MetadataFetcher, ObjectType, Person, Release, Sha1Git, TimestampWithTimezone, ) from swh.storage.algos.snapshot import snapshot_get_all_branches from swh.storage.interface import StorageInterface logger = logging.getLogger(__name__) def now() -> datetime.datetime: return datetime.datetime.now(tz=timezone.utc) @attr.s class DepositPackageInfo(BasePackageInfo): filename = attr.ib(type=str) # instead of Optional[str] author_date = attr.ib(type=datetime.datetime) """codemeta:dateCreated if any, deposit completed_date otherwise""" commit_date = attr.ib(type=datetime.datetime) """codemeta:datePublished if any, deposit completed_date otherwise""" client = attr.ib(type=str) id = attr.ib(type=int) """Internal ID of the deposit in the deposit DB""" collection = attr.ib(type=str) """The collection in the deposit; see SWORD specification.""" author = attr.ib(type=Person) committer = attr.ib(type=Person) release_notes = attr.ib(type=Optional[str]) @classmethod def from_metadata( cls, metadata: Dict[str, Any], url: str, filename: str, version: str ) -> "DepositPackageInfo": # Note: # `date` and `committer_date` are always transmitted by the deposit read api # which computes itself the values. The loader needs to use those to create the # release. - metadata_raw: str = metadata["metadata_raw"] + raw_metadata: str = metadata["raw_metadata"] depo = metadata["deposit"] return cls( url=url, filename=filename, version=version, author_date=depo["author_date"], commit_date=depo["committer_date"], client=depo["client"], id=depo["id"], collection=depo["collection"], author=parse_author(depo["author"]), committer=parse_author(depo["committer"]), release_notes=depo["release_notes"], directory_extrinsic_metadata=[ RawExtrinsicMetadataCore( discovery_date=now(), - metadata=metadata_raw.encode(), + metadata=raw_metadata.encode(), format="sword-v2-atom-codemeta-v2", ) ], ) def extid(self) -> None: # For now, we don't try to deduplicate deposits. There is little point anyway, # as it only happens when the exact same tarball was deposited twice. return None class DepositLoader(PackageLoader[DepositPackageInfo]): """Load a deposited artifact into swh archive. """ visit_type = "deposit" def __init__( self, storage: StorageInterface, url: str, deposit_id: str, deposit_client: "ApiClient", max_content_size: Optional[int] = None, default_filename: str = "archive.tar", ): """Constructor Args: url: Origin url to associate the artifacts/metadata to deposit_id: Deposit identity deposit_client: Deposit api client """ super().__init__(storage=storage, url=url, max_content_size=max_content_size) self.deposit_id = deposit_id self.client = deposit_client self.default_filename = default_filename @classmethod def from_configfile(cls, **kwargs: Any): """Instantiate a loader from the configuration loaded from the SWH_CONFIG_FILENAME envvar, with potential extra keyword arguments if their value is not None. Args: kwargs: kwargs passed to the loader instantiation """ config = dict(load_from_envvar(DEFAULT_CONFIG)) config.update({k: v for k, v in kwargs.items() if v is not None}) deposit_client = ApiClient(**config.pop("deposit")) return cls.from_config(deposit_client=deposit_client, **config) def get_versions(self) -> Sequence[str]: # only 1 branch 'HEAD' with no alias since we only have 1 snapshot # branch return ["HEAD"] def get_metadata_authority(self) -> MetadataAuthority: provider = self.metadata()["provider"] assert provider["provider_type"] == MetadataAuthorityType.DEPOSIT_CLIENT.value return MetadataAuthority( type=MetadataAuthorityType.DEPOSIT_CLIENT, url=provider["provider_url"], metadata={ "name": provider["provider_name"], **(provider["metadata"] or {}), }, ) def get_metadata_fetcher(self) -> MetadataFetcher: tool = self.metadata()["tool"] return MetadataFetcher( name=tool["name"], version=tool["version"], metadata=tool["configuration"], ) def get_package_info( self, version: str ) -> Iterator[Tuple[str, DepositPackageInfo]]: p_info = DepositPackageInfo.from_metadata( self.metadata(), url=self.url, filename=self.default_filename, version=version, ) yield "HEAD", p_info def download_package( self, p_info: DepositPackageInfo, tmpdir: str ) -> List[Tuple[str, Mapping]]: """Override to allow use of the dedicated deposit client """ return [self.client.archive_get(self.deposit_id, tmpdir, p_info.filename)] def build_release( self, p_info: DepositPackageInfo, uncompressed_path: str, directory: Sha1Git, ) -> Optional[Release]: message = ( f"{p_info.client}: Deposit {p_info.id} in collection {p_info.collection}" ) if p_info.release_notes: message += "\n\n" + p_info.release_notes if not message.endswith("\n"): message += "\n" return Release( name=p_info.version.encode(), message=message.encode(), author=p_info.author, date=TimestampWithTimezone.from_dict(p_info.author_date), target=directory, target_type=ObjectType.DIRECTORY, synthetic=True, ) def get_extrinsic_origin_metadata(self) -> List[RawExtrinsicMetadataCore]: metadata = self.metadata() - metadata_raw: str = metadata["metadata_raw"] + raw_metadata: str = metadata["raw_metadata"] origin_metadata = json.dumps( { - "metadata": [metadata_raw], + "metadata": [raw_metadata], "provider": metadata["provider"], "tool": metadata["tool"], } ).encode() return [ RawExtrinsicMetadataCore( discovery_date=now(), - metadata=metadata_raw.encode(), + metadata=raw_metadata.encode(), format="sword-v2-atom-codemeta-v2", ), RawExtrinsicMetadataCore( discovery_date=now(), metadata=origin_metadata, format="original-artifacts-json", ), ] @cached_method def metadata(self): """Returns metadata from the deposit server""" return self.client.metadata_get(self.deposit_id) def load(self) -> Dict: # First making sure the deposit is known on the deposit's RPC server # prior to trigger a loading try: self.metadata() except ValueError: logger.error(f"Unknown deposit {self.deposit_id}, ignoring") return {"status": "failed"} # Then usual loading return super().load() def finalize_visit( self, status_visit: str, errors: Optional[List[str]] = None, **kwargs ) -> Dict[str, Any]: r = super().finalize_visit(status_visit=status_visit, **kwargs) success = status_visit == "full" # Update deposit status try: if not success: self.client.status_update( self.deposit_id, status="failed", errors=errors, ) return r snapshot_id = hash_to_bytes(r["snapshot_id"]) snapshot = snapshot_get_all_branches(self.storage, snapshot_id) if not snapshot: return r branches = snapshot.branches logger.debug("branches: %s", branches) if not branches: return r rel_id = branches[b"HEAD"].target release = self.storage.release_get([rel_id])[0] if not release: return r # update the deposit's status to success with its # release-id and directory-id self.client.status_update( self.deposit_id, status="done", release_id=hash_to_hex(rel_id), directory_id=hash_to_hex(release.target), snapshot_id=r["snapshot_id"], origin_url=self.url, ) except Exception: logger.exception("Problem when trying to update the deposit's status") return {"status": "failed"} return r def parse_author(author) -> Person: """See prior fixme """ return Person( fullname=author["fullname"].encode("utf-8"), name=author["name"].encode("utf-8"), email=author["email"].encode("utf-8"), ) class ApiClient: """Private Deposit Api client """ def __init__(self, url, auth: Optional[Mapping[str, str]]): self.base_url = url.rstrip("/") self.auth = None if not auth else (auth["username"], auth["password"]) def do(self, method: str, url: str, *args, **kwargs): """Internal method to deal with requests, possibly with basic http authentication. Args: method (str): supported http methods as in get/post/put Returns: The request's execution output """ method_fn = getattr(requests, method) if self.auth: kwargs["auth"] = self.auth return method_fn(url, *args, **kwargs) def archive_get( self, deposit_id: Union[int, str], tmpdir: str, filename: str ) -> Tuple[str, Dict]: """Retrieve deposit's archive artifact locally """ url = f"{self.base_url}/{deposit_id}/raw/" return download(url, dest=tmpdir, filename=filename, auth=self.auth) def metadata_url(self, deposit_id: Union[int, str]) -> str: return f"{self.base_url}/{deposit_id}/meta/" def metadata_get(self, deposit_id: Union[int, str]) -> Dict[str, Any]: """Retrieve deposit's metadata artifact as json """ url = self.metadata_url(deposit_id) r = self.do("get", url) if r.ok: return r.json() msg = f"Problem when retrieving deposit metadata at {url}" logger.error(msg) raise ValueError(msg) def status_update( self, deposit_id: Union[int, str], status: str, errors: Optional[List[str]] = None, release_id: Optional[str] = None, directory_id: Optional[str] = None, snapshot_id: Optional[str] = None, origin_url: Optional[str] = None, ): """Update deposit's information including status, and persistent identifiers result of the loading. """ url = f"{self.base_url}/{deposit_id}/update/" payload: Dict[str, Any] = {"status": status} if release_id: payload["release_id"] = release_id if directory_id: payload["directory_id"] = directory_id if snapshot_id: payload["snapshot_id"] = snapshot_id if origin_url: payload["origin_url"] = origin_url if errors: payload["status_detail"] = {"loading": errors} self.do("put", url, json=payload) diff --git a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_666_meta b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_666_meta index 758fdd2..b76dc9e 100644 --- a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_666_meta +++ b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_666_meta @@ -1,51 +1,51 @@ { "origin": { "url": "https://hal-test.archives-ouvertes.fr/some-external-id", "type": "deposit" }, - "metadata_raw" : "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother one", + "raw_metadata" : "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother one", "provider": { "provider_name": "hal", "provider_type": "deposit_client", "provider_url": "https://hal-test.archives-ouvertes.fr/", "metadata": null }, "tool": { "name": "swh-deposit", "version": "0.0.1", "configuration": { "sword_version": "2" } }, "deposit": { "id": "666", "client": "hal", "collection": "hal", "author": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "author_date": { "timestamp": { "seconds": 1507389428, "microseconds": 0 }, "offset": 0 }, "committer": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "committer_date": { "timestamp": { "seconds": 1507389428, "microseconds": 0 }, "offset": 0 }, "revision_parents": [], "release_notes": null } } diff --git a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_777_meta b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_777_meta index 8b46bcd..1a2c258 100644 --- a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_777_meta +++ b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_777_meta @@ -1,51 +1,51 @@ { "origin": { "url": "https://hal-test.archives-ouvertes.fr/some-external-id", "type": "deposit" }, - "metadata_raw": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", + "raw_metadata": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", "provider": { "provider_name": "hal", "provider_type": "deposit_client", "provider_url": "https://hal-test.archives-ouvertes.fr/", "metadata": null }, "tool": { "name": "swh-deposit", "version": "0.0.1", "configuration": { "sword_version": "2" } }, "deposit": { "id": 777, "client": "hal", "collection": "hal", "author": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "author_date": { "timestamp": { "seconds": 1507389428, "microseconds": 0 }, "offset": 0 }, "committer": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "committer_date": { "timestamp": { "seconds": 1507474800, "microseconds": 0 }, "offset": 0 }, "revision_parents": [], "release_notes": null } } diff --git a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_888_meta b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_888_meta index 30cc188..fbf3272 100644 --- a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_888_meta +++ b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_888_meta @@ -1,51 +1,51 @@ { "origin": { "url": "https://hal-test.archives-ouvertes.fr/hal-123456", "type": "deposit" }, - "metadata_raw": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", + "raw_metadata": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", "provider": { "provider_name": "hal", "provider_type": "deposit_client", "provider_url": "https://hal-test.archives-ouvertes.fr/", "metadata": null }, "tool": { "name": "swh-deposit", "version": "0.0.1", "configuration": { "sword_version": "2" } }, "deposit": { "id": 888, "client": "hal", "collection": "hal", "author": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "author_date": { "timestamp": { "seconds": 1507389428, "microseconds": 0 }, "offset": 0 }, "committer": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "committer_date": { "timestamp": { "seconds": 1507474800, "microseconds": 0 }, "offset": 0 }, "revision_parents": [], "release_notes": null } } diff --git a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_999_meta b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_999_meta index bad1d1d..62de3de 100644 --- a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_999_meta +++ b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/1_private_999_meta @@ -1,51 +1,51 @@ { "origin": { "url": "https://hal-test.archives-ouvertes.fr/hal-123456", "type": "deposit" }, - "metadata_raw": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", + "raw_metadata": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", "provider": { "provider_name": "hal", "provider_type": "deposit_client", "provider_url": "https://hal-test.archives-ouvertes.fr/", "metadata": null }, "tool": { "name": "swh-deposit", "version": "0.0.1", "configuration": { "sword_version": "2" } }, "deposit": { "id": 999, "client": "hal", "collection": "hal", "author": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "author_date": { "timestamp": { "seconds": 1507389428, "microseconds": 0 }, "offset": 0 }, "committer": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "committer_date": { "timestamp": { "seconds": 1507474800, "microseconds": 0 }, "offset": 0 }, "revision_parents": [], "release_notes": "This release adds this and that." } } diff --git a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.10.json b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.10.json index 758fdd2..b76dc9e 100644 --- a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.10.json +++ b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.10.json @@ -1,51 +1,51 @@ { "origin": { "url": "https://hal-test.archives-ouvertes.fr/some-external-id", "type": "deposit" }, - "metadata_raw" : "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother one", + "raw_metadata" : "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother one", "provider": { "provider_name": "hal", "provider_type": "deposit_client", "provider_url": "https://hal-test.archives-ouvertes.fr/", "metadata": null }, "tool": { "name": "swh-deposit", "version": "0.0.1", "configuration": { "sword_version": "2" } }, "deposit": { "id": "666", "client": "hal", "collection": "hal", "author": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "author_date": { "timestamp": { "seconds": 1507389428, "microseconds": 0 }, "offset": 0 }, "committer": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "committer_date": { "timestamp": { "seconds": 1507389428, "microseconds": 0 }, "offset": 0 }, "revision_parents": [], "release_notes": null } } diff --git a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.11.json b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.11.json index 8b46bcd..1a2c258 100644 --- a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.11.json +++ b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.11.json @@ -1,51 +1,51 @@ { "origin": { "url": "https://hal-test.archives-ouvertes.fr/some-external-id", "type": "deposit" }, - "metadata_raw": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", + "raw_metadata": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", "provider": { "provider_name": "hal", "provider_type": "deposit_client", "provider_url": "https://hal-test.archives-ouvertes.fr/", "metadata": null }, "tool": { "name": "swh-deposit", "version": "0.0.1", "configuration": { "sword_version": "2" } }, "deposit": { "id": 777, "client": "hal", "collection": "hal", "author": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "author_date": { "timestamp": { "seconds": 1507389428, "microseconds": 0 }, "offset": 0 }, "committer": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "committer_date": { "timestamp": { "seconds": 1507474800, "microseconds": 0 }, "offset": 0 }, "revision_parents": [], "release_notes": null } } diff --git a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.12.json b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.12.json index 30cc188..fbf3272 100644 --- a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.12.json +++ b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.12.json @@ -1,51 +1,51 @@ { "origin": { "url": "https://hal-test.archives-ouvertes.fr/hal-123456", "type": "deposit" }, - "metadata_raw": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", + "raw_metadata": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", "provider": { "provider_name": "hal", "provider_type": "deposit_client", "provider_url": "https://hal-test.archives-ouvertes.fr/", "metadata": null }, "tool": { "name": "swh-deposit", "version": "0.0.1", "configuration": { "sword_version": "2" } }, "deposit": { "id": 888, "client": "hal", "collection": "hal", "author": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "author_date": { "timestamp": { "seconds": 1507389428, "microseconds": 0 }, "offset": 0 }, "committer": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "committer_date": { "timestamp": { "seconds": 1507474800, "microseconds": 0 }, "offset": 0 }, "revision_parents": [], "release_notes": null } } diff --git a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.13.json b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.13.json index bad1d1d..62de3de 100644 --- a/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.13.json +++ b/swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.13.json @@ -1,51 +1,51 @@ { "origin": { "url": "https://hal-test.archives-ouvertes.fr/hal-123456", "type": "deposit" }, - "metadata_raw": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", + "raw_metadata": "some-external-idhttps://hal-test.archives-ouvertes.fr/some-external-id2017-10-07T15:17:08Zsome awesome authoranother oneno one", "provider": { "provider_name": "hal", "provider_type": "deposit_client", "provider_url": "https://hal-test.archives-ouvertes.fr/", "metadata": null }, "tool": { "name": "swh-deposit", "version": "0.0.1", "configuration": { "sword_version": "2" } }, "deposit": { "id": 999, "client": "hal", "collection": "hal", "author": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "author_date": { "timestamp": { "seconds": 1507389428, "microseconds": 0 }, "offset": 0 }, "committer": { "name": "Software Heritage", "fullname": "Software Heritage", "email": "robot@softwareheritage.org" }, "committer_date": { "timestamp": { "seconds": 1507474800, "microseconds": 0 }, "offset": 0 }, "revision_parents": [], "release_notes": "This release adds this and that." } } diff --git a/swh/loader/package/deposit/tests/test_deposit.py b/swh/loader/package/deposit/tests/test_deposit.py index 7ee0e4b..64476a4 100644 --- a/swh/loader/package/deposit/tests/test_deposit.py +++ b/swh/loader/package/deposit/tests/test_deposit.py @@ -1,557 +1,557 @@ # Copyright (C) 2019-2021 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information import datetime import json import re import pytest from swh.core.pytest_plugin import requests_mock_datadir_factory from swh.loader.package.deposit.loader import ApiClient, DepositLoader from swh.loader.package.loader import now from swh.loader.tests import assert_last_visit_matches, check_snapshot, get_stats from swh.model.hashutil import hash_to_bytes, hash_to_hex from swh.model.model import ( Origin, Person, RawExtrinsicMetadata, Release, Snapshot, SnapshotBranch, TargetType, TimestampWithTimezone, ) from swh.model.model import MetadataAuthority, MetadataAuthorityType, MetadataFetcher from swh.model.model import ObjectType as ModelObjectType from swh.model.swhids import CoreSWHID, ExtendedObjectType, ExtendedSWHID, ObjectType DEPOSIT_URL = "https://deposit.softwareheritage.org/1/private" @pytest.fixture def requests_mock_datadir(requests_mock_datadir): """Enhance default mock data to mock put requests as the loader does some internal update queries there. """ requests_mock_datadir.put(re.compile("https")) return requests_mock_datadir def test_deposit_init_ok(swh_storage, deposit_client, swh_loader_config): url = "some-url" deposit_id = 999 loader = DepositLoader( swh_storage, url, deposit_id, deposit_client, default_filename="archive.zip" ) # Something that does not exist assert loader.url == url assert loader.client is not None assert loader.client.base_url == swh_loader_config["deposit"]["url"] def test_deposit_from_configfile(swh_config): """Ensure the deposit instantiation is ok """ loader = DepositLoader.from_configfile( url="some-url", deposit_id="666", default_filename="archive.zip" ) assert isinstance(loader.client, ApiClient) def test_deposit_loading_unknown_deposit( swh_storage, deposit_client, requests_mock_datadir ): """Loading an unknown deposit should fail no origin, no visit, no snapshot """ # private api url form: 'https://deposit.s.o/1/private/hal/666/raw/' url = "some-url" unknown_deposit_id = 667 loader = DepositLoader( swh_storage, url, unknown_deposit_id, deposit_client, default_filename="archive.zip", ) # does not exist actual_load_status = loader.load() assert actual_load_status == {"status": "failed"} stats = get_stats(loader.storage) assert { "content": 0, "directory": 0, "origin": 0, "origin_visit": 0, "release": 0, "revision": 0, "skipped_content": 0, "snapshot": 0, } == stats requests_mock_datadir_missing_one = requests_mock_datadir_factory( ignore_urls=[f"{DEPOSIT_URL}/666/raw/",] ) def test_deposit_loading_failure_to_retrieve_1_artifact( swh_storage, deposit_client, requests_mock_datadir_missing_one ): """Deposit with missing artifact ends up with an uneventful/partial visit """ # private api url form: 'https://deposit.s.o/1/private/hal/666/raw/' url = "some-url-2" deposit_id = 666 requests_mock_datadir_missing_one.put(re.compile("https")) loader = DepositLoader( swh_storage, url, deposit_id, deposit_client, default_filename="archive.zip" ) actual_load_status = loader.load() assert actual_load_status["status"] == "uneventful" assert actual_load_status["snapshot_id"] is not None assert_last_visit_matches(loader.storage, url, status="partial", type="deposit") stats = get_stats(loader.storage) assert { "content": 0, "directory": 0, "origin": 1, "origin_visit": 1, "release": 0, "revision": 0, "skipped_content": 0, "snapshot": 1, } == stats # Retrieve the information for deposit status update query to the deposit urls = [ m for m in requests_mock_datadir_missing_one.request_history if m.url == f"{DEPOSIT_URL}/{deposit_id}/update/" ] assert len(urls) == 1 update_query = urls[0] body = update_query.json() expected_body = { "status": "failed", "status_detail": { "loading": [ "Failed to load branch HEAD for some-url-2: Fail to query " "'https://deposit.softwareheritage.org/1/private/666/raw/'. Reason: 404" ] }, } assert body == expected_body def test_deposit_loading_ok(swh_storage, deposit_client, requests_mock_datadir): url = "https://hal-test.archives-ouvertes.fr/some-external-id" deposit_id = 666 loader = DepositLoader( swh_storage, url, deposit_id, deposit_client, default_filename="archive.zip" ) actual_load_status = loader.load() expected_snapshot_id = "338b45d87e02fb5cbf324694bc4a898623d6a30f" assert actual_load_status == { "status": "eventful", "snapshot_id": expected_snapshot_id, } assert_last_visit_matches( loader.storage, url, status="full", type="deposit", snapshot=hash_to_bytes(expected_snapshot_id), ) release_id_hex = "2566a64a27bc00362e265be9666d7606750530a1" release_id = hash_to_bytes(release_id_hex) expected_snapshot = Snapshot( id=hash_to_bytes(expected_snapshot_id), branches={ b"HEAD": SnapshotBranch(target=release_id, target_type=TargetType.RELEASE,), }, ) check_snapshot(expected_snapshot, storage=loader.storage) release = loader.storage.release_get([release_id])[0] date = TimestampWithTimezone.from_datetime( datetime.datetime(2017, 10, 7, 15, 17, 8, tzinfo=datetime.timezone.utc) ) person = Person( fullname=b"Software Heritage", name=b"Software Heritage", email=b"robot@softwareheritage.org", ) assert release == Release( id=release_id, name=b"HEAD", message=b"hal: Deposit 666 in collection hal\n", author=person, date=date, target_type=ModelObjectType.DIRECTORY, target=b"\xfd-\xf1-\xc5SL\x1d\xa1\xe9\x18\x0b\x91Q\x02\xfbo`\x1d\x19", synthetic=True, metadata=None, ) # check metadata fetcher = MetadataFetcher(name="swh-deposit", version="0.0.1",) authority = MetadataAuthority( type=MetadataAuthorityType.DEPOSIT_CLIENT, url="https://hal-test.archives-ouvertes.fr/", ) # Check origin metadata orig_meta = loader.storage.raw_extrinsic_metadata_get( Origin(url).swhid(), authority ) assert orig_meta.next_page_token is None raw_meta = loader.client.metadata_get(deposit_id) - metadata_raw: str = raw_meta["metadata_raw"] + raw_metadata: str = raw_meta["raw_metadata"] # 2 raw metadata xml + 1 json dict assert len(orig_meta.results) == 2 orig_meta0 = orig_meta.results[0] assert orig_meta0.authority == authority assert orig_meta0.fetcher == fetcher # Check directory metadata assert release.target_type == ModelObjectType.DIRECTORY directory_swhid = CoreSWHID( object_type=ObjectType.DIRECTORY, object_id=release.target ) actual_dir_meta = loader.storage.raw_extrinsic_metadata_get( directory_swhid, authority ) assert actual_dir_meta.next_page_token is None assert len(actual_dir_meta.results) == 1 dir_meta = actual_dir_meta.results[0] assert dir_meta.authority == authority assert dir_meta.fetcher == fetcher - assert dir_meta.metadata.decode() == metadata_raw + assert dir_meta.metadata.decode() == raw_metadata # Retrieve the information for deposit status update query to the deposit urls = [ m for m in requests_mock_datadir.request_history if m.url == f"{DEPOSIT_URL}/{deposit_id}/update/" ] assert len(urls) == 1 update_query = urls[0] body = update_query.json() expected_body = { "status": "done", "release_id": release_id_hex, "directory_id": hash_to_hex(release.target), "snapshot_id": expected_snapshot_id, "origin_url": url, } assert body == expected_body stats = get_stats(loader.storage) assert { "content": 303, "directory": 12, "origin": 1, "origin_visit": 1, "release": 1, "revision": 0, "skipped_content": 0, "snapshot": 1, } == stats def test_deposit_loading_ok_2(swh_storage, deposit_client, requests_mock_datadir): """Field dates should be se appropriately """ external_id = "some-external-id" url = f"https://hal-test.archives-ouvertes.fr/{external_id}" deposit_id = 777 loader = DepositLoader( swh_storage, url, deposit_id, deposit_client, default_filename="archive.zip" ) actual_load_status = loader.load() expected_snapshot_id = "3449b8ff31abeacefd33cca60e3074c1649dc3a1" assert actual_load_status == { "status": "eventful", "snapshot_id": expected_snapshot_id, } assert_last_visit_matches( loader.storage, url, status="full", type="deposit", snapshot=hash_to_bytes(expected_snapshot_id), ) release_id = "ba6c9a59ae3256e765d32b211cc183dc2380aed7" expected_snapshot = Snapshot( id=hash_to_bytes(expected_snapshot_id), branches={ b"HEAD": SnapshotBranch( target=hash_to_bytes(release_id), target_type=TargetType.RELEASE ) }, ) check_snapshot(expected_snapshot, storage=loader.storage) raw_meta = loader.client.metadata_get(deposit_id) # Ensure the date fields are set appropriately in the release # Retrieve the release release = loader.storage.release_get([hash_to_bytes(release_id)])[0] assert release # swh-deposit uses the numeric 'offset_minutes' instead of the bytes offset # attribute, because its dates are always well-formed, and it can only send # JSON-serializable data. release_date_dict = { "timestamp": release.date.timestamp.to_dict(), "offset": release.date.offset_minutes(), } assert release_date_dict == raw_meta["deposit"]["author_date"] assert not release.metadata provider = { "provider_name": "hal", "provider_type": "deposit_client", "provider_url": "https://hal-test.archives-ouvertes.fr/", "metadata": None, } tool = { "name": "swh-deposit", "version": "0.0.1", "configuration": {"sword_version": "2"}, } fetcher = MetadataFetcher(name="swh-deposit", version="0.0.1",) authority = MetadataAuthority( type=MetadataAuthorityType.DEPOSIT_CLIENT, url="https://hal-test.archives-ouvertes.fr/", ) # Check the origin metadata swh side origin_extrinsic_metadata = loader.storage.raw_extrinsic_metadata_get( Origin(url).swhid(), authority ) assert origin_extrinsic_metadata.next_page_token is None - metadata_raw: str = raw_meta["metadata_raw"] + raw_metadata: str = raw_meta["raw_metadata"] # 1 raw metadata xml + 1 json dict assert len(origin_extrinsic_metadata.results) == 2 origin_swhid = Origin(url).swhid() expected_metadata = [] origin_meta = origin_extrinsic_metadata.results[0] expected_metadata.append( RawExtrinsicMetadata( target=origin_swhid, discovery_date=origin_meta.discovery_date, - metadata=metadata_raw.encode(), + metadata=raw_metadata.encode(), format="sword-v2-atom-codemeta-v2", authority=authority, fetcher=fetcher, ) ) origin_metadata = { - "metadata": [metadata_raw], + "metadata": [raw_metadata], "provider": provider, "tool": tool, } expected_metadata.append( RawExtrinsicMetadata( target=origin_swhid, discovery_date=origin_extrinsic_metadata.results[-1].discovery_date, metadata=json.dumps(origin_metadata).encode(), format="original-artifacts-json", authority=authority, fetcher=fetcher, ) ) assert sorted(origin_extrinsic_metadata.results) == sorted(expected_metadata) # Check the release metadata swh side assert release.target_type == ModelObjectType.DIRECTORY directory_swhid = ExtendedSWHID( object_type=ExtendedObjectType.DIRECTORY, object_id=release.target ) actual_directory_metadata = loader.storage.raw_extrinsic_metadata_get( directory_swhid, authority ) assert actual_directory_metadata.next_page_token is None assert len(actual_directory_metadata.results) == 1 release_swhid = CoreSWHID( object_type=ObjectType.RELEASE, object_id=hash_to_bytes(release_id) ) dir_metadata_template = RawExtrinsicMetadata( target=directory_swhid, format="sword-v2-atom-codemeta-v2", authority=authority, fetcher=fetcher, origin=url, release=release_swhid, # to satisfy the constructor discovery_date=now(), metadata=b"", ) expected_directory_metadata = [] dir_metadata = actual_directory_metadata.results[0] expected_directory_metadata.append( RawExtrinsicMetadata.from_dict( { **{ k: v for (k, v) in dir_metadata_template.to_dict().items() if k != "id" }, "discovery_date": dir_metadata.discovery_date, - "metadata": metadata_raw.encode(), + "metadata": raw_metadata.encode(), } ) ) assert sorted(actual_directory_metadata.results) == sorted( expected_directory_metadata ) # Retrieve the information for deposit status update query to the deposit urls = [ m for m in requests_mock_datadir.request_history if m.url == f"{DEPOSIT_URL}/{deposit_id}/update/" ] assert len(urls) == 1 update_query = urls[0] body = update_query.json() expected_body = { "status": "done", "release_id": release_id, "directory_id": hash_to_hex(release.target), "snapshot_id": expected_snapshot_id, "origin_url": url, } assert body == expected_body def test_deposit_loading_ok_3(swh_storage, deposit_client, requests_mock_datadir): """Deposit loading can happen on tarball artifacts as well The latest deposit changes introduce the internal change. """ external_id = "hal-123456" url = f"https://hal-test.archives-ouvertes.fr/{external_id}" deposit_id = 888 loader = DepositLoader(swh_storage, url, deposit_id, deposit_client) actual_load_status = loader.load() expected_snapshot_id = "4677843de89e398f1d6bfedc9ca9b89c451c55c8" assert actual_load_status == { "status": "eventful", "snapshot_id": expected_snapshot_id, } assert_last_visit_matches( loader.storage, url, status="full", type="deposit", snapshot=hash_to_bytes(expected_snapshot_id), ) def test_deposit_loading_ok_release_notes( swh_storage, deposit_client, requests_mock_datadir ): url = "https://hal-test.archives-ouvertes.fr/some-external-id" deposit_id = 999 loader = DepositLoader( swh_storage, url, deposit_id, deposit_client, default_filename="archive.zip" ) actual_load_status = loader.load() expected_snapshot_id = "a307acffb7c29bebb3daf1bcb680bb3f452890a8" assert actual_load_status == { "status": "eventful", "snapshot_id": expected_snapshot_id, } assert_last_visit_matches( loader.storage, url, status="full", type="deposit", snapshot=hash_to_bytes(expected_snapshot_id), ) release_id_hex = "f5e8ec02ede57edbe061afa7fc2a07bb7d14a700" release_id = hash_to_bytes(release_id_hex) expected_snapshot = Snapshot( id=hash_to_bytes(expected_snapshot_id), branches={ b"HEAD": SnapshotBranch(target=release_id, target_type=TargetType.RELEASE,), }, ) check_snapshot(expected_snapshot, storage=loader.storage) release = loader.storage.release_get([release_id])[0] date = TimestampWithTimezone.from_datetime( datetime.datetime(2017, 10, 7, 15, 17, 8, tzinfo=datetime.timezone.utc) ) person = Person( fullname=b"Software Heritage", name=b"Software Heritage", email=b"robot@softwareheritage.org", ) assert release == Release( id=release_id, name=b"HEAD", message=( b"hal: Deposit 999 in collection hal\n\nThis release adds this and that.\n" ), author=person, date=date, target_type=ModelObjectType.DIRECTORY, target=b"\xfd-\xf1-\xc5SL\x1d\xa1\xe9\x18\x0b\x91Q\x02\xfbo`\x1d\x19", synthetic=True, metadata=None, ) diff --git a/swh/loader/package/npm/loader.py b/swh/loader/package/npm/loader.py index a302cbf..b082a0f 100644 --- a/swh/loader/package/npm/loader.py +++ b/swh/loader/package/npm/loader.py @@ -1,296 +1,309 @@ # Copyright (C) 2019-2021 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information from codecs import BOM_UTF8 import json import logging import os +import string from typing import Any, Dict, Iterator, List, Optional, Sequence, Tuple, Union from urllib.parse import quote import attr import chardet from swh.loader.package.loader import ( BasePackageInfo, PackageLoader, - PartialExtID, RawExtrinsicMetadataCore, ) from swh.loader.package.utils import api_info, cached_method, release_name -from swh.model.hashutil import hash_to_bytes from swh.model.model import ( MetadataAuthority, MetadataAuthorityType, ObjectType, Person, Release, Sha1Git, TimestampWithTimezone, ) from swh.storage.interface import StorageInterface logger = logging.getLogger(__name__) EMPTY_PERSON = Person.from_fullname(b"") -EXTID_TYPE = "npm-archive-sha1" -EXTID_VERSION = 0 - - @attr.s class NpmPackageInfo(BasePackageInfo): raw_info = attr.ib(type=Dict[str, Any]) + package_name = attr.ib(type=str) date = attr.ib(type=Optional[str]) shasum = attr.ib(type=str) """sha1 checksum""" + # we cannot rely only on $shasum, as it is technically possible for two versions + # of the same package to have the exact same tarball. + # But the release data (message and date) are extrinsic to the content of the + # package, so they differ between versions. + # So we need every attribute used to build the release object to be part of the + # manifest. + MANIFEST_FORMAT = string.Template( + "date $date\nname $package_name\nshasum $shasum\nurl $url\nversion $version" + ) + EXTID_TYPE = "npm-manifest-sha256" + EXTID_VERSION = 0 + @classmethod def from_metadata( cls, project_metadata: Dict[str, Any], version: str ) -> "NpmPackageInfo": package_metadata = project_metadata["versions"][version] url = package_metadata["dist"]["tarball"] + assert package_metadata["name"] == project_metadata["name"] + # No date available in intrinsic metadata: retrieve it from the API # metadata, using the version number that the API claims this package # has. extrinsic_version = package_metadata["version"] if "time" in project_metadata: date = project_metadata["time"][extrinsic_version] elif "mtime" in package_metadata: date = package_metadata["mtime"] else: date = None return cls( + package_name=package_metadata["name"], url=url, filename=os.path.basename(url), date=date, shasum=package_metadata["dist"]["shasum"], version=extrinsic_version, raw_info=package_metadata, directory_extrinsic_metadata=[ RawExtrinsicMetadataCore( format="replicate-npm-package-json", metadata=json.dumps(package_metadata).encode(), ) ], ) - def extid(self) -> PartialExtID: - return (EXTID_TYPE, EXTID_VERSION, hash_to_bytes(self.shasum)) - class NpmLoader(PackageLoader[NpmPackageInfo]): """Load npm origin's artifact releases into swh archive. """ visit_type = "npm" def __init__( self, storage: StorageInterface, url: str, max_content_size: Optional[int] = None, ): """Constructor Args str: origin url (e.g. https://www.npmjs.com/package/) """ super().__init__(storage=storage, url=url, max_content_size=max_content_size) self.package_name = url.split("https://www.npmjs.com/package/")[1] safe_name = quote(self.package_name, safe="") self.provider_url = f"https://replicate.npmjs.com/{safe_name}/" self._info: Dict[str, Any] = {} self._versions = None @cached_method def _raw_info(self) -> bytes: return api_info(self.provider_url) @cached_method def info(self) -> Dict: """Return the project metadata information (fetched from npm registry) """ return json.loads(self._raw_info()) def get_versions(self) -> Sequence[str]: return sorted(list(self.info()["versions"].keys())) def get_default_version(self) -> str: return self.info()["dist-tags"].get("latest", "") def get_metadata_authority(self): return MetadataAuthority( type=MetadataAuthorityType.FORGE, url="https://npmjs.com/", metadata={}, ) def get_package_info(self, version: str) -> Iterator[Tuple[str, NpmPackageInfo]]: p_info = NpmPackageInfo.from_metadata( project_metadata=self.info(), version=version ) yield release_name(version), p_info def build_release( self, p_info: NpmPackageInfo, uncompressed_path: str, directory: Sha1Git ) -> Optional[Release]: + # Metadata from NPM is not intrinsic to tarballs. + # This means two package versions can have the same tarball, but different + # metadata. To avoid mixing up releases, every field used to build the + # release object must be part of NpmPackageInfo.MANIFEST_FORMAT. i_metadata = extract_intrinsic_metadata(uncompressed_path) if not i_metadata: return None author = extract_npm_package_author(i_metadata) + assert self.package_name == p_info.package_name msg = ( - f"Synthetic release for NPM source package {self.package_name} " + f"Synthetic release for NPM source package {p_info.package_name} " f"version {p_info.version}\n" ) if p_info.date is None: url = p_info.url artifact_name = os.path.basename(url) raise ValueError( "Origin %s: Cannot determine upload time for artifact %s." % (p_info.url, artifact_name) ) date = TimestampWithTimezone.from_iso8601(p_info.date) # FIXME: this is to remain bug-compatible with earlier versions: date = attr.evolve(date, timestamp=attr.evolve(date.timestamp, microseconds=0)) r = Release( name=p_info.version.encode(), message=msg.encode(), author=author, date=date, target=directory, target_type=ObjectType.DIRECTORY, synthetic=True, ) return r def _author_str(author_data: Union[Dict, List, str]) -> str: """Parse author from package.json author fields """ if isinstance(author_data, dict): author_str = "" name = author_data.get("name") if name is not None: if isinstance(name, str): author_str += name elif isinstance(name, list): author_str += _author_str(name[0]) if len(name) > 0 else "" email = author_data.get("email") if email is not None: author_str += f" <{email}>" result = author_str elif isinstance(author_data, list): result = _author_str(author_data[0]) if len(author_data) > 0 else "" else: result = author_data return result def extract_npm_package_author(package_json: Dict[str, Any]) -> Person: """ Extract package author from a ``package.json`` file content and return it in swh format. Args: package_json: Dict holding the content of parsed ``package.json`` file Returns: Person """ for author_key in ("author", "authors"): if author_key in package_json: author_data = package_json[author_key] if author_data is None: return EMPTY_PERSON author_str = _author_str(author_data) return Person.from_fullname(author_str.encode()) return EMPTY_PERSON def _lstrip_bom(s, bom=BOM_UTF8): if s.startswith(bom): return s[len(bom) :] else: return s def load_json(json_bytes): """ Try to load JSON from bytes and return a dictionary. First try to decode from utf-8. If the decoding failed, try to detect the encoding and decode again with replace error handling. If JSON is malformed, an empty dictionary will be returned. Args: json_bytes (bytes): binary content of a JSON file Returns: dict: JSON data loaded in a dictionary """ json_data = {} try: json_str = _lstrip_bom(json_bytes).decode("utf-8") except UnicodeDecodeError: encoding = chardet.detect(json_bytes)["encoding"] if encoding: json_str = json_bytes.decode(encoding, "replace") try: json_data = json.loads(json_str) except json.decoder.JSONDecodeError: pass return json_data def extract_intrinsic_metadata(dir_path: str) -> Dict: """Given an uncompressed path holding the pkginfo file, returns a pkginfo parsed structure as a dict. The release artifact contains at their root one folder. For example: $ tar tvf zprint-0.0.6.tar.gz drwxr-xr-x root/root 0 2018-08-22 11:01 zprint-0.0.6/ ... Args: dir_path (str): Path to the uncompressed directory representing a release artifact from npm. Returns: the pkginfo parsed structure as a dict if any or None if none was present. """ # Retrieve the root folder of the archive if not os.path.exists(dir_path): return {} lst = os.listdir(dir_path) if len(lst) == 0: return {} project_dirname = lst[0] package_json_path = os.path.join(dir_path, project_dirname, "package.json") if not os.path.exists(package_json_path): return {} with open(package_json_path, "rb") as package_json_file: package_json_bytes = package_json_file.read() return load_json(package_json_bytes) diff --git a/swh/loader/package/npm/tests/data/https_registry.npmjs.org/org_-_org-0.0.3-beta.tgz b/swh/loader/package/npm/tests/data/https_registry.npmjs.org/org_-_org-0.0.3-beta.tgz new file mode 100644 index 0000000..bc20daa Binary files /dev/null and b/swh/loader/package/npm/tests/data/https_registry.npmjs.org/org_-_org-0.0.3-beta.tgz differ diff --git a/swh/loader/package/npm/tests/data/https_replicate.npmjs.com/org_version_mismatch b/swh/loader/package/npm/tests/data/https_replicate.npmjs.com/org_version_mismatch new file mode 100644 index 0000000..fc08add --- /dev/null +++ b/swh/loader/package/npm/tests/data/https_replicate.npmjs.com/org_version_mismatch @@ -0,0 +1,141 @@ +{ + "_id": "org_version_mismatch", + "_rev": "4-22484cc537f12d3023241211ee34e39d", + "name": "org_version_mismatch", + "description": "A parser and converter for org-mode notation", + "dist-tags": { + "latest": "0.0.3" + }, + "versions": { + "0.0.3-beta": { + "name": "org_version_mismatch", + "description": "A parser and converter for org-mode notation", + "homepage": "http://mooz.github.com/org-js", + "keywords": [ + "org-mode", + "emacs", + "parser" + ], + "author": { + "name": "mooz", + "email": "stillpedant@gmail.com" + }, + "main": "./lib/org.js", + "version": "0.0.3-beta", + "directories": { + "test": "./tests" + }, + "repository": { + "type": "git", + "url": "git://github.com/mooz/org-js.git" + }, + "bugs": { + "url": "https://github.com/mooz/org-js/issues" + }, + "_id": "org@0.0.3-beta", + "dist": { + "shasum": "6a44220f88903a6dfc3b47d010238058f9faf3a0", + "tarball": "https://registry.npmjs.org/org/-/org-0.0.3-beta.tgz" + }, + "_from": ".", + "_npmVersion": "1.2.25", + "_npmUser": { + "name": "mooz", + "email": "stillpedant@gmail.com" + }, + "maintainers": [ + { + "name": "mooz", + "email": "stillpedant@gmail.com" + } + ] + }, + "0.0.3": { + "name": "org_version_mismatch", + "description": "A parser and converter for org-mode notation", + "homepage": "http://mooz.github.com/org-js", + "bugs": { + "url": "http://github.com/mooz/org-s/issues" + }, + "keywords": [ + "org-mode", + "emacs", + "parser" + ], + "author": { + "name": "Masafumi Oyamada", + "email": "stillpedant@gmail.com", + "url": "http://mooz.github.io/" + }, + "licenses": [ + { + "type": "MIT" + } + ], + "main": "./lib/org.js", + "version": "0.0.3", + "directories": { + "test": "./tests" + }, + "repository": { + "type": "git", + "url": "git://github.com/mooz/org-js.git" + }, + "_id": "org@0.0.3", + "dist": { + "shasum": "6a44220f88903a6dfc3b47d010238058f9faf3a0", + "tarball": "https://registry.npmjs.org/org/-/org-0.0.3.tgz" + }, + "_from": ".", + "_npmVersion": "1.2.25", + "_npmUser": { + "name": "mooz", + "email": "stillpedant@gmail.com" + }, + "maintainers": [ + { + "name": "mooz", + "email": "stillpedant@gmail.com" + } + ] + } + }, + "readme": "org-js\n======\n\nParser and converter for org-mode () notation written in JavaScript.\n\nInteractive Editor\n------------------\n\nFor working example, see http://mooz.github.com/org-js/editor/.\n\nInstallation\n------------\n\n npm install org\n\nSimple example of org -> HTML conversion\n----------------------------------------\n\n```javascript\nvar org = require(\"org\");\n\nvar parser = new org.Parser();\nvar orgDocument = parser.parse(orgCode);\nvar orgHTMLDocument = orgDocument.convert(org.ConverterHTML, {\n headerOffset: 1,\n exportFromLineNumber: false,\n suppressSubScriptHandling: false,\n suppressAutoLink: false\n});\n\nconsole.dir(orgHTMLDocument); // => { title, contentHTML, tocHTML, toc }\nconsole.log(orgHTMLDocument.toString()) // => Rendered HTML\n```\n\nWriting yet another converter\n-----------------------------\n\nSee `lib/org/converter/html.js`.\n", + "maintainers": [ + { + "name": "mooz", + "email": "stillpedant@gmail.com" + } + ], + "time": { + "modified": "2019-01-05T01:37:44Z", + "created": "2014-01-01T15:40:31Z", + "0.0.3-beta": "2014-01-01T15:40:33Z", + "0.0.3": "2014-01-01T15:55:45Z" + }, + "author": { + "name": "Masafumi Oyamada", + "email": "stillpedant@gmail.com", + "url": "http://mooz.github.io/" + }, + "repository": { + "type": "git", + "url": "git://github.com/mooz/org-js.git" + }, + "users": { + "nak2k": true, + "bgschaid": true, + "422665vijay": true, + "nontau": true + }, + "homepage": "http://mooz.github.com/org-js", + "keywords": [ + "org-mode", + "emacs", + "parser" + ], + "bugs": { + "url": "http://github.com/mooz/org-s/issues" + }, + "readmeFilename": "README.md" +} diff --git a/swh/loader/package/npm/tests/test_npm.py b/swh/loader/package/npm/tests/test_npm.py index d00b9a9..63e5924 100644 --- a/swh/loader/package/npm/tests/test_npm.py +++ b/swh/loader/package/npm/tests/test_npm.py @@ -1,641 +1,729 @@ # Copyright (C) 2019-2021 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information import datetime import json import os import pytest from swh.loader.package import __version__ from swh.loader.package.npm.loader import ( NpmLoader, _author_str, extract_npm_package_author, ) from swh.loader.tests import assert_last_visit_matches, check_snapshot, get_stats from swh.model.hashutil import hash_to_bytes from swh.model.model import ( Person, RawExtrinsicMetadata, Release, Snapshot, SnapshotBranch, TargetType, TimestampWithTimezone, ) from swh.model.model import MetadataAuthority, MetadataAuthorityType, MetadataFetcher from swh.model.model import ObjectType as ModelObjectType from swh.model.swhids import CoreSWHID, ExtendedObjectType, ExtendedSWHID, ObjectType from swh.storage.interface import PagedResult @pytest.fixture def org_api_info(datadir) -> bytes: with open(os.path.join(datadir, "https_replicate.npmjs.com", "org"), "rb",) as f: return f.read() def test_npm_author_str(): for author, expected_author in [ ("author", "author"), ( ["Al from quantum leap", "hal from 2001 space odyssey"], "Al from quantum leap", ), ([], ""), ({"name": "groot", "email": "groot@galaxy.org",}, "groot "), ({"name": "somebody",}, "somebody"), ({"email": "no@one.org"}, " "), # note first elt is an extra blank ({"name": "no one", "email": None,}, "no one"), ({"email": None,}, ""), ({"name": None}, ""), ({"name": None, "email": None,}, ""), ({}, ""), (None, None), ({"name": []}, "",), ( {"name": ["Susan McSween", "William H. Bonney", "Doc Scurlock",]}, "Susan McSween", ), (None, None), ]: assert _author_str(author) == expected_author def test_npm_extract_npm_package_author(datadir): package_metadata_filepath = os.path.join( datadir, "https_replicate.npmjs.com", "org_visit1" ) with open(package_metadata_filepath) as json_file: package_metadata = json.load(json_file) extract_npm_package_author(package_metadata["versions"]["0.0.2"]) == Person( fullname=b"mooz ", name=b"mooz", email=b"stillpedant@gmail.com", ) assert extract_npm_package_author(package_metadata["versions"]["0.0.3"]) == Person( fullname=b"Masafumi Oyamada ", name=b"Masafumi Oyamada", email=b"stillpedant@gmail.com", ) package_json = json.loads( """ { "name": "highlightjs-line-numbers.js", "version": "2.7.0", "description": "Highlight.js line numbers plugin.", "main": "src/highlightjs-line-numbers.js", "dependencies": {}, "devDependencies": { "gulp": "^4.0.0", "gulp-rename": "^1.4.0", "gulp-replace": "^0.6.1", "gulp-uglify": "^1.2.0" }, "repository": { "type": "git", "url": "https://github.com/wcoder/highlightjs-line-numbers.js.git" }, "author": "Yauheni Pakala ", "license": "MIT", "bugs": { "url": "https://github.com/wcoder/highlightjs-line-numbers.js/issues" }, "homepage": "http://wcoder.github.io/highlightjs-line-numbers.js/" }""" ) assert extract_npm_package_author(package_json) == Person( fullname=b"Yauheni Pakala ", name=b"Yauheni Pakala", email=b"evgeniy.pakalo@gmail.com", ) package_json = json.loads( """ { "name": "3-way-diff", "version": "0.0.1", "description": "3-way diffing of JavaScript objects", "main": "index.js", "authors": [ { "name": "Shawn Walsh", "url": "https://github.com/shawnpwalsh" }, { "name": "Markham F Rollins IV", "url": "https://github.com/mrollinsiv" } ], "keywords": [ "3-way diff", "3 way diff", "three-way diff", "three way diff" ], "devDependencies": { "babel-core": "^6.20.0", "babel-preset-es2015": "^6.18.0", "mocha": "^3.0.2" }, "dependencies": { "lodash": "^4.15.0" } }""" ) assert extract_npm_package_author(package_json) == Person( fullname=b"Shawn Walsh", name=b"Shawn Walsh", email=None ) package_json = json.loads( """ { "name": "yfe-ynpm", "version": "1.0.0", "homepage": "http://gitlab.ywwl.com/yfe/yfe-ynpm", "repository": { "type": "git", "url": "git@gitlab.ywwl.com:yfe/yfe-ynpm.git" }, "author": [ "fengmk2 (https://fengmk2.com)", "xufuzi (https://7993.org)" ], "license": "MIT" }""" ) assert extract_npm_package_author(package_json) == Person( fullname=b"fengmk2 (https://fengmk2.com)", name=b"fengmk2", email=b"fengmk2@gmail.com", ) package_json = json.loads( """ { "name": "umi-plugin-whale", "version": "0.0.8", "description": "Internal contract component", "authors": { "name": "xiaohuoni", "email": "448627663@qq.com" }, "repository": "alitajs/whale", "devDependencies": { "np": "^3.0.4", "umi-tools": "*" }, "license": "MIT" }""" ) assert extract_npm_package_author(package_json) == Person( fullname=b"xiaohuoni <448627663@qq.com>", name=b"xiaohuoni", email=b"448627663@qq.com", ) package_json_no_authors = json.loads( """{ "authors": null, "license": "MIT" }""" ) assert extract_npm_package_author(package_json_no_authors) == Person.from_fullname( b"" ) def normalize_hashes(hashes): if isinstance(hashes, str): return hash_to_bytes(hashes) if isinstance(hashes, list): return [hash_to_bytes(x) for x in hashes] return {hash_to_bytes(k): hash_to_bytes(v) for k, v in hashes.items()} _expected_new_contents_first_visit = normalize_hashes( [ "4ce3058e16ab3d7e077f65aabf855c34895bf17c", "858c3ceee84c8311adc808f8cdb30d233ddc9d18", "0fa33b4f5a4e0496da6843a38ff1af8b61541996", "85a410f8ef8eb8920f2c384a9555566ad4a2e21b", "9163ac8025923d5a45aaac482262893955c9b37b", "692cf623b8dd2c5df2c2998fd95ae4ec99882fb4", "18c03aac6d3e910efb20039c15d70ab5e0297101", "41265c42446aac17ca769e67d1704f99e5a1394d", "783ff33f5882813dca9239452c4a7cadd4dba778", "b029cfb85107aee4590c2434a3329bfcf36f8fa1", "112d1900b4c2e3e9351050d1b542c9744f9793f3", "5439bbc4bd9a996f1a38244e6892b71850bc98fd", "d83097a2f994b503185adf4e719d154123150159", "d0939b4898e83090ee55fd9d8a60e312cfadfbaf", "b3523a26f7147e4af40d9d462adaae6d49eda13e", "cd065fb435d6fb204a8871bcd623d0d0e673088c", "2854a40855ad839a54f4b08f5cff0cf52fca4399", "b8a53bbaac34ebb8c6169d11a4b9f13b05c583fe", "0f73d56e1cf480bded8a1ecf20ec6fc53c574713", "0d9882b2dfafdce31f4e77fe307d41a44a74cefe", "585fc5caab9ead178a327d3660d35851db713df1", "e8cd41a48d79101977e3036a87aeb1aac730686f", "5414efaef33cceb9f3c9eb5c4cc1682cd62d14f7", "9c3cc2763bf9e9e37067d3607302c4776502df98", "3649a68410e354c83cd4a38b66bd314de4c8f5c9", "e96ed0c091de1ebdf587104eaf63400d1974a1fe", "078ca03d2f99e4e6eab16f7b75fbb7afb699c86c", "38de737da99514de6559ff163c988198bc91367a", ] ) _expected_new_directories_first_visit = normalize_hashes( [ "3370d20d6f96dc1c9e50f083e2134881db110f4f", "42753c0c2ab00c4501b552ac4671c68f3cf5aece", "d7895533ef5edbcffdea3f057d9fef3a1ef845ce", "80579be563e2ef3e385226fe7a3f079b377f142c", "3b0ddc6a9e58b4b53c222da4e27b280b6cda591c", "bcad03ce58ac136f26f000990fc9064e559fe1c0", "5fc7e82a1bc72e074665c6078c6d3fad2f13d7ca", "e3cd26beba9b1e02f6762ef54bd9ac80cc5f25fd", "584b5b4b6cf7f038095e820b99386a9c232de931", "184c8d6d0d242f2b1792ef9d3bf396a5434b7f7a", "bb5f4ee143c970367eb409f2e4c1104898048b9d", "1b95491047add1103db0dfdfa84a9735dcb11e88", "a00c6de13471a2d66e64aca140ddb21ef5521e62", "5ce6c1cd5cda2d546db513aaad8c72a44c7771e2", "c337091e349b6ac10d38a49cdf8c2401ef9bb0f2", "202fafcd7c0f8230e89d5496ad7f44ab12b807bf", "775cc516543be86c15c1dc172f49c0d4e6e78235", "ff3d1ead85a14f891e8b3fa3a89de39db1b8de2e", ] ) _expected_new_releases_first_visit = normalize_hashes( { "d38cc0b571cd41f3c85513864e049766b42032a7": ( "42753c0c2ab00c4501b552ac4671c68f3cf5aece" ), "62bf7076bae9aa2cb4d6cb3bf7ce0ea4fdd5b295": ( "3370d20d6f96dc1c9e50f083e2134881db110f4f" ), "6e976db82f6c310596b21fb0ed8b11f507631434": ( "d7895533ef5edbcffdea3f057d9fef3a1ef845ce" ), } ) def package_url(package): return "https://www.npmjs.com/package/%s" % package def package_metadata_url(package): return "https://replicate.npmjs.com/%s/" % package def test_npm_loader_first_visit(swh_storage, requests_mock_datadir, org_api_info): package = "org" url = package_url(package) loader = NpmLoader(swh_storage, url) actual_load_status = loader.load() expected_snapshot_id = hash_to_bytes("0996ca28d6280499abcf485b51c4e3941b057249") assert actual_load_status == { "status": "eventful", "snapshot_id": expected_snapshot_id.hex(), } assert_last_visit_matches( swh_storage, url, status="full", type="npm", snapshot=expected_snapshot_id ) release_id = "d38cc0b571cd41f3c85513864e049766b42032a7" versions = [ ("0.0.2", release_id), ("0.0.3", "62bf7076bae9aa2cb4d6cb3bf7ce0ea4fdd5b295"), ("0.0.4", "6e976db82f6c310596b21fb0ed8b11f507631434"), ] expected_snapshot = Snapshot( id=expected_snapshot_id, branches={ b"HEAD": SnapshotBranch( target=b"releases/0.0.4", target_type=TargetType.ALIAS ), **{ b"releases/" + version_name.encode(): SnapshotBranch( target=hash_to_bytes(version_id), target_type=TargetType.RELEASE, ) for (version_name, version_id) in versions }, }, ) check_snapshot(expected_snapshot, swh_storage) assert swh_storage.release_get([hash_to_bytes(release_id)])[0] == Release( name=b"0.0.2", message=b"Synthetic release for NPM source package org version 0.0.2\n", target=hash_to_bytes("42753c0c2ab00c4501b552ac4671c68f3cf5aece"), target_type=ModelObjectType.DIRECTORY, synthetic=True, author=Person( fullname=b"mooz ", name=b"mooz", email=b"stillpedant@gmail.com", ), date=TimestampWithTimezone.from_datetime( datetime.datetime(2014, 1, 1, 15, 40, 33, tzinfo=datetime.timezone.utc) ), id=hash_to_bytes(release_id), ) contents = swh_storage.content_get(_expected_new_contents_first_visit) count = sum(0 if content is None else 1 for content in contents) assert count == len(_expected_new_contents_first_visit) assert ( list(swh_storage.directory_missing(_expected_new_directories_first_visit)) == [] ) assert list(swh_storage.release_missing(_expected_new_releases_first_visit)) == [] metadata_authority = MetadataAuthority( type=MetadataAuthorityType.FORGE, url="https://npmjs.com/", ) for (version_name, release_id) in versions: release = swh_storage.release_get([hash_to_bytes(release_id)])[0] assert release.target_type == ModelObjectType.DIRECTORY directory_id = release.target directory_swhid = ExtendedSWHID( object_type=ExtendedObjectType.DIRECTORY, object_id=directory_id, ) release_swhid = CoreSWHID( object_type=ObjectType.RELEASE, object_id=hash_to_bytes(release_id), ) expected_metadata = [ RawExtrinsicMetadata( target=directory_swhid, authority=metadata_authority, fetcher=MetadataFetcher( name="swh.loader.package.npm.loader.NpmLoader", version=__version__, ), discovery_date=loader.visit_date, format="replicate-npm-package-json", metadata=json.dumps( json.loads(org_api_info)["versions"][version_name] ).encode(), origin="https://www.npmjs.com/package/org", release=release_swhid, ) ] assert swh_storage.raw_extrinsic_metadata_get( directory_swhid, metadata_authority, ) == PagedResult(next_page_token=None, results=expected_metadata,) stats = get_stats(swh_storage) assert { "content": len(_expected_new_contents_first_visit), "directory": len(_expected_new_directories_first_visit), "origin": 1, "origin_visit": 1, "release": len(_expected_new_releases_first_visit), "revision": 0, "skipped_content": 0, "snapshot": 1, } == stats def test_npm_loader_incremental_visit(swh_storage, requests_mock_datadir_visits): package = "org" url = package_url(package) loader = NpmLoader(swh_storage, url) expected_snapshot_id = hash_to_bytes("0996ca28d6280499abcf485b51c4e3941b057249") actual_load_status = loader.load() assert actual_load_status == { "status": "eventful", "snapshot_id": expected_snapshot_id.hex(), } assert_last_visit_matches( swh_storage, url, status="full", type="npm", snapshot=expected_snapshot_id ) stats = get_stats(swh_storage) assert { "content": len(_expected_new_contents_first_visit), "directory": len(_expected_new_directories_first_visit), "origin": 1, "origin_visit": 1, "release": len(_expected_new_releases_first_visit), "revision": 0, "skipped_content": 0, "snapshot": 1, } == stats # reset loader internal state del loader._cached_info del loader._cached__raw_info actual_load_status2 = loader.load() assert actual_load_status2["status"] == "eventful" snap_id2 = actual_load_status2["snapshot_id"] assert snap_id2 is not None assert snap_id2 != actual_load_status["snapshot_id"] assert_last_visit_matches(swh_storage, url, status="full", type="npm") stats = get_stats(swh_storage) assert { # 3 new releases artifacts "content": len(_expected_new_contents_first_visit) + 14, "directory": len(_expected_new_directories_first_visit) + 15, "origin": 1, "origin_visit": 2, "release": len(_expected_new_releases_first_visit) + 3, "revision": 0, "skipped_content": 0, "snapshot": 2, } == stats urls = [ m.url for m in requests_mock_datadir_visits.request_history if m.url.startswith("https://registry.npmjs.org") ] assert len(urls) == len(set(urls)) # we visited each artifact once across @pytest.mark.usefixtures("requests_mock_datadir") def test_npm_loader_version_divergence(swh_storage): - package = "@aller_shared" + package = "@aller/shared" url = package_url(package) loader = NpmLoader(swh_storage, url) actual_load_status = loader.load() - expected_snapshot_id = hash_to_bytes("ebbe6397d0c2a6cf7cba40fa5b043c59dd4f2497") + expected_snapshot_id = hash_to_bytes("68eed3d3bc852e7f435a84f18ee77e23f6884be2") assert actual_load_status == { "status": "eventful", "snapshot_id": expected_snapshot_id.hex(), } assert_last_visit_matches( swh_storage, url, status="full", type="npm", snapshot=expected_snapshot_id ) expected_snapshot = Snapshot( id=expected_snapshot_id, branches={ b"HEAD": SnapshotBranch( target_type=TargetType.ALIAS, target=b"releases/0.1.0" ), b"releases/0.1.0": SnapshotBranch( target_type=TargetType.RELEASE, - target=hash_to_bytes("04c66f3a82aa001e8f1b45246b58b82d2b0ca0df"), + target=hash_to_bytes("0c486b50b407f847ef7581f595c2b6c2062f1089"), ), b"releases/0.1.1-alpha.14": SnapshotBranch( target_type=TargetType.RELEASE, - target=hash_to_bytes("90cc04dc72193f3b1444f10e1c525bee2ea9dac6"), + target=hash_to_bytes("79d80c87c0a8d104a216cc539baad962a454802a"), ), }, ) check_snapshot(expected_snapshot, swh_storage) stats = get_stats(swh_storage) assert { # 1 new releases artifacts "content": 534, "directory": 153, "origin": 1, "origin_visit": 1, "release": 2, "revision": 0, "skipped_content": 0, "snapshot": 1, } == stats +def test_npm_loader_duplicate_shasum(swh_storage, requests_mock_datadir): + """Test with two versions that have exactly the same tarball""" + package = "org_version_mismatch" + url = package_url(package) + loader = NpmLoader(swh_storage, url) + + actual_load_status = loader.load() + expected_snapshot_id = hash_to_bytes("ac867a4c22ba4e22a022d319f309714477412a5a") + assert actual_load_status == { + "status": "eventful", + "snapshot_id": expected_snapshot_id.hex(), + } + + assert_last_visit_matches( + swh_storage, url, status="full", type="npm", snapshot=expected_snapshot_id + ) + + beta_release_id = "e6d5490a02ac2a8dcd49702f9ccd5a64c90a46f1" + release_id = "f6985f437e28db6eb1b7533230e05ed99f2c91f0" + versions = [ + ("0.0.3-beta", beta_release_id), + ("0.0.3", release_id), + ] + + expected_snapshot = Snapshot( + id=expected_snapshot_id, + branches={ + b"HEAD": SnapshotBranch( + target=b"releases/0.0.3", target_type=TargetType.ALIAS + ), + **{ + b"releases/" + + version_name.encode(): SnapshotBranch( + target=hash_to_bytes(version_id), target_type=TargetType.RELEASE, + ) + for (version_name, version_id) in versions + }, + }, + ) + check_snapshot(expected_snapshot, swh_storage) + + assert swh_storage.release_get([hash_to_bytes(beta_release_id)])[0] == Release( + name=b"0.0.3-beta", + message=( + b"Synthetic release for NPM source package org_version_mismatch " + b"version 0.0.3-beta\n" + ), + target=hash_to_bytes("3370d20d6f96dc1c9e50f083e2134881db110f4f"), + target_type=ModelObjectType.DIRECTORY, + synthetic=True, + author=Person.from_fullname(b"Masafumi Oyamada "), + date=TimestampWithTimezone.from_datetime( + datetime.datetime(2014, 1, 1, 15, 40, 33, tzinfo=datetime.timezone.utc) + ), + id=hash_to_bytes(beta_release_id), + ) + + assert swh_storage.release_get([hash_to_bytes(release_id)])[0] == Release( + name=b"0.0.3", + message=( + b"Synthetic release for NPM source package org_version_mismatch " + b"version 0.0.3\n" + ), + target=hash_to_bytes("3370d20d6f96dc1c9e50f083e2134881db110f4f"), + target_type=ModelObjectType.DIRECTORY, + synthetic=True, + author=Person.from_fullname(b"Masafumi Oyamada "), + date=TimestampWithTimezone.from_datetime( + datetime.datetime(2014, 1, 1, 15, 55, 45, tzinfo=datetime.timezone.utc) + ), + id=hash_to_bytes(release_id), + ) + + # Check incremental re-load keeps it unchanged + + loader = NpmLoader(swh_storage, url) + + actual_load_status = loader.load() + assert actual_load_status == { + "status": "uneventful", + "snapshot_id": expected_snapshot_id.hex(), + } + + assert_last_visit_matches( + swh_storage, url, status="full", type="npm", snapshot=expected_snapshot_id + ) + + def test_npm_artifact_with_no_intrinsic_metadata(swh_storage, requests_mock_datadir): """Skip artifact with no intrinsic metadata during ingestion """ package = "nativescript-telerik-analytics" url = package_url(package) loader = NpmLoader(swh_storage, url) actual_load_status = loader.load() # no branch as one artifact without any intrinsic metadata expected_snapshot = Snapshot( id=hash_to_bytes("1a8893e6a86f444e8be8e7bda6cb34fb1735a00e"), branches={}, ) assert actual_load_status == { "status": "eventful", "snapshot_id": expected_snapshot.id.hex(), } assert_last_visit_matches( swh_storage, url, status="full", type="npm", snapshot=expected_snapshot.id ) check_snapshot(expected_snapshot, swh_storage) def test_npm_artifact_with_no_upload_time(swh_storage, requests_mock_datadir): """With no time upload, artifact is skipped """ package = "jammit-no-time" url = package_url(package) loader = NpmLoader(swh_storage, url) actual_load_status = loader.load() # no branch as one artifact without any intrinsic metadata expected_snapshot = Snapshot( id=hash_to_bytes("1a8893e6a86f444e8be8e7bda6cb34fb1735a00e"), branches={}, ) assert actual_load_status == { "status": "uneventful", "snapshot_id": expected_snapshot.id.hex(), } assert_last_visit_matches( swh_storage, url, status="partial", type="npm", snapshot=expected_snapshot.id ) check_snapshot(expected_snapshot, swh_storage) def test_npm_artifact_use_mtime_if_no_time(swh_storage, requests_mock_datadir): """With no time upload, artifact is skipped """ package = "jammit-express" url = package_url(package) loader = NpmLoader(swh_storage, url) actual_load_status = loader.load() expected_snapshot_id = hash_to_bytes("33b8f105d48ce16b6c59158af660e0cc78bcbef4") assert actual_load_status == { "status": "eventful", "snapshot_id": expected_snapshot_id.hex(), } # artifact is used expected_snapshot = Snapshot( id=expected_snapshot_id, branches={ b"HEAD": SnapshotBranch( target_type=TargetType.ALIAS, target=b"releases/0.0.1" ), b"releases/0.0.1": SnapshotBranch( target_type=TargetType.RELEASE, target=hash_to_bytes("3e3b800570869fa9b3dbc302500553e62400cc06"), ), }, ) assert_last_visit_matches( swh_storage, url, status="full", type="npm", snapshot=expected_snapshot.id ) check_snapshot(expected_snapshot, swh_storage) def test_npm_no_artifact(swh_storage, requests_mock_datadir): """If no artifacts at all is found for origin, the visit fails completely """ package = "catify" url = package_url(package) loader = NpmLoader(swh_storage, url) actual_load_status = loader.load() assert actual_load_status == { "status": "failed", } assert_last_visit_matches(swh_storage, url, status="failed", type="npm") def test_npm_origin_not_found(swh_storage, requests_mock_datadir): url = package_url("non-existent-url") loader = NpmLoader(swh_storage, url) assert loader.load() == {"status": "failed"} assert_last_visit_matches( swh_storage, url, status="not_found", type="npm", snapshot=None )