diff --git a/PKG-INFO b/PKG-INFO index bba6a3a..22a8a13 100644 --- a/PKG-INFO +++ b/PKG-INFO @@ -1,103 +1,103 @@ Metadata-Version: 2.1 Name: swh.loader.npm -Version: 0.0.2 +Version: 0.0.3 Summary: Software Heritage loader for npm packages Home-page: https://forge.softwareheritage.org/source/swh-loader-npm.git Author: Software Heritage developers Author-email: swh-devel@inria.fr License: UNKNOWN Project-URL: Source, https://forge.softwareheritage.org/source/swh-loader-npm -Project-URL: Funding, https://www.softwareheritage.org/donate Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest +Project-URL: Funding, https://www.softwareheritage.org/donate Description: swh-loader-npm ============== Software Heritage loader to ingest [`npm`](https://www.npmjs.com/) packages into the archive. # What does the loader do? The npm loader visits and loads a npm package [1]. Each visit will result in: - 1 snapshot (which targets n revisions ; 1 per package release version) - 1 revision (which targets 1 directory ; the package release version uncompressed) [1] https://docs.npmjs.com/about-packages-and-modules ## First visit Given a npm package (origin), the loader, for the first visit: - retrieves information for the given package (notably released versions) - then for each associated released version: - retrieves the associated tarball (with checks) - uncompresses locally the archive - computes the hashes of the uncompressed directory - then creates a revision (using ``package.json`` metadata file) targeting such directory - finally, creates a snapshot targeting all seen revisions (uncompressed npm package released versions and metadata). ## Next visit The loader starts by checking if something changed since the last visit. If nothing changed, the visit's snapshot is left unchanged. The new visit targets the same snapshot. If something changed, the already seen package release versions are skipped. Only the new ones are loaded. In the end, the loader creates a new snapshot based on the previous one. Thus, the new snapshot targets both the old and new package release versions. # Development ## Configuration file ### Location Either: - `/etc/softwareheritage/loader/npm.yml` - `~/.config/swh/loader/npm.yml` ### Configuration sample ```lang=yaml storage: cls: remote args: url: http://localhost:5002/ debug: false ``` ## Local run The built-in command-line will run the loader for a specified npm package. For instance, to load `jquery`: ```lang=bash $ python3 -m swh.loader.npm.loader jquery ``` If you need more control, you can use the loader directly. It expects three arguments: - `package_name` (required): a npm package name - `package_url` (optional): URL of the npm package description (human-readable html page) that will be used as the associated origin URL in the archive - `project_metadata_url` (optional): URL of the npm package metadata information (machine-parsable JSON document) ```lang=python import logging from urllib.parse import quote from swh.loader.npm.loader import NpmLoader logging.basicConfig(level=logging.DEBUG) package_name='webpack' NpmLoader().load(package_name, 'https://www.npmjs.com/package/%s/' % package_name, 'https://replicate.npmjs.com/%s/' % quote(package_name, safe='')) ``` Platform: UNKNOWN Classifier: Programming Language :: Python :: 3 Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3) Classifier: Operating System :: OS Independent Classifier: Development Status :: 3 - Alpha Description-Content-Type: text/markdown Provides-Extra: testing diff --git a/swh.loader.npm.egg-info/PKG-INFO b/swh.loader.npm.egg-info/PKG-INFO index bba6a3a..22a8a13 100644 --- a/swh.loader.npm.egg-info/PKG-INFO +++ b/swh.loader.npm.egg-info/PKG-INFO @@ -1,103 +1,103 @@ Metadata-Version: 2.1 Name: swh.loader.npm -Version: 0.0.2 +Version: 0.0.3 Summary: Software Heritage loader for npm packages Home-page: https://forge.softwareheritage.org/source/swh-loader-npm.git Author: Software Heritage developers Author-email: swh-devel@inria.fr License: UNKNOWN Project-URL: Source, https://forge.softwareheritage.org/source/swh-loader-npm -Project-URL: Funding, https://www.softwareheritage.org/donate Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest +Project-URL: Funding, https://www.softwareheritage.org/donate Description: swh-loader-npm ============== Software Heritage loader to ingest [`npm`](https://www.npmjs.com/) packages into the archive. # What does the loader do? The npm loader visits and loads a npm package [1]. Each visit will result in: - 1 snapshot (which targets n revisions ; 1 per package release version) - 1 revision (which targets 1 directory ; the package release version uncompressed) [1] https://docs.npmjs.com/about-packages-and-modules ## First visit Given a npm package (origin), the loader, for the first visit: - retrieves information for the given package (notably released versions) - then for each associated released version: - retrieves the associated tarball (with checks) - uncompresses locally the archive - computes the hashes of the uncompressed directory - then creates a revision (using ``package.json`` metadata file) targeting such directory - finally, creates a snapshot targeting all seen revisions (uncompressed npm package released versions and metadata). ## Next visit The loader starts by checking if something changed since the last visit. If nothing changed, the visit's snapshot is left unchanged. The new visit targets the same snapshot. If something changed, the already seen package release versions are skipped. Only the new ones are loaded. In the end, the loader creates a new snapshot based on the previous one. Thus, the new snapshot targets both the old and new package release versions. # Development ## Configuration file ### Location Either: - `/etc/softwareheritage/loader/npm.yml` - `~/.config/swh/loader/npm.yml` ### Configuration sample ```lang=yaml storage: cls: remote args: url: http://localhost:5002/ debug: false ``` ## Local run The built-in command-line will run the loader for a specified npm package. For instance, to load `jquery`: ```lang=bash $ python3 -m swh.loader.npm.loader jquery ``` If you need more control, you can use the loader directly. It expects three arguments: - `package_name` (required): a npm package name - `package_url` (optional): URL of the npm package description (human-readable html page) that will be used as the associated origin URL in the archive - `project_metadata_url` (optional): URL of the npm package metadata information (machine-parsable JSON document) ```lang=python import logging from urllib.parse import quote from swh.loader.npm.loader import NpmLoader logging.basicConfig(level=logging.DEBUG) package_name='webpack' NpmLoader().load(package_name, 'https://www.npmjs.com/package/%s/' % package_name, 'https://replicate.npmjs.com/%s/' % quote(package_name, safe='')) ``` Platform: UNKNOWN Classifier: Programming Language :: Python :: 3 Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3) Classifier: Operating System :: OS Independent Classifier: Development Status :: 3 - Alpha Description-Content-Type: text/markdown Provides-Extra: testing diff --git a/swh/loader/npm/tasks.py b/swh/loader/npm/tasks.py index 68e7411..948aadc 100644 --- a/swh/loader/npm/tasks.py +++ b/swh/loader/npm/tasks.py @@ -1,13 +1,13 @@ # Copyright (C) 2019 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information from celery import current_app as app from swh.loader.npm.loader import NpmLoader @app.task(name=__name__ + '.LoadNpm') -def load_pypi(package_name, package_url=None, package_metadata_url=None): +def load_npm(package_name, package_url=None, package_metadata_url=None): return NpmLoader().load(package_name, package_url, package_metadata_url) diff --git a/version.txt b/version.txt index acfef66..150d422 100644 --- a/version.txt +++ b/version.txt @@ -1 +1 @@ -v0.0.2-0-ge303859 \ No newline at end of file +v0.0.3-0-g43af640 \ No newline at end of file