Page MenuHomeSoftware Heritage

No OneTemporary

diff --git a/.gitignore b/.gitignore
index 21e2c07..bb863ea 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,11 +1,11 @@
*.pyc
*.sw?
*~
.coverage
.eggs/
__pycache__
*.egg-info/
version.txt
build/
dist/
-.tox
+.tox/
diff --git a/PKG-INFO b/PKG-INFO
index 7d31452..981fa98 100644
--- a/PKG-INFO
+++ b/PKG-INFO
@@ -1,136 +1,136 @@
Metadata-Version: 2.1
Name: swh.loader.pypi
-Version: 0.0.4
+Version: 0.0.5
Summary: Software Heritage PyPI Loader
Home-page: https://forge.softwareheritage.org/source/swh-loader-pypi
Author: Software Heritage developers
Author-email: swh-devel@inria.fr
License: UNKNOWN
Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest
Project-URL: Funding, https://www.softwareheritage.org/donate
Project-URL: Source, https://forge.softwareheritage.org/source/swh-loader-pypi
Description: swh-loader-pypi
====================
SWH PyPI loader's source code repository
# What does the loader do?
The PyPI loader visits and loads a PyPI project [1].
Each visit will result in:
- 1 snapshot (which targets n revisions ; 1 per release artifact)
- 1 revision (which targets 1 directory ; the release artifact uncompressed)
[1] https://pypi.org/help/#packages
## First visit
Given a PyPI project (origin), the loader, for the first visit:
- retrieves information for the given project (including releases)
- then for each associated release
- for each associated source distribution (type 'sdist') release
artifact (possibly many per release)
- retrieves the associated artifact archive (with checks)
- uncompresses locally the archive
- computes the hashes of the uncompressed directory
- then creates a revision (using PKG-INFO metadata file) targeting
such directory
- finally, creates a snapshot targeting all seen revisions
(uncompressed PyPI artifact and metadata).
## Next visit
The loader starts by checking if something changed since the last
visit. If nothing changed, the visit's snapshot is left
unchanged. The new visit targets the same snapshot.
If something changed, the already seen release artifacts are skipped.
Only the new ones are loaded. In the end, the loader creates a new
snapshot based on the previous one. Thus, the new snapshot targets
both the old and new PyPI release artifacts.
## Terminology
- 1 project: a PyPI project (used as swh origin). This is a collection
of releases.
- 1 release: a specific version of the (PyPi) project. It's a
collection of information and associated source release
artifacts (type 'sdist')
- 1 release artifact: a source release artifact (distributed by a PyPI
maintainer). In swh, we are specifically
interested by the 'sdist' type (source code).
## Edge cases
- If no release provides release artifacts, those are skipped
- If a release artifact holds no PKG-INFO file (root at the archive),
the release artifact is skipped.
- If a problem occurs during a fetch action (e.g. release artifact
download), the load fails and the visit is marked as 'partial'.
# Development
## Configuration file
### Location
Either:
- /etc/softwareheritage/
- ~/.config/swh/
- ~/.swh/
Note: Will call that location $SWH_CONFIG_PATH
### Configuration sample
$SWH_CONFIG_PATH/loader/pypi.yml:
```
storage:
cls: remote
args:
url: http://localhost:5002/
```
## Local run
The built-in command-line will run the loader for a project in the
main PyPI archive.
For instance, to load arrow:
``` sh
python3 -m swh.loader.pypi.loader arrow
```
If you need more control, you can use the loader directly. It expects
three arguments:
- project: a PyPI project name (f.e.: arrow)
- project_url: URL of the PyPI project (human-readable html page)
- project_metadata_url: URL of the PyPI metadata information
(machine-parsable json document)
``` python
import logging
logging.basicConfig(level=logging.DEBUG)
from swh.loader.pypi.tasks import LoadPyPI
project='arrow'
LoadPyPI().run(project, 'https://pypi.org/pypi/%s/' % project, 'https://pypi.org/pypi/%s/json' % project)
```
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 5 - Production/Stable
Description-Content-Type: text/markdown
Provides-Extra: testing
diff --git a/debian/control b/debian/control
index 673b7e6..dbd286f 100644
--- a/debian/control
+++ b/debian/control
@@ -1,29 +1,29 @@
Source: swh-loader-pypi
Maintainer: Software Heritage developers <swh-devel@inria.fr>
Section: python
Priority: optional
Build-Depends: debhelper (>= 9),
dh-python (>= 2),
python3-all,
python3-arrow,
python3-nose,
python3-pkginfo,
python3-requests,
python3-setuptools,
python3-swh.core,
- python3-swh.loader.core (>= 0.0.34~),
- python3-swh.model (>= 0.0.27~),
+ python3-swh.loader.core (>= 0.0.35~),
+ python3-swh.model (>= 0.0.28~),
python3-swh.storage (>= 0.0.108~),
python3-swh.scheduler,
python3-vcversioner
Standards-Version: 3.9.6
Homepage: https://forge.softwareheritage.org/source/swh-loader-pypi.git
Package: python3-swh.loader.pypi
Architecture: all
Depends: python3-swh.core,
- python3-swh.loader.core (>= 0.0.34~),
- python3-swh.model (>= 0.0.27~),
+ python3-swh.loader.core (>= 0.0.35~),
+ python3-swh.model (>= 0.0.28~),
python3-swh.storage (>= 0.0.108~),
${misc:Depends}, ${python3:Depends}
Description: Software Heritage PyPI Loader
diff --git a/debian/rules b/debian/rules
index 71548ae..f4d87c1 100755
--- a/debian/rules
+++ b/debian/rules
@@ -1,12 +1,12 @@
#!/usr/bin/make -f
export PYBUILD_NAME=swh.loader.pypi
-export PYBUILD_TEST_ARGS=--with-doctest -sva !db,!fs
+export PYBUILD_TEST_ARGS=-m 'not db and not fs'
%:
dh $@ --with python3 --buildsystem=pybuild
override_dh_install:
dh_install
rm -v $(CURDIR)/debian/python3-*/usr/lib/python*/dist-packages/swh/__init__.py
rm -v $(CURDIR)/debian/python3-*/usr/lib/python*/dist-packages/swh/loader/__init__.py
diff --git a/docs/index.rst b/docs/index.rst
index a1468c6..933500f 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,19 +1,15 @@
.. _swh-loader-pypi:
Software Heritage - PyPI loader
===============================
Loader for `PyPI <https://pypi.org/>`_ source code releases.
+Reference Documentation
+-----------------------
+
.. toctree::
:maxdepth: 2
- :caption: Contents:
-
-
-Indices and tables
-==================
-* :ref:`genindex`
-* :ref:`modindex`
-* :ref:`search`
+ /apidoc/swh.loader.pypi
diff --git a/pytest.ini b/pytest.ini
new file mode 100644
index 0000000..afa4cf3
--- /dev/null
+++ b/pytest.ini
@@ -0,0 +1,2 @@
+[pytest]
+norecursedirs = docs
diff --git a/requirements-swh.txt b/requirements-swh.txt
index 5155478..9e518b1 100644
--- a/requirements-swh.txt
+++ b/requirements-swh.txt
@@ -1,5 +1,5 @@
swh.core
-swh.model >= 0.0.27
+swh.model >= 0.0.28
swh.storage >= 0.0.108
swh.scheduler
-swh.loader.core >= 0.0.34
+swh.loader.core >= 0.0.35
diff --git a/requirements-test.txt b/requirements-test.txt
index f3c7e8e..e079f8a 100644
--- a/requirements-test.txt
+++ b/requirements-test.txt
@@ -1 +1 @@
-nose
+pytest
diff --git a/setup.py b/setup.py
index fe8a048..0afea9a 100755
--- a/setup.py
+++ b/setup.py
@@ -1,66 +1,66 @@
#!/usr/bin/env python3
# Copyright (C) 2015-2018 The Software Heritage developers
# See the AUTHORS file at the top-level directory of this distribution
# License: GNU General Public License version 3, or any later version
# See top-level LICENSE file for more information
from setuptools import setup, find_packages
from os import path
from io import open
here = path.abspath(path.dirname(__file__))
# Get the long description from the README file
with open(path.join(here, 'README.md'), encoding='utf-8') as f:
long_description = f.read()
def parse_requirements(name=None):
if name:
reqf = 'requirements-%s.txt' % name
else:
reqf = 'requirements.txt'
requirements = []
if not path.exists(reqf):
return requirements
with open(reqf) as f:
for line in f.readlines():
line = line.strip()
if not line or line.startswith('#'):
continue
requirements.append(line)
return requirements
setup(
name='swh.loader.pypi',
description='Software Heritage PyPI Loader',
long_description=long_description,
long_description_content_type='text/markdown',
author='Software Heritage developers',
author_email='swh-devel@inria.fr',
url='https://forge.softwareheritage.org/source/swh-loader-pypi',
packages=find_packages(),
scripts=[], # scripts to package
install_requires=parse_requirements() + parse_requirements('swh'),
- test_requires=parse_requirements('test'),
+ tests_require=parse_requirements('test'),
setup_requires=['vcversioner'],
extras_require={'testing': parse_requirements('test')},
vcversioner={'version_module_paths': ['swh/loader/pypi/_version.py']},
include_package_data=True,
classifiers=[
"Programming Language :: Python :: 3",
"Intended Audience :: Developers",
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
"Operating System :: OS Independent",
"Development Status :: 5 - Production/Stable",
],
project_urls={
'Bug Reports': 'https://forge.softwareheritage.org/maniphest',
'Funding': 'https://www.softwareheritage.org/donate',
'Source': 'https://forge.softwareheritage.org/source/swh-loader-pypi',
},
)
diff --git a/swh.loader.pypi.egg-info/PKG-INFO b/swh.loader.pypi.egg-info/PKG-INFO
index 7d31452..981fa98 100644
--- a/swh.loader.pypi.egg-info/PKG-INFO
+++ b/swh.loader.pypi.egg-info/PKG-INFO
@@ -1,136 +1,136 @@
Metadata-Version: 2.1
Name: swh.loader.pypi
-Version: 0.0.4
+Version: 0.0.5
Summary: Software Heritage PyPI Loader
Home-page: https://forge.softwareheritage.org/source/swh-loader-pypi
Author: Software Heritage developers
Author-email: swh-devel@inria.fr
License: UNKNOWN
Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest
Project-URL: Funding, https://www.softwareheritage.org/donate
Project-URL: Source, https://forge.softwareheritage.org/source/swh-loader-pypi
Description: swh-loader-pypi
====================
SWH PyPI loader's source code repository
# What does the loader do?
The PyPI loader visits and loads a PyPI project [1].
Each visit will result in:
- 1 snapshot (which targets n revisions ; 1 per release artifact)
- 1 revision (which targets 1 directory ; the release artifact uncompressed)
[1] https://pypi.org/help/#packages
## First visit
Given a PyPI project (origin), the loader, for the first visit:
- retrieves information for the given project (including releases)
- then for each associated release
- for each associated source distribution (type 'sdist') release
artifact (possibly many per release)
- retrieves the associated artifact archive (with checks)
- uncompresses locally the archive
- computes the hashes of the uncompressed directory
- then creates a revision (using PKG-INFO metadata file) targeting
such directory
- finally, creates a snapshot targeting all seen revisions
(uncompressed PyPI artifact and metadata).
## Next visit
The loader starts by checking if something changed since the last
visit. If nothing changed, the visit's snapshot is left
unchanged. The new visit targets the same snapshot.
If something changed, the already seen release artifacts are skipped.
Only the new ones are loaded. In the end, the loader creates a new
snapshot based on the previous one. Thus, the new snapshot targets
both the old and new PyPI release artifacts.
## Terminology
- 1 project: a PyPI project (used as swh origin). This is a collection
of releases.
- 1 release: a specific version of the (PyPi) project. It's a
collection of information and associated source release
artifacts (type 'sdist')
- 1 release artifact: a source release artifact (distributed by a PyPI
maintainer). In swh, we are specifically
interested by the 'sdist' type (source code).
## Edge cases
- If no release provides release artifacts, those are skipped
- If a release artifact holds no PKG-INFO file (root at the archive),
the release artifact is skipped.
- If a problem occurs during a fetch action (e.g. release artifact
download), the load fails and the visit is marked as 'partial'.
# Development
## Configuration file
### Location
Either:
- /etc/softwareheritage/
- ~/.config/swh/
- ~/.swh/
Note: Will call that location $SWH_CONFIG_PATH
### Configuration sample
$SWH_CONFIG_PATH/loader/pypi.yml:
```
storage:
cls: remote
args:
url: http://localhost:5002/
```
## Local run
The built-in command-line will run the loader for a project in the
main PyPI archive.
For instance, to load arrow:
``` sh
python3 -m swh.loader.pypi.loader arrow
```
If you need more control, you can use the loader directly. It expects
three arguments:
- project: a PyPI project name (f.e.: arrow)
- project_url: URL of the PyPI project (human-readable html page)
- project_metadata_url: URL of the PyPI metadata information
(machine-parsable json document)
``` python
import logging
logging.basicConfig(level=logging.DEBUG)
from swh.loader.pypi.tasks import LoadPyPI
project='arrow'
LoadPyPI().run(project, 'https://pypi.org/pypi/%s/' % project, 'https://pypi.org/pypi/%s/json' % project)
```
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 5 - Production/Stable
Description-Content-Type: text/markdown
Provides-Extra: testing
diff --git a/swh.loader.pypi.egg-info/SOURCES.txt b/swh.loader.pypi.egg-info/SOURCES.txt
index 36ce350..7a270c1 100644
--- a/swh.loader.pypi.egg-info/SOURCES.txt
+++ b/swh.loader.pypi.egg-info/SOURCES.txt
@@ -1,49 +1,51 @@
.gitignore
AUTHORS
LICENSE
MANIFEST.in
Makefile
README.md
+pytest.ini
requirements-swh.txt
requirements-test.txt
requirements.txt
setup.py
+tox.ini
version.txt
debian/changelog
debian/compat
debian/control
debian/copyright
debian/rules
debian/source/format
docs/.gitignore
docs/Makefile
docs/conf.py
docs/index.rst
docs/_static/.placeholder
docs/_templates/.placeholder
swh/__init__.py
swh.loader.pypi.egg-info/PKG-INFO
swh.loader.pypi.egg-info/SOURCES.txt
swh.loader.pypi.egg-info/dependency_links.txt
swh.loader.pypi.egg-info/requires.txt
swh.loader.pypi.egg-info/top_level.txt
swh/loader/__init__.py
swh/loader/pypi/.gitignore
swh/loader/pypi/__init__.py
swh/loader/pypi/_version.py
swh/loader/pypi/client.py
swh/loader/pypi/converters.py
swh/loader/pypi/loader.py
swh/loader/pypi/tasks.py
swh/loader/pypi/tests/__init__.py
swh/loader/pypi/tests/common.py
swh/loader/pypi/tests/test_client.py
swh/loader/pypi/tests/test_converters.py
swh/loader/pypi/tests/test_loader.py
swh/loader/pypi/tests/resources/0805nexter+new-made-up-release.json
swh/loader/pypi/tests/resources/0805nexter-unpublished-release.json
swh/loader/pypi/tests/resources/0805nexter.json
swh/loader/pypi/tests/resources/archives/0805nexter-1.1.0.zip
swh/loader/pypi/tests/resources/archives/0805nexter-1.2.0.zip
swh/loader/pypi/tests/resources/archives/0805nexter-1.3.0.zip
swh/loader/pypi/tests/resources/archives/0805nexter-1.4.0.zip
\ No newline at end of file
diff --git a/swh.loader.pypi.egg-info/requires.txt b/swh.loader.pypi.egg-info/requires.txt
index b267d74..271685a 100644
--- a/swh.loader.pypi.egg-info/requires.txt
+++ b/swh.loader.pypi.egg-info/requires.txt
@@ -1,13 +1,13 @@
arrow
pkginfo
requests
setuptools
swh.core
-swh.loader.core>=0.0.34
-swh.model>=0.0.27
+swh.loader.core>=0.0.35
+swh.model>=0.0.28
swh.scheduler
swh.storage>=0.0.108
vcversioner
[testing]
-nose
+pytest
diff --git a/swh/loader/pypi/_version.py b/swh/loader/pypi/_version.py
index d48e243..070ce78 100644
--- a/swh/loader/pypi/_version.py
+++ b/swh/loader/pypi/_version.py
@@ -1,5 +1,5 @@
# This file is automatically generated by setup.py.
-__version__ = '0.0.4'
-__sha__ = 'gc993a89'
-__revision__ = 'gc993a89'
+__version__ = '0.0.5'
+__sha__ = 'gedb3d4e'
+__revision__ = 'gedb3d4e'
diff --git a/swh/loader/pypi/loader.py b/swh/loader/pypi/loader.py
index 797b787..797f575 100644
--- a/swh/loader/pypi/loader.py
+++ b/swh/loader/pypi/loader.py
@@ -1,310 +1,311 @@
# Copyright (C) 2018 The Software Heritage developers
# See the AUTHORS file at the top-level directory of this distribution
# License: GNU General Public License version 3, or any later version
# See top-level LICENSE file for more information
import os
import shutil
from tempfile import mkdtemp
import arrow
from swh.loader.core.utils import clean_dangling_folders
from swh.loader.core.loader import SWHLoader
from swh.model.from_disk import Directory
from swh.model.identifiers import (
revision_identifier, snapshot_identifier,
identifier_to_bytes, normalize_timestamp
)
from swh.storage.algos.snapshot import snapshot_get_all_branches
from .client import PyPIClient, PyPIProject
TEMPORARY_DIR_PREFIX_PATTERN = 'swh.loader.pypi.'
DEBUG_MODE = '** DEBUG MODE **'
class PyPILoader(SWHLoader):
CONFIG_BASE_FILENAME = 'loader/pypi'
ADDITIONAL_CONFIG = {
'temp_directory': ('str', '/tmp/swh.loader.pypi/'),
'cache': ('bool', False),
'cache_dir': ('str', ''),
'debug': ('bool', False), # NOT FOR PRODUCTION
}
def __init__(self, client=None):
super().__init__(logging_class='swh.loader.pypi.PyPILoader')
self.origin_id = None
if not client:
temp_directory = self.config['temp_directory']
os.makedirs(temp_directory, exist_ok=True)
self.temp_directory = mkdtemp(
suffix='-%s' % os.getpid(),
prefix=TEMPORARY_DIR_PREFIX_PATTERN,
dir=temp_directory)
self.pypi_client = PyPIClient(
temp_directory=self.temp_directory,
cache=self.config['cache'],
cache_dir=self.config['cache_dir'])
else:
self.temp_directory = client.temp_directory
self.pypi_client = client
self.debug = self.config['debug']
self.done = False
def pre_cleanup(self):
"""To prevent disk explosion if some other workers exploded
in mid-air (OOM killed), we try and clean up dangling files.
"""
if self.debug:
self.log.warn('%s Will not pre-clean up temp dir %s' % (
DEBUG_MODE, self.temp_directory
))
return
clean_dangling_folders(self.config['temp_directory'],
pattern_check=TEMPORARY_DIR_PREFIX_PATTERN,
log=self.log)
def cleanup(self):
"""Clean up temporary disk use
"""
if self.debug:
self.log.warn('%s Will not clean up temp dir %s' % (
DEBUG_MODE, self.temp_directory
))
return
if os.path.exists(self.temp_directory):
self.log.debug('Clean up %s' % self.temp_directory)
shutil.rmtree(self.temp_directory)
def prepare_origin_visit(self, project_name, project_url,
project_metadata_url=None):
"""Prepare the origin visit information
Args:
project_name (str): Project's simple name
project_url (str): Project's main url
project_metadata_url (str): Project's metadata url
"""
self.origin = {
'url': project_url,
'type': 'pypi',
}
self.visit_date = None # loader core will populate it
def _known_artifacts(self, last_snapshot):
"""Retrieve the known releases/artifact for the origin_id.
Args
snapshot (dict): Last snapshot for the visit
Returns:
list of (filename, sha256) tuples.
"""
if not last_snapshot or 'branches' not in last_snapshot:
return {}
revs = [rev['target'] for rev in last_snapshot['branches'].values()]
known_revisions = self.storage.revision_get(revs)
ret = {}
for revision in known_revisions:
if 'original_artifact' in revision['metadata']:
artifact = revision['metadata']['original_artifact']
ret[artifact['filename'], artifact['sha256']] = revision['id']
return ret
def _last_snapshot(self):
"""Retrieve the last snapshot
"""
snapshot = self.storage.snapshot_get_latest(self.origin_id)
if snapshot and snapshot.pop('next_branch', None):
- return snapshot_get_all_branches(self.storage, snapshot['id'])
+ snapshot = snapshot_get_all_branches(self.storage, snapshot['id'])
+ return snapshot
def prepare(self, project_name, project_url,
project_metadata_url=None):
"""Keep reference to the origin url (project) and the
project metadata url
Args:
project_name (str): Project's simple name
project_url (str): Project's main url
project_metadata_url (str): Project's metadata url
"""
self.project_name = project_name
self.origin_url = project_url
self.project_metadata_url = project_metadata_url
self.project = PyPIProject(self.pypi_client, self.project_name,
self.project_metadata_url)
self._prepare_state()
def _prepare_state(self):
"""Initialize internal state (snapshot, contents, directories, etc...)
This is called from `prepare` method.
"""
last_snapshot = self._last_snapshot()
self.known_artifacts = self._known_artifacts(last_snapshot)
# and the artifacts
# that will be the source of data to retrieve
self.new_artifacts = self.project.download_new_releases(
self.known_artifacts
)
# temporary state
self._contents = []
self._directories = []
self._revisions = []
self._load_status = 'uneventful'
self._visit_status = 'full'
def fetch_data(self):
"""Called once per release artifact version (can be many for one
release).
This will for each call:
- retrieve a release artifact (associated to a release version)
- Uncompress it and compute the necessary information
- Computes the swh objects
Returns:
True as long as data to fetch exist
"""
data = None
if self.done:
return False
try:
data = next(self.new_artifacts)
self._load_status = 'eventful'
except StopIteration:
self.done = True
return False
project_info, author, release, artifact, dir_path = data
dir_path = dir_path.encode('utf-8')
directory = Directory.from_disk(path=dir_path, data=True)
_objects = directory.collect()
self._contents = _objects['content'].values()
self._directories = _objects['directory'].values()
date = normalize_timestamp(
int(arrow.get(artifact['date']).timestamp))
name = release['name'].encode('utf-8')
message = release['message'].encode('utf-8')
if message:
message = b'%s: %s' % (name, message)
else:
message = name
_revision = {
'synthetic': True,
'metadata': {
'original_artifact': artifact,
'project': project_info,
},
'author': author,
'date': date,
'committer': author,
'committer_date': date,
'message': message,
'directory': directory.hash,
'parents': [],
'type': 'tar',
}
_revision['id'] = identifier_to_bytes(
revision_identifier(_revision))
self._revisions.append(_revision)
artifact_key = artifact['filename'], artifact['sha256']
self.known_artifacts[artifact_key] = _revision['id']
return not self.done
def target_from_artifact(self, filename, sha256):
target = self.known_artifacts.get((filename, sha256))
if target:
return {
'target': target,
'target_type': 'revision',
}
return None
def generate_and_load_snapshot(self):
branches = {}
for release, artifacts in self.project.all_release_artifacts().items():
default_release = self.project.default_release()
if len(artifacts) == 1:
# Only one artifact for this release, generate a single branch
branch_name = 'releases/%s' % release
filename, sha256 = artifacts[0]
target = self.target_from_artifact(filename, sha256)
branches[branch_name.encode('utf-8')] = target
if release == default_release:
branches[b'HEAD'] = {
'target_type': 'alias',
'target': branch_name.encode('utf-8'),
}
if not target:
self._visit_status = 'partial'
else:
# Several artifacts for this release, generate a separate
# pointer for each of them
for filename, sha256 in artifacts:
branch_name = 'releases/%s/%s' % (release, filename)
target = self.target_from_artifact(filename, sha256)
branches[branch_name.encode('utf-8')] = target
if not target:
self._visit_status = 'partial'
snapshot = {
'branches': branches,
}
snapshot['id'] = identifier_to_bytes(
snapshot_identifier(snapshot))
self.maybe_load_snapshot(snapshot)
def store_data(self):
"""(override) This sends collected objects to storage.
"""
self.maybe_load_contents(self._contents)
self.maybe_load_directories(self._directories)
self.maybe_load_revisions(self._revisions)
if self.done:
self.generate_and_load_snapshot()
self.flush()
def load_status(self):
return {
'status': self._load_status,
}
def visit_status(self):
return self._visit_status
if __name__ == '__main__':
import logging
import sys
logging.basicConfig(level=logging.DEBUG)
if len(sys.argv) != 2:
logging.error('Usage: %s <module-name>' % sys.argv[0])
sys.exit(1)
module_name = sys.argv[1]
loader = PyPILoader()
loader.load(
module_name,
'https://pypi.org/projects/%s/' % module_name,
'https://pypi.org/pypi/%s/json' % module_name,
)
diff --git a/swh/loader/pypi/tests/common.py b/swh/loader/pypi/tests/common.py
index 8acbea0..8ebedc2 100644
--- a/swh/loader/pypi/tests/common.py
+++ b/swh/loader/pypi/tests/common.py
@@ -1,56 +1,56 @@
# Copyright (C) 2018 The Software Heritage developers
# See the AUTHORS file at the top-level directory of this distribution
# License: GNU General Public License version 3, or any later version
# See top-level LICENSE file for more information
import json
import shutil
import os
import tempfile
-from nose.plugins.attrib import attr
+import pytest
from unittest import TestCase
from swh.loader.pypi.client import PyPIClient, PyPIProject
RESOURCES_PATH = './swh/loader/pypi/tests/resources'
class PyPIClientWithCache(PyPIClient):
"""Force the use of the cache to bypass pypi calls
"""
def __init__(self, temp_directory, cache_dir):
super().__init__(temp_directory=temp_directory,
cache=True, cache_dir=cache_dir)
-@attr('fs')
+@pytest.mark.fs
class WithProjectTest(TestCase):
def setUp(self):
project = '0805nexter'
project_metadata_file = '%s/%s.json' % (RESOURCES_PATH, project)
with open(project_metadata_file) as f:
data = json.load(f)
temp_dir = tempfile.mkdtemp(
dir='/tmp/', prefix='swh.loader.pypi.tests-')
project_metadata_url = 'https://pypi.org/pypi/%s/json' % project
# Will use the pypi with cache
client = PyPIClientWithCache(
temp_directory=temp_dir, cache_dir=RESOURCES_PATH)
self.project = PyPIProject(
client=client,
project=project,
project_metadata_url=project_metadata_url,
data=data)
self.data = data
self.temp_dir = temp_dir
self.project_name = project
def tearDown(self):
if os.path.exists(self.temp_dir):
shutil.rmtree(self.temp_dir)
diff --git a/swh/loader/pypi/tests/test_client.py b/swh/loader/pypi/tests/test_client.py
index f364b61..a237986 100644
--- a/swh/loader/pypi/tests/test_client.py
+++ b/swh/loader/pypi/tests/test_client.py
@@ -1,97 +1,97 @@
# Copyright (C) 2018 The Software Heritage developers
# See the AUTHORS file at the top-level directory of this distribution
# License: GNU General Public License version 3, or any later version
# See top-level LICENSE file for more information
import os
from swh.loader.pypi import converters
from swh.loader.pypi.client import _project_pkginfo
from .common import WithProjectTest
class PyPIProjectTest(WithProjectTest):
def test_download_new_releases(self):
actual_releases = self.project.download_new_releases([])
expected_release_artifacts = {
'1.1.0': {
'archive_type': 'zip',
'blake2s256': 'df9413bde66e6133b10cadefad6fcf9cbbc369b47831089112c846d79f14985a', # noqa
'date': '2016-01-31T05:28:42',
'filename': '0805nexter-1.1.0.zip',
'sha1': '127d8697db916ba1c67084052196a83319a25000',
'sha1_git': '4b8f1350e6d9fa00256e974ae24c09543d85b196',
'sha256': '52cd128ad3afe539478abc7440d4b043384295fbe6b0958a237cb6d926465035', # noqa
'size': 862,
'url': 'https://files.pythonhosted.org/packages/ec/65/c0116953c9a3f47de89e71964d6c7b0c783b01f29fa3390584dbf3046b4d/0805nexter-1.1.0.zip', # noqa
},
'1.2.0': {
'archive_type': 'zip',
'blake2s256': '67010586b5b9a4aaa3b1c386f9dc8b4c99e6e40f37732a717a5f9b9b1185e588', # noqa
'date': '2016-01-31T05:51:25',
'filename': '0805nexter-1.2.0.zip',
'sha1': 'd55238554b94da7c5bf4a349ece0fe3b2b19f79c',
'sha1_git': '8638d33a96cb25d8319af21417f00045ec6ee810',
'sha256': '49785c6ae39ea511b3c253d7621c0b1b6228be2f965aca8a491e6b84126d0709', # noqa
'size': 898,
'url': 'https://files.pythonhosted.org/packages/c4/a0/4562cda161dc4ecbbe9e2a11eb365400c0461845c5be70d73869786809c4/0805nexter-1.2.0.zip', # noqa
}
}
expected_releases = {
'1.1.0': {
'name': '1.1.0',
'message': '',
},
'1.2.0': {
'name': '1.2.0',
'message': '',
},
}
dir_paths = []
for pkginfo, author, release, artifact, dir_path in actual_releases:
version = pkginfo['version']
expected_pkginfo = _project_pkginfo(dir_path)
- self.assertEquals(pkginfo, expected_pkginfo)
+ self.assertEqual(pkginfo, expected_pkginfo)
expected_author = converters.author(expected_pkginfo)
self.assertEqual(author, expected_author)
expected_artifact = expected_release_artifacts[version]
self.assertEqual(artifact, expected_artifact)
expected_release = expected_releases[version]
self.assertEqual(release, expected_release)
self.assertTrue(version in dir_path)
self.assertTrue(self.project_name in dir_path)
# path still exists
self.assertTrue(os.path.exists(dir_path))
dir_paths.append(dir_path)
# Ensure uncompressed paths have been destroyed
for dir_path in dir_paths:
# path no longer exists
self.assertFalse(os.path.exists(dir_path))
def test_all_release_artifacts(self):
expected_release_artifacts = {
'1.1.0': [(
'0805nexter-1.1.0.zip',
'52cd128ad3afe539478abc7440d4b043'
'384295fbe6b0958a237cb6d926465035',
)],
'1.2.0': [(
'0805nexter-1.2.0.zip',
'49785c6ae39ea511b3c253d7621c0b1b'
'6228be2f965aca8a491e6b84126d0709',
)],
}
self.assertEqual(
self.project.all_release_artifacts(),
expected_release_artifacts,
)
def test_default_release(self):
self.assertEqual(self.project.default_release(), '1.2.0')
diff --git a/swh/loader/pypi/tests/test_converters.py b/swh/loader/pypi/tests/test_converters.py
index effca39..214f80b 100644
--- a/swh/loader/pypi/tests/test_converters.py
+++ b/swh/loader/pypi/tests/test_converters.py
@@ -1,121 +1,121 @@
# Copyright (C) 2018 The Software Heritage developers
# See the AUTHORS file at the top-level directory of this distribution
# License: GNU General Public License version 3, or any later version
# See top-level LICENSE file for more information
from unittest import TestCase
from swh.loader.pypi.converters import EMPTY_AUTHOR, author
from .common import WithProjectTest
class Test(WithProjectTest):
def test_info(self):
actual_info = self.project.info()
expected_info = {
'home_page': self.data['info']['home_page'],
'description': self.data['info']['description'],
'summary': self.data['info']['summary'],
'license': self.data['info']['license'],
'package_url': self.data['info']['package_url'],
'project_url': self.data['info']['project_url'],
'upstream': self.data['info']['project_urls']['Homepage'],
}
self.assertEqual(expected_info, actual_info)
def test_author(self):
info = self.data['info']
actual_author = author(info)
name = info['author'].encode('utf-8')
email = info['author_email'].encode('utf-8')
expected_author = {
'fullname': b'%s <%s>' % (name, email),
'name': name,
'email': email,
}
self.assertEqual(expected_author, actual_author)
def test_no_author(self):
actual_author = author({})
self.assertEqual(EMPTY_AUTHOR, actual_author)
def test_partial_author(self):
actual_author = author({'author': 'someone'})
expected_author = {
'name': b'someone',
'fullname': b'someone',
'email': None,
}
self.assertEqual(expected_author, actual_author)
class ParseAuthorTest(TestCase):
def test_author_basic(self):
data = {
'author': "i-am-groot",
'author_email': 'iam@groot.org',
}
actual_author = author(data)
expected_author = {
'fullname': b'i-am-groot <iam@groot.org>',
'name': b'i-am-groot',
'email': b'iam@groot.org',
}
- self.assertEquals(actual_author, expected_author)
+ self.assertEqual(actual_author, expected_author)
def test_author_malformed(self):
data = {
'author': "['pierre', 'paul', 'jacques']",
'author_email': None,
}
actual_author = author(data)
expected_author = {
'fullname': b"['pierre', 'paul', 'jacques']",
'name': b"['pierre', 'paul', 'jacques']",
'email': None,
}
- self.assertEquals(actual_author, expected_author)
+ self.assertEqual(actual_author, expected_author)
def test_author_malformed_2(self):
data = {
'author': '[marie, jeanne]',
'author_email': '[marie@some, jeanne@thing]',
}
actual_author = author(data)
expected_author = {
'fullname': b'[marie, jeanne] <[marie@some, jeanne@thing]>',
'name': b'[marie, jeanne]',
'email': b'[marie@some, jeanne@thing]',
}
- self.assertEquals(actual_author, expected_author)
+ self.assertEqual(actual_author, expected_author)
def test_author_malformed_3(self):
data = {
'author': '[marie, jeanne, pierre]',
'author_email': '[marie@somewhere.org, jeanne@somewhere.org]',
}
actual_author = author(data)
expected_author = {
'fullname': b'[marie, jeanne, pierre] <[marie@somewhere.org, jeanne@somewhere.org]>', # noqa
'name': b'[marie, jeanne, pierre]',
'email': b'[marie@somewhere.org, jeanne@somewhere.org]',
}
- self.assertEquals(actual_author, expected_author)
+ self.assertEqual(actual_author, expected_author)
diff --git a/swh/loader/pypi/tests/test_loader.py b/swh/loader/pypi/tests/test_loader.py
index 4849489..258c957 100644
--- a/swh/loader/pypi/tests/test_loader.py
+++ b/swh/loader/pypi/tests/test_loader.py
@@ -1,475 +1,477 @@
# Copyright (C) 2016-2018 The Software Heritage developers
# See the AUTHORS file at the top-level directory of this distribution
# License: GNU General Public License version 3, or any later version
# See top-level LICENSE file for more information
import json
-import shutil
import tempfile
-from nose.plugins.attrib import attr
+import pytest
from swh.loader.core.tests import BaseLoaderTest, LoaderNoStorage
from swh.loader.pypi.client import PyPIProject
from swh.loader.pypi.loader import PyPILoader
from swh.model import hashutil
from .common import RESOURCES_PATH, PyPIClientWithCache
class TestPyPILoader(LoaderNoStorage, PyPILoader):
"""Real PyPILoader for test purposes (storage and pypi interactions
inhibited)
"""
def __init__(self, project_name, json_filename=None):
if not json_filename: # defaulting to using same name as project
json_filename = '%s.json' % project_name
project_metadata_file = '%s/%s' % (RESOURCES_PATH, json_filename)
project_metadata_url = 'https://pypi.org/pypi/%s/json' % project_name
with open(project_metadata_file) as f:
data = json.load(f)
self.temp_dir = tempfile.mkdtemp(
dir='/tmp/', prefix='swh.loader.pypi.tests-')
# Will use the pypi with cache
client = PyPIClientWithCache(
temp_directory=self.temp_dir, cache_dir=RESOURCES_PATH)
super().__init__(client=client)
self.project = PyPIProject(
client=client,
project=project_name,
project_metadata_url=project_metadata_url,
data=data)
def prepare(self, project_name, origin_url,
origin_metadata_url=None):
self.project_name = project_name
self.origin_url = origin_url
self.origin_metadata_url = origin_metadata_url
self.visit = 1 # first visit
self._prepare_state()
-@attr('fs')
+@pytest.mark.fs
class PyPIBaseLoaderTest(BaseLoaderTest):
"""Loader Test Mixin to prepare the pypi to 'load' in a test context.
In this setup, the loader uses the cache to load data so no
network interaction (no storage, no pypi).
"""
def setUp(self, project_name='0805nexter',
dummy_pypi_instance='https://dummy.org'):
self.tmp_root_path = tempfile.mkdtemp(
dir='/tmp', prefix='swh.loader.pypi.tests-')
self._project = project_name
self._origin_url = '%s/pypi/%s/' % (dummy_pypi_instance, project_name)
self._project_metadata_url = '%s/pypi/%s/json' % (
dummy_pypi_instance, project_name)
class PyPILoaderNoSnapshot(TestPyPILoader):
"""Same as TestPyPILoader with no prior snapshot seen
"""
def _last_snapshot(self):
return None
class LoaderITest(PyPIBaseLoaderTest):
def setUp(self, project_name='0805nexter',
dummy_pypi_instance='https://dummy.org'):
super().setUp(project_name, dummy_pypi_instance)
self.loader = PyPILoaderNoSnapshot(project_name=project_name)
def test_load(self):
"""Load a pypi origin
"""
# when
self.loader.load(
self._project, self._origin_url, self._project_metadata_url)
# then
self.assertCountContents(
6, '3 contents per release artifact files (2)')
self.assertCountDirectories(4)
self.assertCountRevisions(
2, '2 releases so 2 revisions should be created')
self.assertCountReleases(0, 'No release is created in the pypi loader')
self.assertCountSnapshots(1, 'Only 1 snapshot targeting all revisions')
expected_contents = [
'a61e24cdfdab3bb7817f6be85d37a3e666b34566',
'938c33483285fd8ad57f15497f538320df82aeb8',
'a27576d60e08c94a05006d2e6d540c0fdb5f38c8',
'405859113963cb7a797642b45f171d6360425d16',
'e5686aa568fdb1d19d7f1329267082fe40482d31',
'83ecf6ec1114fd260ca7a833a2d165e71258c338',
]
self.assertContentsOk(expected_contents)
expected_directories = [
'05219ba38bc542d4345d5638af1ed56c7d43ca7d',
'cf019eb456cf6f78d8c4674596f1c9a97ece8f44',
'b178b66bd22383d5f16f4f5c923d39ca798861b4',
'c3a58f8b57433a4b56caaa5033ae2e0931405338',
]
self.assertDirectoriesOk(expected_directories)
# {revision hash: directory hash}
expected_revisions = {
'4c99891f93b81450385777235a37b5e966dd1571': '05219ba38bc542d4345d5638af1ed56c7d43ca7d', # noqa
'e445da4da22b31bfebb6ffc4383dbf839a074d21': 'b178b66bd22383d5f16f4f5c923d39ca798861b4', # noqa
}
self.assertRevisionsOk(expected_revisions)
expected_branches = {
'releases/1.1.0': {
'target': '4c99891f93b81450385777235a37b5e966dd1571',
'target_type': 'revision',
},
'releases/1.2.0': {
'target': 'e445da4da22b31bfebb6ffc4383dbf839a074d21',
'target_type': 'revision',
},
'HEAD': {
'target': 'releases/1.2.0',
'target_type': 'alias',
},
}
self.assertSnapshotOk('ba6e158ada75d0b3cfb209ffdf6daa4ed34a227a',
expected_branches)
self.assertEqual(self.loader.load_status(), {'status': 'eventful'})
self.assertEqual(self.loader.visit_status(), 'full')
class PyPILoaderWithSnapshot(TestPyPILoader):
"""This loader provides a snapshot and lists corresponding seen
release artifacts.
"""
def _last_snapshot(self):
"""Return last visited snapshot"""
return {
'id': b'\xban\x15\x8a\xdau\xd0\xb3\xcf\xb2\t\xff\xdfm\xaaN\xd3J"z', # noqa
'branches': {
b'releases/1.1.0': {
'target': b'L\x99\x89\x1f\x93\xb8\x14P'
b'8Ww#Z7\xb5\xe9f\xdd\x15q',
'target_type': 'revision'
},
b'releases/1.2.0': {
'target': b'\xe4E\xdaM\xa2+1\xbf'
b'\xeb\xb6\xff\xc48=\xbf\x83'
b'\x9a\x07M!',
'target_type': 'revision'
},
b'HEAD': {
'target': b'releases/1.2.0',
'target_type': 'alias'
},
},
}
def _known_artifacts(self, last_snapshot):
"""List corresponding seen release artifacts"""
return {
(
'0805nexter-1.1.0.zip',
'52cd128ad3afe539478abc7440d4b043384295fbe6b0958a237cb6d926465035' # noqa
): b'L\x99\x89\x1f\x93\xb8\x14P8Ww#Z7\xb5\xe9f\xdd\x15q',
(
'0805nexter-1.2.0.zip',
'49785c6ae39ea511b3c253d7621c0b1b6228be2f965aca8a491e6b84126d0709' # noqa
): b'\xe4E\xdaM\xa2+1\xbf\xeb\xb6\xff\xc48=\xbf\x83\x9a\x07M!',
}
class LoaderNoNewChangesSinceLastVisitITest(PyPIBaseLoaderTest):
"""This scenario makes use of the incremental nature of the loader.
If nothing changes in between visits, the snapshot for the visit
must stay the same as the first visit.
"""
def setUp(self, project_name='0805nexter',
dummy_pypi_instance='https://dummy.org'):
super().setUp(project_name, dummy_pypi_instance)
self.loader = PyPILoaderWithSnapshot(project_name=project_name)
def test_load(self):
"""Load a PyPI origin without new changes results in 1 same snapshot
"""
# when
self.loader.load(
self._project, self._origin_url, self._project_metadata_url)
# then
self.assertCountContents(0)
self.assertCountDirectories(0)
self.assertCountRevisions(0)
self.assertCountReleases(0)
self.assertCountSnapshots(1)
self.assertContentsOk([])
self.assertDirectoriesOk([])
self.assertRevisionsOk(expected_revisions={})
expected_snapshot_id = 'ba6e158ada75d0b3cfb209ffdf6daa4ed34a227a'
expected_branches = {
'releases/1.1.0': {
'target': '4c99891f93b81450385777235a37b5e966dd1571',
'target_type': 'revision',
},
'releases/1.2.0': {
'target': 'e445da4da22b31bfebb6ffc4383dbf839a074d21',
'target_type': 'revision',
},
'HEAD': {
'target': 'releases/1.2.0',
'target_type': 'alias',
},
}
self.assertSnapshotOk(expected_snapshot_id, expected_branches)
_id = hashutil.hash_to_hex(self.loader._last_snapshot()['id'])
- self.assertEquals(expected_snapshot_id, _id)
+ self.assertEqual(expected_snapshot_id, _id)
self.assertEqual(self.loader.load_status(), {'status': 'uneventful'})
self.assertEqual(self.loader.visit_status(), 'full')
class LoaderNewChangesSinceLastVisitITest(PyPIBaseLoaderTest):
"""In this scenario, a visit has already taken place.
An existing snapshot exists.
This time, the PyPI project has changed, a new release (with 1 new
release artifact) has been uploaded. The old releases did not
change.
The visit results in a new snapshot.
The new snapshot shares the same history as prior visit's
snapshot. It holds a new branch targeting the new revision.
"""
def setUp(self, project_name='0805nexter',
dummy_pypi_instance='https://dummy.org'):
super().setUp(project_name, dummy_pypi_instance)
self.loader = PyPILoaderWithSnapshot(
project_name=project_name,
json_filename='0805nexter+new-made-up-release.json')
def test_load(self):
"""Load a PyPI origin with changes results in 1 new snapshot
"""
# when
self.loader.load(
self._project, self._origin_url, self._project_metadata_url)
# then
- self.assertCountContents(4,
- "3 + 1 new content (only change between 1.2.0 and 1.3.0 archives)")
+ self.assertCountContents(
+ 4, ("3 + 1 new content (only change between "
+ "1.2.0 and 1.3.0 archives)"))
self.assertCountDirectories(2)
self.assertCountRevisions(
1, "1 new revision targeting that new directory id")
self.assertCountReleases(0)
self.assertCountSnapshots(1)
expected_contents = [
'92689fa2b7fb4d4fc6fb195bf73a50c87c030639', # new one
'405859113963cb7a797642b45f171d6360425d16',
'83ecf6ec1114fd260ca7a833a2d165e71258c338',
'e5686aa568fdb1d19d7f1329267082fe40482d31',
]
self.assertContentsOk(expected_contents)
expected_directories = [
'e226e7e4ad03b4fc1403d69a18ebdd6f2edd2b3a',
'52604d46843b898f5a43208045d09fcf8731631b',
]
self.assertDirectoriesOk(expected_directories)
expected_revisions = {
'fb46e49605b0bbe69f8c53d315e89370e7c6cb5d': 'e226e7e4ad03b4fc1403d69a18ebdd6f2edd2b3a', # noqa
}
self.assertRevisionsOk(expected_revisions)
old_revisions = {
'4c99891f93b81450385777235a37b5e966dd1571': '05219ba38bc542d4345d5638af1ed56c7d43ca7d', # noqa
'e445da4da22b31bfebb6ffc4383dbf839a074d21': 'b178b66bd22383d5f16f4f5c923d39ca798861b4', # noqa
}
for rev, dir_id in old_revisions.items():
expected_revisions[rev] = dir_id
expected_snapshot_id = '07322209e51618410b5e43ca4af7e04fe5113c9d'
expected_branches = {
'releases/1.1.0': {
'target': '4c99891f93b81450385777235a37b5e966dd1571',
'target_type': 'revision',
},
'releases/1.2.0': {
'target': 'e445da4da22b31bfebb6ffc4383dbf839a074d21',
'target_type': 'revision',
},
'releases/1.3.0': {
'target': 'fb46e49605b0bbe69f8c53d315e89370e7c6cb5d',
'target_type': 'revision',
},
'HEAD': {
'target': 'releases/1.3.0',
'target_type': 'alias',
},
}
self.assertSnapshotOk(expected_snapshot_id, expected_branches)
_id = hashutil.hash_to_hex(self.loader._last_snapshot()['id'])
self.assertNotEqual(expected_snapshot_id, _id)
self.assertEqual(self.loader.load_status(), {'status': 'eventful'})
self.assertEqual(self.loader.visit_status(), 'full')
class PyPILoaderWithSnapshot2(TestPyPILoader):
"""This loader provides a snapshot and lists corresponding seen
release artifacts.
"""
def _last_snapshot(self):
"""Return last visited snapshot"""
return {
'id': b'\x072"\t\xe5\x16\x18A\x0b^C\xcaJ\xf7\xe0O\xe5\x11<\x9d', # noqa
'branches': {
b'releases/1.1.0': {
'target': b'L\x99\x89\x1f\x93\xb8\x14P8Ww#Z7\xb5\xe9f\xdd\x15q', # noqa
'target_type': 'revision'
},
b'releases/1.2.0': {
'target': b'\xe4E\xdaM\xa2+1\xbf\xeb\xb6\xff\xc48=\xbf\x83\x9a\x07M!', # noqa
'target_type': 'revision'
},
b'releases/1.3.0': {
'target': b'\xfbF\xe4\x96\x05\xb0\xbb\xe6\x9f\x8cS\xd3\x15\xe8\x93p\xe7\xc6\xcb]', # noqa
'target_type': 'revision'
},
b'HEAD': {
'target': b'releases/1.3.0', # noqa
'target_type': 'alias'
},
}
}
def _known_artifacts(self, last_snapshot):
"""Map previously seen release artifacts to their revision"""
return {
(
'0805nexter-1.1.0.zip',
'52cd128ad3afe539478abc7440d4b043384295fbe6b0958a237cb6d926465035' # noqa
): b'L\x99\x89\x1f\x93\xb8\x14P8Ww#Z7\xb5\xe9f\xdd\x15q',
(
'0805nexter-1.2.0.zip',
'49785c6ae39ea511b3c253d7621c0b1b6228be2f965aca8a491e6b84126d0709' # noqa
): b'\xe4E\xdaM\xa2+1\xbf\xeb\xb6\xff\xc48=\xbf\x83\x9a\x07M!',
(
'0805nexter-1.3.0.zip',
'7097c49fb8ec24a7aaab54c3dbfbb5a6ca1431419d9ee0f6c363d9ad01d2b8b1' # noqa
): b'\xfbF\xe4\x96\x05\xb0\xbb\xe6\x9f\x8cS\xd3\x15\xe8\x93p\xe7\xc6\xcb]', # noqa
}
class LoaderChangesOldReleaseArtifactRemovedSinceLastVisit(PyPIBaseLoaderTest):
"""In this scenario, a visit has already taken place. An existing
snapshot exists.
The PyPI project has changed:
- a new release has been uploaded
- an older one has been removed
The visit should result in a new snapshot. Such snapshot shares some of
the same branches as prior visit (but not all):
- new release artifact branch exists
- old release artifact branch has been removed
- the other unchanged release artifact branches are left unchanged
"""
def setUp(self, project_name='0805nexter',
dummy_pypi_instance='https://dummy.org'):
super().setUp(project_name, dummy_pypi_instance)
self.loader = PyPILoaderWithSnapshot2(
project_name=project_name,
json_filename='0805nexter-unpublished-release.json')
def test_load(self):
"""Load PyPI origin with removed artifact + changes ~> 1 new snapshot
"""
# when
self.loader.load(
self._project, self._origin_url, self._project_metadata_url)
# then
- self.assertCountContents(4,
- "3 + 1 new content (only change between 1.3.0 and 1.4.0 archives)")
+ self.assertCountContents(
+ 4, ("3 + 1 new content (only change between "
+ "1.3.0 and 1.4.0 archives)"))
self.assertCountDirectories(2)
- self.assertCountRevisions(1,
- "This results in 1 new revision targeting that new directory id")
+ self.assertCountRevisions(
+ 1, ("This results in 1 new revision targeting "
+ "that new directory id"))
self.assertCountReleases(0)
self.assertCountSnapshots(1)
expected_contents = [
'e2d68a197e3a3ad0fc6de28749077892c2148043', # new one
'405859113963cb7a797642b45f171d6360425d16',
'83ecf6ec1114fd260ca7a833a2d165e71258c338',
'e5686aa568fdb1d19d7f1329267082fe40482d31',
]
self.assertContentsOk(expected_contents)
expected_directories = [
'a2b7621f3e52eb3632657f6e3436bd08202db56f', # new one
'770e21215ecac53cea331d8ea4dc0ffc9d979367',
]
self.assertDirectoriesOk(expected_directories)
expected_revisions = {
# 1.4.0
'5e91875f096ac48c98d74acf307439a3490f2827': '770e21215ecac53cea331d8ea4dc0ffc9d979367', # noqa
}
self.assertRevisionsOk(expected_revisions)
expected_snapshot_id = 'bb0b0c29040678eadb6dae9e43e496cc860123e4'
expected_branches = {
'releases/1.2.0': {
'target': 'e445da4da22b31bfebb6ffc4383dbf839a074d21',
'target_type': 'revision',
},
'releases/1.3.0': {
'target': 'fb46e49605b0bbe69f8c53d315e89370e7c6cb5d',
'target_type': 'revision',
},
'releases/1.4.0': {
'target': '5e91875f096ac48c98d74acf307439a3490f2827',
'target_type': 'revision',
},
'HEAD': {
'target': 'releases/1.4.0',
'target_type': 'alias',
},
}
self.assertSnapshotOk(expected_snapshot_id, expected_branches)
_id = hashutil.hash_to_hex(self.loader._last_snapshot()['id'])
self.assertNotEqual(expected_snapshot_id, _id)
self.assertEqual(self.loader.load_status(), {'status': 'eventful'})
self.assertEqual(self.loader.visit_status(), 'full')
diff --git a/tox.ini b/tox.ini
new file mode 100644
index 0000000..0fb07c6
--- /dev/null
+++ b/tox.ini
@@ -0,0 +1,16 @@
+[tox]
+envlist=flake8,py3
+
+[testenv:py3]
+deps =
+ .[testing]
+ pytest-cov
+commands =
+ pytest --cov=swh --cov-branch {posargs}
+
+[testenv:flake8]
+skip_install = true
+deps =
+ flake8
+commands =
+ {envpython} -m flake8
diff --git a/version.txt b/version.txt
index 906bd94..dadde7e 100644
--- a/version.txt
+++ b/version.txt
@@ -1 +1 @@
-v0.0.4-0-gc993a89
\ No newline at end of file
+v0.0.5-0-gedb3d4e
\ No newline at end of file

File Metadata

Mime Type
text/x-diff
Expires
Fri, Jul 4, 3:28 PM (6 d, 19 h ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3300635

Event Timeline