diff --git a/PKG-INFO b/PKG-INFO index 3a27ae1..b5abc3b 100644 --- a/PKG-INFO +++ b/PKG-INFO @@ -1,28 +1,28 @@ Metadata-Version: 2.1 Name: swh.core -Version: 0.0.55 +Version: 0.0.56 Summary: Software Heritage core utilities Home-page: https://forge.softwareheritage.org/diffusion/DCORE/ Author: Software Heritage developers Author-email: swh-devel@inria.fr License: UNKNOWN Project-URL: Source, https://forge.softwareheritage.org/source/swh-core -Project-URL: Funding, https://www.softwareheritage.org/donate Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest +Project-URL: Funding, https://www.softwareheritage.org/donate Description: swh-core ======== core library for swh's modules: - config parser - hash computations - serialization - logging mechanism Platform: UNKNOWN Classifier: Programming Language :: Python :: 3 Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3) Classifier: Operating System :: OS Independent Classifier: Development Status :: 5 - Production/Stable Description-Content-Type: text/markdown Provides-Extra: testing diff --git a/debian/changelog b/debian/changelog index b320f50..3ec2ca9 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,412 +1,414 @@ -swh-core (0.0.55-1~swh1~bpo9+1) stretch-swh; urgency=medium +swh-core (0.0.56-1~swh1) unstable-swh; urgency=medium - * Rebuild for stretch-swh + * New upstream release 0.0.56 - (tagged by David Douard + on 2019-03-19 10:17:06 +0100) + * Upstream changes: - v0.0.56 - -- Software Heritage autobuilder (on jenkins-debian1) Tue, 19 Feb 2019 11:33:45 +0000 + -- Software Heritage autobuilder (on jenkins-debian1) Tue, 19 Mar 2019 09:27:18 +0000 swh-core (0.0.55-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.55 - (tagged by Antoine R. Dumont (@ardumont) on 2019-02-19 12:28:26 +0100) * Upstream changes: - v0.0.55 - Fix runtime dependencies -- Software Heritage autobuilder (on jenkins-debian1) Tue, 19 Feb 2019 11:32:28 +0000 swh-core (0.0.54-1~swh2) unstable-swh; urgency=medium * New upstream release 0.0.54 * Upstream changes: - Add missing build dependencies -- Antoine R. Dumont (@ardumont) Tue, 12 Feb 2019 16:25:34 +0000 swh-core (0.0.54-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.54 - (tagged by Valentin Lorentz on 2019-02-11 16:47:18 +0100) * Upstream changes: - Add test for BaseDb.connect. -- Software Heritage autobuilder (on jenkins-debian1) Tue, 12 Feb 2019 12:37:43 +0000 swh-core (0.0.53-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.53 - (tagged by Antoine R. Dumont (@ardumont) on 2019-02-08 09:09:30 +0100) * Upstream changes: - v0.0.53 - Fix debian build -- Software Heritage autobuilder (on jenkins-debian1) Fri, 08 Feb 2019 08:12:31 +0000 swh-core (0.0.52-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.52 - (tagged by David Douard on 2019-02-06 15:24:04 +0100) * Upstream changes: - v0.0.52 -- Software Heritage autobuilder (on jenkins-debian1) Wed, 06 Feb 2019 14:27:14 +0000 swh-core (0.0.51-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.51 - (tagged by David Douard on 2019-02-01 14:28:27 +0100) * Upstream changes: - v0.0.51 -- Software Heritage autobuilder (on jenkins-debian1) Fri, 01 Feb 2019 13:31:45 +0000 swh-core (0.0.50-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.50 - (tagged by Nicolas Dandrimont on 2019-01-09 15:50:58 +0100) * Upstream changes: - Release swh.core v0.0.50 - Add statsd client module - Log used config files -- Software Heritage autobuilder (on jenkins-debian1) Wed, 09 Jan 2019 14:54:37 +0000 swh-core (0.0.49-1~swh1) unstable-swh; urgency=medium * Make DbTestFixture.setUp() accept and pass *args and **kwargs. -- Software Heritage autobuilder (on jenkins-debian1) Tue, 08 Jan 2019 16:38:02 +0000 swh-core (0.0.48-1~swh1) unstable-swh; urgency=medium * v0.0.48 * swh.core.cli: Update swh-db-init to make it idemtpotent -- Antoine R. Dumont (@ardumont) Tue, 08 Jan 2019 15:33:15 +0000 swh-core (0.0.47-1~swh1) unstable-swh; urgency=medium * v0.0.47 * swh.core.cli: Fix flag -- Antoine R. Dumont (@ardumont) Tue, 08 Jan 2019 15:16:09 +0000 swh-core (0.0.46-1~swh1) unstable-swh; urgency=medium * v0.0.46 * utils.grouper: Improve implementation * Remove now-obsolete information about swh.core.worker -- Antoine R. Dumont (@ardumont) Tue, 08 Jan 2019 14:37:34 +0000 swh-core (0.0.45-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.45 * Compatibility with recent msgpack * Debian packaging-related cleanups -- Nicolas Dandrimont Thu, 22 Nov 2018 21:09:53 +0100 swh-core (0.0.44-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.44 * Refactor the database testing fixtures * Stop unsafe serialization/deserialization constructs * Update tests to use nose -- Nicolas Dandrimont Thu, 18 Oct 2018 18:20:12 +0200 swh-core (0.0.43-1~swh1) unstable-swh; urgency=medium * v0.0.43 * Fix missing dependency declaration -- Antoine R. Dumont (@ardumont) Thu, 11 Oct 2018 15:47:06 +0200 swh-core (0.0.42-1~swh1) unstable-swh; urgency=medium * v0.0.42 * Fix missing dependency declaration -- Antoine R. Dumont (@ardumont) Thu, 11 Oct 2018 15:45:25 +0200 swh-core (0.0.41-1~swh1) unstable-swh; urgency=medium * Add functions to generate HTTP API clients and servers from databases. * Summary: This moves the interesting parts of D505 into the core, so other components can use them as well. * Test Plan: `make test` * Reviewers: ardumont, seirl, #reviewers * Reviewed By: ardumont, #reviewers * Subscribers: douardda * Differential Revision: https://forge.softwareheritage.org/D507 -- Valentin Lorentz Thu, 11 Oct 2018 10:57:27 +0200 swh-core (0.0.40-1~swh1) unstable-swh; urgency=medium * v0.0.40 * swh.core.api.SWHRemoteAPI: Permit to set a query timeout option -- Antoine R. Dumont (@ardumont) Thu, 24 May 2018 12:10:03 +0200 swh-core (0.0.39-1~swh1) unstable-swh; urgency=medium * v0.0.39 * package: Add missing runtime dependency -- Antoine R. Dumont (@ardumont) Thu, 26 Apr 2018 15:24:22 +0200 swh-core (0.0.38-1~swh1) unstable-swh; urgency=medium * v0.0.38 * tests: Use more reasonable psql options for db restores * swh.core.serializers: Add custom types serialization -- Antoine R. Dumont (@ardumont) Thu, 26 Apr 2018 15:15:27 +0200 swh-core (0.0.37-1~swh1) unstable-swh; urgency=medium * v0.0.37 * Move test fixture in swh.core.tests.server_testing module -- Antoine R. Dumont (@ardumont) Wed, 25 Apr 2018 15:00:02 +0200 swh-core (0.0.36-1~swh1) unstable-swh; urgency=medium * v0.0.36 * Migrate swh.loader.tar.tarball module in swh.core -- Antoine R. Dumont (@ardumont) Wed, 06 Dec 2017 12:03:29 +0100 swh-core (0.0.35-1~swh1) unstable-swh; urgency=medium * Release swh.core version 0.0.35 * Update packaging runes -- Nicolas Dandrimont Thu, 12 Oct 2017 18:07:50 +0200 swh-core (0.0.34-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.34 * New modular database test fixture -- Nicolas Dandrimont Mon, 07 Aug 2017 18:29:48 +0200 swh-core (0.0.33-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.33 * Be more conservative with remote API responses -- Nicolas Dandrimont Mon, 19 Jun 2017 19:01:38 +0200 swh-core (0.0.32-1~swh1) unstable-swh; urgency=medium * Release swh-core v0.0.32 * Add asynchronous streaming methods for internal APIs * Remove task arguments from systemd-journal loggers -- Nicolas Dandrimont Tue, 09 May 2017 14:04:22 +0200 swh-core (0.0.31-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.31 * Add explicit dependency on python3-systemd -- Nicolas Dandrimont Fri, 07 Apr 2017 15:11:26 +0200 swh-core (0.0.30-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.30 * drop swh.core.hashutil (moved to swh.model.hashutil) * add a systemd logger -- Nicolas Dandrimont Fri, 07 Apr 2017 11:49:15 +0200 swh-core (0.0.29-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.29 * Catch proper exception in the base API client -- Nicolas Dandrimont Thu, 02 Feb 2017 00:19:25 +0100 swh-core (0.0.28-1~swh1) unstable-swh; urgency=medium * v0.0.28 * Refactoring some common code into swh.core -- Antoine R. Dumont (@ardumont) Thu, 26 Jan 2017 14:54:22 +0100 swh-core (0.0.27-1~swh1) unstable-swh; urgency=medium * v0.0.27 * Fix issue with default boolean value -- Antoine R. Dumont (@ardumont) Thu, 20 Oct 2016 16:15:20 +0200 swh-core (0.0.26-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.26 * Raise an exception when a configuration file exists and is unreadable -- Nicolas Dandrimont Wed, 12 Oct 2016 10:16:09 +0200 swh-core (0.0.25-1~swh1) unstable-swh; urgency=medium * v0.0.25 * Add new function utils.cwd -- Antoine R. Dumont (@ardumont) Thu, 29 Sep 2016 21:29:37 +0200 swh-core (0.0.24-1~swh1) unstable-swh; urgency=medium * v0.0.24 * Deal with edge case in logger regarding json -- Antoine R. Dumont (@ardumont) Thu, 22 Sep 2016 12:21:09 +0200 swh-core (0.0.23-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.23 * Properly fix the PyYAML dependency -- Nicolas Dandrimont Tue, 23 Aug 2016 16:20:29 +0200 swh-core (0.0.22-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.22 * Proper loading of yaml and ini files in all paths -- Nicolas Dandrimont Fri, 19 Aug 2016 15:45:55 +0200 swh-core (0.0.21-1~swh1) unstable-swh; urgency=medium * v0.0.21 * Update test tools -- Antoine R. Dumont (@ardumont) Tue, 19 Jul 2016 14:47:01 +0200 swh-core (0.0.20-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.20 * Add some generic bytes <-> escaped unicode methods -- Nicolas Dandrimont Tue, 14 Jun 2016 16:54:41 +0200 swh-core (0.0.19-1~swh1) unstable-swh; urgency=medium * v0.0.19 * Resurrect swh.core.utils -- Antoine R. Dumont (@ardumont) Fri, 15 Apr 2016 12:40:43 +0200 swh-core (0.0.18-1~swh1) unstable-swh; urgency=medium * v0.0.18 * Add swh.core.utils * serializers: support UUIDs all around -- Antoine R. Dumont (@ardumont) Sat, 26 Mar 2016 11:16:33 +0100 swh-core (0.0.17-1~swh1) unstable-swh; urgency=medium * Release swh.core v0.0.17 * Allow serialization of UUIDs -- Nicolas Dandrimont Fri, 04 Mar 2016 11:40:56 +0100 swh-core (0.0.16-1~swh1) unstable-swh; urgency=medium * Release swh.core version 0.0.16 * add bytehex_to_hash and hash_to_bytehex in hashutil * move scheduling utilities to swh.scheduler -- Nicolas Dandrimont Fri, 19 Feb 2016 18:12:10 +0100 swh-core (0.0.15-1~swh1) unstable-swh; urgency=medium * Release v0.0.15 * Add hashutil.hash_git_object -- Nicolas Dandrimont Wed, 16 Dec 2015 16:31:26 +0100 swh-core (0.0.14-1~swh1) unstable-swh; urgency=medium * v0.0.14 * Add simple README * Update license * swh.core.hashutil.hashfile can now deal with filepath as bytes -- Antoine R. Dumont (@ardumont) Fri, 23 Oct 2015 11:13:14 +0200 swh-core (0.0.13-1~swh1) unstable-swh; urgency=medium * Prepare deployment of swh.core v0.0.13 -- Nicolas Dandrimont Fri, 09 Oct 2015 17:32:49 +0200 swh-core (0.0.12-1~swh1) unstable-swh; urgency=medium * Prepare deployment of swh.core v0.0.12 -- Nicolas Dandrimont Tue, 06 Oct 2015 17:34:34 +0200 swh-core (0.0.11-1~swh1) unstable-swh; urgency=medium * Prepare deployment of swh.core v0.0.11 -- Nicolas Dandrimont Sat, 03 Oct 2015 15:57:03 +0200 swh-core (0.0.10-1~swh1) unstable-swh; urgency=medium * Prepare deploying swh.core v0.0.10 -- Nicolas Dandrimont Sat, 03 Oct 2015 12:28:52 +0200 swh-core (0.0.9-1~swh1) unstable-swh; urgency=medium * Prepare deploying swh.core v0.0.9 -- Nicolas Dandrimont Sat, 03 Oct 2015 11:36:55 +0200 swh-core (0.0.8-1~swh1) unstable-swh; urgency=medium * Prepare deployment of swh.core v0.0.8 -- Nicolas Dandrimont Thu, 01 Oct 2015 12:31:44 +0200 swh-core (0.0.7-1~swh1) unstable-swh; urgency=medium * Prepare deployment of swh.core v0.0.7 -- Nicolas Dandrimont Thu, 01 Oct 2015 11:29:04 +0200 swh-core (0.0.6-1~swh1) unstable-swh; urgency=medium * Prepare deployment of swh.core v0.0.6 -- Nicolas Dandrimont Tue, 29 Sep 2015 16:48:44 +0200 swh-core (0.0.5-1~swh1) unstable-swh; urgency=medium * Prepare v0.0.5 deployment -- Nicolas Dandrimont Tue, 29 Sep 2015 16:08:32 +0200 swh-core (0.0.4-1~swh1) unstable-swh; urgency=medium * Tagging swh.core 0.0.4 -- Nicolas Dandrimont Fri, 25 Sep 2015 15:41:26 +0200 swh-core (0.0.3-1~swh1) unstable-swh; urgency=medium * Tag swh.core v0.0.3 -- Nicolas Dandrimont Fri, 25 Sep 2015 11:07:10 +0200 swh-core (0.0.2-1~swh1) unstable-swh; urgency=medium * Deploy v0.0.2 -- Nicolas Dandrimont Wed, 23 Sep 2015 12:08:50 +0200 swh-core (0.0.1-1~swh1) unstable-swh; urgency=medium * Initial release * Tag v0.0.1 for deployment -- Nicolas Dandrimont Tue, 22 Sep 2015 14:52:26 +0200 diff --git a/swh.core.egg-info/PKG-INFO b/swh.core.egg-info/PKG-INFO index 3a27ae1..b5abc3b 100644 --- a/swh.core.egg-info/PKG-INFO +++ b/swh.core.egg-info/PKG-INFO @@ -1,28 +1,28 @@ Metadata-Version: 2.1 Name: swh.core -Version: 0.0.55 +Version: 0.0.56 Summary: Software Heritage core utilities Home-page: https://forge.softwareheritage.org/diffusion/DCORE/ Author: Software Heritage developers Author-email: swh-devel@inria.fr License: UNKNOWN Project-URL: Source, https://forge.softwareheritage.org/source/swh-core -Project-URL: Funding, https://www.softwareheritage.org/donate Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest +Project-URL: Funding, https://www.softwareheritage.org/donate Description: swh-core ======== core library for swh's modules: - config parser - hash computations - serialization - logging mechanism Platform: UNKNOWN Classifier: Programming Language :: Python :: 3 Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3) Classifier: Operating System :: OS Independent Classifier: Development Status :: 5 - Production/Stable Description-Content-Type: text/markdown Provides-Extra: testing diff --git a/swh.core.egg-info/SOURCES.txt b/swh.core.egg-info/SOURCES.txt index a635254..08bd88a 100644 --- a/swh.core.egg-info/SOURCES.txt +++ b/swh.core.egg-info/SOURCES.txt @@ -1,40 +1,41 @@ MANIFEST.in Makefile README.md requirements-swh.txt requirements.txt setup.py version.txt swh/__init__.py swh.core.egg-info/PKG-INFO swh.core.egg-info/SOURCES.txt swh.core.egg-info/dependency_links.txt swh.core.egg-info/entry_points.txt swh.core.egg-info/requires.txt swh.core.egg-info/top_level.txt swh/core/__init__.py swh/core/api_async.py swh/core/cli.py swh/core/config.py swh/core/logger.py swh/core/statsd.py swh/core/tarball.py swh/core/utils.py swh/core/api/__init__.py swh/core/api/asynchronous.py -swh/core/api/negotiate.py +swh/core/api/negotiation.py swh/core/api/serializers.py swh/core/db/__init__.py swh/core/db/common.py swh/core/db/db_utils.py swh/core/sql/log-schema.sql swh/core/tests/__init__.py +swh/core/tests/conftest.py swh/core/tests/db_testing.py swh/core/tests/server_testing.py swh/core/tests/test_api.py swh/core/tests/test_config.py swh/core/tests/test_db.py swh/core/tests/test_logger.py swh/core/tests/test_serializers.py swh/core/tests/test_statsd.py swh/core/tests/test_utils.py \ No newline at end of file diff --git a/swh/core/api/__init__.py b/swh/core/api/__init__.py index a62316a..7526e96 100644 --- a/swh/core/api/__init__.py +++ b/swh/core/api/__init__.py @@ -1,309 +1,323 @@ # Copyright (C) 2015-2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information import collections import functools import inspect import json import logging import pickle import requests import datetime from flask import Flask, Request, Response, request, abort from .serializers import (decode_response, encode_data_client as encode_data, msgpack_dumps, msgpack_loads, SWHJSONDecoder) -from .negotiate import (Formatter as FormatterBase, - Negotiator as NegotiatorBase, - negotiate as _negotiate) +from .negotiation import (Formatter as FormatterBase, + Negotiator as NegotiatorBase, + negotiate as _negotiate) logger = logging.getLogger(__name__) # support for content negotation class Negotiator(NegotiatorBase): def best_mimetype(self): return request.accept_mimetypes.best_match( - self.accept_mimetypes, 'text/html') + self.accept_mimetypes, 'application/json') def _abort(self, status_code, err=None): return abort(status_code, err) def negotiate(formatter_cls, *args, **kwargs): return _negotiate(Negotiator, formatter_cls, *args, **kwargs) class Formatter(FormatterBase): def _make_response(self, body, content_type): return Response(body, content_type=content_type) class SWHJSONEncoder(json.JSONEncoder): def default(self, obj): if isinstance(obj, (datetime.datetime, datetime.date)): return obj.isoformat() if isinstance(obj, datetime.timedelta): return str(obj) # Let the base class default method raise the TypeError return super().default(obj) class JSONFormatter(Formatter): format = 'json' mimetypes = ['application/json'] def render(self, obj): return json.dumps(obj, cls=SWHJSONEncoder) class MsgpackFormatter(Formatter): format = 'msgpack' mimetypes = ['application/x-msgpack'] def render(self, obj): return msgpack_dumps(obj) # base API classes class RemoteException(Exception): pass def remote_api_endpoint(path): def dec(f): f._endpoint_path = path return f return dec +class APIError(Exception): + """API Error""" + def __str__(self): + return ('An unexpected error occurred in the backend: {}' + .format(self.args)) + + class MetaSWHRemoteAPI(type): """Metaclass for SWHRemoteAPI, which adds a method for each endpoint of the database it is designed to access. See for example :class:`swh.indexer.storage.api.client.RemoteStorage`""" def __new__(cls, name, bases, attributes): # For each method wrapped with @remote_api_endpoint in an API backend # (eg. :class:`swh.indexer.storage.IndexerStorage`), add a new # method in RemoteStorage, with the same documentation. # # Note that, despite the usage of decorator magic (eg. functools.wrap), # this never actually calls an IndexerStorage method. backend_class = attributes.get('backend_class', None) for base in bases: if backend_class is not None: break backend_class = getattr(base, 'backend_class', None) if backend_class: for (meth_name, meth) in backend_class.__dict__.items(): if hasattr(meth, '_endpoint_path'): cls.__add_endpoint(meth_name, meth, attributes) return super().__new__(cls, name, bases, attributes) @staticmethod def __add_endpoint(meth_name, meth, attributes): wrapped_meth = inspect.unwrap(meth) @functools.wraps(meth) # Copy signature and doc def meth_(*args, **kwargs): # Match arguments and parameters post_data = inspect.getcallargs( wrapped_meth, *args, **kwargs) # Remove arguments that should not be passed self = post_data.pop('self') post_data.pop('cur', None) post_data.pop('db', None) # Send the request. return self.post(meth._endpoint_path, post_data) attributes[meth_name] = meth_ class SWHRemoteAPI(metaclass=MetaSWHRemoteAPI): """Proxy to an internal SWH API """ backend_class = None """For each method of `backend_class` decorated with :func:`remote_api_endpoint`, a method with the same prototype and docstring will be added to this class. Calls to this new method will be translated into HTTP requests to a remote server. This backend class will never be instantiated, it only serves as a template.""" - def __init__(self, api_exception, url, timeout=None): - super().__init__() - self.api_exception = api_exception + api_exception = APIError + """The exception class to raise in case of communication error with + the server.""" + + def __init__(self, url, api_exception=None, + timeout=None, chunk_size=4096, **kwargs): + if api_exception: + self.api_exception = api_exception base_url = url if url.endswith('/') else url + '/' self.url = base_url self.session = requests.Session() self.timeout = timeout + self.chunk_size = chunk_size def _url(self, endpoint): return '%s%s' % (self.url, endpoint) - def raw_post(self, endpoint, data, **opts): + def raw_verb(self, verb, endpoint, **opts): + if 'chunk_size' in opts: + # if the chunk_size argument has been passed, consider the user + # also wants stream=True, otherwise, what's the point. + opts['stream'] = True if self.timeout and 'timeout' not in opts: opts['timeout'] = self.timeout try: - return self.session.post( + return getattr(self.session, verb)( self._url(endpoint), - data=data, **opts ) except requests.exceptions.ConnectionError as e: raise self.api_exception(e) - def raw_get(self, endpoint, params=None, **opts): - if self.timeout and 'timeout' not in opts: - opts['timeout'] = self.timeout - try: - return self.session.get( - self._url(endpoint), - params=params, - **opts - ) - except requests.exceptions.ConnectionError as e: - raise self.api_exception(e) - - def post(self, endpoint, data, params=None): - data = encode_data(data) - response = self.raw_post( - endpoint, data, params=params, + def post(self, endpoint, data, **opts): + if isinstance(data, (collections.Iterator, collections.Generator)): + data = (encode_data(x) for x in data) + else: + data = encode_data(data) + chunk_size = opts.pop('chunk_size', self.chunk_size) + response = self.raw_verb( + 'post', endpoint, data=data, headers={'content-type': 'application/x-msgpack', - 'accept': 'application/x-msgpack'}) - return self._decode_response(response) - - def get(self, endpoint, params=None): - response = self.raw_get( - endpoint, params=params, - headers={'accept': 'application/x-msgpack'}) - return self._decode_response(response) - - def post_stream(self, endpoint, data, params=None): - if not isinstance(data, collections.Iterable): - raise ValueError("`data` must be Iterable") - response = self.raw_post( - endpoint, data, params=params, - headers={'accept': 'application/x-msgpack'}) - - return self._decode_response(response) - - def get_stream(self, endpoint, params=None, chunk_size=4096): - response = self.raw_get(endpoint, params=params, stream=True, - headers={'accept': 'application/x-msgpack'}) - return response.iter_content(chunk_size) + 'accept': 'application/x-msgpack'}, + **opts) + if opts.get('stream') or \ + response.headers.get('transfer-encoding') == 'chunked': + return response.iter_content(chunk_size) + else: + return self._decode_response(response) + + def post_stream(self, endpoint, data, **opts): + return self.post(endpoint, data, stream=True, **opts) + + def get(self, endpoint, **opts): + chunk_size = opts.pop('chunk_size', self.chunk_size) + response = self.raw_verb( + 'get', endpoint, + headers={'accept': 'application/x-msgpack'}, + **opts) + if opts.get('stream') or \ + response.headers.get('transfer-encoding') == 'chunked': + return response.iter_content(chunk_size) + else: + return self._decode_response(response) + + def get_stream(self, endpoint, **opts): + return self.get(endpoint, stream=True, **opts) def _decode_response(self, response): if response.status_code == 404: return None if response.status_code == 500: data = decode_response(response) if 'exception_pickled' in data: raise pickle.loads(data['exception_pickled']) else: raise RemoteException(data['exception']) # XXX: this breaks language-independence and should be # replaced by proper unserialization if response.status_code == 400: raise pickle.loads(decode_response(response)) elif response.status_code != 200: raise RemoteException( "Unexpected status code for API request: %s (%s)" % ( response.status_code, response.content, ) ) return decode_response(response) + def __repr__(self): + return '<{} url={}>'.format(self.__class__.__name__, self.url) + class BytesRequest(Request): """Request with proper escaping of arbitrary byte sequences.""" encoding = 'utf-8' encoding_errors = 'surrogateescape' ENCODERS = { 'application/x-msgpack': msgpack_dumps, 'application/json': json.dumps, } def encode_data_server(data, content_type='application/x-msgpack'): encoded_data = ENCODERS[content_type](data) return Response( encoded_data, mimetype=content_type, ) def decode_request(request): content_type = request.mimetype data = request.get_data() if not data: return {} if content_type == 'application/x-msgpack': r = msgpack_loads(data) elif content_type == 'application/json': r = json.loads(data, cls=SWHJSONDecoder) else: raise ValueError('Wrong content type `%s` for API request' % content_type) return r def error_handler(exception, encoder): # XXX: this breaks language-independence and should be # replaced by proper serialization of errors logging.exception(exception) response = encoder(pickle.dumps(exception)) response.status_code = 400 return response class SWHServerAPIApp(Flask): """For each endpoint of the given `backend_class`, tells app.route to call a function that decodes the request and sends it to the backend object provided by the factory. :param Any backend_class: The class of the backend, which will be analyzed to look for API endpoints. :param Callable[[], backend_class] backend_factory: A function with no argument that returns an instance of `backend_class`.""" request_class = BytesRequest def __init__(self, *args, backend_class=None, backend_factory=None, **kwargs): super().__init__(*args, **kwargs) if backend_class is not None: if backend_factory is None: raise TypeError('Missing argument backend_factory') for (meth_name, meth) in backend_class.__dict__.items(): if hasattr(meth, '_endpoint_path'): self.__add_endpoint(meth_name, meth, backend_factory) def __add_endpoint(self, meth_name, meth, backend_factory): from flask import request @self.route('/'+meth._endpoint_path, methods=['POST']) @functools.wraps(meth) # Copy signature and doc def _f(): # Call the actual code obj_meth = getattr(backend_factory(), meth_name) return encode_data_server(obj_meth(**decode_request(request))) diff --git a/swh/core/api/negotiate.py b/swh/core/api/negotiation.py similarity index 100% rename from swh/core/api/negotiate.py rename to swh/core/api/negotiation.py diff --git a/swh/core/config.py b/swh/core/config.py index b258311..e234210 100644 --- a/swh/core/config.py +++ b/swh/core/config.py @@ -1,359 +1,360 @@ # Copyright (C) 2015 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information import configparser import logging import os import yaml from itertools import chain from copy import deepcopy logger = logging.getLogger(__name__) SWH_CONFIG_DIRECTORIES = [ '~/.config/swh', '~/.swh', '/etc/softwareheritage', ] SWH_GLOBAL_CONFIG = 'global.ini' SWH_DEFAULT_GLOBAL_CONFIG = { 'content_size_limit': ('int', 100 * 1024 * 1024), 'log_db': ('str', 'dbname=softwareheritage-log'), } SWH_CONFIG_EXTENSIONS = [ '.yml', '.ini', ] # conversion per type _map_convert_fn = { 'int': int, 'bool': lambda x: x.lower() == 'true', 'list[str]': lambda x: [value.strip() for value in x.split(',')], 'list[int]': lambda x: [int(value.strip()) for value in x.split(',')], } _map_check_fn = { 'int': lambda x: isinstance(x, int), 'bool': lambda x: isinstance(x, bool), 'list[str]': lambda x: (isinstance(x, list) and all(isinstance(y, str) for y in x)), 'list[int]': lambda x: (isinstance(x, list) and all(isinstance(y, int) for y in x)), } def exists_accessible(file): """Check whether a file exists, and is accessible. Returns: True if the file exists and is accessible False if the file does not exist Raises: PermissionError if the file cannot be read. """ try: os.stat(file) except PermissionError: raise except FileNotFoundError: return False else: if os.access(file, os.R_OK): return True else: raise PermissionError("Permission denied: %r" % file) def config_basepath(config_path): """Return the base path of a configuration file""" if config_path.endswith(('.ini', '.yml')): return config_path[:-4] return config_path def read_raw_config(base_config_path): """Read the raw config corresponding to base_config_path. Can read yml or ini files. """ yml_file = base_config_path + '.yml' if exists_accessible(yml_file): logger.info('Loading config file %s', yml_file) with open(yml_file) as f: return yaml.safe_load(f) ini_file = base_config_path + '.ini' if exists_accessible(ini_file): config = configparser.ConfigParser() config.read(ini_file) if 'main' in config._sections: logger.info('Loading config file %s', ini_file) return config._sections['main'] else: logger.warning('Ignoring config file %s (no [main] section)', ini_file) return {} def config_exists(config_path): """Check whether the given config exists""" basepath = config_basepath(config_path) return any(exists_accessible(basepath + extension) for extension in SWH_CONFIG_EXTENSIONS) def read(conf_file=None, default_conf=None): """Read the user's configuration file. Fill in the gap using `default_conf`. `default_conf` is similar to this:: DEFAULT_CONF = { 'a': ('str', '/tmp/swh-loader-git/log'), 'b': ('str', 'dbname=swhloadergit') 'c': ('bool', true) 'e': ('bool', None) 'd': ('int', 10) } If conf_file is None, return the default config. """ conf = {} if conf_file: base_config_path = config_basepath(os.path.expanduser(conf_file)) conf = read_raw_config(base_config_path) if not default_conf: default_conf = {} # remaining missing default configuration key are set # also type conversion is enforced for underneath layer for key in default_conf: nature_type, default_value = default_conf[key] val = conf.get(key, None) if val is None: # fallback to default value conf[key] = default_value elif not _map_check_fn.get(nature_type, lambda x: True)(val): # value present but not in the proper format, force type conversion conf[key] = _map_convert_fn.get(nature_type, lambda x: x)(val) return conf def priority_read(conf_filenames, default_conf=None): """Try reading the configuration files from conf_filenames, in order, and return the configuration from the first one that exists. default_conf has the same specification as it does in read. """ # Try all the files in order for filename in conf_filenames: full_filename = os.path.expanduser(filename) if config_exists(full_filename): return read(full_filename, default_conf) # Else, return the default configuration return read(None, default_conf) def merge_default_configs(base_config, *other_configs): """Merge several default config dictionaries, from left to right""" full_config = base_config.copy() for config in other_configs: full_config.update(config) return full_config def merge_configs(base, other): """Merge two config dictionaries This does merge config dicts recursively, with the rules, for every value of the dicts (with 'val' not being a dict): - None + type -> type - type + None -> None - dict + dict -> dict (merged) - val + dict -> TypeError - dict + val -> TypeError - val + val -> val (other) - so merging + for instance: - { - 'key1': { - 'skey1': value1, - 'skey2': {'sskey1': value2}, - }, - 'key2': value3, - } + >>> d1 = { + ... 'key1': { + ... 'skey1': 'value1', + ... 'skey2': {'sskey1': 'value2'}, + ... }, + ... 'key2': 'value3', + ... } with - { - 'key1': { - 'skey1': value4, - 'skey2': {'sskey2': value5}, - }, - 'key3': value6, - } + >>> d2 = { + ... 'key1': { + ... 'skey1': 'value4', + ... 'skey2': {'sskey2': 'value5'}, + ... }, + ... 'key3': 'value6', + ... } will give: - { - 'key1': { - 'skey1': value4, # <-- note this - 'skey2': { - 'sskey1': value2, - 'sskey2': value5, - }, - }, - 'key2': value3, - 'key3': value6, - } + >>> d3 = { + ... 'key1': { + ... 'skey1': 'value4', # <-- note this + ... 'skey2': { + ... 'sskey1': 'value2', + ... 'sskey2': 'value5', + ... }, + ... }, + ... 'key2': 'value3', + ... 'key3': 'value6', + ... } + >>> assert merge_configs(d1, d2) == d3 Note that no type checking is done for anything but dicts. """ if not isinstance(base, dict) or not isinstance(other, dict): raise TypeError( 'Cannot merge a %s with a %s' % (type(base), type(other))) output = {} allkeys = set(chain(base.keys(), other.keys())) for k in allkeys: vb = base.get(k) vo = other.get(k) if isinstance(vo, dict): output[k] = merge_configs(vb is not None and vb or {}, vo) elif isinstance(vb, dict) and k in other and other[k] is not None: output[k] = merge_configs(vb, vo is not None and vo or {}) elif k in other: output[k] = deepcopy(vo) else: output[k] = deepcopy(vb) return output def swh_config_paths(base_filename): """Return the Software Heritage specific configuration paths for the given filename.""" return [os.path.join(dirname, base_filename) for dirname in SWH_CONFIG_DIRECTORIES] def prepare_folders(conf, *keys): """Prepare the folder mentioned in config under keys. """ def makedir(folder): if not os.path.exists(folder): os.makedirs(folder) for key in keys: makedir(conf[key]) def load_global_config(): """Load the global Software Heritage config""" return priority_read( swh_config_paths(SWH_GLOBAL_CONFIG), SWH_DEFAULT_GLOBAL_CONFIG, ) def load_named_config(name, default_conf=None, global_conf=True): """Load the config named `name` from the Software Heritage configuration paths. If global_conf is True (default), read the global configuration too. """ conf = {} if global_conf: conf.update(load_global_config()) conf.update(priority_read(swh_config_paths(name), default_conf)) return conf class SWHConfig: """Mixin to add configuration parsing abilities to classes The class should override the class attributes: - DEFAULT_CONFIG (default configuration to be parsed) - CONFIG_BASE_FILENAME (the filename of the configuration to be used) This class defines one classmethod, parse_config_file, which parses a configuration file using the default config as set in the class attribute. """ DEFAULT_CONFIG = {} CONFIG_BASE_FILENAME = '' @classmethod def parse_config_file(cls, base_filename=None, config_filename=None, additional_configs=None, global_config=True): """Parse the configuration file associated to the current class. By default, parse_config_file will load the configuration cls.CONFIG_BASE_FILENAME from one of the Software Heritage configuration directories, in order, unless it is overridden by base_filename or config_filename (which shortcuts the file lookup completely). Args: - base_filename (str) overrides the default cls.CONFIG_BASE_FILENAME - config_filename (str) sets the file to parse instead of the defaults set from cls.CONFIG_BASE_FILENAME - additional_configs (list of default configuration dicts) allows to override or extend the configuration set in cls.DEFAULT_CONFIG. - global_config (bool): Load the global configuration (default: True) """ if config_filename: config_filenames = [config_filename] elif 'SWH_CONFIG_FILENAME' in os.environ: config_filenames = [os.environ['SWH_CONFIG_FILENAME']] else: if not base_filename: base_filename = cls.CONFIG_BASE_FILENAME config_filenames = swh_config_paths(base_filename) if not additional_configs: additional_configs = [] full_default_config = merge_default_configs(cls.DEFAULT_CONFIG, *additional_configs) config = {} if global_config: config = load_global_config() config.update(priority_read(config_filenames, full_default_config)) return config diff --git a/swh/core/tests/conftest.py b/swh/core/tests/conftest.py new file mode 100644 index 0000000..5d8dcd5 --- /dev/null +++ b/swh/core/tests/conftest.py @@ -0,0 +1,2 @@ +import os +os.environ['LC_ALL'] = 'C.UTF-8' diff --git a/swh/core/tests/test_api.py b/swh/core/tests/test_api.py index e1009b1..1b978d8 100644 --- a/swh/core/tests/test_api.py +++ b/swh/core/tests/test_api.py @@ -1,81 +1,81 @@ # Copyright (C) 2018 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information import unittest import requests_mock from werkzeug.wrappers import BaseResponse from werkzeug.test import Client as WerkzeugTestClient from swh.core.api import ( error_handler, encode_data_server, remote_api_endpoint, SWHRemoteAPI, SWHServerAPIApp) class ApiTest(unittest.TestCase): def test_server(self): testcase = self nb_endpoint_calls = 0 class TestStorage: @remote_api_endpoint('test_endpoint_url') def test_endpoint(self, test_data, db=None, cur=None): nonlocal nb_endpoint_calls nb_endpoint_calls += 1 testcase.assertEqual(test_data, 'spam') return 'egg' app = SWHServerAPIApp('testapp', backend_class=TestStorage, backend_factory=lambda: TestStorage()) @app.errorhandler(Exception) def my_error_handler(exception): return error_handler(exception, encode_data_server) client = WerkzeugTestClient(app, BaseResponse) res = client.post('/test_endpoint_url', headers={'Content-Type': 'application/x-msgpack'}, data=b'\x81\xa9test_data\xa4spam') self.assertEqual(nb_endpoint_calls, 1) self.assertEqual(b''.join(res.response), b'\xa3egg') def test_client(self): class TestStorage: @remote_api_endpoint('test_endpoint_url') def test_endpoint(self, test_data, db=None, cur=None): pass nb_http_calls = 0 def callback(request, context): nonlocal nb_http_calls nb_http_calls += 1 self.assertEqual(request.headers['Content-Type'], 'application/x-msgpack') self.assertEqual(request.body, b'\x81\xa9test_data\xa4spam') context.headers['Content-Type'] = 'application/x-msgpack' context.content = b'\xa3egg' return b'\xa3egg' adapter = requests_mock.Adapter() adapter.register_uri('POST', 'mock://example.com/test_endpoint_url', content=callback) class Testclient(SWHRemoteAPI): backend_class = TestStorage def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.session.mount('mock', adapter) - c = Testclient('foo', 'mock://example.com/') + c = Testclient(url='mock://example.com/') res = c.test_endpoint('spam') self.assertEqual(nb_http_calls, 1) self.assertEqual(res, 'egg') diff --git a/version.txt b/version.txt index c1a00f1..053ca14 100644 --- a/version.txt +++ b/version.txt @@ -1 +1 @@ -v0.0.55-0-g2ef40b2 \ No newline at end of file +v0.0.56-0-g577e933 \ No newline at end of file