diff --git a/.gitignore b/.gitignore index 8b22a4c43..1b812e289 100644 --- a/.gitignore +++ b/.gitignore @@ -1,12 +1,12 @@ *.pyc *.sw? *~ \#* .\#* .coverage .eggs/ resources/test/ __pycache__ version.txt -swh.web.ui.egg-info +swh.web.egg-info docs/build/ diff --git a/MANIFEST.in b/MANIFEST.in index 2a563258c..c377988a1 100644 --- a/MANIFEST.in +++ b/MANIFEST.in @@ -1,6 +1,7 @@ include Makefile include requirements.txt include requirements-swh.txt include version.txt +include swh/web/db.sqlite3 recursive-include swh/web/static * recursive-include swh/web/api/templates * diff --git a/debian/control b/debian/control index 8cbe05f4a..844b1838f 100644 --- a/debian/control +++ b/debian/control @@ -1,29 +1,28 @@ Source: swh-web Maintainer: Software Heritage developers Section: python Priority: optional Build-Depends: debhelper (>= 9), dh-python, python3-all, python3-docutils, python3-nose, python3-django, python3-djangorestframework, - python3-djangorestframework-extensions, python3-pygments, python3-setuptools, + python3-yaml, python3-swh.core (>= 0.0.20~), python3-swh.model (>= 0.0.15~), - python3-swh.storage (>= 0.0.83~), - python3-yaml + python3-swh.storage (>= 0.0.83~) Standards-Version: 3.9.6 Homepage: https://forge.softwareheritage.org/diffusion/DWUI/ Package: python3-swh.web Architecture: all Depends: python3-swh.core (>= 0.0.20~), python3-swh.model (>= 0.0.15~), python3-swh.storage (>= 0.0.83~), ${misc:Depends}, ${python3:Depends} Description: Software Heritage Web Applications diff --git a/requirements.txt b/requirements.txt index ed1eb3c2b..c569f9eb5 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,14 +1,15 @@ # Add here external Python modules dependencies, one per line. Module names # should match https://pypi.python.org/pypi names. For the full spec or # dependency lines, see https://pip.readthedocs.org/en/1.1/requirements.html # Runtime dependencies django django_rest_framework django_extensions python-dateutil docutils pygments +yaml # Test dependencies diff --git a/setup.py b/setup.py index 0221b62cf..d03de4311 100755 --- a/setup.py +++ b/setup.py @@ -1,30 +1,32 @@ #!/usr/bin/env python3 from setuptools import setup def parse_requirements(): requirements = [] for reqf in ('requirements.txt', 'requirements-swh.txt'): with open(reqf) as f: for line in f.readlines(): line = line.strip() if not line or line.startswith('#'): continue requirements.append(line) return requirements setup( - name='swh.web.ui', + name='swh.web', description='Software Heritage Web UI', author='Software Heritage developers', author_email='swh-devel@inria.fr', url='https://forge.softwareheritage.org/diffusion/DWUI/', - packages=['swh.web.ui', 'swh.web.ui.views', 'swh.web.ui.tests'], + packages=['swh.web', 'swh.web.api', + 'swh.web.api.views', 'swh.web.api.tests', + 'swh.web.api.templatetags'], scripts=[], install_requires=parse_requirements(), setup_requires=['vcversioner'], vcversioner={}, include_package_data=True, ) diff --git a/swh/web/api/apidoc.py b/swh/web/api/apidoc.py index 4f4e551da..92ff191e3 100644 --- a/swh/web/api/apidoc.py +++ b/swh/web/api/apidoc.py @@ -1,305 +1,305 @@ # Copyright (C) 2015-2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU Affero General Public License version 3, or any later version # See top-level LICENSE file for more information import re from collections import defaultdict from functools import wraps from enum import Enum -from django.urls import reverse +from django.core.urlresolvers import reverse from rest_framework.decorators import api_view from swh.web.api.apiurls import APIUrls from swh.web.api.apiresponse import make_api_response, error_response class argtypes(Enum): # noqa: N801 """Class for centralizing argument type descriptions """ ts = 'timestamp' int = 'integer' str = 'string' path = 'path' sha1 = 'sha1' uuid = 'uuid' sha1_git = 'sha1_git' algo_and_hash = 'hash_type:hash' class rettypes(Enum): # noqa: N801 """Class for centralizing return type descriptions """ octet_stream = 'octet stream' list = 'list' dict = 'dict' class excs(Enum): # noqa: N801 """Class for centralizing exception type descriptions """ badinput = 'BadInputExc' notfound = 'NotFoundExc' class APIDocException(Exception): """ Custom exception to signal errors in the use of the APIDoc decorators """ class route(object): # noqa: N801 """Decorate an API method to register it in the API doc route index and create the corresponding Flask route. This decorator is responsible for bootstrapping the linking of subsequent decorators, as well as traversing the decorator stack to obtain the documentation data from it. Args: route: documentation page's route noargs: set to True if the route has no arguments, and its result should be displayed anytime its documentation is requested. Default to False hidden: set to True to remove the endpoint from being listed in the /api endpoints. Default to False. tags: Further information on api endpoints. Two values are possibly expected: - hidden: remove the entry points from the listing - upcoming: display the entry point but it is not followable """ def __init__(self, route, noargs=False, tags=[], handle_response=False, api_version='1'): super().__init__() self.route = route self.urlpattern = '^' + api_version + route + '$' self.noargs = noargs self.tags = set(tags) self.handle_response = handle_response # @apidoc.route() Decorator call def __call__(self, f): # If the route is not hidden, add it to the index if 'hidden' not in self.tags: APIUrls.index_add_route(self.route, f.__doc__, tags=self.tags) # If the decorated route has arguments, we create a specific # documentation view if not self.noargs: - @api_view() + @api_view(['GET', 'HEAD']) def doc_view(request): doc_data = self.get_doc_data(f) return make_api_response(request, None, doc_data) view_name = self.route[1:-1].replace('/', '-') APIUrls.index_add_url_pattern(self.urlpattern, doc_view, view_name) @wraps(f) def documented_view(request, **kwargs): doc_data = self.get_doc_data(f) try: rv = f(request, **kwargs) except Exception as exc: return error_response(request, exc, doc_data) if self.handle_response: return rv else: return make_api_response(request, rv, doc_data) return documented_view def filter_api_url(self, endpoint, route_re, noargs): doc_methods = {'GET', 'HEAD', 'OPTIONS'} if re.match(route_re, endpoint['rule']): if endpoint['methods'] == doc_methods and not noargs: return False return True def build_examples(self, urls, args): """Build example documentation. Args: f: function urls: information relative to url for that function args: information relative to arguments for that function Yields: example based on default parameter value if any """ s = set() r = [] for data_url in urls: url = data_url['rule'] defaults = {arg['name']: arg['default'] for arg in args if arg['name'] in url} if defaults and None not in defaults.values(): url = reverse(data_url['name'], kwargs=defaults) if url in s: continue s.add(url) r.append(url) return r def get_doc_data(self, f): """Build documentation data for the decorated function""" data = { 'route': self.route, 'noargs': self.noargs, } data.update(getattr(f, 'doc_data', {})) if not f.__doc__: raise APIDocException('Apidoc %s: expected a docstring' ' for function %s' % (self.__class__.__name__, f.__name__)) data['docstring'] = f.__doc__ route_re = re.compile('.*%s$' % data['route']) endpoint_list = APIUrls.get_method_endpoints(f) data['urls'] = [url for url in endpoint_list if self.filter_api_url(url, route_re, data['noargs'])] if 'args' in data: data['examples'] = self.build_examples(data['urls'], data['args']) data['heading'] = '%s Documentation' % data['route'] return data class DocData(object): """Base description of optional input/output setup for a route. """ destination = None def __init__(self): self.doc_data = {} def __call__(self, f): if not hasattr(f, 'doc_data'): f.doc_data = defaultdict(list) f.doc_data[self.destination].append(self.doc_data) return f class arg(DocData): # noqa: N801 """ Decorate an API method to display an argument's information on the doc page specified by @route above. Args: name: the argument's name. MUST match the method argument's name to create the example request URL. default: the argument's default value argtype: the argument's type as an Enum value from apidoc.argtypes argdoc: the argument's documentation string """ destination = 'args' def __init__(self, name, default, argtype, argdoc): super().__init__() self.doc_data = { 'name': name, 'type': argtype.value, 'doc': argdoc, 'default': default } class header(DocData): # noqa: N801 """ Decorate an API method to display header information the api can potentially return in the response. Args: name: the header name doc: the information about that header """ destination = 'headers' def __init__(self, name, doc): super().__init__() self.doc_data = { 'name': name, 'doc': doc, } class param(DocData): # noqa: N801 """Decorate an API method to display query parameter information the api can potentially accept. Args: name: parameter's name default: parameter's default value argtype: parameter's type as an Enum value from apidoc.argtypes doc: the information about that header """ destination = 'params' def __init__(self, name, default, argtype, doc): super().__init__() self.doc_data = { 'name': name, 'type': argtype.value, 'default': default, 'doc': doc, } class raises(DocData): # noqa: N801 """Decorate an API method to display information pertaining to an exception that can be raised by this method. Args: exc: the exception name as an Enum value from apidoc.excs doc: the exception's documentation string """ destination = 'excs' def __init__(self, exc, doc): super().__init__() self.doc_data = { 'exc': exc.value, 'doc': doc } class returns(DocData): # noqa: N801 """Decorate an API method to display information about its return value. Args: rettype: the return value's type as an Enum value from apidoc.rettypes retdoc: the return value's documentation string """ destination = 'returns' def __init__(self, rettype=None, retdoc=None): super().__init__() self.doc_data = { 'type': rettype.value, 'doc': retdoc } diff --git a/swh/web/api/apiresponse.py b/swh/web/api/apiresponse.py index 4e0f7af20..9c7b471b8 100644 --- a/swh/web/api/apiresponse.py +++ b/swh/web/api/apiresponse.py @@ -1,173 +1,173 @@ # Copyright (C) 2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information import json from rest_framework.response import Response from swh.storage.exc import StorageDBError, StorageAPIError from swh.web.api import utils from swh.web.api.exc import NotFoundExc, ForbiddenExc def compute_link_header(rv, options): """Add Link header in returned value results. Expects rv to be a dict with 'results' and 'headers' key: 'results': the returned value expected to be shown 'headers': dictionary with link-next and link-prev Args: rv (dict): with keys: - 'headers': potential headers with 'link-next' and 'link-prev' keys - 'results': containing the result to return options (dict): the initial dict to update with result if any Returns: Dict with optional keys 'link-next' and 'link-prev'. """ link_headers = [] if 'headers' not in rv: return {} rv_headers = rv['headers'] if 'link-next' in rv_headers: link_headers.append('<%s>; rel="next"' % ( rv_headers['link-next'])) if 'link-prev' in rv_headers: link_headers.append('<%s>; rel="previous"' % ( rv_headers['link-prev'])) if link_headers: link_header_str = ','.join(link_headers) headers = options.get('headers', {}) headers.update({ 'Link': link_header_str }) return headers return {} def filter_by_fields(request, data): """Extract a request parameter 'fields' if it exists to permit the filtering on the data dict's keys. If such field is not provided, returns the data as is. """ - fields = request.query_params.get('fields') + fields = utils.get_query_params(request).get('fields') if fields: fields = set(fields.split(',')) data = utils.filter_field_keys(data, fields) return data def transform(rv): """Transform an eventual returned value with multiple layer of information with only what's necessary. If the returned value rv contains the 'results' key, this is the associated value which is returned. Otherwise, return the initial dict without the potential 'headers' key. """ if 'results' in rv: return rv['results'] if 'headers' in rv: rv.pop('headers') return rv def make_api_response(request, data, doc_data={}, options={}): """Generates an API response based on the requested mimetype. Args: request: a DRF Request object data: raw data to return in the API response doc_data: documentation data for HTML response options: optionnal data that can be used to generate the response Returns: a DRF Response a object """ if data: options['headers'] = compute_link_header(data, options) data = transform(data) data = filter_by_fields(request, data) doc_env = doc_data headers = {} if 'headers' in options: doc_env['headers_data'] = options['headers'] headers = options['headers'] # get request status code doc_env['status_code'] = options.get('status', 200) response_args = {'status': doc_env['status_code'], 'headers': headers, 'content_type': request.accepted_media_type} # when requesting HTML, typically when browsing the API through its # documented views, we need to enrich the input data with documentation # related ones and inform DRF that we request HTML template rendering if request.accepted_media_type == 'text/html': if data: data = json.dumps(data, sort_keys=True, indent=4, separators=(',', ': ')) doc_env['response_data'] = data doc_env['request'] = request doc_env['heading'] = utils.shorten_path(str(request.path)) response_args['data'] = doc_env response_args['template_name'] = 'apidoc.html' # otherwise simply return the raw data and let DRF picks # the correct renderer (JSON or YAML) else: response_args['data'] = data return Response(**response_args) def error_response(request, error, doc_data): """Private function to create a custom error response. Args: request: a DRF Request object error: the exception that caused the error doc_data: documentation data for HTML response """ error_code = 400 if isinstance(error, NotFoundExc): error_code = 404 elif isinstance(error, ForbiddenExc): error_code = 403 elif isinstance(error, StorageDBError): error_code = 503 elif isinstance(error, StorageAPIError): error_code = 503 error_opts = {'status': error_code} error_data = { 'exception': error.__class__.__name__, 'reason': str(error), } return make_api_response(request, error_data, doc_data, options=error_opts) diff --git a/swh/web/api/apiurls.py b/swh/web/api/apiurls.py index 734c79211..f93cc9279 100644 --- a/swh/web/api/apiurls.py +++ b/swh/web/api/apiurls.py @@ -1,124 +1,125 @@ # Copyright (C) 2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information import re from django.conf.urls import url from rest_framework.decorators import api_view class APIUrls(object): """ Class to manage API documentation URLs. * Indexes all routes documented using apidoc's decorators. * Tracks endpoint/request processing method relationships for use in generating related urls in API documentation """ apidoc_routes = {} method_endpoints = {} urlpatterns = [] @classmethod def get_app_endpoints(cls): return cls.apidoc_routes @classmethod def get_method_endpoints(cls, f): if f.__name__ not in cls.method_endpoints: cls.method_endpoints[f.__name__] = cls.group_routes_by_method(f) return cls.method_endpoints[f.__name__] @classmethod def group_routes_by_method(cls, f): """ Group URL endpoints according to their processing method. Returns: A dict where keys are the processing method names, and values are the routes that are bound to the key method. """ rules = [] for urlp in cls.urlpatterns: endpoint = urlp.callback.__name__ if endpoint != f.__name__: continue - method_names = urlp.callback.view_class.http_method_names + method_names = urlp.callback.http_method_names url_rule = urlp.regex.pattern.replace('^', '/').replace('$', '') url_rule_params = re.findall('\([^)]+\)', url_rule) for param in url_rule_params: param_name = re.findall('<(.*)>', param) param_name = param_name[0] if len(param_name) > 0 else None if param_name and hasattr(f, 'doc_data'): param_index = \ next(i for (i, d) in enumerate(f.doc_data['args']) if d['name'] == param_name) if param_index is not None: url_rule = url_rule.replace( param, '<' + f.doc_data['args'][param_index]['name'] + ': ' + f.doc_data['args'][param_index]['type'] + '>') rule_dict = {'rule': '/api' + url_rule, 'name': urlp.name, 'methods': {method.upper() for method in method_names} } rules.append(rule_dict) return rules @classmethod def index_add_route(cls, route, docstring, **kwargs): """ Add a route to the self-documenting API reference """ route_view_name = route[1:-1].replace('/', '-') if route not in cls.apidoc_routes: d = {'docstring': docstring, 'route_view_name': route_view_name} for k, v in kwargs.items(): d[k] = v cls.apidoc_routes[route] = d @classmethod def index_add_url_pattern(cls, url_pattern, view, view_name): cls.urlpatterns.append(url(url_pattern, view, name=view_name)) @classmethod def get_url_patterns(cls): return cls.urlpatterns class api_route(object): # noqa: N801 """ Decorator to ease the registration of an API endpoint using the Django REST Framework. Args: url_pattern: the url pattern used by DRF to identify the API route view_name: the name of the API view associated to the route used to reverse the url methods: array of HTTP methods supported by the API route """ def __init__(self, url_pattern=None, view_name=None, methods=['GET', 'HEAD'], api_version='1'): super().__init__() self.url_pattern = '^' + api_version + url_pattern + '$' self.view_name = view_name self.methods = methods def __call__(self, f): # create a DRF view from the wrapped function @api_view(self.methods) def api_view_f(*args, **kwargs): return f(*args, **kwargs) - # small hack for correctly generating API endpoints index doc + # small hacks for correctly generating API endpoints index doc api_view_f.__name__ = f.__name__ + api_view_f.http_method_names = self.methods # register the route and its view in the endpoints index APIUrls.index_add_url_pattern(self.url_pattern, api_view_f, self.view_name) return f diff --git a/swh/web/api/tests/test_apiresponse.py b/swh/web/api/tests/test_apiresponse.py index cc4855ed9..84bb15c18 100644 --- a/swh/web/api/tests/test_apiresponse.py +++ b/swh/web/api/tests/test_apiresponse.py @@ -1,177 +1,176 @@ # Copyright (C) 2015-2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU Affero General Public License version 3, or any later version # See top-level LICENSE file for more information import json import unittest from rest_framework.test import APIRequestFactory from nose.tools import istest from unittest.mock import patch from swh.web.api.apiresponse import ( compute_link_header, transform, make_api_response, filter_by_fields ) api_request_factory = APIRequestFactory() class SWHComputeLinkHeaderTest(unittest.TestCase): @istest def compute_link_header(self): rv = { 'headers': {'link-next': 'foo', 'link-prev': 'bar'}, 'results': [1, 2, 3] } options = {} # when headers = compute_link_header( rv, options) self.assertEquals(headers, { 'Link': '; rel="next",; rel="previous"', }) @istest def compute_link_header_nothing_changed(self): rv = {} options = {} # when headers = compute_link_header( rv, options) self.assertEquals(headers, {}) @istest def compute_link_header_nothing_changed_2(self): rv = {'headers': {}} options = {} # when headers = compute_link_header( rv, options) self.assertEquals(headers, {}) class SWHTransformProcessorTest(unittest.TestCase): @istest def transform_only_return_results_1(self): rv = {'results': {'some-key': 'some-value'}} self.assertEquals(transform(rv), {'some-key': 'some-value'}) @istest def transform_only_return_results_2(self): rv = {'headers': {'something': 'do changes'}, 'results': {'some-key': 'some-value'}} self.assertEquals(transform(rv), {'some-key': 'some-value'}) @istest def transform_do_remove_headers(self): rv = {'headers': {'something': 'do changes'}, 'some-key': 'some-value'} self.assertEquals(transform(rv), {'some-key': 'some-value'}) @istest def transform_do_nothing(self): rv = {'some-key': 'some-value'} self.assertEquals(transform(rv), {'some-key': 'some-value'}) class RendererTestCase(unittest.TestCase): @patch('swh.web.api.apiresponse.json') @patch('swh.web.api.apiresponse.filter_by_fields') @patch('swh.web.api.apiresponse.utils.shorten_path') @istest def swh_multi_response_mimetype(self, mock_shorten_path, mock_filter, mock_json): # given data = { 'data': [12, 34], 'id': 'adc83b19e793491b1c6ea0fd8b46cd9f32e592fc' } mock_filter.return_value = data mock_shorten_path.return_value = 'my_short_path' accepted_response_formats = {'html': 'text/html', 'yaml': 'application/yaml', 'json': 'application/json'} for format in accepted_response_formats: request = api_request_factory.get('/api/test/path/') mime_type = accepted_response_formats[format] setattr(request, 'accepted_media_type', mime_type) if mime_type == 'text/html': expected_data = { 'response_data': json.dumps(data), 'request': request, 'headers_data': {}, 'heading': 'my_short_path', 'status_code': 200 } mock_json.dumps.return_value = json.dumps(data) else: expected_data = data # when rv = make_api_response(request, data) # then mock_filter.assert_called_with(request, data) self.assertEqual(rv.data, expected_data) self.assertEqual(rv.status_code, 200) if mime_type == 'text/html': self.assertEqual(rv.template_name, 'apidoc.html') @istest def swh_filter_renderer_do_nothing(self): # given input_data = {'a': 'some-data'} request = api_request_factory.get('/api/test/path/', data={}) setattr(request, 'query_params', request.GET) # when actual_data = filter_by_fields(request, input_data) # then self.assertEquals(actual_data, input_data) - @patch('swh.web.api.apiresponse.utils') + @patch('swh.web.api.apiresponse.utils.filter_field_keys') @istest - def swh_filter_renderer_do_filter(self, mock_utils): + def swh_filter_renderer_do_filter(self, mock_ffk): # given - mock_utils.filter_field_keys.return_value = {'a': 'some-data'} + mock_ffk.return_value = {'a': 'some-data'} request = api_request_factory.get('/api/test/path/', data={'fields': 'a,c'}) setattr(request, 'query_params', request.GET) input_data = {'a': 'some-data', 'b': 'some-other-data'} # when actual_data = filter_by_fields(request, input_data) # then self.assertEquals(actual_data, {'a': 'some-data'}) - mock_utils.filter_field_keys.assert_called_once_with(input_data, - {'a', 'c'}) + mock_ffk.assert_called_once_with(input_data, {'a', 'c'}) diff --git a/swh/web/api/utils.py b/swh/web/api/utils.py index f0f91caed..7ab109c34 100644 --- a/swh/web/api/utils.py +++ b/swh/web/api/utils.py @@ -1,400 +1,411 @@ # Copyright (C) 2015-2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU Affero General Public License version 3, or any later version # See top-level LICENSE file for more information import re -from django.urls import reverse +from django.core.urlresolvers import reverse from datetime import datetime, timezone from dateutil import parser from .exc import BadInputExc def filter_endpoints(url_map, prefix_url_rule, blacklist=[]): """Filter endpoints by prefix url rule. Args: - url_map: Url Werkzeug.Map of rules - prefix_url_rule: prefix url string - blacklist: blacklist of some url Returns: Dictionary of url_rule with values methods and endpoint. The key is the url, the associated value is a dictionary of 'methods' (possible http methods) and 'endpoint' (python function) """ out = {} for r in url_map: rule = r['rule'] if rule == prefix_url_rule or rule in blacklist: continue if rule.startswith(prefix_url_rule): out[rule] = {'methods': sorted(map(str, r['methods'])), 'endpoint': r['endpoint']} return out def fmap(f, data): """Map f to data at each level. This must keep the origin data structure type: - map -> map - dict -> dict - list -> list - None -> None Args: f: function that expects one argument. data: data to traverse to apply the f function. list, map, dict or bare value. Returns: The same data-structure with modified values by the f function. """ if data is None: return data if isinstance(data, map): return map(lambda y: fmap(f, y), (x for x in data)) if isinstance(data, list): return [fmap(f, x) for x in data] if isinstance(data, dict): return {k: fmap(f, v) for (k, v) in data.items()} return f(data) def prepare_data_for_view(data, encoding='utf-8'): def prepare_data(s): # Note: can only be 'data' key with bytes of raw content if isinstance(s, bytes): try: return s.decode(encoding) except: return "Cannot decode the data bytes, try and set another " \ "encoding in the url (e.g. ?encoding=utf8) or " \ "download directly the " \ "content's raw data." if isinstance(s, str): return re.sub(r'/api/1/', r'/browse/', s) return s return fmap(prepare_data, data) def filter_field_keys(data, field_keys): """Given an object instance (directory or list), and a csv field keys to filter on. Return the object instance with filtered keys. Note: Returns obj as is if it's an instance of types not in (dictionary, list) Args: - data: one object (dictionary, list...) to filter. - field_keys: csv or set of keys to filter the object on Returns: obj filtered on field_keys """ if isinstance(data, map): return map(lambda x: filter_field_keys(x, field_keys), data) if isinstance(data, list): return [filter_field_keys(x, field_keys) for x in data] if isinstance(data, dict): return {k: v for (k, v) in data.items() if k in field_keys} return data def person_to_string(person): """Map a person (person, committer, tagger, etc...) to a string. """ return ''.join([person['name'], ' <', person['email'], '>']) def parse_timestamp(timestamp): """Given a time or timestamp (as string), parse the result as datetime. Returns: a timezone-aware datetime representing the parsed value. None if the parsing fails. Samples: - 2016-01-12 - 2016-01-12T09:19:12+0100 - Today is January 1, 2047 at 8:21:00AM - 1452591542 """ if not timestamp: return None try: return parser.parse(timestamp, ignoretz=False, fuzzy=True) except: try: return datetime.utcfromtimestamp(float(timestamp)).replace( tzinfo=timezone.utc) except (ValueError, OverflowError) as e: raise BadInputExc(e) def enrich_object(object): """Enrich an object (revision, release) with link to the 'target' of type 'target_type'. Args: object: An object with target and target_type keys (e.g. release, revision) Returns: Object enriched with target_url pointing to the right swh.web.ui.api urls for the pointing object (revision, release, content, directory) """ obj = object.copy() if 'target' in obj and 'target_type' in obj: if obj['target_type'] == 'revision': obj['target_url'] = reverse('revision', kwargs={'sha1_git': obj['target']}) elif obj['target_type'] == 'release': obj['target_url'] = reverse('release', kwargs={'sha1_git': obj['target']}) elif obj['target_type'] == 'content': obj['target_url'] = \ reverse('content', kwargs={'q': 'sha1_git:' + obj['target']}) elif obj['target_type'] == 'directory': obj['target_url'] = reverse('directory', kwargs={'sha1_git': obj['target']}) if 'author' in obj: author = obj['author'] obj['author_url'] = reverse('person', kwargs={'person_id': author['id']}) return obj enrich_release = enrich_object def enrich_directory(directory, context_url=None): """Enrich directory with url to content or directory. """ if 'type' in directory: target_type = directory['type'] target = directory['target'] if target_type == 'file': directory['target_url'] = \ reverse('content', kwargs={'q': 'sha1_git:%s' % target}) if context_url: directory['file_url'] = context_url + directory['name'] + '/' else: directory['target_url'] = reverse('directory', kwargs={'sha1_git': target}) if context_url: directory['dir_url'] = context_url + directory['name'] + '/' return directory def enrich_metadata_endpoint(content): """Enrich metadata endpoint with link to the upper metadata endpoint. """ c = content.copy() c['content_url'] = reverse('content', args=['sha1:%s' % c['id']]) return c def enrich_content(content, top_url=False): """Enrich content with links to: - data_url: its raw data - filetype_url: its filetype information """ for h in ['sha1', 'sha1_git', 'sha256']: if h in content: q = '%s:%s' % (h, content[h]) if top_url: content['content_url'] = reverse('content', kwargs={'q': q}) content['data_url'] = reverse('content-raw', kwargs={'q': q}) content['filetype_url'] = reverse('content-filetype', kwargs={'q': q}) content['language_url'] = reverse('content-language', kwargs={'q': q}) content['license_url'] = reverse('content-license', kwargs={'q': q}) break return content def enrich_entity(entity): """Enrich entity with """ if 'uuid' in entity: entity['uuid_url'] = reverse('entity', kwargs={'uuid': entity['uuid']}) if 'parent' in entity and entity['parent']: entity['parent_url'] = reverse('entity', kwargs={'uuid': entity['parent']}) return entity def _get_path_list(path_string): """Helper for enrich_revision: get a list of the sha1 id of the navigation breadcrumbs, ordered from the oldest to the most recent. Args: path_string: the path as a '/'-separated string Returns: The navigation context as a list of sha1 revision ids """ return path_string.split('/') def _get_revision_contexts(rev_id, context): """Helper for enrich_revision: retrieve for the revision id and potentially the navigation breadcrumbs the context to pass to parents and children of of the revision. Args: rev_id: the revision's sha1 id context: the current navigation context Returns: The context for parents, children and the url of the direct child as a tuple in that order. """ context_for_parents = None context_for_children = None url_direct_child = None if not context: return (rev_id, None, None) path_list = _get_path_list(context) context_for_parents = '%s/%s' % (context, rev_id) prev_for_children = path_list[:-1] if len(prev_for_children) > 0: context_for_children = '/'.join(prev_for_children) child_id = path_list[-1] # This commit is not the first commit in the path if context_for_children: url_direct_child = reverse('revision-context', kwargs={'sha1_git': child_id, 'context': context_for_children}) # This commit is the first commit in the path else: url_direct_child = reverse('revision', kwargs={'sha1_git': child_id}) return (context_for_parents, context_for_children, url_direct_child) def _make_child_url(rev_children, context): """Helper for enrich_revision: retrieve the list of urls corresponding to the children of the current revision according to the navigation breadcrumbs. Args: rev_children: a list of revision id context: the '/'-separated navigation breadcrumbs Returns: the list of the children urls according to the context """ children = [] for child in rev_children: if context and child != _get_path_list(context)[-1]: children.append(reverse('revision', kwargs={'sha1_git': child})) elif not context: children.append(reverse('revision', kwargs={'sha1_git': child})) return children def enrich_revision(revision, context=None): """Enrich revision with links where it makes sense (directory, parents). Keep track of the navigation breadcrumbs if they are specified. Args: revision: the revision as a dict context: the navigation breadcrumbs as a /-separated string of revision sha1_git """ ctx_parents, ctx_children, url_direct_child = _get_revision_contexts( revision['id'], context) revision['url'] = reverse('revision', kwargs={'sha1_git': revision['id']}) revision['history_url'] = reverse('revision-log', kwargs={'sha1_git': revision['id']}) if context: revision['history_context_url'] = reverse( 'revision-log', kwargs={'sha1_git': revision['id'], 'prev_sha1s': context}) if 'author' in revision: author = revision['author'] revision['author_url'] = reverse('person', kwargs={'person_id': author['id']}) if 'committer' in revision: committer = revision['committer'] revision['committer_url'] = \ reverse('person', kwargs={'person_id': committer['id']}) if 'directory' in revision: revision['directory_url'] = \ reverse('directory', kwargs={'sha1_git': revision['directory']}) if 'parents' in revision: parents = [] for parent in revision['parents']: parents.append({ 'id': parent, 'url': reverse('revision', kwargs={'sha1_git': parent}) }) revision['parents'] = parents if 'children' in revision: children = _make_child_url(revision['children'], context) if url_direct_child: children.append(url_direct_child) revision['children_urls'] = children else: if url_direct_child: revision['children_urls'] = [url_direct_child] if 'message_decoding_failed' in revision: revision['message_url'] = reverse('revision-raw-message', kwargs={'sha1_git': revision['id']}) return revision def shorten_path(path): """Shorten the given path: for each hash present, only return the first 8 characters followed by an ellipsis""" sha256_re = r'([0-9a-f]{8})[0-9a-z]{56}' sha1_re = r'([0-9a-f]{8})[0-9a-f]{32}' ret = re.sub(sha256_re, r'\1...', path) return re.sub(sha1_re, r'\1...', ret) + + +def get_query_params(request): + """Utility functions for retrieving query parameters from a DRF request + object. Its purpose is to handle multiple versions of DRF.""" + if hasattr(request, 'query_params'): + # DRF >= 3.0 uses query_params attribute + return request.query_params + else: + # while DRF < 3.0 uses QUERY_PARAMS attribute + return request.QUERY_PARAMS diff --git a/swh/web/api/views/__init__.py b/swh/web/api/views/__init__.py index 8f1783fd7..10e897e5d 100644 --- a/swh/web/api/views/__init__.py +++ b/swh/web/api/views/__init__.py @@ -1,92 +1,92 @@ # Copyright (C) 2015-2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU Affero General Public License version 3, or any later version # See top-level LICENSE file for more information from django.conf.urls import url from rest_framework.response import Response from rest_framework.decorators import api_view from types import GeneratorType from swh.web.api.exc import NotFoundExc from swh.web.api.apiurls import APIUrls, api_route # canned doc string snippets that are used in several doc strings _doc_arg_content_id = """A "[hash_type:]hash" content identifier, where hash_type is one of "sha1" (the default), "sha1_git", "sha256", and hash is a checksum obtained with the hash_type hashing algorithm.""" _doc_arg_last_elt = 'element to start listing from, for pagination purposes' _doc_arg_per_page = 'number of elements to list, for pagination purposes' _doc_exc_bad_id = 'syntax error in the given identifier(s)' _doc_exc_id_not_found = 'no object matching the given criteria could be found' _doc_ret_revision_meta = 'metadata of the revision identified by sha1_git' _doc_ret_revision_log = """list of dictionaries representing the metadata of each revision found in the commit log heading to revision sha1_git. For each commit at least the following information are returned: author/committer, authoring/commit timestamps, revision id, commit message, parent (i.e., immediately preceding) commits, "root" directory id.""" _doc_header_link = """indicates that a subsequent result page is available, pointing to it""" def _api_lookup(lookup_fn, *args, notfound_msg='Object not found', enrich_fn=lambda x: x): """Capture a redundant behavior of: - looking up the backend with a criteria (be it an identifier or checksum) passed to the function lookup_fn - if nothing is found, raise an NotFoundExc exception with error message notfound_msg. - Otherwise if something is returned: - either as list, map or generator, map the enrich_fn function to it and return the resulting data structure as list. - either as dict and pass to enrich_fn and return the dict enriched. Args: - criteria: discriminating criteria to lookup - lookup_fn: function expects one criteria and optional supplementary *args. - notfound_msg: if nothing matching the criteria is found, raise NotFoundExc with this error message. - enrich_fn: Function to use to enrich the result returned by lookup_fn. Default to the identity function if not provided. - *args: supplementary arguments to pass to lookup_fn. Raises: NotFoundExp or whatever `lookup_fn` raises. """ res = lookup_fn(*args) if not res: raise NotFoundExc(notfound_msg) if isinstance(res, (map, list, GeneratorType)): return [enrich_fn(x) for x in res] return enrich_fn(res) -@api_view() +@api_view(['GET', 'HEAD']) def api_home(request): return Response({}, template_name='api.html') APIUrls.urlpatterns.append(url(r'^$', api_home, name='homepage')) @api_route(r'/', 'endpoints') def api_endpoints(request): """Display the list of opened api endpoints. """ routes = APIUrls.get_app_endpoints().copy() for route, doc in routes.items(): doc['doc_intro'] = doc['docstring'].split('\n\n')[0] # Return a list of routes with consistent ordering env = { 'doc_routes': sorted(routes.items()) } return Response(env, template_name="api-endpoints.html") diff --git a/swh/web/api/views/content.py b/swh/web/api/views/content.py index 67593acb7..47e0f2a61 100644 --- a/swh/web/api/views/content.py +++ b/swh/web/api/views/content.py @@ -1,341 +1,341 @@ # Copyright (C) 2015-2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU Affero General Public License version 3, or any later version # See top-level LICENSE file for more information import functools from django.http import QueryDict -from django.urls import reverse +from django.core.urlresolvers import reverse from django.http import HttpResponse from swh.web.api import service, utils from swh.web.api import apidoc as api_doc from swh.web.api.exc import NotFoundExc, ForbiddenExc from swh.web.api.apiurls import api_route from swh.web.api.views import ( _api_lookup, _doc_exc_id_not_found, _doc_header_link, _doc_arg_last_elt, _doc_arg_per_page, _doc_exc_bad_id, _doc_arg_content_id ) @api_route(r'/content/(?P.+)/provenance/', 'content-provenance') @api_doc.route('/content/provenance/', tags=['hidden']) @api_doc.arg('q', default='sha1_git:88b9b366facda0b5ff8d8640ee9279bed346f242', argtype=api_doc.argtypes.algo_and_hash, argdoc=_doc_arg_content_id) @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""List of provenance information (dict) for the matched content.""") def api_content_provenance(request, q): """Return content's provenance information if any. """ def _enrich_revision(provenance): p = provenance.copy() p['revision_url'] = \ reverse('revision', kwargs={'sha1_git': provenance['revision']}) p['content_url'] = \ reverse('content', kwargs={'q': 'sha1_git:%s' % provenance['content']}) p['origin_url'] = \ reverse('origin', kwargs={'origin_id': provenance['origin']}) p['origin_visits_url'] = \ reverse('origin-visits', kwargs={'origin_id': provenance['origin']}) p['origin_visit_url'] = \ reverse('origin-visit', kwargs={'origin_id': provenance['origin'], 'visit_id': provenance['visit']}) return p return _api_lookup( service.lookup_content_provenance, q, notfound_msg='Content with {} not found.'.format(q), enrich_fn=_enrich_revision) @api_route(r'/content/(?P.+)/filetype/', 'content-filetype') @api_doc.route('/content/filetype/', tags=['upcoming']) @api_doc.arg('q', default='sha1:1fc6129a692e7a87b5450e2ba56e7669d0c5775d', argtype=api_doc.argtypes.algo_and_hash, argdoc=_doc_arg_content_id) @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""Filetype information (dict) for the matched content.""") def api_content_filetype(request, q): """Get information about the detected MIME type of a content object. """ return _api_lookup( service.lookup_content_filetype, q, notfound_msg='No filetype information found for content {}.'.format(q), enrich_fn=utils.enrich_metadata_endpoint) @api_route(r'/content/(?P.+)/language/', 'content-language') @api_doc.route('/content/language/', tags=['upcoming']) @api_doc.arg('q', default='sha1:1fc6129a692e7a87b5450e2ba56e7669d0c5775d', argtype=api_doc.argtypes.algo_and_hash, argdoc=_doc_arg_content_id) @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""Language information (dict) for the matched content.""") def api_content_language(request, q): """Get information about the detected (programming) language of a content object. """ return _api_lookup( service.lookup_content_language, q, notfound_msg='No language information found for content {}.'.format(q), enrich_fn=utils.enrich_metadata_endpoint) @api_route(r'/content/(?P.+)/license/', 'content-license') @api_doc.route('/content/license/', tags=['upcoming']) @api_doc.arg('q', default='sha1:1fc6129a692e7a87b5450e2ba56e7669d0c5775d', argtype=api_doc.argtypes.algo_and_hash, argdoc=_doc_arg_content_id) @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""License information (dict) for the matched content.""") def api_content_license(request, q): """Get information about the detected license of a content object. """ return _api_lookup( service.lookup_content_license, q, notfound_msg='No license information found for content {}.'.format(q), enrich_fn=utils.enrich_metadata_endpoint) @api_route(r'/content/(?P.+)/ctags/', 'content-ctags') @api_doc.route('/content/ctags/', tags=['upcoming']) @api_doc.arg('q', default='sha1:1fc6129a692e7a87b5450e2ba56e7669d0c5775d', argtype=api_doc.argtypes.algo_and_hash, argdoc=_doc_arg_content_id) @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""Ctags symbol (dict) for the matched content.""") def api_content_ctags(request, q): """Get information about all `Ctags `_-style symbols defined in a content object. """ return _api_lookup( service.lookup_content_ctags, q, notfound_msg='No ctags symbol found for content {}.'.format(q), enrich_fn=utils.enrich_metadata_endpoint) @api_route(r'/content/(?P.+)/raw/', 'content-raw') @api_doc.route('/content/raw/', handle_response=True) @api_doc.arg('q', default='adc83b19e793491b1c6ea0fd8b46cd9f32e592fc', argtype=api_doc.argtypes.algo_and_hash, argdoc=_doc_arg_content_id) @api_doc.param('filename', default=None, argtype=api_doc.argtypes.str, doc='User\'s desired filename. If provided, the downloaded' ' content will get that filename.') @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.octet_stream, retdoc='The raw content data as an octet stream') def api_content_raw(request, q): """Get the raw content of a content object (AKA "blob"), as a byte sequence. """ def generate(content): yield content['data'] content_raw = service.lookup_content_raw(q) if not content_raw: raise NotFoundExc('Content %s is not found.' % q) content_filetype = service.lookup_content_filetype(q) if not content_filetype: raise NotFoundExc('Content %s is not available for download.' % q) mimetype = content_filetype['mimetype'] if 'text/' not in mimetype: raise ForbiddenExc('Only textual content is available for download. ' 'Actual content mimetype is %s.' % mimetype) - filename = request.query_params.get('filename') + filename = utils.get_query_params(request).get('filename') if not filename: filename = 'content_%s_raw' % q.replace(':', '_') response = HttpResponse(generate(content_raw), content_type='application/octet-stream') response['Content-disposition'] = 'attachment; filename=%s' % filename return response @api_route(r'/content/symbol/search/', 'content-symbol', methods=['POST']) @api_route(r'/content/symbol/(?P.+)/', 'content-symbol') @api_doc.route('/content/symbol/', tags=['upcoming']) @api_doc.arg('q', default='hello', argtype=api_doc.argtypes.str, argdoc="""An expression string to lookup in swh's raw content""") @api_doc.header('Link', doc=_doc_header_link) @api_doc.param('last_sha1', default=None, argtype=api_doc.argtypes.str, doc=_doc_arg_last_elt) @api_doc.param('per_page', default=10, argtype=api_doc.argtypes.int, doc=_doc_arg_per_page) @api_doc.returns(rettype=api_doc.rettypes.list, retdoc="""A list of dict whose content matches the expression. Each dict has the following keys: - id (bytes): identifier of the content - name (text): symbol whose content match the expression - kind (text): kind of the symbol that matched - lang (text): Language for that entry - line (int): Number line for the symbol """) def api_content_symbol(request, q=None): """Search content objects by `Ctags `_-style symbol (e.g., function name, data type, method, ...). """ result = {} - last_sha1 = request.query_params.get('last_sha1', None) - per_page = int(request.query_params.get('per_page', '10')) + last_sha1 = utils.get_query_params(request).get('last_sha1', None) + per_page = int(utils.get_query_params(request).get('per_page', '10')) def lookup_exp(exp, last_sha1=last_sha1, per_page=per_page): return service.lookup_expression(exp, last_sha1, per_page) symbols = _api_lookup( lookup_exp, q, notfound_msg="No indexed raw content match expression '{}'.".format(q), enrich_fn=functools.partial(utils.enrich_content, top_url=True)) if symbols: l = len(symbols) if l == per_page: query_params = QueryDict('', mutable=True) new_last_sha1 = symbols[-1]['sha1'] query_params['last_sha1'] = new_last_sha1 - if request.query_params.get('per_page'): + if utils.get_query_params(request).get('per_page'): query_params['per_page'] = per_page result['headers'] = { 'link-next': reverse('content-symbol', kwargs={'q': q}) + '?' + query_params.urlencode() } result.update({ 'results': symbols }) return result @api_route(r'/content/known/search/', 'content-known', methods=['POST']) @api_route(r'/content/known/(?P(?!search).*)/', 'content-known') @api_doc.route('/content/known/', tags=['hidden']) @api_doc.arg('q', default='adc83b19e793491b1c6ea0fd8b46cd9f32e592fc', argtype=api_doc.argtypes.sha1, argdoc='content identifier as a sha1 checksum') @api_doc.param('q', default=None, argtype=api_doc.argtypes.str, doc="""(POST request) An algo_hash:hash string, where algo_hash is one of sha1, sha1_git or sha256 and hash is the hash to search for in SWH""") @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""a dictionary with results (found/not found for each given identifier) and statistics about how many identifiers were found""") def api_check_content_known(request, q=None): """Check whether some content (AKA "blob") is present in the archive. Lookup can be performed by various means: - a GET request with one or several hashes, separated by ',' - a POST request with one or several hashes, passed as (multiple) values for parameter 'q' """ response = {'search_res': None, 'search_stats': None} search_stats = {'nbfiles': 0, 'pct': 0} search_res = None queries = [] # GET: Many hash separated values request if q: hashes = q.split(',') for v in hashes: queries.append({'filename': None, 'sha1': v}) # POST: Many hash requests in post form submission elif request.method == 'POST': data = request.data # Remove potential inputs with no associated value for k, v in data.items(): if v is not None: if k == 'q' and len(v) > 0: queries.append({'filename': None, 'sha1': v}) elif v != '': queries.append({'filename': k, 'sha1': v}) if queries: lookup = service.lookup_multiple_hashes(queries) result = [] l = len(queries) for el in lookup: res_d = {'sha1': el['sha1'], 'found': el['found']} if 'filename' in el and el['filename']: res_d['filename'] = el['filename'] result.append(res_d) search_res = result nbfound = len([x for x in lookup if x['found']]) search_stats['nbfiles'] = l search_stats['pct'] = (nbfound / l) * 100 response['search_res'] = search_res response['search_stats'] = search_stats return response @api_route(r'/content/(?P.+)/', 'content') @api_doc.route('/content/') @api_doc.arg('q', default='adc83b19e793491b1c6ea0fd8b46cd9f32e592fc', argtype=api_doc.argtypes.algo_and_hash, argdoc=_doc_arg_content_id) @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""known metadata for content identified by q""") def api_content_metadata(request, q): """Get information about a content (AKA "blob") object. """ return _api_lookup( service.lookup_content, q, notfound_msg='Content with {} not found.'.format(q), enrich_fn=utils.enrich_content) diff --git a/swh/web/api/views/origin.py b/swh/web/api/views/origin.py index 114d26653..09b38c2cb 100644 --- a/swh/web/api/views/origin.py +++ b/swh/web/api/views/origin.py @@ -1,178 +1,178 @@ # Copyright (C) 2015-2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU Affero General Public License version 3, or any later version # See top-level LICENSE file for more information from django.http import QueryDict -from django.urls import reverse +from django.core.urlresolvers import reverse from swh.web.api import service, utils from swh.web.api import apidoc as api_doc from swh.web.api.apiurls import api_route from swh.web.api.views import ( _api_lookup, _doc_exc_id_not_found, _doc_header_link, _doc_arg_last_elt, _doc_arg_per_page ) @api_route(r'/origin/(?P[0-9]+)/', 'origin') @api_route(r'/origin/(?P[a-z]+)/url/(?P.+)', 'origin') @api_doc.route('/origin/') @api_doc.arg('origin_id', default=1, argtype=api_doc.argtypes.int, argdoc='origin identifier (when looking up by ID)') @api_doc.arg('origin_type', default='git', argtype=api_doc.argtypes.str, argdoc='origin type (when looking up by type+URL)') @api_doc.arg('origin_url', default='https://github.com/hylang/hy', argtype=api_doc.argtypes.path, argdoc='origin URL (when looking up by type+URL)') @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""The metadata of the origin corresponding to the given criteria""") def api_origin(request, origin_id=None, origin_type=None, origin_url=None): """Get information about a software origin. Software origins might be looked up by origin type and canonical URL (e.g., "git" + a "git clone" URL), or by their unique (but otherwise meaningless) identifier. """ ori_dict = { 'id': origin_id, 'type': origin_type, 'url': origin_url } ori_dict = {k: v for k, v in ori_dict.items() if ori_dict[k]} if 'id' in ori_dict: error_msg = 'Origin with id %s not found.' % ori_dict['id'] else: error_msg = 'Origin with type %s and URL %s not found' % ( ori_dict['type'], ori_dict['url']) def _enrich_origin(origin): if 'id' in origin: o = origin.copy() o['origin_visits_url'] = \ reverse('origin-visits', kwargs={'origin_id': origin['id']}) return o return origin return _api_lookup( service.lookup_origin, ori_dict, notfound_msg=error_msg, enrich_fn=_enrich_origin) @api_route(r'/origin/(?P[0-9]+)/visits/', 'origin-visits') @api_doc.route('/origin/visits/') @api_doc.arg('origin_id', default=1, argtype=api_doc.argtypes.int, argdoc='software origin identifier') @api_doc.header('Link', doc=_doc_header_link) @api_doc.param('last_visit', default=None, argtype=api_doc.argtypes.int, doc=_doc_arg_last_elt) @api_doc.param('per_page', default=10, argtype=api_doc.argtypes.int, doc=_doc_arg_per_page) @api_doc.returns(rettype=api_doc.rettypes.list, retdoc="""a list of dictionaries describing individual visits. For each visit, its identifier, timestamp (as UNIX time), outcome, and visit-specific URL for more information are given.""") def api_origin_visits(request, origin_id): """Get information about all visits of a given software origin. """ result = {} - per_page = int(request.query_params.get('per_page', '10')) - last_visit = request.query_params.get('last_visit') + per_page = int(utils.get_query_params(request).get('per_page', '10')) + last_visit = utils.get_query_params(request).get('last_visit') if last_visit: last_visit = int(last_visit) def _lookup_origin_visits( origin_id, last_visit=last_visit, per_page=per_page): return service.lookup_origin_visits( origin_id, last_visit=last_visit, per_page=per_page) def _enrich_origin_visit(origin_visit): ov = origin_visit.copy() ov['origin_visit_url'] = reverse('origin-visit', kwargs={'origin_id': origin_id, 'visit_id': ov['visit']}) return ov r = _api_lookup( _lookup_origin_visits, origin_id, notfound_msg='No origin {} found'.format(origin_id), enrich_fn=_enrich_origin_visit) if r: l = len(r) if l == per_page: new_last_visit = r[-1]['visit'] query_params = QueryDict('', mutable=True) query_params['last_visit'] = new_last_visit - if request.query_params.get('per_page'): + if utils.get_query_params(request).get('per_page'): query_params['per_page'] = per_page result['headers'] = { 'link-next': reverse('origin-visits', kwargs={'origin_id': origin_id}) + '?' + query_params.urlencode() } result.update({ 'results': r }) return result @api_route(r'/origin/(?P[0-9]+)/visit/(?P[0-9]+)/', 'origin-visit') @api_doc.route('/origin/visit/') @api_doc.arg('origin_id', default=1, argtype=api_doc.argtypes.int, argdoc='software origin identifier') @api_doc.arg('visit_id', default=1, argtype=api_doc.argtypes.int, argdoc="""visit identifier, relative to the origin identified by origin_id""") @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""dictionary containing both metadata for the entire visit (e.g., timestamp as UNIX time, visit outcome, etc.) and what was at the software origin during the visit (i.e., a mapping from branches to other archive objects)""") def api_origin_visit(request, origin_id, visit_id): """Get information about a specific visit of a software origin. """ def _enrich_origin_visit(origin_visit): ov = origin_visit.copy() ov['origin_url'] = reverse('origin', kwargs={'origin_id': ov['origin']}) if 'occurrences' in ov: ov['occurrences'] = { k: utils.enrich_object(v) for k, v in ov['occurrences'].items() } return ov return _api_lookup( service.lookup_origin_visit, origin_id, visit_id, notfound_msg=('No visit {} for origin {} found' .format(visit_id, origin_id)), enrich_fn=_enrich_origin_visit) diff --git a/swh/web/api/views/revision.py b/swh/web/api/views/revision.py index 4be5cf01c..fec1caca9 100644 --- a/swh/web/api/views/revision.py +++ b/swh/web/api/views/revision.py @@ -1,420 +1,420 @@ # Copyright (C) 2015-2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU Affero General Public License version 3, or any later version # See top-level LICENSE file for more information from django.http import QueryDict -from django.urls import reverse +from django.core.urlresolvers import reverse from django.http import HttpResponse from swh.web.api import service, utils from swh.web.api import apidoc as api_doc from swh.web.api.apiurls import api_route from swh.web.api.views import ( _api_lookup, _doc_exc_id_not_found, _doc_header_link, _doc_arg_per_page, _doc_exc_bad_id, _doc_ret_revision_log, _doc_ret_revision_meta ) def _revision_directory_by(revision, path, request_path, limit=100, with_data=False): """Compute the revision matching criterion's directory or content data. Args: revision: dictionary of criterions representing a revision to lookup path: directory's path to lookup request_path: request path which holds the original context to limit: optional query parameter to limit the revisions log (default to 100). For now, note that this limit could impede the transitivity conclusion about sha1_git not being an ancestor of with_data: indicate to retrieve the content's raw data if path resolves to a content. """ def enrich_directory_local(dir, context_url=request_path): return utils.enrich_directory(dir, context_url) rev_id, result = service.lookup_directory_through_revision( revision, path, limit=limit, with_data=with_data) content = result['content'] if result['type'] == 'dir': # dir_entries result['content'] = list(map(enrich_directory_local, content)) else: # content result['content'] = utils.enrich_content(content) return result @api_route(r'/revision/origin/(?P[0-9]+)/log/', 'revision-origin-log') @api_route(r'/revision/origin/(?P[0-9]+)' r'/ts/(?P.+)/log/', 'revision-origin-log') @api_route(r'/revision/origin/(?P[0-9]+)' r'/branch/(?P.+)' r'/ts/(?P.+)/log/', 'revision-origin-log') @api_route(r'/revision/origin/(?P[0-9]+)' r'/branch/(?P.+)/log/', 'revision-origin-log') @api_doc.route('/revision/origin/log/') @api_doc.arg('origin_id', default=1, argtype=api_doc.argtypes.int, argdoc="The revision's SWH origin identifier") @api_doc.arg('branch_name', default='refs/heads/master', argtype=api_doc.argtypes.path, argdoc="""(Optional) The revision's branch name within the origin specified. Defaults to 'refs/heads/master'.""") @api_doc.arg('ts', default='2000-01-17T11:23:54+00:00', argtype=api_doc.argtypes.ts, argdoc="""(Optional) A time or timestamp string to parse""") @api_doc.header('Link', doc=_doc_header_link) @api_doc.param('per_page', default=10, argtype=api_doc.argtypes.int, doc=_doc_arg_per_page) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc=_doc_ret_revision_log) def api_revision_log_by(request, origin_id, branch_name='refs/heads/master', ts=None): """Show the commit log for a revision, searching for it based on software origin, branch name, and/or visit timestamp. This endpoint behaves like ``/log``, but operates on the revision that has been found at a given software origin, close to a given point in time, pointed by a given branch. """ result = {} - per_page = int(request.query_params.get('per_page', '10')) + per_page = int(utils.get_query_params(request).get('per_page', '10')) if ts: ts = utils.parse_timestamp(ts) def lookup_revision_log_by_with_limit(o_id, br, ts, limit=per_page+1): return service.lookup_revision_log_by(o_id, br, ts, limit) error_msg = 'No revision matching origin %s ' % origin_id error_msg += ', branch name %s' % branch_name error_msg += (' and time stamp %s.' % ts) if ts else '.' rev_get = _api_lookup( lookup_revision_log_by_with_limit, origin_id, branch_name, ts, notfound_msg=error_msg, enrich_fn=utils.enrich_revision) l = len(rev_get) if l == per_page+1: revisions = rev_get[:-1] last_sha1_git = rev_get[-1]['id'] params = {k: v for k, v in {'origin_id': origin_id, 'branch_name': branch_name, 'ts': ts, }.items() if v is not None} query_params = QueryDict('', mutable=True) query_params['sha1_git'] = last_sha1_git - if request.query_params.get('per_page'): + if utils.get_query_params(request).get('per_page'): query_params['per_page'] = per_page result['headers'] = { 'link-next': reverse('revision-origin-log', kwargs=params) + (('?' + query_params.urlencode()) if len(query_params) > 0 else '') } else: revisions = rev_get result.update({'results': revisions}) return result @api_route(r'/revision/origin/(?P[0-9]+)/directory/', 'revision-directory') @api_route(r'/revision/origin/(?P[0-9]+)/directory/(?P.+)/', 'revision-directory') @api_route(r'/revision/origin/(?P[0-9]+)' r'/branch/(?P.+)/directory/', 'revision-directory') @api_route(r'/revision/origin/(?P[0-9]+)' r'/branch/(?P.+)/ts/(?P.+)/directory/', 'revision-directory') @api_route(r'/revision/origin/(?P[0-9]+)' r'/branch/(?P.+)/directory/(?P.+)/', 'revision-directory') @api_route(r'/revision/origin/(?P[0-9]+)' r'/branch/(?P.+)/ts/(?P.+)' r'/directory/(?P.+)/', 'revision-directory') @api_doc.route('/revision/origin/directory/', tags=['hidden']) @api_doc.arg('origin_id', default=1, argtype=api_doc.argtypes.int, argdoc="The revision's origin's SWH identifier") @api_doc.arg('branch_name', default='refs/heads/master', argtype=api_doc.argtypes.path, argdoc="""The optional branch for the given origin (default to master""") @api_doc.arg('ts', default='2000-01-17T11:23:54+00:00', argtype=api_doc.argtypes.ts, argdoc="""Optional timestamp (default to the nearest time crawl of timestamp)""") @api_doc.arg('path', default='Dockerfile', argtype=api_doc.argtypes.path, argdoc='The path to the directory or file to display') @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""The metadata of the revision corresponding to the given criteria""") def api_directory_through_revision_origin(request, origin_id, branch_name="refs/heads/master", ts=None, path=None, with_data=False): """Display directory or content information through a revision identified by origin/branch/timestamp. """ if ts: ts = utils.parse_timestamp(ts) return _revision_directory_by({'origin_id': origin_id, 'branch_name': branch_name, 'ts': ts }, path, request.path, with_data=with_data) @api_route(r'/revision/origin/(?P[0-9]+)/', 'revision-origin') @api_route(r'/revision/origin/(?P[0-9]+)' r'/branch/(?P.+)/', 'revision-origin') @api_route(r'/revision/origin/(?P[0-9]+)' r'/branch/(?P.+)/ts/(?P.+)/', 'revision-origin') @api_route(r'/revision/origin/(?P[0-9]+)/ts/(?P.+)/', 'revision-origin') @api_doc.route('/revision/origin/') @api_doc.arg('origin_id', default=1, argtype=api_doc.argtypes.int, argdoc='software origin identifier') @api_doc.arg('branch_name', default='refs/heads/master', argtype=api_doc.argtypes.path, argdoc="""(optional) fully-qualified branch name, e.g., "refs/heads/master". Defaults to the master branch.""") @api_doc.arg('ts', default=None, argtype=api_doc.argtypes.ts, argdoc="""(optional) timestamp close to which the revision pointed by the given branch should be looked up. Defaults to now.""") @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc=_doc_ret_revision_meta) def api_revision_with_origin(request, origin_id, branch_name="refs/heads/master", ts=None): """Get information about a revision, searching for it based on software origin, branch name, and/or visit timestamp. This endpoint behaves like ``/revision``, but operates on the revision that has been found at a given software origin, close to a given point in time, pointed by a given branch. """ ts = utils.parse_timestamp(ts) return _api_lookup( service.lookup_revision_by, origin_id, branch_name, ts, notfound_msg=('Revision with (origin_id: {}, branch_name: {}' ', ts: {}) not found.'.format(origin_id, branch_name, ts)), enrich_fn=utils.enrich_revision) @api_route(r'/revision/(?P[0-9a-f]+)/prev/(?P[0-9a-f/]+)/', 'revision-context') @api_doc.route('/revision/prev/', tags=['hidden']) @api_doc.arg('sha1_git', default='ec72c666fb345ea5f21359b7bc063710ce558e39', argtype=api_doc.argtypes.sha1_git, argdoc="The revision's sha1_git identifier") @api_doc.arg('context', default='6adc4a22f20bbf3bbc754f1ec8c82be5dfb5c71a', argtype=api_doc.argtypes.path, argdoc='The navigation breadcrumbs -- use at your own risk') @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc='The metadata of the revision identified by sha1_git') def api_revision_with_context(request, sha1_git, context): """Return information about revision with id sha1_git. """ def _enrich_revision(revision, context=context): return utils.enrich_revision(revision, context) return _api_lookup( service.lookup_revision, sha1_git, notfound_msg='Revision with sha1_git %s not found.' % sha1_git, enrich_fn=_enrich_revision) @api_route(r'/revision/(?P[0-9a-f]+)/', 'revision') @api_doc.route('/revision/') @api_doc.arg('sha1_git', default='aafb16d69fd30ff58afdd69036a26047f3aebdc6', argtype=api_doc.argtypes.sha1_git, argdoc="revision identifier") @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc=_doc_ret_revision_meta) def api_revision(request, sha1_git): """Get information about a revision. Revisions are identified by SHA1 checksums, compatible with Git commit identifiers. See ``revision_identifier`` in our `data model module `_ for details about how they are computed. """ return _api_lookup( service.lookup_revision, sha1_git, notfound_msg='Revision with sha1_git {} not found.'.format(sha1_git), enrich_fn=utils.enrich_revision) @api_route(r'/revision/(?P[0-9a-f]+)/raw/', 'revision-raw-message') @api_doc.route('/revision/raw/', tags=['hidden'], handle_response=True) @api_doc.arg('sha1_git', default='ec72c666fb345ea5f21359b7bc063710ce558e39', argtype=api_doc.argtypes.sha1_git, argdoc="The queried revision's sha1_git identifier") @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.octet_stream, retdoc="""The message of the revision identified by sha1_git as a downloadable octet stream""") def api_revision_raw_message(request, sha1_git): """Return the raw data of the message of revision identified by sha1_git """ raw = service.lookup_revision_message(sha1_git) response = HttpResponse(raw['message'], content_type='application/octet-stream') response['Content-disposition'] = \ 'attachment;filename=rev_%s_raw' % sha1_git return response @api_route(r'/revision/(?P[0-9a-f]+)/directory/', 'revision-directory') @api_route(r'/revision/(?P[0-9a-f]+)/directory/(?P.+)/', 'revision-directory') @api_doc.route('/revision/directory/') @api_doc.arg('sha1_git', default='ec72c666fb345ea5f21359b7bc063710ce558e39', argtype=api_doc.argtypes.sha1_git, argdoc='revision identifier') @api_doc.arg('dir_path', default='Documentation/BUG-HUNTING', argtype=api_doc.argtypes.path, argdoc="""path relative to the root directory of revision identifier by sha1_git""") @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc="""either a list of directory entries with their metadata, or the metadata of a single directory entry""") def api_revision_directory(request, sha1_git, dir_path=None, with_data=False): """Get information about directory (entry) objects associated to revisions. Each revision is associated to a single "root" directory. This endpoint behaves like ``/directory/``, but operates on the root directory associated to a given revision. """ return _revision_directory_by({'sha1_git': sha1_git}, dir_path, request.path, with_data=with_data) @api_route(r'/revision/(?P[0-9a-f]+)/log/', 'revision-log') @api_route(r'/revision/(?P[0-9a-f]+)' r'/prev/(?P[0-9a-f/]+)/log/', 'revision-log') @api_doc.route('/revision/log/') @api_doc.arg('sha1_git', default='37fc9e08d0c4b71807a4f1ecb06112e78d91c283', argtype=api_doc.argtypes.sha1_git, argdoc='revision identifier') @api_doc.arg('prev_sha1s', default='6adc4a22f20bbf3bbc754f1ec8c82be5dfb5c71a', argtype=api_doc.argtypes.path, argdoc="""(Optional) Navigation breadcrumbs (descendant revisions previously visited). If multiple values, use / as delimiter. """) @api_doc.header('Link', doc=_doc_header_link) @api_doc.param('per_page', default=10, argtype=api_doc.argtypes.int, doc=_doc_arg_per_page) @api_doc.raises(exc=api_doc.excs.badinput, doc=_doc_exc_bad_id) @api_doc.raises(exc=api_doc.excs.notfound, doc=_doc_exc_id_not_found) @api_doc.returns(rettype=api_doc.rettypes.dict, retdoc=_doc_ret_revision_log) def api_revision_log(request, sha1_git, prev_sha1s=None): """Get a list of all revisions heading to a given one, i.e., show the commit log. """ result = {} - per_page = int(request.query_params.get('per_page', '10')) + per_page = int(utils.get_query_params(request).get('per_page', '10')) def lookup_revision_log_with_limit(s, limit=per_page+1): return service.lookup_revision_log(s, limit) error_msg = 'Revision with sha1_git %s not found.' % sha1_git rev_get = _api_lookup(lookup_revision_log_with_limit, sha1_git, notfound_msg=error_msg, enrich_fn=utils.enrich_revision) l = len(rev_get) if l == per_page+1: rev_backward = rev_get[:-1] new_last_sha1 = rev_get[-1]['id'] query_params = QueryDict('', mutable=True) - if request.query_params.get('per_page'): + if utils.get_query_params(request).get('per_page'): query_params['per_page'] = per_page result['headers'] = { 'link-next': reverse('revision-log', kwargs={'sha1_git': new_last_sha1}) + (('?' + query_params.urlencode()) if len(query_params) > 0 else '') } else: rev_backward = rev_get if not prev_sha1s: # no nav breadcrumbs, so we're done revisions = rev_backward else: rev_forward_ids = prev_sha1s.split('/') rev_forward = _api_lookup( service.lookup_revision_multiple, rev_forward_ids, notfound_msg=error_msg, enrich_fn=utils.enrich_revision) revisions = rev_forward + rev_backward result.update({ 'results': revisions }) return result diff --git a/swh/web/settings.py b/swh/web/settings.py index 63f904e79..d52dbbe25 100644 --- a/swh/web/settings.py +++ b/swh/web/settings.py @@ -1,181 +1,180 @@ # Copyright (C) 2017 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information """ Django settings for swhweb project. Generated by 'django-admin startproject' using Django 1.11.3. For more information on this file, see https://docs.djangoproject.com/en/1.11/topics/settings/ For the full list of settings and their values, see https://docs.djangoproject.com/en/1.11/ref/settings/ """ import os from swh.web.config import get_config swh_web_config = get_config() # Build paths inside the project like this: os.path.join(BASE_DIR, ...) PROJECT_DIR = os.path.dirname(os.path.abspath(__file__)) # Quick-start development settings - unsuitable for production # See https://docs.djangoproject.com/en/1.11/howto/deployment/checklist/ # SECURITY WARNING: keep the secret key used in production secret! SECRET_KEY = swh_web_config['secret_key'] # SECURITY WARNING: don't run with debug turned on in production! DEBUG = swh_web_config['debug'] ALLOWED_HOSTS = ['127.0.0.1', 'localhost', 'testserver'] # Application definition INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', - 'django_extensions', 'rest_framework', 'swh.web.api' ] MIDDLEWARE = [ 'django.middleware.security.SecurityMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.common.CommonMiddleware', 'django.middleware.csrf.CsrfViewMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.contrib.messages.middleware.MessageMiddleware', 'django.middleware.clickjacking.XFrameOptionsMiddleware', ] ROOT_URLCONF = 'swh.web.urls' TEMPLATES = [ { 'BACKEND': 'django.template.backends.django.DjangoTemplates', 'DIRS': [], 'APP_DIRS': True, 'OPTIONS': { 'context_processors': [ 'django.template.context_processors.debug', 'django.template.context_processors.request', 'django.contrib.auth.context_processors.auth', 'django.contrib.messages.context_processors.messages', ], }, }, ] WSGI_APPLICATION = 'swh.web.wsgi.application' # Database # https://docs.djangoproject.com/en/1.11/ref/settings/#databases DATABASES = { 'default': { 'ENGINE': 'django.db.backends.sqlite3', 'NAME': os.path.join(PROJECT_DIR, 'db.sqlite3'), } } # Password validation # https://docs.djangoproject.com/en/1.11/ref/settings/#auth-password-validators AUTH_PASSWORD_VALIDATORS = [ { 'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator', # noqa }, { 'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator', # noqa }, { 'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator', # noqa }, { 'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator', # noqa }, ] # Internationalization # https://docs.djangoproject.com/en/1.11/topics/i18n/ LANGUAGE_CODE = 'en-us' TIME_ZONE = 'UTC' USE_I18N = True USE_L10N = True USE_TZ = True # Static files (CSS, JavaScript, Images) # https://docs.djangoproject.com/en/1.11/howto/static-files/ STATIC_URL = '/static/' STATICFILES_DIRS = [ os.path.join(PROJECT_DIR, "static") ] INTERNAL_IPS = ['127.0.0.1'] REST_FRAMEWORK = { 'DEFAULT_RENDERER_CLASSES': ( 'rest_framework.renderers.JSONRenderer', 'swh.web.api.renderers.YAMLRenderer', 'rest_framework.renderers.TemplateHTMLRenderer' ), 'DEFAULT_THROTTLE_CLASSES': ( 'rest_framework.throttling.AnonRateThrottle', ), 'DEFAULT_THROTTLE_RATES': { 'anon': None if DEBUG else swh_web_config['limiter_rate'], } } LOGGING = { 'version': 1, 'disable_existing_loggers': False, 'filters': { 'require_debug_false': { '()': 'django.utils.log.RequireDebugFalse', }, 'require_debug_true': { '()': 'django.utils.log.RequireDebugTrue', }, }, 'handlers': { 'console': { 'level': 'INFO', 'filters': ['require_debug_true'], 'class': 'logging.StreamHandler', }, 'file': { 'level': 'DEBUG', 'filters': ['require_debug_false'], 'class': 'logging.FileHandler', 'filename': os.path.join(swh_web_config['log_dir'], 'swh-web.log'), }, }, 'loggers': { 'django': { 'handlers': ['console', 'file'], 'level': 'DEBUG', 'propagate': True, }, }, }