Changeset View
Standalone View
swh/web/api/views/raw.py
- This file was added.
# Copyright (C) 2018-2019 The Software Heritage developers | |||||||||||||||||||||||||
vlorentz: should be `2022` instead | |||||||||||||||||||||||||
# See the AUTHORS file at the top-level directory of this distribution | |||||||||||||||||||||||||
# License: GNU Affero General Public License version 3, or any later version | |||||||||||||||||||||||||
# See top-level LICENSE file for more information | |||||||||||||||||||||||||
from django.http import HttpResponse | |||||||||||||||||||||||||
from swh.model.git_objects import ( | |||||||||||||||||||||||||
# content_git_object, | |||||||||||||||||||||||||
directory_git_object, | |||||||||||||||||||||||||
revision_git_object, | |||||||||||||||||||||||||
release_git_object, | |||||||||||||||||||||||||
snapshot_git_object, | |||||||||||||||||||||||||
) | |||||||||||||||||||||||||
import swh.model.model as model | |||||||||||||||||||||||||
from swh.model.swhids import CoreSWHID, ObjectType | |||||||||||||||||||||||||
from swh.web.api.apidoc import api_doc, format_docstring | |||||||||||||||||||||||||
from swh.web.api.apiurls import api_route | |||||||||||||||||||||||||
from swh.web.api.views.utils import api_lookup | |||||||||||||||||||||||||
from swh.web.common import archive | |||||||||||||||||||||||||
@api_route( | |||||||||||||||||||||||||
r"/raw/(?P<swhid>{SWHID_RE})/", | |||||||||||||||||||||||||
"api-1-raw-object", | |||||||||||||||||||||||||
) | |||||||||||||||||||||||||
@api_doc("/raw") | |||||||||||||||||||||||||
@format_docstring() | |||||||||||||||||||||||||
def api_raw_object(_request, swhid_r): | |||||||||||||||||||||||||
""" | |||||||||||||||||||||||||
.. http:get:: /api/1/raw/<swhid> | |||||||||||||||||||||||||
Get the object corresponding to the SWHID in raw form. | |||||||||||||||||||||||||
This endpoint exposes the internal representation (see | |||||||||||||||||||||||||
:func:`swh.model.git_objects.*_git_object` in our data | |||||||||||||||||||||||||
model module for details), and so can be used to fetch a binary | |||||||||||||||||||||||||
blob which hashes to the same identifier. | |||||||||||||||||||||||||
:param string swhid: the object's SWHID | |||||||||||||||||||||||||
Done Inline ActionsThis ref won't work because it's not the name of an existing function. Use this instead: see ``*_git_object`` functions in :mod:`swh.model.git_objects` vlorentz: This ref won't work because it's not the name of an existing function. Use this instead:
```… | |||||||||||||||||||||||||
:resheader Content-Type: application/octet-stream | |||||||||||||||||||||||||
:statuscode 200: no error | |||||||||||||||||||||||||
:statuscode 400: an invalid SWHID has been provided | |||||||||||||||||||||||||
:statuscode 404: the requested object can not be found in the archive | |||||||||||||||||||||||||
**Example:** | |||||||||||||||||||||||||
.. parsed-literal:: | |||||||||||||||||||||||||
Done Inline Actionsdon't invalid SWHIDs raise a 404 too? vlorentz: don't invalid SWHIDs raise a 404 too? | |||||||||||||||||||||||||
Done Inline ActionsI have no idea :). This was inherited from the snapshot handler the patch as originally based on. I am just hoping the thrown exceptions do that! :) Ericson2314: I have no idea :). This was inherited from the snapshot handler the patch as originally based… | |||||||||||||||||||||||||
Done Inline ActionsI don't see anything in the code that would raise a 400; so it's either a 404 (if Django rejects based on the regexp) or a 500. vlorentz: I don't see anything in the code that would raise a 400; so it's either a 404 (if Django… | |||||||||||||||||||||||||
Done Inline ActionsWhat happens if CoreSWHID.from_string throws a ValidationError? Still, regardless you are right that even if that is a 400, regex non-matches will be a 404. One of the swh.web.view.vault one taking a SWHID mentions the 404, but the others don't so I will just drop it. Ericson2314: What happens if `CoreSWHID.from_string` throws a `ValidationError`?
Still, regardless you are… | |||||||||||||||||||||||||
Not Done Inline Actionsit raises a 500 because it won't be caught (afaik). ack vlorentz: it raises a 500 because it won't be caught (afaik).
ack | |||||||||||||||||||||||||
:swh_web_api:`raw/swh:1:snp:6a3a2cf0b2b90ce7ae1cf0a221ed68035b686f5a` | |||||||||||||||||||||||||
""" | |||||||||||||||||||||||||
swhid = CoreSWHID.from_string(swhid_r) | |||||||||||||||||||||||||
data = api_lookup( | |||||||||||||||||||||||||
archive.lookup_object, | |||||||||||||||||||||||||
swhid.object_type, | |||||||||||||||||||||||||
swhid.object_id, | |||||||||||||||||||||||||
complete=True, | |||||||||||||||||||||||||
notfound_msg=f"Object with id {swhid_r} not found.", | |||||||||||||||||||||||||
) | |||||||||||||||||||||||||
json_to_git = { | |||||||||||||||||||||||||
# ObjectType.CONTENT: lambda obj: content_git_object( | |||||||||||||||||||||||||
# model.Content.from_dict(obj) | |||||||||||||||||||||||||
# ), | |||||||||||||||||||||||||
ObjectType.DIRECTORY: lambda obj: directory_git_object( | |||||||||||||||||||||||||
model.Directory.from_dict(obj) | |||||||||||||||||||||||||
), | |||||||||||||||||||||||||
ObjectType.REVISION: lambda obj: revision_git_object( | |||||||||||||||||||||||||
model.Revision.from_dict(obj) | |||||||||||||||||||||||||
), | |||||||||||||||||||||||||
ObjectType.RELEASE: lambda obj: release_git_object( | |||||||||||||||||||||||||
model.Release.from_dict(obj) | |||||||||||||||||||||||||
), | |||||||||||||||||||||||||
ObjectType.SNAPSHOT: lambda obj: snapshot_git_object( | |||||||||||||||||||||||||
model.Snapshot.from_dict(obj) | |||||||||||||||||||||||||
), | |||||||||||||||||||||||||
}[swhid.object_type] | |||||||||||||||||||||||||
Done Inline Actions
You can use this function: https://docs.softwareheritage.org/devel/apidoc/swh.core.api.classes.html#swh.core.api.classes.stream_results_optional It's shorter and more efficient (avoids a list copy on each loop) vlorentz: You can use this function: https://docs.softwareheritage.org/devel/apidoc/swh.core.api.classes. | |||||||||||||||||||||||||
results = json_to_git(data) | |||||||||||||||||||||||||
response = HttpResponse(results, content_type="application/octet-stream") | |||||||||||||||||||||||||
response["Content-disposition"] = "attachment;filename=%s_raw" % swhid.replace( | |||||||||||||||||||||||||
":", "_" | |||||||||||||||||||||||||
) | |||||||||||||||||||||||||
return response | |||||||||||||||||||||||||
Done Inline Actionsit's a paginated endpoint too. Use this: https://docs.softwareheritage.org/devel/apidoc/swh.storage.algos.snapshot.html#swh.storage.algos.snapshot.snapshot_get_all_branches vlorentz: it's a paginated endpoint too. Use this: https://docs.softwareheritage.org/devel/apidoc/swh. | |||||||||||||||||||||||||
Not Done Inline Actions
This usually does not matter, but some directories' git_object cannot be entirely rebuilt just from the list of entries for various reasons. (and please add a test for it; there is an example here: https://forge.softwareheritage.org/source/swh-vault/browse/master/swh/vault/tests/test_cookers.py$1121-1141 ) vlorentz: This usually does not matter, but some directories' git_object cannot be entirely rebuilt just… | |||||||||||||||||||||||||
Not Done Inline Actionsoh my bad, it should be [object_id] instead of [0] (the method returns a dict for some reason) vlorentz: oh my bad, it should be `[object_id]` instead of `[0]` (the method returns a dict for some… | |||||||||||||||||||||||||
Done Inline ActionsI this makes me think directory reassembly is subtle enough that it deserves its own function. I therefore opened D7720 to make one analogous to the snapshot one. We could still do an integration test for the web interface, but with this division of labor directory_get_all_entries and directory_git_object can also be tested in isolation. Ericson2314: I this makes me think directory reassembly is subtle enough that it deserves its own function. |
should be 2022 instead