Changeset View
Standalone View
swh/deposit/utils.py
# Copyright (C) 2018-2019 The Software Heritage developers | # Copyright (C) 2018-2019 The Software Heritage developers | ||||
# See the AUTHORS file at the top-level directory of this distribution | # See the AUTHORS file at the top-level directory of this distribution | ||||
# License: GNU General Public License version 3, or any later version | # License: GNU General Public License version 3, or any later version | ||||
# See top-level LICENSE file for more information | # See top-level LICENSE file for more information | ||||
import iso8601 | import iso8601 | ||||
from types import GeneratorType | from types import GeneratorType | ||||
from swh.model.identifiers import normalize_timestamp | from swh.model.identifiers import normalize_timestamp | ||||
def origin_url_from(deposit): | |||||
"""Given a deposit instance, return the associated origin url | |||||
Args: | |||||
deposit (Deposit): The deposit from which derives the origin url | |||||
Returns | |||||
The associated origin url | |||||
""" | |||||
base_url = deposit.client.provider_url | |||||
zack: More a question than a request of change (because I'm new to this code base), but where in this… | |||||
Done Inline Actions
So far, we don't check. In the current workflow with hal, it is provided (and i don't expect that to change because the slug is still mandatory on the api). I can add the check but i'm not sure what to do if the cases arise, raising? ardumont: > More a question than a request of change (because I'm new to this code base), but where in… | |||||
Not Done Inline ActionsMy point is that *today* we are sure it's around, but tomorrow the code that make sure the ID is there might be changed and, when that happens, we will not notice that is missing and will start filling the archive with URLs that only contain the domain name and no ID. Please add the check (an assertion would be enough) as a classic defensive programming measure. zack: My point is that *today* we are sure it's around, but tomorrow the code that make sure the ID… | |||||
external_id = deposit.external_id | |||||
return '%s/%s' % (base_url.rstrip('/'), external_id) | |||||
def merge(*dicts): | def merge(*dicts): | ||||
"""Given an iterator of dicts, merge them losing no information. | """Given an iterator of dicts, merge them losing no information. | ||||
Args: | Args: | ||||
*dicts: arguments are all supposed to be dict to merge into one | *dicts: arguments are all supposed to be dict to merge into one | ||||
Returns: | Returns: | ||||
dict merged without losing information | dict merged without losing information | ||||
▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines |
More a question than a request of change (because I'm new to this code base), but where in this change we verify that *an* external_id is actually present and not simply missing? It is totally right to make external_id optional *as a metadata field*, but it should not be missing entirely---the client should be providing an UUID if it's not being provided by other means. In that case, though, the code that generate the complete URL should check that, one way or another, *an* external_id is present, as a safety net.