Changeset View
Changeset View
Standalone View
Standalone View
swh/model/identifiers.py
Show First 20 Lines • Show All 727 Lines • ▼ Show 20 Lines | def raw_extrinsic_metadata_identifier(metadata: Dict[str, Any]) -> str: | ||||
"""Return the intrinsic identifier for a RawExtrinsicMetadata object. | """Return the intrinsic identifier for a RawExtrinsicMetadata object. | ||||
A raw_extrinsic_metadata identifier is a salted sha1 (using the git | A raw_extrinsic_metadata identifier is a salted sha1 (using the git | ||||
hashing algorithm with the ``raw_extrinsic_metadata`` object type) of | hashing algorithm with the ``raw_extrinsic_metadata`` object type) of | ||||
a manifest following the format: | a manifest following the format: | ||||
``` | ``` | ||||
target $ExtendedSwhid | target $ExtendedSwhid | ||||
discovery_date $ISO8601 | discovery_date $Timestamp | ||||
authority $StrWithoutSpaces $IRI | authority $StrWithoutSpaces $IRI | ||||
fetcher $Str $Version | fetcher $Str $Version | ||||
format $StrWithoutSpaces | format $StrWithoutSpaces | ||||
origin $IRI <- optional | origin $IRI <- optional | ||||
visit $IntInDecimal <- optional | visit $IntInDecimal <- optional | ||||
snapshot $CoreSwhid <- optional | snapshot $CoreSwhid <- optional | ||||
release $CoreSwhid <- optional | release $CoreSwhid <- optional | ||||
revision $CoreSwhid <- optional | revision $CoreSwhid <- optional | ||||
Show All 9 Lines | def raw_extrinsic_metadata_identifier(metadata: Dict[str, Any]) -> str: | ||||
$StrWithoutSpaces and $Version are ASCII strings, and may not contain spaces. | $StrWithoutSpaces and $Version are ASCII strings, and may not contain spaces. | ||||
$Str is an UTF-8 string. | $Str is an UTF-8 string. | ||||
$CoreSwhid are core SWHIDs, as defined in :ref:`persistent-identifiers`. | $CoreSwhid are core SWHIDs, as defined in :ref:`persistent-identifiers`. | ||||
$ExtendedSwhid is a core SWHID, with extra types allowed ('ori' for | $ExtendedSwhid is a core SWHID, with extra types allowed ('ori' for | ||||
origins and 'emd' for raw extrinsic metadata) | origins and 'emd' for raw extrinsic metadata) | ||||
$Timestamp is a decimal representation of the integer number of seconds since | |||||
the UNIX epoch (1970-01-01 00:00:00 UTC), with no leading '0' | |||||
(unless the timestamp value is zero) and no timezone. | |||||
It may be negative by prefixing it with a '-', which must not be followed | |||||
by a '0'. | |||||
Newlines in $Bytes, $Str, and $Iri are escaped as with other git fields, | Newlines in $Bytes, $Str, and $Iri are escaped as with other git fields, | ||||
ie. by adding a space after them. | ie. by adding a space after them. | ||||
Returns: | Returns: | ||||
str: the intrinsic identifier for `metadata` | str: the intrinsic identifier for `metadata` | ||||
""" | """ | ||||
timestamp = metadata["discovery_date"].timestamp() | |||||
headers = [ | headers = [ | ||||
(b"target", str(metadata["target"]).encode()), | (b"target", str(metadata["target"]).encode()), | ||||
(b"discovery_date", metadata["discovery_date"].isoformat().encode("ascii")), | (b"discovery_date", str(int(timestamp)).encode("ascii")), | ||||
( | ( | ||||
b"authority", | b"authority", | ||||
f"{metadata['authority']['type']} {metadata['authority']['url']}".encode(), | f"{metadata['authority']['type']} {metadata['authority']['url']}".encode(), | ||||
), | ), | ||||
( | ( | ||||
b"fetcher", | b"fetcher", | ||||
f"{metadata['fetcher']['name']} {metadata['fetcher']['version']}".encode(), | f"{metadata['fetcher']['name']} {metadata['fetcher']['version']}".encode(), | ||||
), | ), | ||||
▲ Show 20 Lines • Show All 411 Lines • Show Last 20 Lines |