The rounding algorithm wasn't specified
Depends on D5004.
Differential D5008
identifiers: Properly define the behavior of raw_extrinsic_metadata on negative timestamps. vlorentz on Feb 4 2021, 12:26 PM. Authored by
Details
The rounding algorithm wasn't specified Depends on D5004.
Diff Detail
Event TimelineComment Actions Build is green Patch application report for D5008 (id=17878)Could not rebase; Attempt merge onto 0c16581283... Updating 0c16581..e338317 Fast-forward swh/model/hashutil.py | 9 +- swh/model/identifiers.py | 94 ++++++++++++++ swh/model/model.py | 8 +- swh/model/tests/test_identifiers.py | 243 ++++++++++++++++++++++++++++++++++++ swh/model/tests/test_model.py | 6 +- 5 files changed, 357 insertions(+), 3 deletions(-) Changes applied before testcommit e3383172e94518217be3905ed5e3f7acd0dd80fd Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 12:24:46 2021 +0100 identifiers: Properly define the behavior of raw_extrinsic_metadata on negative timestamps. commit 03d282b9af891b6d48e83bd5301e57c0472d5b03 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 11:02:39 2021 +0100 identifiers: Change the manifest format of raw_extrinsic_metadata to use integer instead of ISO8601 Serializing as ISO8601 makes the hash brittle, because the database may change the timezone silently and/or lose precision in the microseconds. As we do not need precise timestamp, using an integer is good enough, and is consistant with the git format. The manifest also does not need to contain a timezone, as it only represents the timezone of the system that fetched this metadata, which is useless data. commit 266b88dcaaa0cab48c67e62ebca51f0a4599c435 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Jan 29 15:08:49 2021 +0100 model: Add 'id' field to RawExtrinsicMetadata So that they can be properly deduplicated and referenced. commit 272468f3b5a96c8854a26efe333c32cba4504aff Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jan 25 12:31:12 2021 +0100 identifiers: Add raw_extrinsic_metadata_identifier This will be used to compute an intrisic identifier for RawExtrinsicMetadata; which can be used for deduplication and refering to it like any other sha1_git instead of needed to use a tuple of its fields. See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/216/ for more details. Comment Actions I still find slightly unfortunate that this logic is separate from the one used for revisions/releases, but I guess it makes sense as the "input types" are so dissimilar.
Comment Actions Build is green Patch application report for D5008 (id=17977)Could not rebase; Attempt merge onto 0c16581283... Updating 0c16581..35aab96 Fast-forward swh/model/hashutil.py | 9 +- swh/model/identifiers.py | 99 ++++++++++++++ swh/model/model.py | 8 +- swh/model/tests/test_identifiers.py | 263 ++++++++++++++++++++++++++++++++++++ swh/model/tests/test_model.py | 6 +- 5 files changed, 382 insertions(+), 3 deletions(-) Changes applied before testcommit 35aab964f65f25ba6c3acf161a49682d100aa450 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 12:24:46 2021 +0100 identifiers: Properly define the behavior of raw_extrinsic_metadata on negative timestamps. commit 03d282b9af891b6d48e83bd5301e57c0472d5b03 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 11:02:39 2021 +0100 identifiers: Change the manifest format of raw_extrinsic_metadata to use integer instead of ISO8601 Serializing as ISO8601 makes the hash brittle, because the database may change the timezone silently and/or lose precision in the microseconds. As we do not need precise timestamp, using an integer is good enough, and is consistant with the git format. The manifest also does not need to contain a timezone, as it only represents the timezone of the system that fetched this metadata, which is useless data. commit 266b88dcaaa0cab48c67e62ebca51f0a4599c435 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Jan 29 15:08:49 2021 +0100 model: Add 'id' field to RawExtrinsicMetadata So that they can be properly deduplicated and referenced. commit 272468f3b5a96c8854a26efe333c32cba4504aff Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jan 25 12:31:12 2021 +0100 identifiers: Add raw_extrinsic_metadata_identifier This will be used to compute an intrisic identifier for RawExtrinsicMetadata; which can be used for deduplication and refering to it like any other sha1_git instead of needed to use a tuple of its fields. See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/220/ for more details.
Comment Actions it would be nice to have an explanation in the commit message of why this is actually needed. Explain why the original implementation is incorrect, and how it fixes this old incorrect behavior. Comment Actions
still relevant ;) Comment Actions Build is green Patch application report for D5008 (id=18555)Could not rebase; Attempt merge onto 8e0119962b... Updating 8e01199..a0d2933 Fast-forward swh/model/cli.py | 92 +++--- swh/model/hashutil.py | 9 +- swh/model/identifiers.py | 283 ++++++------------ swh/model/model.py | 8 +- swh/model/tests/test_cli.py | 6 +- swh/model/tests/test_identifiers.py | 575 ++++++++++++++++-------------------- swh/model/tests/test_model.py | 7 +- 7 files changed, 434 insertions(+), 546 deletions(-) Changes applied before testcommit a0d29337130e4d063e25f51a8b8bbce0401039fc Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 12:24:46 2021 +0100 identifiers: Properly define the behavior of raw_extrinsic_metadata on negative timestamps. The rounding algorithm wasn't specified commit ff1423f8fab417a7e146c036ae09c0e6dd0fe4e2 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 11:02:39 2021 +0100 identifiers: Change the manifest format of raw_extrinsic_metadata to use integer instead of ISO8601 Serializing as ISO8601 makes the hash brittle, because the database may change the timezone silently and/or lose precision in the microseconds. As we do not need precise timestamp, using an integer is good enough, and is consistant with the git format. The manifest also does not need to contain a timezone, as it only represents the timezone of the system that fetched this metadata, which is useless data. commit 8cbe59ee6251ac834a3d98b04146d2dfeb8da075 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Jan 29 15:08:49 2021 +0100 model: Add 'id' field to RawExtrinsicMetadata So that they can be properly deduplicated and referenced. commit d88a5e13f2ffea5c0ebfad24e29ba41e86af20c0 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jan 25 12:31:12 2021 +0100 identifiers: Add raw_extrinsic_metadata_identifier This will be used to compute an intrisic identifier for RawExtrinsicMetadata; which can be used for deduplication and refering to it like any other sha1_git instead of needed to use a tuple of its fields. commit bf4ab4336f7b43d442988c47d3dd70bb82b595c5 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Wed Mar 3 10:44:48 2021 +0100 identifiers: Remove the deprecated SWHID class Other packages don't use it anymore. commit 1e924e84198a895003d6f649b8e3471cd93a7c7b Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Wed Mar 3 10:44:27 2021 +0100 cli: stop using the deprecated SWHID class See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/276/ for more details. Comment Actions Build is green Patch application report for D5008 (id=18589)Could not rebase; Attempt merge onto bf4ab4336f... Updating bf4ab43..4386330 Fast-forward swh/model/hashutil.py | 9 +- swh/model/identifiers.py | 99 ++++++++++++++ swh/model/model.py | 8 +- swh/model/tests/test_identifiers.py | 258 ++++++++++++++++++++++++++++++++++++ swh/model/tests/test_model.py | 7 +- 5 files changed, 378 insertions(+), 3 deletions(-) Changes applied before testcommit 43863306859ea90d7701db7ee5ea68c2dbff95ce Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 12:24:46 2021 +0100 identifiers: Properly define the behavior of raw_extrinsic_metadata on negative timestamps. The rounding algorithm wasn't specified commit 3ce412506cd942d1a5e7340bc309f43da2b2adeb Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 11:02:39 2021 +0100 identifiers: Change the manifest format of raw_extrinsic_metadata to use integer instead of ISO8601 Serializing as ISO8601 makes the hash brittle, because the database may change the timezone silently and/or lose precision in the microseconds. As we do not need precise timestamp, using an integer is good enough, and is consistant with the git format. The manifest also does not need to contain a timezone, as it only represents the timezone of the system that fetched this metadata, which is useless data. commit fc808e1fc9e59ffae1e82ea483d529025d7d1436 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Jan 29 15:08:49 2021 +0100 model: Add 'id' field to RawExtrinsicMetadata So that they can be properly deduplicated and referenced. commit f6eab95253f13f28fe4d4652fc471e3e8a0b5565 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Mon Jan 25 12:31:12 2021 +0100 identifiers: Add raw_extrinsic_metadata_identifier This will be used to compute an intrisic identifier for RawExtrinsicMetadata; which can be used for deduplication and refering to it like any other sha1_git instead of needed to use a tuple of its fields. See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/280/ for more details. |