Page MenuHomeSoftware Heritage

identifiers: Add raw_extrinsic_metadata_identifier
ClosedPublic

Authored by vlorentz on Jan 25 2021, 12:31 PM.

Details

Summary

This will be used to compute an intrisic identifier for RawExtrinsicMetadata;
which can be used for deduplication and refering to it like any other sha1_git
instead of needed to use a tuple of its fields.

T2703

Diff Detail

Repository
rDMOD Data model
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D4935 (id=17550)

Rebasing onto 9af451fd62...

First, rewinding head to replay your work on top of it...
Applying: identifiers: Add raw_extrinsic_metadata_identifier
Changes applied before test
commit 1c195e7b099042dc4a6a7fcea57a95e91002e2b5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Jan 25 12:31:12 2021 +0100

    identifiers: Add raw_extrinsic_metadata_identifier
    
    This will be used to compute an intrisic identifier for RawExtrinsicMetadata;
    which can be used for deduplication and refering to it like any other sha1_git
    instead of needed to use a tuple of its fields.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/202/ for more details.

clearer specification in the docstring

Build is green

Patch application report for D4935 (id=17552)

Rebasing onto 9af451fd62...

First, rewinding head to replay your work on top of it...
Applying: identifiers: Add raw_extrinsic_metadata_identifier
Changes applied before test
commit b0b3e195b638734c94ebbcc8cab3b155f2137705
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Jan 25 12:31:12 2021 +0100

    identifiers: Add raw_extrinsic_metadata_identifier
    
    This will be used to compute an intrisic identifier for RawExtrinsicMetadata;
    which can be used for deduplication and refering to it like any other sha1_git
    instead of needed to use a tuple of its fields.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/203/ for more details.

anlambert added a subscriber: anlambert.

Looks good to me.

swh/model/identifiers.py
729–730

This does not compute a snapshot identifier.

761

same here

This revision is now accepted and ready to land.Jan 29 2021, 1:48 PM
olasd added a subscriber: olasd.
olasd added inline comments.
swh/model/identifiers.py
793

I never remember which is the default encoding; can we be explicit here?

for the records:

12:57 <vlorentz> olasd: if I have to add an argument to .encode(), then it makes lines 725 and 729 harder to read
12:57 <vlorentz> (because they go over 88 cols)
12:58 <+olasd> vlorentz: I guess the fact that the explicit .encodes() use ascii makes the default one more obvious
12:58 <+olasd> (i.e. don't sweat it)
  • rebase
  • fix doc
  • improve doc

Build is green

Patch application report for D4935 (id=17746)

Rebasing onto cad940dc8c...

Current branch diff-target is up to date.
Changes applied before test
commit 2a807789f22fdbf3838684b6b1ce8cf5a599b754
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Jan 25 12:31:12 2021 +0100

    identifiers: Add raw_extrinsic_metadata_identifier
    
    This will be used to compute an intrisic identifier for RawExtrinsicMetadata;
    which can be used for deduplication and refering to it like any other sha1_git
    instead of needed to use a tuple of its fields.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/205/ for more details.

Build is green

Patch application report for D4935 (id=17866)

Rebasing onto 0c16581283...

Current branch diff-target is up to date.
Changes applied before test
commit 272468f3b5a96c8854a26efe333c32cba4504aff
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Jan 25 12:31:12 2021 +0100

    identifiers: Add raw_extrinsic_metadata_identifier
    
    This will be used to compute an intrisic identifier for RawExtrinsicMetadata;
    which can be used for deduplication and refering to it like any other sha1_git
    instead of needed to use a tuple of its fields.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/210/ for more details.

big rebase, plz review again

Build is green

Patch application report for D4935 (id=18552)

Could not rebase; Attempt merge onto 8e0119962b...

Updating 8e01199..d88a5e1
Fast-forward
 swh/model/cli.py                    |  92 +++++---
 swh/model/hashutil.py               |   9 +-
 swh/model/identifiers.py            | 263 +++++++---------------
 swh/model/tests/test_cli.py         |   6 +-
 swh/model/tests/test_identifiers.py | 432 ++++++++++--------------------------
 5 files changed, 258 insertions(+), 544 deletions(-)
Changes applied before test
commit d88a5e13f2ffea5c0ebfad24e29ba41e86af20c0
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Jan 25 12:31:12 2021 +0100

    identifiers: Add raw_extrinsic_metadata_identifier
    
    This will be used to compute an intrisic identifier for RawExtrinsicMetadata;
    which can be used for deduplication and refering to it like any other sha1_git
    instead of needed to use a tuple of its fields.

commit bf4ab4336f7b43d442988c47d3dd70bb82b595c5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 3 10:44:48 2021 +0100

    identifiers: Remove the deprecated SWHID class
    
    Other packages don't use it anymore.

commit 1e924e84198a895003d6f649b8e3471cd93a7c7b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 3 10:44:27 2021 +0100

    cli: stop using the deprecated SWHID class

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/273/ for more details.

olasd added a subscriber: zack.

Please fix the last inline issue before landing :-)

swh/model/identifiers.py
734–745

You still have the spurious colons that @zack had mentioned.

This revision is now accepted and ready to land.Mar 4 2021, 11:05 AM

remove colons, add more tabulations

Build is green

Patch application report for D4935 (id=18586)

Rebasing onto bf4ab4336f...

Current branch diff-target is up to date.
Changes applied before test
commit f6eab95253f13f28fe4d4652fc471e3e8a0b5565
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Jan 25 12:31:12 2021 +0100

    identifiers: Add raw_extrinsic_metadata_identifier
    
    This will be used to compute an intrisic identifier for RawExtrinsicMetadata;
    which can be used for deduplication and refering to it like any other sha1_git
    instead of needed to use a tuple of its fields.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/277/ for more details.