Page MenuHomeSoftware Heritage

Update the HashableObject interface to take the object itself
ClosedPublic

Authored by olasd on Oct 19 2020, 4:26 PM.

Details

Summary

This will enable a gradual enhancement of the functions in identifiers.py to
take model objects directly, and return the bytes of the hash instead of an
hex representation.

Related to T2703
Related to T2704
Related to T2713

Test Plan

no changes to tox tests

Diff Detail

Repository
rDMOD Data model
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D4307 (id=15220)

Rebasing onto 9224c8ca6e...

Current branch diff-target is up to date.
Changes applied before test
commit 572fd837e558476170271e90b8043213f5270045
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Oct 19 15:30:53 2020 +0200

    Update the HashableObject interface to take the object itself
    
    This will enable a gradual enhancement of the functions in identifiers.py to
    take model objects directly, and return the bytes of the hash instead of an
    hex representation.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/149/ for more details.

shouldn't you add -> bytes to all the compute_hash methods?

shouldn't you add -> bytes to all the compute_hash methods?

Good point

And it's confusing why compute_hash can't just return self.id; I think I liked it better as a staticmethod.

Or at least add a comment/docstring on the abstract one

And it's confusing why compute_hash can't just return self.id; I think I liked it better as a staticmethod.

Is it? It's an initialization method that's mostly used when you don't have a self.id.

I want to (eventually) make swh.model.identifiers.*_identifier take a partially initialized instance of the model object (avoiding a bunch of dict transformations, and giving them stronger typing).

Or at least add a comment/docstring on the abstract one

Would you have a suggestion for such a comment?

In D4307#107125, @olasd wrote:

And it's confusing why compute_hash can't just return self.id; I think I liked it better as a staticmethod.

Is it? It's an initialization method that's mostly used when you don't have a self.id.

Yes, but it's not obvious that it's an initialization method.

Would you have a suggestion for such a comment?

In the docstring, you could add This is called by the constructor to set id` if it is not already defined`

Build is green

Patch application report for D4307 (id=15237)

Rebasing onto 9224c8ca6e...

Current branch diff-target is up to date.
Changes applied before test
commit 2cc5b855157bafdcc8250b1055a02583f12554c3
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Oct 19 15:30:53 2020 +0200

    Update the HashableObject interface to take the object itself
    
    This will enable a gradual enhancement of the functions in identifiers.py to
    take model objects directly, and return the bytes of the hash instead of an
    hex representation.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/150/ for more details.

This revision is now accepted and ready to land.Oct 20 2020, 10:54 AM

Build is green

Patch application report for D4307 (id=15381)

Rebasing onto aa84d8d941...

Current branch diff-target is up to date.
Changes applied before test
commit 498a1076416ad19f2628b20cb972096b2dfb3854
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Oct 19 15:30:53 2020 +0200

    Update the HashableObject interface to take the object itself
    
    This will enable a gradual enhancement of the functions in identifiers.py to
    take model objects directly, and return the bytes of the hash instead of an
    hex representation.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/157/ for more details.