Page MenuHomeSoftware Heritage

Deprecate identifiers.py
ClosedPublic

Authored by vlorentz on Sep 23 2021, 4:54 PM.

Details

Summary
  1. Add a warning
  2. Move identifier/manifest documentation to git_objects.py
  3. Remove all imports of that module.

Motivation:

  • SWHID classes were moved to swhids.py
  • manifest computation functions were moved to git_objects.py
  • Only reexports and trivial wrappers of model.py remain

Depends on D6327 and D6328.

Diff Detail

Repository
rDMOD Data model
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D6330 (id=23012)

Could not rebase; Attempt merge onto 0dd33cdf7d...

Updating 0dd33cd..f56becc
Fast-forward
 swh/model/cli.py                    |   23 +-
 swh/model/from_disk.py              |   22 +-
 swh/model/git_objects.py            |  583 ++++++++++++++++
 swh/model/hypothesis_strategies.py  |   11 +-
 swh/model/identifiers.py            | 1279 ++---------------------------------
 swh/model/model.py                  |  129 +++-
 swh/model/swhids.py                 |  448 ++++++++++++
 swh/model/tests/swh_model_data.py   |    2 +-
 swh/model/tests/test_identifiers.py |  412 ++++++-----
 swh/model/tests/test_model.py       |  105 +--
 10 files changed, 1394 insertions(+), 1620 deletions(-)
 create mode 100644 swh/model/git_objects.py
 create mode 100644 swh/model/swhids.py
Changes applied before test
commit f56becc196ed6dd4b211c97096654c4400b047ec
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Sep 23 16:52:22 2021 +0200

    Deprecate identifiers.py
    
    1. Add a warning
    2. Move identifier/manifest documentation to git_objects.py
    3. Remove all imports of that module.
    
    Motivation:
    
    * SWHID classes were moved to swhids.py
    * manifest computation functions were moved to git_objects.py
    * Only reexports and trivial wrappers of model.py remain

commit df73e74b40ccc24fd2c756e79bcf3eac6fc8b8b5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 22 18:43:18 2021 +0200

    test_identifiers.py: Make sha1_git literals more consistent.

commit 510df60c31815f0e568d6b179a68087aa2c7394a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 22 17:59:25 2021 +0200

    Remove identifier_to_bytes and identifier_to_hex
    
    They are not used anywhere.

commit 9e8a547622239ebf949ddbacec42e113e525c245
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 22 17:41:15 2021 +0200

    Move manifest computation functions from identifiers.py to git_objects.py
    
    Since they are used by the vault for non-identifier-related stuff,
    I think it makes sense to move them to a new module.
    
    identifiers.py is now an empty shell, as all its features were moved to
    other modules and it only contains reexports and backward-compat
    functions. Therefore, it should be considered deprecated from now on.

commit 57ae405d312879bec19107d29a20c2c290d7861d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 22 17:32:48 2021 +0200

    Refactor identifiers & model to make *_git_object() functions work on model classes instead of dicts
    
    Since we now use these classes everywhere, computing hashes required
    using to_dict() just to compute identifiers, which can be a performance
    bottleneck in code computing many checksums.

commit 6a72f88c5687f5b6c05f9b25eb73bce2d772f97c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 22 17:19:10 2021 +0200

    test_identifiers.py: Fix/update malformed data dicts
    
    A future commit will make identifier computation use the attrs classes,
    which are strict about what they accept.

commit 9ec683264c415731286005dff823e1099ef358c3
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 22 16:36:30 2021 +0200

    Move SWHID classes and functions from identifiers.py to swhids.py
    
    identifiers.py initially worked only on bare sha1_git.
    
    I chose to add the SWHID classes in that module because
    of the name, but the SWHID code didn't actually interact
    with the other functions in the module, so it now feels
    out of place to me.

See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/382/ for more details.

Cool stuff!

Are you sure the removals in swh/model/tests/test_model.py are properly covered in another set of tests? For instance, I don't seem to find a computation of the id of snapshot_example after your removals.

they didn't actually check the value, only made sure it matched the return value computed by identifiers.py.

And snapshot_example is still tested, it's just copied to self.all_types in the constructor

This revision is now accepted and ready to land.Sep 24 2021, 9:28 AM
This revision was automatically updated to reflect the committed changes.