Page MenuHomeSoftware Heritage

Add automatic object identifier computation support
ClosedPublic

Authored by anlambert on Nov 28 2019, 5:21 PM.

Details

Summary

Add support to automatically compute identifier in the following object models:
Directory, Release, Revision, Snapshot.

If the identifier is not provided as parameter, it will be computed when the model
is initialized.

This is useful when writing tests and one wants to populate the archive using models
instead of raw dicts.

For instance, if we have the following directory entry model:

dir_target = ...
dir_entry = DirectoryEntry(name=b'directory', type='dir',
                           target=dir_target,
                           perms=DentryPerms.directory)

To create a valid Directory model, previously you had to proceed like this:

dir_id = hash_to_bytes(directory_identifier({
    'entries': [dir_entry.to_dict()]
}))

dir = Directory(id=dir_id, entries=[dir_entry])

While after applying that diff, you can now simply write:

dir = Directory(entries=[dir_entry])
Test Plan

Tests have been added. Some raw data from test_identifiers are reused to avoid duplication.

Diff Detail

Repository
rDMOD Data model
Branch
model-id-computation
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 9447
Build 13869: tox-on-jenkinsJenkins
Build 13868: arc lint + arc unit

Event Timeline

vlorentz added a subscriber: vlorentz.

Nice!

I think instead you should add a method that computes the hash to each class, then implement __attrs_post_init__ only in BaseModel (or a new abstract class for hashable objects).

That will make your code shorter, and also make it easier to implement hash checking when we will add it.

swh/model/model.py
235–240

You should do this, for consistency with other classes:

return cls(
    branches={name: ... for ... in d.pop('branches').items()},
    **d)
317–318

I think you can remove this line entirely

350–353

same

This revision now requires changes to proceed.Nov 29 2019, 1:57 PM

Update: Rebase and address @vlorentz comments

swh/model/model.py
341–344

forgot this one

swh/model/model.py
341–344

Sigh .. I am really not efficient when I am sick and tired.

Update: Forgot to process Directory model

This revision is now accepted and ready to land.Nov 29 2019, 3:53 PM