Page MenuHomeSoftware Heritage

Add automatic object identifier computation support
ClosedPublic

Authored by anlambert on Nov 28 2019, 5:21 PM.

Details

Summary

Add support to automatically compute identifier in the following object models:
Directory, Release, Revision, Snapshot.

If the identifier is not provided as parameter, it will be computed when the model
is initialized.

This is useful when writing tests and one wants to populate the archive using models
instead of raw dicts.

For instance, if we have the following directory entry model:

dir_target = ...
dir_entry = DirectoryEntry(name=b'directory', type='dir',
                           target=dir_target,
                           perms=DentryPerms.directory)

To create a valid Directory model, previously you had to proceed like this:

dir_id = hash_to_bytes(directory_identifier({
    'entries': [dir_entry.to_dict()]
}))

dir = Directory(id=dir_id, entries=[dir_entry])

While after applying that diff, you can now simply write:

dir = Directory(entries=[dir_entry])
Test Plan

Tests have been added. Some raw data from test_identifiers are reused to avoid duplication.

Diff Detail

Repository
rDMOD Data Model
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

anlambert created this revision.Nov 28 2019, 5:21 PM
vlorentz requested changes to this revision.Nov 29 2019, 1:57 PM
vlorentz added a subscriber: vlorentz.

Nice!

I think instead you should add a method that computes the hash to each class, then implement __attrs_post_init__ only in BaseModel (or a new abstract class for hashable objects).

That will make your code shorter, and also make it easier to implement hash checking when we will add it.

swh/model/model.py
235–240

You should do this, for consistency with other classes:

return cls(
    branches={name: ... for ... in d.pop('branches').items()},
    **d)
317–318

I think you can remove this line entirely

349–352

same

This revision now requires changes to proceed.Nov 29 2019, 1:57 PM
anlambert updated this revision to Diff 8375.Nov 29 2019, 2:47 PM

Update: Rebase and address @vlorentz comments

anlambert updated this revision to Diff 8376.Nov 29 2019, 2:51 PM

Update: Rename a variable

vlorentz added inline comments.Nov 29 2019, 3:44 PM
swh/model/model.py
342–345

forgot this one

anlambert added inline comments.Nov 29 2019, 3:50 PM
swh/model/model.py
342–345

Sigh .. I am really not efficient when I am sick and tired.

anlambert updated this revision to Diff 8380.Nov 29 2019, 3:52 PM

Update: Forgot to process Directory model

vlorentz accepted this revision.Nov 29 2019, 3:53 PM
This revision is now accepted and ready to land.Nov 29 2019, 3:53 PM