Details
- Reviewers
ardumont - Group Reviewers
Reviewers - Commits
- rDMOD29312dff6d96: Add support for model object anonymization
Diff Detail
- Repository
- rDMOD Data model
- Branch
- anonymize
- Lint
Lint Skipped - Unit
Unit Tests Skipped - Build Status
Buildable 12446 Build 18890: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 18889: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D3171 (id=11258)
Rebasing onto cce3036634...
Current branch diff-target is up to date.
Changes applied before test
commit 0292f52f53294f2a5e809218d67557940abeb34e Author: David Douard <david.douard@sdfa3.org> Date: Tue May 19 16:04:30 2020 +0200 Add support for model object anonymization Simply add a BaseModel.anonymize() method. Default implementation returns None, meaning the object is not anonymizable. For Revision, Release and Person, the method do return an anonymized version of the object.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/60/ for more details.
swh/model/model.py | ||
---|---|---|
146–147 | name and email are just display helpers. The anonymous version should probably only hash the fullname data. |
Shouldn't we make anonymized objects error when their compute_hash() method is called?
Maybe, but that would require we keep the info "this is an anonymized object" somewhere, which is not the case for now. This idea can be dealt later, maybe?
swh/model/model.py | ||
---|---|---|
146–147 | right, I read the code identifier.py too fast and thought all 3 were concatenated for hash computation, but they are not as you point out. |
That's why I'm asking now, so you/we don't have to do some code changes later. But if you're comfortable with it, then fine.
Typos + comments/docstrings + hash on the fullname in Person.anonymize()
also ensures persons_d() strategy do not generate data that looks like
and anonymized person.
Build is green
Patch application report for D3171 (id=11267)
Rebasing onto cce3036634...
Current branch diff-target is up to date.
Changes applied before test
commit e40fe471031bc85f9d40be163cba9d7351a02888 Author: David Douard <david.douard@sdfa3.org> Date: Tue May 19 16:04:30 2020 +0200 Add support for model object anonymization Simply add a BaseModel.anonymize() method. Default implementation returns None, meaning the object is not anonymizable. For Person, the method returns a Person whith hashed fullname (and unset name and email). For Revision and Release, the method returns an anonymized version of the object, i.e. with instance of Person replaced by anonymized ones.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/61/ for more details.
swh/model/hypothesis_strategies.py | ||
---|---|---|
102 ↗ | (On Diff #11267) | \o/ |
Build is green
Patch application report for D3171 (id=11268)
Rebasing onto cce3036634...
Current branch diff-target is up to date.
Changes applied before test
commit 0f3af381835fc2f1e3e420519d0bba7aef3d8ce6 Author: David Douard <david.douard@sdfa3.org> Date: Tue May 19 16:04:30 2020 +0200 Add support for model object anonymization Simply add a BaseModel.anonymize() method. Default implementation returns None, meaning the object is not anonymizable. For Person, the method returns a Person whith hashed fullname (and unset name and email). For Revision and Release, the method returns an anonymized version of the object, i.e. with instance of Person replaced by anonymized ones.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/62/ for more details.
Build is green
Patch application report for D3171 (id=11270)
Rebasing onto cce3036634...
Current branch diff-target is up to date.
Changes applied before test
commit 29312dff6d96ac1c9bc18bf98de1d2e27a76c334 Author: David Douard <david.douard@sdfa3.org> Date: Tue May 19 16:04:30 2020 +0200 Add support for model object anonymization Simply add a BaseModel.anonymize() method. Default implementation returns None, meaning the object is not anonymizable. For Person, the method returns a Person whith hashed fullname (and unset name and email). For Revision and Release, the method returns an anonymized version of the object, i.e. with instance of Person replaced by anonymized ones.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/63/ for more details.