This will be used by the loaders, instead of collect(), because collect()
returns nested dictionaries, and its internals (deep_update) highly depend
on working with dicts.
Details
- Reviewers
ardumont olasd - Group Reviewers
Reviewers - Commits
- rDMOD9cf7a04a3e0c: Add method MerkleNode.iter_tree, to visit all nodes in the subtree of a node.
Diff Detail
- Repository
- rDMOD Data model
- Branch
- iter-tree
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 10769 Build 16167: tox-on-jenkins Jenkins Build 16166: arc lint + arc unit
Event Timeline
Build is green
See https://jenkins.softwareheritage.org/job/DMOD/job/tox/190/ for more details.
Sounds about right.
Got a couple of remarks/questions.
swh/model/merkle.py | ||
---|---|---|
275 | So as olasd mentioned in irc, Iterator['MerkleNode']? Also, can't we import the MerkleNode and use it plainly here? |
I've just noticed something: the top level _iter_tree will walk the tree twice: once when computing self.hash, and again in recursing over the children. Of course the recursed calls will have a cached self.hash and will only need to walk the tree once.
There must be a way to avoid that, maybe by yielding the children before yielding oneself, but I couldn't really work it out with the deduplication.
I don't think we should really care (collect() has had that issue since it's existed as well), but I thought I'd mention it if you can find a way to avoid that.
swh/model/merkle.py | ||
---|---|---|
275 | This is the MerkleNode. You can't reference the class you're currently defining. | |
282 | If we want to be pedantic about it, seen should probably be a tuple (self.type, self.hash), even though for all intents and purposes the probabilty of collision across types is very low. | |
283 | I guess that line is not really needed. |
swh/model/merkle.py | ||
---|---|---|
275 | thanks, i did not notice (d'oh)! |
Build is green
See https://jenkins.softwareheritage.org/job/DMOD/job/tox/191/ for more details.