- reorder conditionals to minimize the number of tests needed
- use hasattr() instead of the expensive isinstance()
Motivation: Ironically, this is the bottleneck of my checksumming
script.
Benchmark:
In [1]: from swh.storage import get_storage In [2]: from swh.model.identifiers import CoreSWHID, _BaseSWHID In [3]: s = get_storage('remote', url='http://moma.internal.softwareheritage.org:5002/') In [4]: rev = s.revision_get([bytes.fromhex("747675816d815e86b7482b5a0acb9110eeeec590")])[0]
Before this commit:
In [18]: %timeit rev.to_dict() 10000 loops, best of 5: 70.4 µs per loop In [19]: %timeit rev.to_dict() 10000 loops, best of 5: 69.3 µs per loop
After this commit:
In [5]: %timeit rev.to_dict() 10000 loops, best of 5: 48.4 µs per loop In [6]: %timeit rev.to_dict() 10000 loops, best of 5: 45.7 µs per loop In [7]: %timeit rev.to_dict() 10000 loops, best of 5: 47.5 µs per loop
Unfortunately there isn't much more we can do, 90% of the time is spent
constructing a dict (even when replacing the dictcomp
{k: dictify(v) for k, v in value.items()} with map() + dict()).