From within swh-model at least ;)
Details
Diff Detail
- Repository
- rDMOD Data model
- Branch
- master
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 1429 Build 1773: arc lint + arc unit
Event Timeline
A few comments inline, most notably regarding the behavior of file hashing.
swh/model/from_disk.py | ||
---|---|---|
80–82 | I don't like the verbosity of passing the length argument to from_data to avoid one call to len() which is a pretty minor thing compared to actually hashing the data :) However I won't argue that this needs to change. | |
127–129 | This should do multiple reads with a fixed chunk size, rather than relying on the iterator behavior of file objects: the iterator will split the input line by line, which will make us read binary files all at once and possibly blow up our memory usage. I now notice that MultiHash.from_file should do the same :) |
swh/model/from_disk.py | ||
---|---|---|
80–82 | A minor +1 on not passing length here too, mostly because it opens up to inconsistency risks: one might pass a length that is not the actual length of data, which makes this API more dangerous than it could be. |
swh/model/from_disk.py | ||
---|---|---|
80–82 |
I don't either...
ok. But...
To be fair, the internal hash_* were already this way (cf. e.g. hash_file). I was trying to find a middleground here. What i would actually like to see from the MultiHash from_* is a way to ask for the length (raw_data even?) computed from the content/path/file. That'd be consistent with our use, we often want it (more than not) in the model clients (loaders mostly). For example, a parameter to the *digest method? What do you think? | |
127–129 | Right! |
swh/model/from_disk.py | ||
---|---|---|
80–82 | Only hash_file really needs the length to be passed by the caller (Python file objects don't generally know their size), all other API calls can infer it from the source. That's also the way the old API was designed, and I missed that when reviewing the new API. I can propose a diff to hashutil to clean that up? | |
127–129 | 👌 |