They will be used by loaders, so they can deal only with canonical
model objects, instead of having to do the same conversion themselves.
Depends on D2702.
Differential D2703
Add to_model() method to from_disk.{Content,Directory}, to convert to canonical model objects. vlorentz on Feb 20 2020, 4:51 PM. Authored by
Details
They will be used by loaders, so they can deal only with canonical Depends on D2702.
Diff Detail
Event TimelineComment Actions Build is green Comment Actions That'd be a lot more work downstream, but wouldn't it make sense to have get_data return model objects directly instead of dicts? As most users are calling collect() on a directory object, that would remove a large chunk of indirection from loaders. What do you think?
Comment Actions
No, in that case it's a SkippedContent.
What's the difference? Comment Actions They don't exactly return the same thing; from_disk.Content.get_data() also returns perms, name, and path.
I don't understand what collect() has to do with it Comment Actions Then you're generating SkippedContents for all files that are supposed to be lazy-loaded from disk. That's even more wrong than my original expectation.
The difference between what and what? Comment Actions That's really just an abstraction failure/leak in its implementation. These fields are not useful to the callers if the API of the returned Contents and Directorys is sensible (see also the lazy loading comments).
collect() is the external API of Directory that's used by loaders to get the lists of objects to load. That function calls get_data() on every node in the tree, which currently converts the data structure to a dict. So you don't have any Content objects on which to call .to_model by the time you've collect()ed, which makes to_model's usefulness on its own quite limited. Comment Actions I mean, what's the point of lazy-loading, instead of using the data argument of from_disk? ack
ack Comment Actions Ah, right. Most loaders that are using from_disk will be re-processing files that they've already imported in one way or another: for instance, the svn loader calls runs it for every commit in turn; package manager loaders for every version in turn; lazy-loading allows us to only carry the full data in memory at the point we're actually sending it away to the archive, rather than all the time. In testing, the memory consumption savings were quite substantial. Comment Actions Build has FAILED Link to build: https://jenkins.softwareheritage.org/job/DMOD/job/tox/186/ Comment Actions Build is green Comment Actions Thanks!
Comment Actions Build has FAILED Link to build: https://jenkins.softwareheritage.org/job/DMOD/job/tox/188/ |