Adapt according to review:
- Drop swh.loader.pattern and move class Loader in swh.loader.core.loader module
- Drop unneeded self.create_authorities, self.create_fetchers
Adapt according to review:
@ardumont points out that the base PackageLoader doesn't inherit from BaseLoader, which explains the new (common) base class. I think the new class could just as well be next to BaseLoader, and doesn't warrant the introduction of a pattern module.
@ardumont points out that the base PackageLoader doesn't inherit from BaseLoader, which explains the new (common) base class. I think the new class could just as well be next to BaseLoader, and doesn't warrant the introduction of a pattern module.
Are the new pattern module / pattern.Loader class really needed? It looks like these methods could live in the BaseLoader class directly.
This is great, thanks!
Build is green
Add missing test on cli run edge case
Build is green
Fix unused import
Build has FAILED
Our logging handler swh.core.logger.JournalHandler already knows how to pull some metadata from the celery tasks:
...
Build is green
Adapt according to suggestion
In D4012#99590, @ardumont wrote:To be clear, my main issue today, when I try to look through our logs to
investigate or plain read what's going on (after a deployment for example), I
don't have any clues immediately...In my mind, the kibana information is not enough by itself, so i think i need
to cross information with say sentry to have some more context... It's
currently quite frustrating... up to an eventual point of, "oh well, I have
some other urgent matters somewhere else..." (sometimes I push through but
sometimes, I fail).
Build is green
Simplify to just one log statement
ok, i'll adapt
i would not be against a nudge in the right direction to actually improve the logging
In D4012#99525, @olasd wrote:I don't think the origin url and visit type should be sent in the task result; they're arguments of the task already.
If we want them logged by the worker when the task ends (which I agree would be useful), then we should improve logging on the worker/celery side to show some of the task arguments (for instance, if there's a "url" argument) instead / in addition of the task id.
I don't think the origin url and visit type should be sent in the task
result; they're arguments of the task already.
I don't think the origin url and visit type should be sent in the task result; they're arguments of the task already.
I think the second point mostly happened: the storage is returning statistics to the loader, but the loaders don't generally collect them.
I'm afraid the only way to properly solve this is to wait until we stop writing metadata to the revision table
Reopening as i'm still refactoring/cleaning up more modules.
In the end, it's more dead code since it's only code we pass into when the storage used is an in-memory instance.
This is no longer the case, tests are now using pg-storage instance.