Page MenuHomeSoftware Heritage

Make the swh.loader.package exception handling more granular
ClosedPublic

Authored by ardumont on Mar 20 2020, 6:11 PM.

Details

Summary

This change generalizes the behavior of loading most revisions and generating a
partial snapshot when we failed to load some of them.

Depends on D2860

Test Plan

tox

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 11276
Build 17044: tox-on-jenkinsJenkins
Build 17043: arc lint + arc unit

Event Timeline

I'm fine with the idea of being more resilient.
Since we want to maximize the ingestion for package loaders.

ardumont added a reviewer: olasd.

Build on the first iteration:

  • make test pass
  • add docstring
  • call sentry_sdk to push exception to sentry

Remove spurious print statements

swh/loader/package/loader.py
15

I patched the staging node currently running the functional loader.
But i think we can add this anyway.

olasd requested changes to this revision.Mar 23 2020, 11:25 AM
olasd added inline comments.
swh/loader/package/loader.py
377

I guess you can remove the loaded return value now; seems like the function always raises an exception when it fails to load now. This will simplify the caller as well.

This revision now requires changes to proceed.Mar 23 2020, 11:25 AM

Remove unneeded loaded flag from _load_revision

This revision is now accepted and ready to land.Mar 23 2020, 11:34 AM
lewo added inline comments.
swh/loader/package/loader.py
357

I actually don't really like these except Exception statements. I think it would be better to first catch all expected errors, and then fallback on except Exception if it is a unexpected exception (and tag it as unexpected in sentry). For instance, we know some archives will be broken, this would lead to a "expected error".

This would allow us to write better tests, since we could test the type of the raised exception.

swh/loader/package/loader.py
343

I knew this bothered me for some reason and then it slipped my mind [1]

[1] https://sentry.softwareheritage.org/share/issue/c0d1fbfcd865486bb41eea35552f50dd/

status visit can only be ongoing, partial or full...

swh/loader/package/loader.py
343

Fixed in D2870