Page MenuHomeSoftware Heritage

pipeline storage: Add retry behavior on flushing failures
ClosedPublic

Authored by ardumont on Jan 30 2020, 12:34 PM.

Details

Summary

Currently, wrong "hash collisions" are happening a lot on ingestion [1] [2] [3]
The last loading step (flush) is failing on most loaders (git, npm, etc...).

This commits adds the retry behavior to the flush method.
Which should decrease the frequency of that error.

The remaining hash collisions which won't subside should be then real hash
collisions.

[1] https://sentry.softwareheritage.org/share/issue/102aace238fe4ba6b49bcc5531f7c2bf/

[2] https://sentry.softwareheritage.org/share/issue/8e8b48a1d94c465b8109e76311ecdbe7/

[3] https://sentry.softwareheritage.org/share/issue/d4f1208b7eec4b43b11e38494ff039cc/

Test Plan

tox

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.