Page MenuHomeSoftware Heritage

Add more type checks in metadata dictionary.
ClosedPublic

Authored by vlorentz on Feb 5 2019, 10:16 AM.

Diff Detail

Repository
rDCIDX Metadata indexer
Branch
more-type-checks
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 4158
Build 5481: tox-on-jenkinsJenkins
Build 5480: arc lint + arc unit

Event Timeline

As always with this kind of diff, I'm quite puzzled: we add a lot of 'if' conditions (on types), but what are the unexpected types that the code may encounter? As is, we just (silently) ignore a bunch of values.

I get it that we parse (normally) structured files from the wide, so we shall expect nothing valid. But shouldn't we at least notify the failure? shouldn't there be a metedata entry that state the parsing was successful, partial or a complete failure?

Also, you add many defensive statements, but none of them are actually tested, unless I'm mistaken. Once again, please add tests for these invalid input resiliency checks.

douardda requested changes to this revision.Feb 7 2019, 9:33 AM
This revision now requires changes to proceed.Feb 7 2019, 9:33 AM
  • Use hypothesis to generate adversarial inputs.
  • Fix bugs found by hypothesis.
  • Fix doctest typo
  • Smaller hypothesis examples.
  • Fix another bug found by hypothesis.

Let's keep on, but I really don't like the turn this code takes...

This revision is now accepted and ready to land.Feb 12 2019, 9:32 AM
This revision was automatically updated to reflect the committed changes.