Page MenuHomeSoftware Heritage

Add more type checks in metadata dictionary.

Authored by vlorentz on Feb 5 2019, 10:16 AM.

Diff Detail

rDCIDX Object indexer
No Linters Available
No Unit Test Coverage
Build Status
Buildable 4124
Build 5432: tox-on-jenkinsJenkins
Build 5431: arc lint + arc unit

Event Timeline

vlorentz created this revision.Feb 5 2019, 10:16 AM

As always with this kind of diff, I'm quite puzzled: we add a lot of 'if' conditions (on types), but what are the unexpected types that the code may encounter? As is, we just (silently) ignore a bunch of values.

I get it that we parse (normally) structured files from the wide, so we shall expect nothing valid. But shouldn't we at least notify the failure? shouldn't there be a metedata entry that state the parsing was successful, partial or a complete failure?

Also, you add many defensive statements, but none of them are actually tested, unless I'm mistaken. Once again, please add tests for these invalid input resiliency checks.

douardda requested changes to this revision.Feb 7 2019, 9:33 AM
This revision now requires changes to proceed.Feb 7 2019, 9:33 AM
vlorentz updated this revision to Diff 3491.Feb 8 2019, 3:28 PM
  • Use hypothesis to generate adversarial inputs.
  • Fix bugs found by hypothesis.
vlorentz updated this revision to Diff 3492.Feb 8 2019, 3:40 PM
  • rebase
vlorentz updated this revision to Diff 3493.Feb 8 2019, 3:44 PM
  • Fix doctest typo
  • Smaller hypothesis examples.
vlorentz updated this revision to Diff 3494.Feb 8 2019, 4:20 PM
  • Fix another bug found by hypothesis.
douardda accepted this revision.Feb 12 2019, 9:32 AM

Let's keep on, but I really don't like the turn this code takes...

This revision is now accepted and ready to land.Feb 12 2019, 9:32 AM
vlorentz updated this revision to Diff 3528.Feb 12 2019, 12:19 PM
  • Rebase + squash
This revision was automatically updated to reflect the committed changes.