Page MenuHomeSoftware Heritage

Split Content class into two classes, for missing and non-missing contents.
ClosedPublic

Authored by vlorentz on Feb 4 2020, 3:50 PM.

Details

Summary

I'm not very happy with the names though.
Suggestions welcome

Diff Detail

Repository
rDMOD Data model
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

vlorentz created this revision.Feb 4 2020, 3:50 PM
zack added a subscriber: zack.Feb 4 2020, 4:14 PM

s/non missing/present/
would be an improvement.

"Missing" isn't perfect, maybe, but it's consistent with SQL storage tables at least.

In D2623#62430, @zack wrote:

s/non missing/present/
would be an improvement.

Indeed

"Missing" isn't perfect, maybe, but it's consistent with SQL storage tables at least.

No, SQL tables use "skipped". But I prefer "missing" because it's more generic (it also includes content we couldn't find)

zack added a comment.Feb 4 2020, 4:29 PM

"Missing" isn't perfect, maybe, but it's consistent with SQL storage tables at least.

No, SQL tables use "skipped". But I prefer "missing" because it's more generic (it also includes content we couldn't find)

Ah, good point — I was misremembering.
I agree with you that "missing" is better.

ardumont added a comment.EditedFeb 4 2020, 4:49 PM

"Missing" isn't perfect, maybe, but it's consistent with SQL storage tables at least.

i remembered as much but the table is named skipped, not missing.
what uses missing are the storage endpoints.

so bonus point to improve consistency.

swh/model/hypothesis_strategies.py
136

as zack said present is better


also i like existing but it can be ambiguous.
if it's missing, it does not exist within the archive...
it exists for real though... like i said ambiguous.

go for present ;)

Is this already covered by current tests?

Is this already covered by current tests?

Yes, via the hypothesis strategies

ardumont accepted this revision.Feb 5 2020, 11:19 AM
This revision is now accepted and ready to land.Feb 5 2020, 11:19 AM
vlorentz updated this revision to Diff 9383.Feb 5 2020, 11:21 AM

rename non-missing -> present

vlorentz updated this revision to Diff 9384.Feb 5 2020, 11:27 AM
  • rename missing -> skipped

it avoid confusion with the terminology used in swh-storage,
as "content missing" means we never saw that content before;
while "skipped content" means we saw it but didn't ingest it
for some reason.

  • rename missing -> skipped

it avoid confusion with the terminology used in swh-storage,
as "content missing" means we never saw that content before;
while "skipped content" means we saw it but didn't ingest it
for some reason.

as per oral discussion, agreed!

ardumont accepted this revision.Feb 5 2020, 4:09 PM