It also makes the weirdness of the dichotomy between content and skipped_content more glaring. In hindsight, having two somewhat incompatible tables populated by the same call to content_add is a bit weird.

It's also really the only place where we handle holes in the graph, while we really should handle them for all object types.

I'll accept the diff because length=-1 is a valid input to content_add; You can add a cross-check to make sure that length is -1 only when the content is marked as absent, until we untangle the management of the content and skipped_content tables.

This revision is now accepted and ready to land.Aug 19 2019, 4:50 PM

Only allow length=-1 if status=absent.

This revision was landed with ongoing or failed builds.Aug 19 2019, 4:56 PM

Closed by commit rDMOD19634f258915: Allow -1 as Content length. (authored by vlorentz). · Explain Why

This revision was automatically updated to reflect the committed changes.

Harbormaster failed remote builds in B7312: Diff 6276!Aug 19 2019, 4:56 PM

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DMOD/job/tox/108/
See console output for more information: https://jenkins.softwareheritage.org/job/DMOD/job/tox/108/console

[ I'm jumping in here, but obviously I'm missing the IRL context. ]

In D1862#43313, @olasd wrote:

It's also really the only place where we handle holes in the graph, while we really should handle them for all object types.

This is an important discrepancy that we should fix indeed.
To expand, and make sure we are on the same page, this is the only place where we can store some (incomplete) information about objects that are missing in full.
For other missing objects we just don't store any information at all about them, so all links to them from elsewhere are just dangling.

Having a list of all missing objects is something that would be already useful per se, and that today we don't have.
Having a place where to store partial information for any kind of missing objects will be a plus.

Does this summary sound correct?

(If so, we should stash this info in a dedicated task and discuss the matter further there.)

so all links to them from elsewhere are just dangling.

it's worse than that. Holes in VCSs other than git can't even be referenced, because we don't know their sha1_git.

Does this summary sound correct?

Yes

(If so, we should stash this info in a dedicated task and discuss the matter further there.)

I always assumed there was one, but I can't find it. So here it is: T1957

Revision Contents
Changeset List

Path

Size

swh/

model/

model.py

6 lines

Diff 6277

View Options

Allow -1 as Content length.ClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 6277

swh/model/model.py

Allow -1 as Content length.
ClosedPublic
Actions

Revision Contents
Changeset List