Page MenuHomeSoftware Heritage

loader.core: Make check_snapshot ensure snapshot object references exist
ClosedPublic

Authored by ardumont on Jul 9 2020, 4:17 PM.

Details

Summary

When asserting the snapshot is the one expected, this now check the references
targeted by the snapshot do exist.

This also checks that the alias present in the snapshot actually target
branches in the snapshot.

For now, those check concern only release and revision objects (not yet up to
directories and contents).

The function still allows some branches to not be resolvable. Like for example
the "evaluation" branch in the case of the nixguix loader.

Related to T2483

Depends on D3473

Test Plan

tox

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build was aborted

Patch application report for D3478 (id=12309)

Could not rebase; Attempt merge onto 39ffcde657...

Updating 39ffcde..0bd2f4a
Fast-forward
 swh/loader/tests/__init__.py  | 113 ++++++++++++++---
 swh/loader/tests/test_init.py | 288 ++++++++++++++++++++++++++++++++++++------
 2 files changed, 342 insertions(+), 59 deletions(-)
Changes applied before test
commit 0bd2f4a60809940e6c73405f3706310c7212cb13
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Related to T2483

commit 5d02595567ba777aa081a740b86db956e0e8ca2d
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 8 17:36:21 2020 +0200

    check_snapshot: Allow to check using both Snapshot and dict
    
    So we can smoothly transition to stop using snapshot as dict.
    
    Related to T2483

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/123/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/123/console

Rebase on latest master

Now the build is red because apparently loader-nixguix inserts incomplete
snapshot (I'll check further) and the new check-snapshot implementation detected it (nice or what !?)

Build has FAILED

Patch application report for D3478 (id=12318)

Could not rebase; Attempt merge onto ed4e0b38c1...

Updating ed4e0b3..b48bd3b
Fast-forward
 swh/loader/tests/__init__.py  | 113 ++++++++++++++---
 swh/loader/tests/test_init.py | 288 ++++++++++++++++++++++++++++++++++++------
 2 files changed, 342 insertions(+), 59 deletions(-)
Changes applied before test
commit b48bd3bb7efdbeffccf22b2bd286d914cfc9a13a
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Related to T2483

commit b5ccfd7ae14ad8f415417281e4e135779a839a61
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 8 17:36:21 2020 +0200

    check_snapshot: Allow to check using both Snapshot and dict
    
    So we can smoothly transition to stop using snapshot as dict.
    
    Related to T2483

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/126/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/126/console

(build currently fails because some loaders inserts incomplete data apparently)

ah but yeah the evaluation branch...
took me too long to see it, i'll update check-snapshot so the information is displayed.

And i'll need to add the means to explicitely allow some holes in there.

  • Open safelist parameter to allow some objects to be unchecked
  • Override check-snapshot for the test_nixguix so tests are fine (using that safelist parameter)
  • Improve reported message when failing the assertions so we know what branch and target is reported missing

Build is green

Patch application report for D3478 (id=12327)

Could not rebase; Attempt merge onto ed4e0b38c1...

Updating ed4e0b3..6073467
Fast-forward
 swh/loader/package/nixguix/tests/test_nixguix.py |  13 +-
 swh/loader/tests/__init__.py                     | 127 ++++++++--
 swh/loader/tests/test_init.py                    | 295 +++++++++++++++++++----
 3 files changed, 374 insertions(+), 61 deletions(-)
Changes applied before test
commit 60734673df500704ac91ff8cd36328e070abcad4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Allow to loosen up the check by safe-listing some object type (for example, for
    the nixguix loader, allows the branch "evaluation" with type revision to be non
    resolvable)
    
    Related to T2483

commit b5ccfd7ae14ad8f415417281e4e135779a839a61
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 8 17:36:21 2020 +0200

    check_snapshot: Allow to check using both Snapshot and dict
    
    So we can smoothly transition to stop using snapshot as dict.
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/127/ for more details.

I have trouble grasping the content of this diff, so I'll leave it to future me / someone else.

Some style comments though:

swh/loader/tests/__init__.py
103–116

way too many newlines here

118

that's not a great name. What about allowed_empty?

156

Could you rename it to branches_by_target_type, or something of the sort?

157

That's not a very helpful comment :/

172–174

you can do the joining directly on this line

177

space should be before the line break

185–187

same

189

concat

swh/loader/tests/test_init.py
62–154

Why .from_dict() instead of building them directly?

swh/loader/tests/__init__.py
118

sounds good.

172–174

inside the string you mean?
(it was there originally but it was unreadable)

swh/loader/tests/test_init.py
62–154

eager to go to test the function and then i forgot to refactor.

Also i wanted initially to reuse the storage_data objects and drop those.
But then, realized they are not consisten in regards to ids as it is here.
so really, i'm missing the refactor step to make those direct objects ;)

Thanks.

swh/loader/tests/__init__.py
103–116

What's the change proposal?
(I did not get the implied solution)

157

"alias" consistency?
I want to check that the alias reference does not target something that does not exist.

swh/loader/tests/__init__.py
103–116
class InconsistentAliasBranchError(AssertionError):
    """When an alias branch targets an inexistent branch."""
    pass
157

"check that the alias reference does not target something that does not exist"
-> write that ;)

172–174

on the contrary, not inside the string: `missing_objs = ', '.join(str((object_to_branch[rev], rev.hex())) for rev in not_found)

swh/loader/tests/__init__.py
103–116

heh, right, though the pass will get spaced by black (i think) ;)

Thanks for the review, will adapt once i rebase on D3473 ;)

ardumont edited the summary of this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)
  • Rebase on latest master
  • Adapt according to review

Build is green

Patch application report for D3478 (id=12351)

Rebasing onto 72590f9980...

Current branch diff-target is up to date.
Changes applied before test
commit b7050398a9cc253c438c172b72b05f19b8016d55
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Allow to loosen up the check by safe-listing some object type (for example, for
    the nixguix loader, allows the branch "evaluation" with type revision to be non
    resolvable)
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/131/ for more details.

swh/loader/tests/test_init.py
76

Still need to convert those into model objects (incoming heh ;)

adapt according to review: make model objects ;)

swh/loader/tests/test_init.py
456

That's not needed, drop it.

Build is green

Patch application report for D3478 (id=12352)

Rebasing onto 72590f9980...

Current branch diff-target is up to date.
Changes applied before test
commit 016405ca812f49b7c60517f799423494c52c819e
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Allow to loosen up the check by safe-listing some object type (for example, for
    the nixguix loader, allows the branch "evaluation" with type revision to be non
    resolvable)
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/132/ for more details.

Build is green

Patch application report for D3478 (id=12353)

Rebasing onto 72590f9980...

Current branch diff-target is up to date.
Changes applied before test
commit 04a0e9406155f22cea815f0544a5f75b034deb45
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Allow to loosen up the check by safe-listing some object type (for example, for
    the nixguix loader, allows the branch "evaluation" with type revision to be non
    resolvable)
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/133/ for more details.

Build is green

Patch application report for D3478 (id=12354)

Rebasing onto 72590f9980...

Current branch diff-target is up to date.
Changes applied before test
commit f6bc2ff4db2ab4320410b78fdf602e81f0e1ef74
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    When asserting the snapshot is the one expected, this now check the references
    targeted by the snapshot do exist.
    
    This also checks that the alias present in the snapshot actually target
    branches in the snapshot.
    
    For now, those check concern only release and revision objects (not yet up to
    directories and contents).
    
    The function still allows some branches to not be resolvable. Like for example
    the "evaluation" branch in the case of the nixguix loader.
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/134/ for more details.

Rework commit message (again)

Build is green

Patch application report for D3478 (id=12355)

Rebasing onto 72590f9980...

Current branch diff-target is up to date.
Changes applied before test
commit fe28c44349dd21ec1906058a83d4ca91147ee2e6
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    When asserting the snapshot is the one expected, this now checks the references
    targeted by the snapshot do exist as well.
    
    This also checks that the alias present in the snapshot actually target
    branches in the snapshot.
    
    For now, those checks concern only release and revision objects (it does not
    check up to directories and contents yet).
    
    The function still allows to declare that some branches can be empty. Like for
    example the "evaluation" branch in the case of the nixguix loader.
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/135/ for more details.

swh/loader/tests/test_init.py
456

done

This revision is now accepted and ready to land.Jul 10 2020, 4:12 PM