Page MenuHomeSoftware Heritage

loader.core: Make check_snapshot ensure snapshot object references exist
ClosedPublic

Authored by ardumont on Jul 9 2020, 4:17 PM.

Details

Summary

When asserting the snapshot is the one expected, this now check the references
targeted by the snapshot do exist.

This also checks that the alias present in the snapshot actually target
branches in the snapshot.

For now, those check concern only release and revision objects (not yet up to
directories and contents).

The function still allows some branches to not be resolvable. Like for example
the "evaluation" branch in the case of the nixguix loader.

Related to T2483

Depends on D3473

Test Plan

tox

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Branch
refactor-check
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 13539
Build 20722: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 20721: arc lint + arc unit

Unit TestsFailed

TimeTest
194 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.loader.package.nixguix.tests.test_nixguix::test_eoferror
swh_config = '/tmp/pytest-of-jenkins/pytest-0/test_eoferror0/loader.yml' requests_mock_datadir = <requests_mock.mocker.Mocker object at 0x7fad82049a58>
194 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.loader.package.nixguix.tests.test_nixguix::test_evaluation_branch
swh_config = '/tmp/pytest-of-jenkins/pytest-0/test_evaluation_branch0/loader.yml' requests_mock_datadir = <requests_mock.mocker.Mocker object at 0x7fad82281eb8>
219 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.loader.package.nixguix.tests.test_nixguix::test_loader_incremental
swh_config = '/tmp/pytest-of-jenkins/pytest-0/test_loader_incremental0/loader.yml' requests_mock_datadir = <requests_mock.mocker.Mocker object at 0x7fad821c0128>
200 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.loader.package.nixguix.tests.test_nixguix::test_loader_two_visits
swh_config = '/tmp/pytest-of-jenkins/pytest-0/test_loader_two_visits0/loader.yml' requests_mock_datadir_visits = <requests_mock.mocker.Mocker object at 0x7fad8229eb38>
194 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.loader.package.nixguix.tests.test_nixguix::test_raise_exception
swh_config = '/tmp/pytest-of-jenkins/pytest-0/test_raise_exception0/loader.yml' requests_mock_datadir = <requests_mock.mocker.Mocker object at 0x7fad8219dac8> mocker = <pytest_mock.plugin.MockFixture object at 0x7fad821df940>
View Full Test Results (5 Failed · 137 Passed)

Event Timeline

ardumont created this revision.Jul 9 2020, 4:17 PM

Build was aborted

Patch application report for D3478 (id=12309)

Could not rebase; Attempt merge onto 39ffcde657...

Updating 39ffcde..0bd2f4a
Fast-forward
 swh/loader/tests/__init__.py  | 113 ++++++++++++++---
 swh/loader/tests/test_init.py | 288 ++++++++++++++++++++++++++++++++++++------
 2 files changed, 342 insertions(+), 59 deletions(-)
Changes applied before test
commit 0bd2f4a60809940e6c73405f3706310c7212cb13
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Related to T2483

commit 5d02595567ba777aa081a740b86db956e0e8ca2d
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 8 17:36:21 2020 +0200

    check_snapshot: Allow to check using both Snapshot and dict
    
    So we can smoothly transition to stop using snapshot as dict.
    
    Related to T2483

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/123/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/123/console

ardumont updated this revision to Diff 12318.EditedJul 9 2020, 5:40 PM

Rebase on latest master

Now the build is red because apparently loader-nixguix inserts incomplete
snapshot (I'll check further) and the new check-snapshot implementation detected it (nice or what !?)

Build has FAILED

Patch application report for D3478 (id=12318)

Could not rebase; Attempt merge onto ed4e0b38c1...

Updating ed4e0b3..b48bd3b
Fast-forward
 swh/loader/tests/__init__.py  | 113 ++++++++++++++---
 swh/loader/tests/test_init.py | 288 ++++++++++++++++++++++++++++++++++++------
 2 files changed, 342 insertions(+), 59 deletions(-)
Changes applied before test
commit b48bd3bb7efdbeffccf22b2bd286d914cfc9a13a
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Related to T2483

commit b5ccfd7ae14ad8f415417281e4e135779a839a61
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 8 17:36:21 2020 +0200

    check_snapshot: Allow to check using both Snapshot and dict
    
    So we can smoothly transition to stop using snapshot as dict.
    
    Related to T2483

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/126/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/126/console

ardumont edited the test plan for this revision. (Show Details)Jul 9 2020, 5:43 PM
ardumont edited the test plan for this revision. (Show Details)Jul 9 2020, 5:47 PM

(build currently fails because some loaders inserts incomplete data apparently)

ah but yeah the evaluation branch...
took me too long to see it, i'll update check-snapshot so the information is displayed.

And i'll need to add the means to explicitely allow some holes in there.

ardumont updated this revision to Diff 12327.Jul 9 2020, 6:28 PM
  • Open safelist parameter to allow some objects to be unchecked
  • Override check-snapshot for the test_nixguix so tests are fine (using that safelist parameter)
  • Improve reported message when failing the assertions so we know what branch and target is reported missing
ardumont edited the test plan for this revision. (Show Details)Jul 9 2020, 6:28 PM

Build is green

Patch application report for D3478 (id=12327)

Could not rebase; Attempt merge onto ed4e0b38c1...

Updating ed4e0b3..6073467
Fast-forward
 swh/loader/package/nixguix/tests/test_nixguix.py |  13 +-
 swh/loader/tests/__init__.py                     | 127 ++++++++--
 swh/loader/tests/test_init.py                    | 295 +++++++++++++++++++----
 3 files changed, 374 insertions(+), 61 deletions(-)
Changes applied before test
commit 60734673df500704ac91ff8cd36328e070abcad4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Allow to loosen up the check by safe-listing some object type (for example, for
    the nixguix loader, allows the branch "evaluation" with type revision to be non
    resolvable)
    
    Related to T2483

commit b5ccfd7ae14ad8f415417281e4e135779a839a61
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 8 17:36:21 2020 +0200

    check_snapshot: Allow to check using both Snapshot and dict
    
    So we can smoothly transition to stop using snapshot as dict.
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/127/ for more details.

ardumont edited the summary of this revision. (Show Details)Jul 9 2020, 6:38 PM

I have trouble grasping the content of this diff, so I'll leave it to future me / someone else.

Some style comments though:

swh/loader/tests/__init__.py
104–117

way too many newlines here

123

that's not a great name. What about allowed_empty?

154

Could you rename it to branches_by_target_type, or something of the sort?

172

That's not a very helpful comment :/

187–189

you can do the joining directly on this line

192

space should be before the line break

200–202

same

204

concat

swh/loader/tests/test_init.py
60–152

Why .from_dict() instead of building them directly?

ardumont added inline comments.Jul 10 2020, 12:00 PM
swh/loader/tests/__init__.py
123

sounds good.

187–189

inside the string you mean?
(it was there originally but it was unreadable)

swh/loader/tests/test_init.py
60–152

eager to go to test the function and then i forgot to refactor.

Also i wanted initially to reuse the storage_data objects and drop those.
But then, realized they are not consisten in regards to ids as it is here.
so really, i'm missing the refactor step to make those direct objects ;)

Thanks.

ardumont added inline comments.Jul 10 2020, 12:04 PM
swh/loader/tests/__init__.py
104–117

What's the change proposal?
(I did not get the implied solution)

172

"alias" consistency?
I want to check that the alias reference does not target something that does not exist.

vlorentz added inline comments.Jul 10 2020, 12:11 PM
swh/loader/tests/__init__.py
104–117
class InconsistentAliasBranchError(AssertionError):
    """When an alias branch targets an inexistent branch."""
    pass
172

"check that the alias reference does not target something that does not exist"
-> write that ;)

187–189

on the contrary, not inside the string: `missing_objs = ', '.join(str((object_to_branch[rev], rev.hex())) for rev in not_found)

ardumont added inline comments.Jul 10 2020, 12:53 PM
swh/loader/tests/__init__.py
104–117

heh, right, though the pass will get spaced by black (i think) ;)

Thanks for the review, will adapt once i rebase on D3473 ;)

ardumont edited the summary of this revision. (Show Details)Jul 10 2020, 1:09 PM
ardumont updated this revision to Diff 12351.Jul 10 2020, 2:51 PM
ardumont edited the summary of this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)
  • Rebase on latest master
  • Adapt according to review

Build is green

Patch application report for D3478 (id=12351)

Rebasing onto 72590f9980...

Current branch diff-target is up to date.
Changes applied before test
commit b7050398a9cc253c438c172b72b05f19b8016d55
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Allow to loosen up the check by safe-listing some object type (for example, for
    the nixguix loader, allows the branch "evaluation" with type revision to be non
    resolvable)
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/131/ for more details.

ardumont added inline comments.Jul 10 2020, 3:01 PM
swh/loader/tests/test_init.py
74

Still need to convert those into model objects (incoming heh ;)

ardumont updated this revision to Diff 12352.Jul 10 2020, 3:11 PM

adapt according to review: make model objects ;)

ardumont added inline comments.Jul 10 2020, 3:12 PM
swh/loader/tests/test_init.py
455

That's not needed, drop it.

Build is green

Patch application report for D3478 (id=12352)

Rebasing onto 72590f9980...

Current branch diff-target is up to date.
Changes applied before test
commit 016405ca812f49b7c60517f799423494c52c819e
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Allow to loosen up the check by safe-listing some object type (for example, for
    the nixguix loader, allows the branch "evaluation" with type revision to be non
    resolvable)
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/132/ for more details.

ardumont updated this revision to Diff 12353.Jul 10 2020, 3:13 PM

Drop unneeded instructions

Build is green

Patch application report for D3478 (id=12353)

Rebasing onto 72590f9980...

Current branch diff-target is up to date.
Changes applied before test
commit 04a0e9406155f22cea815f0544a5f75b034deb45
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    Up to revision and release check.
    
    Allow to loosen up the check by safe-listing some object type (for example, for
    the nixguix loader, allows the branch "evaluation" with type revision to be non
    resolvable)
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/133/ for more details.

ardumont updated this revision to Diff 12354.Jul 10 2020, 3:19 PM

Rework commit message

Build is green

Patch application report for D3478 (id=12354)

Rebasing onto 72590f9980...

Current branch diff-target is up to date.
Changes applied before test
commit f6bc2ff4db2ab4320410b78fdf602e81f0e1ef74
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    When asserting the snapshot is the one expected, this now check the references
    targeted by the snapshot do exist.
    
    This also checks that the alias present in the snapshot actually target
    branches in the snapshot.
    
    For now, those check concern only release and revision objects (not yet up to
    directories and contents).
    
    The function still allows some branches to not be resolvable. Like for example
    the "evaluation" branch in the case of the nixguix loader.
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/134/ for more details.

ardumont updated this revision to Diff 12355.Jul 10 2020, 3:24 PM

Rework commit message (again)

Build is green

Patch application report for D3478 (id=12355)

Rebasing onto 72590f9980...

Current branch diff-target is up to date.
Changes applied before test
commit fe28c44349dd21ec1906058a83d4ca91147ee2e6
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 9 16:05:47 2020 +0200

    Make check_snapshot ensure snapshot object references exist
    
    When asserting the snapshot is the one expected, this now checks the references
    targeted by the snapshot do exist as well.
    
    This also checks that the alias present in the snapshot actually target
    branches in the snapshot.
    
    For now, those checks concern only release and revision objects (it does not
    check up to directories and contents yet).
    
    The function still allows to declare that some branches can be empty. Like for
    example the "evaluation" branch in the case of the nixguix loader.
    
    Related to T2483

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/135/ for more details.

ardumont added inline comments.Jul 10 2020, 4:03 PM
swh/loader/tests/test_init.py
455

done

vlorentz accepted this revision.Jul 10 2020, 4:12 PM
This revision is now accepted and ready to land.Jul 10 2020, 4:12 PM