Page MenuHomeSoftware Heritage

Initialize DB schema and postgresql storage checker
ClosedPublic

Authored by vlorentz on Mar 16 2022, 12:34 PM.

Details

Test Plan

Will fail because they depend on D7359.

Diff Detail

Repository
rDSCRUB Datastore Scrubber
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 27489
Build 43014: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 43013: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D7360 (id=26591)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit 2f81a6f554357a040c539619fb7fdc004278de32
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/1/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/1/console

remove hypothesis, add missing deps

Build has FAILED

Patch application report for D7360 (id=26592)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit 356a9957843aa290d0acaea2255488c9419e1887
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/2/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/2/console

replace unnecessary test data

Build has FAILED

Patch application report for D7360 (id=26593)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit 5e1dce96bf29bb372f3bb677fbc77c99bb88d816
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/3/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/3/console

Build has FAILED

Patch application report for D7360 (id=26605)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit f982347f75d0d96bafc6c92f4dea6ca6a4dec737
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/4/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/4/console

Build has FAILED

Patch application report for D7360 (id=26606)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit ccd098c4436b7b09ccde5bd26363d5ff66344565
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/5/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/5/console

add test with more than 1 corrupt object

Build has FAILED

Patch application report for D7360 (id=26608)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit 785ee4f78ad45b87da0ace58b487a9046a990ed8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/6/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/6/console

Ok, i started reviewing this, it's long.
It sounds ok from afar, i stopped at the test_cli so far.

Can you please:

  • land the dependent this diff depends upon (afaict they are accepted now so after you can trigger again the build here).
  • link this diff to the related task

TIA

waiting for jenkins to release swh-storage

Build has FAILED

Patch application report for D7360 (id=26608)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit 785ee4f78ad45b87da0ace58b487a9046a990ed8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/7/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/7/console

Build has FAILED

Patch application report for D7360 (id=26608)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit 785ee4f78ad45b87da0ace58b487a9046a990ed8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/8/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/8/console

Build has FAILED

Patch application report for D7360 (id=26608)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit 785ee4f78ad45b87da0ace58b487a9046a990ed8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/9/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/9/console

Build has FAILED

Patch application report for D7360 (id=26636)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit a8d2533f341b08ed5cbaf93fe1243e4d52e282c3
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/10/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/10/console

Build has FAILED

Patch application report for D7360 (id=26637)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit 72bcf77907d3c481e38a228b8997ef8a000db076
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/11/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/11/console

Build is green

Patch application report for D7360 (id=26638)

Rebasing onto 5e030a65a9...

Current branch diff-target is up to date.
Changes applied before test
commit be9a35c0c3977cb2b2f4ad2e9378d9cb394611f5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 16 12:33:37 2022 +0100

    Initialize DB schema and postgresql storage checker

See https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/12/ for more details.

lgtm

couple of questions.

corrupted feels more correct than 'corrupt' (even though a quick survey and it looks to be ok in english too...)

swh/scrubber/db.py
25

?

swh/scrubber/tests/test_storage_postgresql.py
25

neat, i did not realize we could do that ;)

78

not a big fan of the object_ name...

105

That's actually the same test one on a specific snapshot, another on multiple snapshots.
Is it necessary to keep the first one?

This revision is now accepted and ready to land.Mar 17 2022, 3:34 PM
swh/scrubber/tests/test_storage_postgresql.py
78

I couldn't think of a better name. obj could work too, but it is discouraged by PEP 8:

If your public attribute name collides with a reserved keyword, append a single trailing underscore to your attribute name. This is preferable to an abbreviation or corrupted spelling.

https://peps.python.org/pep-0008/#designing-for-inheritance

105

the first one checks the checker does not report all objects after the first corrupt one, for example.