Page MenuHomeSoftware Heritage

Add OriginLocator, to list candidate origins to recover objects from
ClosedPublic

Authored by vlorentz on Mar 25 2022, 2:28 PM.

Diff Detail

Repository
rDSCRUB Datastore Scrubber
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 28019
Build 43882: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 43881: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D7432 (id=26899)

Rebasing onto 4144ac789f...

Current branch diff-target is up to date.
Changes applied before test
commit d70dc6b0958ae0ca0b8dc9312eddbbd52191b937
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 23 13:49:19 2022 +0100

    Add OriginLocator, to list candidate origins to recover objects from

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/20/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/20/console

Harbormaster returned this revision to the author for changes because remote builds failed.Mar 25 2022, 2:29 PM
Harbormaster failed remote builds in B27813: Diff 26899!

Build has FAILED

Patch application report for D7432 (id=26901)

Rebasing onto 4144ac789f...

Current branch diff-target is up to date.
Changes applied before test
commit ce497b9800818cf5929301a4a6f174a019120038
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 23 13:49:19 2022 +0100

    Add OriginLocator, to list candidate origins to recover objects from

Link to build: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/22/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/22/console

Harbormaster returned this revision to the author for changes because remote builds failed.Mar 25 2022, 2:44 PM
Harbormaster failed remote builds in B27815: Diff 26901!

Build fails because the latest swh-graph release can't be uploaded on pypi

Build is green

Patch application report for D7432 (id=26901)

Rebasing onto 4144ac789f...

Current branch diff-target is up to date.
Changes applied before test
commit ce497b9800818cf5929301a4a6f174a019120038
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Mar 23 13:49:19 2022 +0100

    Add OriginLocator, to list candidate origins to recover objects from

See https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/27/ for more details.

You are missing the diff description.
It's not only about the need described by the diff title.

You are actually introducing the origin locator but also:

  • changing the schema to invert position of the id to check and the provenance column
  • modifying some internal details (regarding parameter position as well)

I'm unclear whether that needs to go in the same diff though? The diff is a tad big to review.
Anyway, some suggestions, questions inline.

Cheers,

swh/scrubber/cli.py
157

TTBOMU, you are missing the same check on the 'graph' key.

swh/scrubber/db.py
112

Don't you want some exclusion?
Otherwise, we'll probably check multiple times the same objects.
Not that big of a deal, just asking.

swh/scrubber/origin_locator.py
33

Can you please add some docstrings here as well?

63

docstring on that one?

83

You are actually introducing the origin locator but also:

  • changing the schema to invert position of the id to check and the provenance column

Because I don't want to send a diff just for that

You are missing the diff description.
It's not only about the need described by the diff title.

  • modifying some internal details (regarding parameter position as well)

What parameter position did I change? And the rest is either needed for this diff, or changed for consistency.

swh/scrubber/cli.py
157

yeah, D7433 adds it

swh/scrubber/db.py
112

Yes, good point. But this function is actually replaced by D7433 so not worth it.

swh/scrubber/db.py
59

What parameter position did I change? And the rest is either needed for this diff, or changed for consistency.

this one ^ for example

swh/scrubber/db.py
59

I changed the type (from a model object to just a SWHID), so I also changed the order for consistency with the DB schema

i forgot to update the diff's status btw.

This revision is now accepted and ready to land.Mar 29 2022, 6:02 PM
This revision was landed with ongoing or failed builds.Mar 30 2022, 1:31 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D7432 (id=27100)

Rebasing onto 21ee7e2f67...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-32-D7432.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DSCRUB/job/tests-on-diff/32/ for more details.