Page MenuHomeSoftware Heritage

Make release_add support adding the same object twice in the same call
ClosedPublic

Authored by douardda on Mar 5 2020, 2:20 PM.

Details

Summary

This is an edge case, but the mirror infrastructure is apparently hitting it. We
modify the SQL query to be properly idempotent.

Depends on D2773

Test Plan

added new unit test

Diff Detail

Repository
rDSTO Storage manager
Branch
arcpatch-D2771
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 11435
Build 17325: Phabricator diff pipeline on jenkinsJenkins · Jenkins console · Jenkins
Build 17324: arc lint + arc unit

Event Timeline

I'm guessing you don't want to do the same change for other object types, but you're introducing a subtle difference in the behavior of release_add vs the other ones.
Are you sure you want that?

swh/storage/sql/40-swh-func.sql
685

why the where is needed? the python code already calls release_missing in the same transaction

I'm guessing you don't want to do the same change for other object types, but you're introducing a subtle difference in the behavior of release_add vs the other ones.
Are you sure you want that?

Adding contents/skipped_contents already has that behavior; If anything we should align other endpoints to do the same thing. So yes, I'm sure I want that :)

In D2771#66215, @olasd wrote:

So yes, I'm sure I want that :)

I meant, are you sure you don't want to do the same thing for the other endpoints as well?

This revision is now accepted and ready to land.Mar 5 2020, 3:58 PM
vlorentz requested changes to this revision.Mar 6 2020, 1:19 AM

I really think we should either have it for all object types or none at all.

This revision now requires changes to proceed.Mar 6 2020, 1:19 AM

I really think we should either have it for all object types or none at all.

I agree it should be done everywhere, that said, this fix problems we have right now, so no reason not merge it as is.

swh/storage/sql/40-swh-func.sql
685

ping?

swh/storage/sql/40-swh-func.sql
685

It'd be great if we could just paste that snippet into an explain...
we can't because the tmp_release does not exist, /me is sad.

swh/storage/sql/40-swh-func.sql
685

It's not strictly needed; it's more of a belt + suspenders kind of thing.

I really think we should either have it for all object types or none at all.

I created T2316.

swh/storage/sql/40-swh-func.sql
685

I don't get the point, but whatever...

This revision is now accepted and ready to land.Mar 16 2020, 4:20 PM

Noop change to trigger build

This revision is now accepted and ready to land.Mar 27 2020, 6:49 PM
douardda edited reviewers, added: olasd; removed: douardda.

rebase and improve a bit the test as well as the cassandra and in_memory backends

Build has FAILED

Patch application report for D2771 (id=11961)

Rebasing onto 10443b8a17...

Current branch diff-target is up to date.
Changes applied before test
commit ca0cb992b9d48ce24ad96b2f659e85a1cc51da88
Author: David Douard <david.douard@sdfa3.org>
Date:   Mon Jun 29 16:23:45 2020 +0200

    Make release_add support adding the same object twice in the same call
    
    This is an edge case, but the mirror infrastructure is apparently hitting it.
    We modify the SQL query to be properly idempotent.
    
    Also ensure in_memory and cassandra backends behave the same.
    
    Note: this revision was mostly written by
          Nicolas Dandrimont <nicolas@dandrimont.eu>.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/385/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/385/console

Build is green

Patch application report for D2771 (id=11962)

Rebasing onto 10443b8a17...

Current branch diff-target is up to date.
Changes applied before test
commit dc1878b9fb507a64e4bf8a601efecdd1be8aa2d3
Author: David Douard <david.douard@sdfa3.org>
Date:   Mon Jun 29 16:23:45 2020 +0200

    Make release_add support adding the same object twice in the same call
    
    This is an edge case, but the mirror infrastructure is apparently hitting it.
    We modify the SQL query to be properly idempotent.
    
    Also ensure in_memory and cassandra backends behave the same.
    
    Note: this revision was mostly written by
          Nicolas Dandrimont <nicolas@dandrimont.eu>.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/386/ for more details.