Page MenuHomeSoftware Heritage

Refactor the Storage.add_origin API
ClosedPublic

Authored by douardda on Jun 22 2020, 11:37 AM.

Details

Summary

make it consistent with other add_xxx methods, ie. make it return a
summary dict.

Test Plan

Tests of all dependencies (but swh-web, see D3326) should pass OK.

Diff Detail

Unit TestsFailed

TimeTest
742 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_cassandra.TestCassandraStorage::test_origin_add
self = <swh.storage.tests.test_cassandra.TestCassandraStorage object at 0x7f9f643d3eb8> swh_storage = <swh.storage.validate.ValidatingProxyStorage object at 0x7f9f643d1908>
699 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_cassandra.TestCassandraStorage::test_origin_add_from_generator
self = <swh.storage.tests.test_cassandra.TestCassandraStorage object at 0x7f9f9c5e20b8> swh_storage = <swh.storage.validate.ValidatingProxyStorage object at 0x7f9fd02fb278>
610 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_cassandra.TestCassandraStorage::test_origin_add_twice
self = <swh.storage.tests.test_cassandra.TestCassandraStorage object at 0x7f9f9c2f52e8> swh_storage = <swh.storage.validate.ValidatingProxyStorage object at 0x7f9f94287a20>
717 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_cassandra.TestCassandraStorage::test_origin_metadata_add
self = <swh.storage.tests.test_cassandra.TestCassandraStorage object at 0x7f9f00516be0> swh_storage = <swh.storage.validate.ValidatingProxyStorage object at 0x7f9f0006e6d8>
699 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_cassandra.TestCassandraStorage::test_origin_metadata_add_dict
self = <swh.storage.tests.test_cassandra.TestCassandraStorage object at 0x7f9f00727668> swh_storage = <swh.storage.validate.ValidatingProxyStorage object at 0x7f9ef871a518>
View Full Test Results (21 Failed · 727 Passed · 17 Skipped)

Event Timeline

Build has FAILED

Patch application report for D3327 (id=11786)

Rebasing onto 041543df09...

First, rewinding head to replay your work on top of it...
Applying: Refactor the Storage.add_origin API
Changes applied before test
commit 5bb22b8d6b629cba0a86d574d7e9748e17b5232c
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Jun 18 18:28:51 2020 +0200

    Refactor the Storage.add_origin API
    
    make it consistent with other add_xxx methods, ie. make it return a
    summary dict.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/323/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/323/console

Build has FAILED

Patch application report for D3327 (id=11796)

Rebasing onto 041543df09...

First, rewinding head to replay your work on top of it...
Applying: Refactor the Storage.add_origin API
Changes applied before test
commit 4a8e2cf9d048e6939ca8fa37f1a632c7b876b2c2
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Jun 18 18:28:51 2020 +0200

    Refactor the Storage.add_origin API
    
    make it consistent with other add_xxx methods, ie. make it return a
    summary dict.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/326/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/326/console

Build has FAILED

Patch application report for D3327 (id=11796)

Rebasing onto 041543df09...

First, rewinding head to replay your work on top of it...
Applying: Refactor the Storage.add_origin API
Changes applied before test
commit b666b60209ddac2a000d5e97362f3b3df932466f
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Jun 18 18:28:51 2020 +0200

    Refactor the Storage.add_origin API
    
    make it consistent with other add_xxx methods, ie. make it return a
    summary dict.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/329/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/329/console

vlorentz added a subscriber: vlorentz.

Could you change the first line of your commit message to explain what it changed?

swh/storage/cassandra/storage.py
766

It is, but it doesn't matter. If an origin gets inserted twice at the same time, both entries have the same primary key, so only the latter is kept.

767–771

Remove the brackets around the call to origin_get

This revision now requires changes to proceed.Jun 22 2020, 4:37 PM
swh/storage/cassandra/storage.py
766

ok thx

767–771

thx again

swh/storage/cassandra/storage.py
766

Now I remember: what may be inconsistent is the number of origin added retruned in the status dict {"origin:add": n} but I guess it's not really a problem.

swh/storage/cassandra/storage.py
766

yup.

all stats are approximates anyway

reword the commit message and fix cassandra impl

Build has FAILED

Patch application report for D3327 (id=11813)

Rebasing onto 2d497ff07c...

First, rewinding head to replay your work on top of it...
Applying: Make Storage.add_origin() return a sumary dict
Changes applied before test
commit 935e5c71b1bdc72805ea69ca569a7bca5af800f6
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Jun 18 18:28:51 2020 +0200

    Make Storage.add_origin() return a sumary dict
    
    make it consistent with other add_xxx methods by making it return a
    summary dict `{"origin:add": int}`.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/331/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/331/console

This revision is now accepted and ready to land.Jun 22 2020, 5:42 PM
ardumont added inline comments.
swh/storage/cassandra/storage.py
767–771

To repeat what i've said on irc, I think that filtering might be probably the root cause of the current tests failing...

add a revision to fix test_origin_visit_status_add_twice

needed for the tests to pass ok with the cassandra backend after
the refactoring.

Build has FAILED

Patch application report for D3327 (id=11836)

Rebasing onto 2d497ff07c...

First, rewinding head to replay your work on top of it...
Applying: Simplify/fix test_origin_visit_status_add_twice
Applying: Make Storage.add_origin() return a sumary dict
Changes applied before test
commit 22b602541c9919557cc9c122d8e24912d0be2d56
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Jun 18 18:28:51 2020 +0200

    Make Storage.add_origin() return a sumary dict
    
    make it consistent with other add_xxx methods by making it return a
    summary dict `{"origin:add": int}`.

commit d706e200687f569c05cceebc8645a8cfde22a9ab
Author: David Douard <david.douard@sdfa3.org>
Date:   Mon Jun 22 17:47:26 2020 +0200

    Simplify/fix test_origin_visit_status_add_twice
    
    do not make assumptions on how the backend storage will manage idempotence
    related to the journal.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/335/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/335/console

remove the previously added revision and fix the actual bug instead

[...]

Build is green

Patch application report for D3327 (id=11840)

Rebasing onto 2d497ff07c...

Current branch diff-target is up to date.
Changes applied before test
commit fb603e1b9874a3bbb393f7653ec5004bfa18b251
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Jun 18 18:28:51 2020 +0200

    Make Storage.add_origin() return a sumary dict
    
    make it consistent with other add_xxx methods by making it return a
    summary dict `{"origin:add": int}`.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/337/ for more details.