Page MenuHomeSoftware Heritage

storage*: Align origin-visit-add to take iterable of OriginVisit objects as input
ClosedPublic

Authored by ardumont on Jun 11 2020, 3:26 PM.

Details

Summary

This makes its api consistent with other add endpoints (see
origin-visit-status-add as one example).

This is preparatory work towards removing origin-visit-upsert.

It's similar to origin-visit-upsert now but it does slighly more:

  1. it checks for origin existence (raise if unknown)
  2. it's able to either deal with visit with id (replayer case) and visit with no id (the existing use case)

It does also less by delegating the validation to the model instantiation
(prior to the call of the endpoint) which is a good thing.

This also drops the dateutil.parser we want to phase out.

Related to T2310

Test Plan

tox

Diff Detail

Repository
rDSTO Storage manager
Branch
origin-visit-add
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 12776
Build 19434: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 19433: arc lint + arc unit

Unit TestsFailed

TimeTest
5,058 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_kafka_writer::test_storage_direct_writer
kafka_prefix = 'xhasbvnnlq', kafka_server = '127.0.0.1:44385' consumer = <cimpl.Consumer object at 0x7f50ff7d3a60>
10,029 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_replay::test_storage_play_with_collision
replayer_storage_and_client = (<swh.storage.in_memory.InMemoryStorage object at 0x7f50fc589048>, <swh.journal.client.JournalClient object at 0x7f50f48be860>) caplog = <_pytest.logging.LogCaptureFixture object at 0x7f50fd5ec080>
10,034 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_replay::test_storage_replayer
replayer_storage_and_client = (<swh.storage.in_memory.InMemoryStorage object at 0x7f50f7258b00>, <swh.journal.client.JournalClient object at 0x7f50f9359940>) caplog = <_pytest.logging.LogCaptureFixture object at 0x7f50f93929e8>
4 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.fixer::swh.storage.fixer._fix_content
2 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.fixer::swh.storage.fixer._fix_origin
View Full Test Results (3 Failed · 763 Passed · 17 Skipped)

Event Timeline

ardumont edited the summary of this revision. (Show Details)
  • Fix mispelled comment
  • Make sure we *mostly* write to the journal prior to the backend when possible (not possible for the pg backend in the case of new origin visit).

Build has FAILED

Patch application report for D3262 (id=11568)

Rebasing onto 5d6163342f...

Current branch diff-target is up to date.
Changes applied before test
commit 2e88faeccc5d6cab044dcc83ab131fd92818a718
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jun 11 14:57:02 2020 +0200

    storage*: Align origin-visit-add to take iterable of OriginVisit objects
    
    This makes its api consistent with other add endpoints.
    
    This is preparatory work towards removing origin-visit-upsert.
    
    Related to T2310

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/254/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/254/console

Build has FAILED

Patch application report for D3262 (id=11569)

Rebasing onto 5d6163342f...

Current branch diff-target is up to date.
Changes applied before test
commit 73c45b82779da698b3f7f1bcdd1ebd67752fd801
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jun 11 14:57:02 2020 +0200

    storage*: Align origin-visit-add to take iterable of OriginVisit objects
    
    This makes its api consistent with other add endpoints.
    
    This is preparatory work towards removing origin-visit-upsert.
    
    Related to T2310

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/255/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/255/console

  • Fix most replayer missing origin-visit-add calls

Expects tests to be a tad red again (replayer wise only) because of the
in-memory storage implementation on passing existing origin-visit... unsure as
how to properly deal with that one yet.

Build has FAILED

Patch application report for D3262 (id=11571)

Rebasing onto 5d6163342f...

Current branch diff-target is up to date.
Changes applied before test
commit d95aeca3d79b59268322959cb058f876d10b93bd
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jun 11 16:18:01 2020 +0200

    in-memory: wip fix (unsure as what to do for the idempotency case for now)

commit 691cdc54a3fc78e8f281091d36df06b48d1da354
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jun 11 14:57:02 2020 +0200

    storage*: Align origin-visit-add to take iterable of OriginVisit objects
    
    This makes its api consistent with other add endpoints.
    
    This is preparatory work towards removing origin-visit-upsert.
    
    Related to T2310

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/256/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/256/console

ardumont retitled this revision from storage*: Align origin-visit-add to take iterable of OriginVisit objects to storage*: Align origin-visit-add to take iterable of OriginVisit objects as input.Jun 11 2020, 5:13 PM

Build is green

Patch application report for D3262 (id=11572)

Rebasing onto 5d6163342f...

Current branch diff-target is up to date.
Changes applied before test
commit ceed32bc9c95e2c997beaeae2f447253291e58cd
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jun 11 14:57:02 2020 +0200

    storage*: Align origin-visit-add to take iterable of OriginVisit objects
    
    This makes its api consistent with other add endpoints.
    
    This is preparatory work towards removing origin-visit-upsert.
    
    Related to T2310

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/257/ for more details.

vlorentz added inline comments.
swh/storage/cassandra/storage.py
810

visit = attr.evolve(visit, visit=visit_id)

swh/storage/in_memory.py
817–819

same

swh/storage/storage.py
841

same

swh/storage/tests/test_replay.py
73–80

cool

swh/storage/tests/test_storage.py
1623–1633

inconsistent names: expected_ori_visit_dict vs visit_status

1682–1686

nit: (origin_visit1, origin_visits) = swh_storage.origin_visit_add([visit1, visit2])

1683–1684

good

1768–1770

same

1991

may it still raise a ValueError?

1996

dead code

This revision is now accepted and ready to land.Jun 11 2020, 6:12 PM
swh/storage/cassandra/storage.py
810

oh yeah! thanks!

swh/storage/tests/test_storage.py
1623–1633

yeah, i recall it was to avoid too long lines...or something,

i'll see what i can do.

1682–1686

right, i forgot to make a pass to check the ones i missed ;)

1991

no idea, i'll check.

ardumont added inline comments.
swh/storage/tests/test_storage.py
1991

it's origin-visit-update here not add so i guess, yes it still raises.
Also, note that i will tear it down soon so don't be attached too much ;)

ardumont edited the summary of this revision. (Show Details)

Adapt according to review

Build is green

Patch application report for D3262 (id=11576)

Rebasing onto 5d6163342f...

Current branch diff-target is up to date.
Changes applied before test
commit 785e480ed1b54fde0a26094c92ea2583a3a95679
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jun 11 14:57:02 2020 +0200

    storage*: Align origin-visit-add to take iterable of OriginVisit objects
    
    This makes its api consistent with other add endpoints.
    
    This is preparatory work towards removing origin-visit-upsert.
    
    Related to T2310

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/258/ for more details.

Build has FAILED

Patch application report for D3262 (id=11578)

Rebasing onto 5d6163342f...

Current branch diff-target is up to date.
Changes applied before test
commit 37c45300bc44694fddf945bd70540b2ece5e81c3
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jun 11 14:57:02 2020 +0200

    storage*: Align origin-visit-add to take iterable of OriginVisit objects
    
    This makes its api consistent with other add endpoints.
    
    This is preparatory work towards removing origin-visit-upsert.
    
    Related to T2310

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/259/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/259/console

Build is green

Patch application report for D3262 (id=11578)

Rebasing onto 5d6163342f...

Current branch diff-target is up to date.
Changes applied before test
commit 37c45300bc44694fddf945bd70540b2ece5e81c3
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jun 11 14:57:02 2020 +0200

    storage*: Align origin-visit-add to take iterable of OriginVisit objects
    
    This makes its api consistent with other add endpoints.
    
    This is preparatory work towards removing origin-visit-upsert.
    
    Related to T2310

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/260/ for more details.