Page MenuHomeSoftware Heritage

origin-visit-add: Write origin-visit-status objects to the journal
ClosedPublic

Authored by ardumont on Jun 8 2020, 11:56 AM.

Details

Summary

This also makes the instruction order consistent across the different storage
implementations. First, write objects to the journal, then write objects to the
storage backend.

This does not deal yet with endpoints origin-visit-update and origin-visit-upsert.

Related to T2310

Test Plan

tox

Diff Detail

Repository
rDSTO Storage manager
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 12694
Build 19299: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 19298: arc lint + arc unit

Unit TestsFailed

TimeTest
4,027 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_kafka_writer::test_storage_direct_writer
kafka_prefix = 'iencbtyler', kafka_server = '127.0.0.1:54133' consumer = <cimpl.Consumer object at 0x7f9ed0258d90>
8,092 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_kafka_writer::test_storage_direct_writer_anonymized
kafka_prefix = 'kwsrblufgh', kafka_server = '127.0.0.1:54133' consumer = <cimpl.Consumer object at 0x7f9ed021e158>
10,071 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_replay::test_storage_play_with_collision
replayer_storage_and_client = (<swh.storage.in_memory.InMemoryStorage object at 0x7f9ed01a5e48>, <swh.journal.client.JournalClient object at 0x7f9ed019e898>) caplog = <_pytest.logging.LogCaptureFixture object at 0x7f9ed02100f0>
10,024 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_replay::test_storage_replayer
replayer_storage_and_client = (<swh.storage.in_memory.InMemoryStorage object at 0x7f9ed020c438>, <swh.journal.client.JournalClient object at 0x7f9ed01ed358>) caplog = <_pytest.logging.LogCaptureFixture object at 0x7f9ed0210f28>
8 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.fixer::swh.storage.fixer._fix_content
View Full Test Results (4 Failed · 762 Passed · 17 Skipped)

Event Timeline

Build was aborted

Patch application report for D3238 (id=11485)

Rebasing onto dcef916e5e...

Current branch diff-target is up to date.
Changes applied before test
commit 18de8f6087d230dcd7d09e11c1e4a99af9cb0066
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Jun 8 11:42:13 2020 +0200

    origin-visit-{add|update}: Write origin-visit-status-add to the journal
    
    This also makes the instruction order consistent across the different storage
    implementations. First, write objects to the journal, then write objects to the
    storage backend.
    
    Related to T2310

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/239/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/239/console

ardumont retitled this revision from origin-visit-{add|update}: Write origin-visit-status-add to the journal to origin-visit-add: Write origin-visit-status to the journal.
ardumont edited the summary of this revision. (Show Details)

I need to update the journal module for origin-visit-status as dependencies.

Reduce scope to only origin-visit-add (which now also write origin-visit-status
to journal)

Other diffs will come to deal with origin-visit-update and origin-visit-upsert.

ardumont edited the test plan for this revision. (Show Details)
ardumont added a project: Storage manager.
ardumont edited the test plan for this revision. (Show Details)
swh/storage/in_memory.py
900

I just realigned the implementation with the other backends.

This also aligns the journal order writes (which is tested and failed due to origin-visit-status being written there now)

Build has FAILED

Patch application report for D3238 (id=11489)

Rebasing onto dcef916e5e...

Current branch diff-target is up to date.
Changes applied before test
commit 3c0203cac19396d70f6770f6962d598c31ce69c0
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Jun 8 11:42:13 2020 +0200

    origin-visit-add: Write visit status to the journal
    
    This also makes the instruction order consistent across the different storage
    implementations. First, write objects to the journal, then write objects to the
    storage backend.
    
    Related to T2310

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/242/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/242/console

swh/storage/cassandra/storage.py
899–900

why this change?

swh/storage/cassandra/storage.py
899–900

It's only a temporary change.

Now that self._origin_visit_status_add writes to the journal, the journal testing part is a mess of origin-visit and origin-visit status with differing date...

I want to concern myself here only with origin-visit-add without the harassment i had so far with the update part.
So i'll deal with that in another diff which will revert that behavior (thus why it's commented).

If you want I can add that as a FIXME.

This revision is now accepted and ready to land.Jun 8 2020, 2:52 PM
swh/storage/cassandra/storage.py
899–900

So i'll deal with that in another diff which will revert that behavior (thus why it's commented).

D3244 ;)

Build has FAILED

Patch application report for D3238 (id=11489)

Rebasing onto 7eb44d412b...

First, rewinding head to replay your work on top of it...
Applying: origin-visit-add: Write visit status to the journal
Changes applied before test
commit d2308540cc934c30b0459f50eed017791b896c25
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Jun 8 11:42:13 2020 +0200

    origin-visit-add: Write visit status to the journal
    
    This also makes the instruction order consistent across the different storage
    implementations. First, write objects to the journal, then write objects to the
    storage backend.
    
    Related to T2310

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/249/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/249/console

  • Rebase on latest master
  • Fix journal related tests to take into accounts origin-visit-status objects
ardumont retitled this revision from origin-visit-add: Write origin-visit-status to the journal to origin-visit-add: Write origin-visit-status objects to the journal.Jun 9 2020, 2:35 PM

Build is green

Patch application report for D3238 (id=11517)

Rebasing onto 7eb44d412b...

Current branch diff-target is up to date.
Changes applied before test
commit 0860920774d1f907047443b4f0604ee6cbe2889b
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Jun 8 11:42:13 2020 +0200

    origin-visit-add: Write visit status to the journal
    
    This also makes the instruction order consistent across the different storage
    implementations. First, write objects to the journal, then write objects to the
    storage backend.
    
    Related to T2310

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/250/ for more details.