Page MenuHomeSoftware Heritage

Make the replayer drop the Revision.metadata
ClosedPublic

Authored by douardda on Apr 2 2021, 4:15 PM.

Details

Summary

this attribute is deprecated and on the verge of being replaced by
RawExtrinsicMetadata objects, and the kafka journal currently in production
contains a few invalid metadata entries that makes the replayer unhappy.

Closes T3201.

Depends on D5413.

Diff Detail

Event Timeline

Build is green

Patch application report for D5414 (id=19365)

Could not rebase; Attempt merge onto 0a270d1a7a...

Updating 0a270d1a..c45a4a87
Fast-forward
 swh/storage/postgresql/storage.py  |   2 +
 swh/storage/replay.py              |  19 ++++++-
 swh/storage/tests/test_backfill.py |   3 +
 swh/storage/tests/test_replay.py   | 111 ++++++++++++++++++++-----------------
 4 files changed, 83 insertions(+), 52 deletions(-)
Changes applied before test
commit c45a4a870127cd7af48620f0fb0a0787b07675ef
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Apr 2 16:10:28 2021 +0200

    Make the replayer drop the Revision.metadata
    
    this attribute is deprecated and on the verge of being replaced by
    RawExtrinsicMetadata objects, and the kafka journal currently in production
    contains a few invalid metadata entries that makes the replayer unhappy.
    
    Closes #T3201.

commit e27327b798ffb5fb87f269e84fb2305a76b8b734
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Apr 2 12:56:53 2021 +0200

    Make pg Strorage.extid_add() write extid objects to the journal
    
    also merge test_replay's _check_replayed and check_replayed in a single
    function.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1242/ for more details.

vlorentz added a subscriber: vlorentz.
vlorentz added inline comments.
swh/storage/tests/test_replay.py
33

why the copy?

34–48

more readable IMO

This revision is now accepted and ready to land.Apr 2 2021, 5:05 PM
swh/storage/tests/test_replay.py
33

because I don't want to modify the original TEST_OBJECT dict

34–48

I find it more cryptic, but meh

Build is green

Patch application report for D5414 (id=19400)

Could not rebase; Attempt merge onto 0a270d1a7a...

Updating 0a270d1a..bad6fe15
Fast-forward
 swh/storage/postgresql/storage.py  |   2 +
 swh/storage/replay.py              |  19 ++++++-
 swh/storage/tests/storage_tests.py |  16 ++++++
 swh/storage/tests/test_backfill.py |   3 +
 swh/storage/tests/test_replay.py   | 111 ++++++++++++++++++++-----------------
 5 files changed, 99 insertions(+), 52 deletions(-)
Changes applied before test
commit bad6fe15a488302c512b50635bbccc9e19be6bac
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Apr 2 16:10:28 2021 +0200

    Make the replayer drop the Revision.metadata
    
    this attribute is deprecated and on the verge of being replaced by
    RawExtrinsicMetadata objects, and the kafka journal currently in production
    contains a few invalid metadata entries that makes the replayer unhappy.
    
    Closes #T3201.

commit 84dcbe3d0e567157aa03e74ce724e8f3b4bc1f02
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Apr 2 12:56:53 2021 +0200

    Merge test_replay's _check_replayed and check_replayed in a single function

commit 36a7fd34f3ba81a44df97d64b81df48a3b809629
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue Apr 6 15:57:40 2021 +0200

    Fix pg Storage.extid_add(): write ExtID objects to the journal
    
    and explicitely check for extid objects in the journal in TestStorage.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1246/ for more details.

Build is green

Patch application report for D5414 (id=19403)

Could not rebase; Attempt merge onto 0a270d1a7a...

Updating 0a270d1a..39507b24
Fast-forward
 swh/storage/postgresql/storage.py  |   2 +
 swh/storage/replay.py              |  19 ++++++-
 swh/storage/tests/storage_tests.py |  16 ++++++
 swh/storage/tests/test_backfill.py |   3 +
 swh/storage/tests/test_replay.py   | 111 ++++++++++++++++++++-----------------
 5 files changed, 99 insertions(+), 52 deletions(-)
Changes applied before test
commit 39507b24d0f4bfa15347edf422bb3496b3761629
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Apr 2 16:10:28 2021 +0200

    Make the replayer drop the Revision.metadata
    
    this attribute is deprecated and on the verge of being replaced by
    RawExtrinsicMetadata objects, and the kafka journal currently in production
    contains a few invalid metadata entries that makes the replayer unhappy.
    
    Closes T3201.

commit 84dcbe3d0e567157aa03e74ce724e8f3b4bc1f02
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Apr 2 12:56:53 2021 +0200

    Merge test_replay's _check_replayed and check_replayed in a single function

commit 36a7fd34f3ba81a44df97d64b81df48a3b809629
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue Apr 6 15:57:40 2021 +0200

    Fix pg Storage.extid_add(): write ExtID objects to the journal
    
    and explicitely check for extid objects in the journal in TestStorage.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1247/ for more details.