Page MenuHomeSoftware Heritage

Check updated metadata are correctly stored in the metadata storage
ClosedPublic

Authored by ardumont on Sep 27 2020, 3:39 PM.

Details

Summary

This adds the missing checks on the current metadata part.
This also adds the origin to the metadata stored in the storage metadata (as per review suggestion).

Related to D4013

Depends on D4047

Test Plan

tox

Diff Detail

Repository
rDDEP Push deposit
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D4058 (id=14326)

Could not rebase; Attempt merge onto f26ea44926...

Updating f26ea449..75fb66cb
Fast-forward
 MANIFEST.in                                        |   1 +
 swh/deposit/__init__.py                            |  11 +
 swh/deposit/api/__init__.py                        |   7 -
 swh/deposit/api/checks.py                          |  54 ++++
 swh/deposit/api/common.py                          |  38 ++-
 swh/deposit/api/deposit_update.py                  | 233 +++++++++++++--
 swh/deposit/api/private/deposit_check.py           |  51 +---
 swh/deposit/api/private/deposit_read.py            |  23 +-
 swh/deposit/config.py                              |   6 +
 swh/deposit/tests/api/test_checks.py               |  78 +++++
 .../tests/api/test_deposit_private_check.py        |  77 +----
 .../api/test_deposit_private_read_metadata.py      |   2 +-
 swh/deposit/tests/api/test_deposit_update.py       | 325 ++++++++++++++++++++-
 swh/deposit/tests/conftest.py                      |  30 +-
 .../tests/data/atom/entry-data-empty-body.xml      |   2 +-
 .../entry-data-fail-metadata-functional-checks.xml |   7 +
 swh/deposit/tests/test_init.py                     |   2 +-
 17 files changed, 745 insertions(+), 202 deletions(-)
 create mode 100644 swh/deposit/api/checks.py
 create mode 100644 swh/deposit/tests/api/test_checks.py
 create mode 100644 swh/deposit/tests/data/atom/entry-data-fail-metadata-functional-checks.xml
Changes applied before test
commit 75fb66cb248c1741c23626e0cde55d0e1404eb21
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sun Sep 27 15:36:32 2020 +0200

    Check updated metadata are correctly stored in the metadata storage

commit 6e29d81201a00d206966c62bbca64e3177051c42
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 25 13:31:29 2020 +0200

    Add functional metadata checks prior to updating them
    
    This refactor the common code executed by the checker so we functionally check
    everything the same way.

commit f23ad6cb2a550d94b0a81b1d718f24256acb94e7
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 22 19:15:19 2020 +0200

    Allow deposit metadata update on deposit already completed
    
    This will allow to update the metadata of a deposit in "done" state. The
    metadata update occurs both in the deposit backend and in the metadata
    storage (provided some basic checks pass).

commit 45c7c35e471ce28529bc495a076d10e2c67b9cf7
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sat Sep 26 18:37:50 2020 +0200

    Fix deposit atom test to actually check the response message
    
    Prior to this commit, the actual test was wrongfully ok because the slug was
    missing. This resulted in a 400 response which passed because we missed
    checking the actual response content. This now fixes it.

commit b04c457d65689ae5e278edac7faee34bb3a2908f
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 25 15:15:37 2020 +0200

    Fix transition deposit to deposited state with empty post on SE IRI
    
    Prior to this the content-length could be None instead of 0, failing the
    transition.
    
    As the coverage was missing, that peculiar case was missed. This fixes it.

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/145/ for more details.

vlorentz added a subscriber: vlorentz.
vlorentz added inline comments.
swh/deposit/tests/api/test_deposit_update.py
685–695
assert page_results == PagedResult(
    results=[
        RawExtrinsicMetadata(
            type=MetadataTargetType.DIRECTORY,
            id=directory_swhid,
            authority=attr.evolve(metadata_authority, metadata=None),
            fetcher=attr.evolve(metadata_fetcher, metadata=None),
            format=..., metadata=...,
            origin=...,
        )
    ],
    next_page=token=None,
)
This revision now requires changes to proceed.Sep 28 2020, 12:02 PM

Note that some checks are commented because they currently fail. Those are
storage issues. The fetcher and authority metadata fields are not properly set
when read from the raw_extrinsic_metadata_get endpoint somehow ¯\_(ツ)_/¯.
That should most probably be the subject for another storage diff.

seen with val, it's because the resulted authority and fetcher are meant to be used as id and not as raw object...
reworked according to suggestion.

You should add an origin context to the RawExtrinsicMetadata object, to be consistent with the deposit loader.

Build is green

Patch application report for D4058 (id=14331)

Could not rebase; Attempt merge onto 45c7c35e47...

Updating 45c7c35e..56498f0d
Fast-forward
 MANIFEST.in                                        |   1 +
 swh/deposit/__init__.py                            |  11 +
 swh/deposit/api/__init__.py                        |   7 -
 swh/deposit/api/checks.py                          |  54 ++++
 swh/deposit/api/common.py                          |  38 +--
 swh/deposit/api/deposit_update.py                  | 205 +++++++++++++--
 swh/deposit/api/private/deposit_check.py           |  51 +---
 swh/deposit/api/private/deposit_read.py            |  23 +-
 swh/deposit/config.py                              |   6 +
 swh/deposit/tests/api/test_checks.py               |  78 ++++++
 swh/deposit/tests/api/test_deposit_atom.py         |   2 -
 .../tests/api/test_deposit_private_check.py        |  77 +-----
 .../api/test_deposit_private_read_metadata.py      |   2 +-
 swh/deposit/tests/api/test_deposit_update.py       | 284 ++++++++++++++++++++-
 swh/deposit/tests/conftest.py                      |  30 +--
 .../entry-data-fail-metadata-functional-checks.xml |   7 +
 swh/deposit/tests/test_init.py                     |   2 +-
 17 files changed, 681 insertions(+), 197 deletions(-)
 create mode 100644 swh/deposit/api/checks.py
 create mode 100644 swh/deposit/tests/api/test_checks.py
 create mode 100644 swh/deposit/tests/data/atom/entry-data-fail-metadata-functional-checks.xml
Changes applied before test
commit 56498f0d69c8c242ad8d7d60e7527b9d99d41995
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sun Sep 27 15:36:32 2020 +0200

    Check updated metadata are correctly stored in the metadata storage

commit 92f99745f35b95b1717d9344bd9b4e3f74b0cc20
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 25 13:31:29 2020 +0200

    Add functional metadata checks prior to updating them
    
    This refactor the common code executed by the checker so we functionally check
    everything the same way.

commit d1f3f69c9016ad6bb7d49515fc187dd1c85039eb
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 22 19:15:19 2020 +0200

    Allow deposit metadata update on deposit already completed
    
    This will allow to update the metadata of a deposit in "done" state. The
    metadata update occurs both in the deposit backend and in the metadata
    storage (provided some basic checks pass).

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/148/ for more details.

Add origin to the raw extrinsic metadata stored in the metadata storage

Build is green

Patch application report for D4058 (id=14334)

Could not rebase; Attempt merge onto d1f3f69c90...

Updating d1f3f69c..36d6b760
Fast-forward
 MANIFEST.in                                        |  1 +
 swh/deposit/api/checks.py                          | 54 +++++++++++++
 swh/deposit/api/deposit_update.py                  | 12 +++
 swh/deposit/api/private/deposit_check.py           | 51 +------------
 swh/deposit/tests/api/test_checks.py               | 78 +++++++++++++++++++
 .../tests/api/test_deposit_private_check.py        | 77 +------------------
 swh/deposit/tests/api/test_deposit_update.py       | 89 +++++++++++++++++++++-
 .../entry-data-fail-metadata-functional-checks.xml |  7 ++
 8 files changed, 243 insertions(+), 126 deletions(-)
 create mode 100644 swh/deposit/api/checks.py
 create mode 100644 swh/deposit/tests/api/test_checks.py
 create mode 100644 swh/deposit/tests/data/atom/entry-data-fail-metadata-functional-checks.xml
Changes applied before test
commit 36d6b760f4a2746b01dddf788bffc81bd7b36742
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Sun Sep 27 15:36:32 2020 +0200

    Check updated metadata are correctly stored in the metadata storage

commit 92f99745f35b95b1717d9344bd9b4e3f74b0cc20
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 25 13:31:29 2020 +0200

    Add functional metadata checks prior to updating them
    
    This refactor the common code executed by the checker so we functionally check
    everything the same way.

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/150/ for more details.

swh/deposit/tests/api/test_deposit_update.py
682–683

so that's the raw XML, right?

swh/deposit/tests/api/test_deposit_update.py
682–683

yes, in the end, as we were already inconsistent as to what swhid we attached the metadata too (revision with the loader and not the directory), i did not have any more restriction to change this anyways to actually do the right thing...

This revision is now accepted and ready to land.Sep 28 2020, 2:00 PM