Page MenuHomeSoftware Heritage

deposit.cli.client: Allow user to define the metadata provenance url
ClosedPublic

Authored by ardumont on Feb 21 2022, 6:05 PM.

Details

Summary

If the user is providing the --metadata-provenance-url, the xml generated will forward
that information to the deposit server. If the user is providing the metadata file
directly, a warning will be logged to notify the user of the missing metadata provenance
url (if it is missing).

$ swh deposit upload  --help | grep -C2 metadata-provenance
                                  server.

  --metadata-provenance-url TEXT  (Optional) Provenance metadata url to
                                  indicate from where the metadata is coming
                                  from.

Related to T3677
Depends on D7210

Test Plan

tox

Diff Detail

Repository
rDDEP Push deposit
Branch
store-warning
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 27019
Build 42250: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 42249: arc lint + arc unit

Event Timeline

swh/deposit/tests/data/atom/entry-only-create-origin.xml
8–10

Could you add/keep a test where this is missing?

Build is green

Patch application report for D7214 (id=26144)

Could not rebase; Attempt merge onto 40adc8c23b...

Updating 40adc8c2..74b32928
Fast-forward
 swh/deposit/api/checks.py                          | 30 ++++++--
 swh/deposit/api/private/deposit_check.py           | 80 ++++++++++++----------
 swh/deposit/cli/client.py                          | 59 +++++++++++++---
 swh/deposit/tests/api/test_checks.py               | 80 +++++++++++++++++++---
 .../tests/api/test_deposit_private_check.py        | 21 +++++-
 swh/deposit/tests/cli/test_client.py               | 28 ++++++--
 .../data/atom/entry-data-with-add-to-origin.xml    |  8 ++-
 .../tests/data/atom/entry-only-create-origin.xml   |  4 ++
 8 files changed, 239 insertions(+), 71 deletions(-)
Changes applied before test
commit 74b32928ed4f19dab6e052c666b48371008267ea
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 18:01:17 2022 +0100

    deposit.cli.client: Allow user to define the metadata provenance url
    
    If the user is providing the `--metadata-provenance-url`, the xml generated will forward
    that information to the deposit server. If the user is providing the metadata file
    directly, a warning will be logged to notify the user of the missing metadata provenance
    url (if it is missing).
    
    Related to T3677

commit 770cc0f5152d948e32f3f0cf0640f50a17626923
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 15:56:55 2022 +0100

    deposit_check: Actually store warning in deposit status detail
    
    Prior to this commit, only rejected deposit were storing problem details. Now that we
    can have warnings even in case of 'verified' deposit, we need to store that details for
    post-analysis.
    
    Note that this also fixes the docstring of the overall class which were out of date
    since the beginning (duplicated from another class).
    
    Related to T3677

commit 339f7dd390ca8cbe8e79a77a1bc2e3c704d5f3f1
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 15:26:13 2022 +0100

    api.checks: Warn when suggested fields are missing from metadata
    
    This introduces a new check about the metadata provenance. While it's a suggested field,
    it's definitely something that we want deposit clients to send us. So warn when it's not
    the case. That does not reject the deposit but it's worth keeping that detail in the
    backend.
    
    Related to T3677

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/716/ for more details.

ardumont edited the summary of this revision. (Show Details)

Add missing case (i thought we already had it ¯\_(ツ)_/¯)

ardumont marked an inline comment as done.

Rework test docstring

Build is green

Patch application report for D7214 (id=26146)

Could not rebase; Attempt merge onto 40adc8c23b...

Updating 40adc8c2..d76fc7c0
Fast-forward
 swh/deposit/api/checks.py                          | 30 ++++++--
 swh/deposit/api/private/deposit_check.py           | 80 ++++++++++++----------
 swh/deposit/cli/client.py                          | 59 +++++++++++++---
 swh/deposit/tests/api/test_checks.py               | 80 +++++++++++++++++++---
 .../tests/api/test_deposit_private_check.py        | 21 +++++-
 swh/deposit/tests/cli/test_client.py               | 63 +++++++++++++++--
 .../entry-data-with-add-to-origin-no-prov-url.xml  | 14 ++++
 .../data/atom/entry-data-with-add-to-origin.xml    |  8 ++-
 .../tests/data/atom/entry-only-create-origin.xml   |  4 ++
 9 files changed, 287 insertions(+), 72 deletions(-)
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-add-to-origin-no-prov-url.xml
Changes applied before test
commit d76fc7c0d64d15b604a90f88715e373d5861dcfd
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 18:01:17 2022 +0100

    deposit.cli.client: Allow user to define the metadata provenance url
    
    If the user is providing the `--metadata-provenance-url`, the xml generated will forward
    that information to the deposit server. If the user is providing the metadata file
    directly, a warning will be logged to notify the user of the missing metadata provenance
    url (if it is missing).
    
    Related to T3677

commit 770cc0f5152d948e32f3f0cf0640f50a17626923
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 15:56:55 2022 +0100

    deposit_check: Actually store warning in deposit status detail
    
    Prior to this commit, only rejected deposit were storing problem details. Now that we
    can have warnings even in case of 'verified' deposit, we need to store that details for
    post-analysis.
    
    Note that this also fixes the docstring of the overall class which were out of date
    since the beginning (duplicated from another class).
    
    Related to T3677

commit 339f7dd390ca8cbe8e79a77a1bc2e3c704d5f3f1
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 15:26:13 2022 +0100

    api.checks: Warn when suggested fields are missing from metadata
    
    This introduces a new check about the metadata provenance. While it's a suggested field,
    it's definitely something that we want deposit clients to send us. So warn when it's not
    the case. That does not reject the deposit but it's worth keeping that detail in the
    backend.
    
    Related to T3677

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/718/ for more details.

Build is green

Patch application report for D7214 (id=26147)

Could not rebase; Attempt merge onto 40adc8c23b...

Updating 40adc8c2..65f2445e
Fast-forward
 swh/deposit/api/checks.py                          | 30 ++++++--
 swh/deposit/api/private/deposit_check.py           | 80 ++++++++++++----------
 swh/deposit/cli/client.py                          | 59 +++++++++++++---
 swh/deposit/tests/api/test_checks.py               | 80 +++++++++++++++++++---
 .../tests/api/test_deposit_private_check.py        | 21 +++++-
 swh/deposit/tests/cli/test_client.py               | 65 ++++++++++++++++--
 .../entry-data-with-add-to-origin-no-prov-url.xml  | 14 ++++
 .../data/atom/entry-data-with-add-to-origin.xml    |  8 ++-
 .../tests/data/atom/entry-only-create-origin.xml   |  4 ++
 9 files changed, 288 insertions(+), 73 deletions(-)
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-add-to-origin-no-prov-url.xml
Changes applied before test
commit 65f2445eb94ddf2b3538d3f1c1539d33bc95f33f
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 18:01:17 2022 +0100

    deposit.cli.client: Allow user to define the metadata provenance url
    
    If the user is providing the `--metadata-provenance-url`, the xml generated will forward
    that information to the deposit server. If the user is providing the metadata file
    directly, a warning will be logged to notify the user of the missing metadata provenance
    url (if it is missing).
    
    Related to T3677

commit 770cc0f5152d948e32f3f0cf0640f50a17626923
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 15:56:55 2022 +0100

    deposit_check: Actually store warning in deposit status detail
    
    Prior to this commit, only rejected deposit were storing problem details. Now that we
    can have warnings even in case of 'verified' deposit, we need to store that details for
    post-analysis.
    
    Note that this also fixes the docstring of the overall class which were out of date
    since the beginning (duplicated from another class).
    
    Related to T3677

commit 339f7dd390ca8cbe8e79a77a1bc2e3c704d5f3f1
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 15:26:13 2022 +0100

    api.checks: Warn when suggested fields are missing from metadata
    
    This introduces a new check about the metadata provenance. While it's a suggested field,
    it's definitely something that we want deposit clients to send us. So warn when it's not
    the case. That does not reject the deposit but it's worth keeping that detail in the
    backend.
    
    Related to T3677

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/719/ for more details.

swh/deposit/cli/client.py
287

did not touch that part ¯\_(ツ)_/¯

swh/deposit/tests/data/atom/entry-only-create-origin.xml
8–10

done, reload the page to see it ;)

To rebase and adapt after D7212 lands.

swh/deposit/tests/cli/test_client.py
944–957

and rename the file accordingly.

because this test checks the behavior when <swh:metadata-provenance> is missing, not <swh:metadata-provenance> present with missing <schema:url> inside

swh/deposit/tests/cli/test_client.py
944–957

not wrong there ;)
will adapt.

Adapted according to val's suggestion

Build is green

Patch application report for D7214 (id=26153)

Rebasing onto 770cc0f515...

Current branch diff-target is up to date.
Changes applied before test
commit 463e6d21001932d8c1ac8987675ffde8b61e2a32
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 18:01:17 2022 +0100

    deposit.cli.client: Allow user to define the metadata provenance url
    
    If the user is providing the `--metadata-provenance-url`, the xml generated will forward
    that information to the deposit server. If the user is providing the metadata file
    directly, a warning will be logged to notify the user of the missing metadata provenance
    url (if it is missing).
    
    Related to T3677

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/720/ for more details.

This revision is now accepted and ready to land.Feb 22 2022, 11:26 AM

Build is green

Patch application report for D7214 (id=26169)

Rebasing onto a10ed57bf8...

Current branch diff-target is up to date.
Changes applied before test
commit b9f565aaa34c731fb4f4c6832e8115591f0d8c54
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Feb 21 18:01:17 2022 +0100

    deposit.cli.client: Allow user to define the metadata provenance url
    
    If the user is providing the `--metadata-provenance-url`, the xml generated will forward
    that information to the deposit server. If the user is providing the metadata file
    directly, a warning will be logged to notify the user of the missing metadata provenance
    url (if it is missing).
    
    Related to T3677

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/726/ for more details.