Page MenuHomeSoftware Heritage

metadata-only: Restrict swhid use of qualifier anchor and visit on snapshot
ClosedPublic

Authored by ardumont on Nov 17 2020, 3:30 PM.

Details

Summary

I don't know what such SWHID would mean but technically it's a valid one (as
far as i understood the grammar [1])

Nonetheless, we currently have a technical restriction when writing extrinsic
metadata though. We can only have one snapshot reference possible there...

Let's discuss what to do then ;)

In this diff, I currently propose to log and return a bad request for such case.

That's the gist of the discussion, is that reasonable? If not, what to do instead?

[1] https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html#syntax

Related to T2537 D4475

Test Plan

tox

Diff Detail

Repository
rDDEP Push deposit
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 17143
Build 26460: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 26459: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D4493 (id=15938)

Could not rebase; Attempt merge onto c1a45162d3...

Updating c1a45162..e943e216
Fast-forward
 swh/deposit/api/common.py                          | 145 +++++++++++-
 swh/deposit/api/deposit_update.py                  |  95 +-------
 swh/deposit/config.py                              |   5 +
 swh/deposit/parsers.py                             |  12 +
 swh/deposit/tests/api/test_deposit_metadata.py     | 256 +++++++++++++++++++++
 swh/deposit/tests/api/test_parsers.py              |  19 +-
 swh/deposit/tests/conftest.py                      |   6 +-
 .../tests/data/atom/entry-data-with-origin.xml     |  13 ++
 .../tests/data/atom/entry-data-with-swhid.xml      |  13 ++
 swh/deposit/tests/test_utils.py                    |  61 ++++-
 swh/deposit/utils.py                               |  40 +++-
 11 files changed, 555 insertions(+), 110 deletions(-)
 create mode 100644 swh/deposit/tests/api/test_deposit_metadata.py
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-origin.xml
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-swhid.xml
Changes applied before test
commit e943e2163ee7315a48fe2b251f5b125a6a5dabda
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Nov 17 15:28:27 2020 +0100

    metadata-only: Restrict swhid use of qualifier anchor and visit on snapshot
    
    Related to T2537

commit 85dc1f7f3d910ffff66ef00dd875794997c4b7b3
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Nov 13 18:27:23 2020 +0100

    Adapt existing POST to a collection to allow metadata-only deposit
    
    The following endpoint is adapted to allow it:
    
    POST /1/<collection>
    HEADER: Content-type: application/atom+xml;type=entry
    And the xml provided by the client contains either:
    
    <swh:deposit>
      <swh:reference>
        <swh:object swhid="{swhid}" />
      </swh:reference>
    </swh:deposit>
    or:
    
    <swh:deposit>
      <swh:reference>
        <swh:origin url="{url}" />
      </swh:reference>
    </swh:deposit>
    
    Any invalid swhid raises a 400 bad request. If everything passes, a 201
    response with usual deposit receipt is received by the deposit user. The
    deposit row is updated with status "done", complete_date and reception_date to
    the same and current date, the swhid columns are updated as well if any.
    
    Note: This technically is reusing the existing code from the metadata update
    scenario. The current metadata update still happens the same way as before.
    
    Related to T2537

commit ce7ab89f267d84dabe032d2dcbe26f21b7e165f0
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Nov 13 14:32:05 2020 +0100

    metadata-only: Start deposit
    
    It does nothing right now but add the detection of the metadata-only deposit.
    And fails in case of invalid input.
    
    Related to T2537

Link to build: https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/340/
See console output for more information: https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/340/console

ardumont edited the summary of this revision. (Show Details)

Fix test

Build is green

Patch application report for D4493 (id=15939)

Could not rebase; Attempt merge onto c1a45162d3...

Updating c1a45162..be2056b9
Fast-forward
 swh/deposit/api/common.py                          | 145 +++++++++++-
 swh/deposit/api/deposit_update.py                  |  95 +-------
 swh/deposit/config.py                              |   5 +
 swh/deposit/parsers.py                             |  22 ++
 swh/deposit/tests/api/test_deposit_metadata.py     | 256 +++++++++++++++++++++
 swh/deposit/tests/api/test_parsers.py              |  19 +-
 swh/deposit/tests/conftest.py                      |   6 +-
 .../tests/data/atom/entry-data-with-origin.xml     |  13 ++
 .../tests/data/atom/entry-data-with-swhid.xml      |  13 ++
 swh/deposit/tests/test_utils.py                    |  61 ++++-
 swh/deposit/utils.py                               |  40 +++-
 11 files changed, 565 insertions(+), 110 deletions(-)
 create mode 100644 swh/deposit/tests/api/test_deposit_metadata.py
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-origin.xml
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-swhid.xml
Changes applied before test
commit be2056b943947cdec2395bbb80663a975a4e0793
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Nov 17 15:28:27 2020 +0100

    Refuse SWHID use with qualifier anchor and visit targeting both a snapshot
    
    Related to T2537

commit 85dc1f7f3d910ffff66ef00dd875794997c4b7b3
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Nov 13 18:27:23 2020 +0100

    Adapt existing POST to a collection to allow metadata-only deposit
    
    The following endpoint is adapted to allow it:
    
    POST /1/<collection>
    HEADER: Content-type: application/atom+xml;type=entry
    And the xml provided by the client contains either:
    
    <swh:deposit>
      <swh:reference>
        <swh:object swhid="{swhid}" />
      </swh:reference>
    </swh:deposit>
    or:
    
    <swh:deposit>
      <swh:reference>
        <swh:origin url="{url}" />
      </swh:reference>
    </swh:deposit>
    
    Any invalid swhid raises a 400 bad request. If everything passes, a 201
    response with usual deposit receipt is received by the deposit user. The
    deposit row is updated with status "done", complete_date and reception_date to
    the same and current date, the swhid columns are updated as well if any.
    
    Note: This technically is reusing the existing code from the metadata update
    scenario. The current metadata update still happens the same way as before.
    
    Related to T2537

commit ce7ab89f267d84dabe032d2dcbe26f21b7e165f0
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Nov 13 14:32:05 2020 +0100

    metadata-only: Start deposit
    
    It does nothing right now but add the detection of the metadata-only deposit.
    And fails in case of invalid input.
    
    Related to T2537

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/341/ for more details.

vlorentz added a subscriber: vlorentz.
vlorentz added inline comments.
swh/deposit/parsers.py
204–207

'anchor=swh:1:snp:' is not supported when 'visit' is also provided.

This revision now requires changes to proceed.Nov 17 2020, 5:37 PM
ardumont added inline comments.
swh/deposit/parsers.py
204–207

thanks, clearer.

ardumont marked an inline comment as done.

Adapt according to review

This revision is now accepted and ready to land.Nov 17 2020, 6:14 PM

Build is green

Patch application report for D4493 (id=15945)

Could not rebase; Attempt merge onto c1a45162d3...

Updating c1a45162..071650b7
Fast-forward
 swh/deposit/api/common.py                          | 145 +++++++++++-
 swh/deposit/api/deposit_update.py                  |  95 +-------
 swh/deposit/config.py                              |   5 +
 swh/deposit/parsers.py                             |  20 ++
 swh/deposit/tests/api/test_deposit_metadata.py     | 256 +++++++++++++++++++++
 swh/deposit/tests/api/test_parsers.py              |  21 +-
 swh/deposit/tests/conftest.py                      |   6 +-
 .../tests/data/atom/entry-data-with-origin.xml     |  13 ++
 .../tests/data/atom/entry-data-with-swhid.xml      |  13 ++
 swh/deposit/tests/test_utils.py                    |  61 ++++-
 swh/deposit/utils.py                               |  40 +++-
 11 files changed, 564 insertions(+), 111 deletions(-)
 create mode 100644 swh/deposit/tests/api/test_deposit_metadata.py
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-origin.xml
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-swhid.xml
Changes applied before test
commit 071650b783f764d40fed12d97633a3e6e816c7ce
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Nov 17 15:28:27 2020 +0100

    Refuse SWHID use with qualifier anchor and visit targeting both a snapshot
    
    Related to T2537

commit 85dc1f7f3d910ffff66ef00dd875794997c4b7b3
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Nov 13 18:27:23 2020 +0100

    Adapt existing POST to a collection to allow metadata-only deposit
    
    The following endpoint is adapted to allow it:
    
    POST /1/<collection>
    HEADER: Content-type: application/atom+xml;type=entry
    And the xml provided by the client contains either:
    
    <swh:deposit>
      <swh:reference>
        <swh:object swhid="{swhid}" />
      </swh:reference>
    </swh:deposit>
    or:
    
    <swh:deposit>
      <swh:reference>
        <swh:origin url="{url}" />
      </swh:reference>
    </swh:deposit>
    
    Any invalid swhid raises a 400 bad request. If everything passes, a 201
    response with usual deposit receipt is received by the deposit user. The
    deposit row is updated with status "done", complete_date and reception_date to
    the same and current date, the swhid columns are updated as well if any.
    
    Note: This technically is reusing the existing code from the metadata update
    scenario. The current metadata update still happens the same way as before.
    
    Related to T2537

commit ce7ab89f267d84dabe032d2dcbe26f21b7e165f0
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Nov 13 14:32:05 2020 +0100

    metadata-only: Start deposit
    
    It does nothing right now but add the detection of the metadata-only deposit.
    And fails in case of invalid input.
    
    Related to T2537

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/345/ for more details.

Build is green

Patch application report for D4493 (id=15948)

Could not rebase; Attempt merge onto c1a45162d3...

Updating c1a45162..546f96f4
Fast-forward
 swh/deposit/api/common.py                          | 147 +++++++++++-
 swh/deposit/api/deposit_update.py                  |  92 +-------
 swh/deposit/config.py                              |   5 +
 swh/deposit/parsers.py                             |  20 ++
 swh/deposit/tests/api/test_deposit_metadata.py     | 256 +++++++++++++++++++++
 swh/deposit/tests/api/test_parsers.py              |  21 +-
 swh/deposit/tests/conftest.py                      |   6 +-
 .../tests/data/atom/entry-data-with-origin.xml     |  13 ++
 .../tests/data/atom/entry-data-with-swhid.xml      |  13 ++
 swh/deposit/tests/test_utils.py                    |  61 ++++-
 swh/deposit/utils.py                               |  40 +++-
 11 files changed, 570 insertions(+), 104 deletions(-)
 create mode 100644 swh/deposit/tests/api/test_deposit_metadata.py
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-origin.xml
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-swhid.xml
Changes applied before test
commit 546f96f4f08200ffa48a85839b7065e2ef1f4a0c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Nov 17 15:28:27 2020 +0100

    Refuse SWHID use with qualifier anchor and visit targeting both a snapshot
    
    Related to T2537

commit 406790effb7080befe2eef6b0c7734a751e16870
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Nov 13 14:32:05 2020 +0100

    Adapt existing POST to a collection to allow metadata-only deposit
    
    The following endpoint is adapted to allow it:
    
    POST /1/<collection>
    HEADER: Content-type: application/atom+xml;type=entry
    And the xml provided by the client contains either:
    
    <swh:deposit>
      <swh:reference>
        <swh:object swhid="{swhid}" />
      </swh:reference>
    </swh:deposit>
    or:
    
    <swh:deposit>
      <swh:reference>
        <swh:origin url="{url}" />
      </swh:reference>
    </swh:deposit>
    
    Any invalid swhid raises a 400 bad request. If everything passes, a 201
    response with usual deposit receipt is received by the deposit user. The
    deposit row is updated with status "done", complete_date and reception_date to
    the same and current date, the swhid columns are updated as well if any.
    
    Note: Technically, this is reusing the existing code from the metadata update
    scenario. The current metadata update still happens the same way as before.
    
    Related to T2537

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/347/ for more details.

Build is green

Patch application report for D4493 (id=15954)

Could not rebase; Attempt merge onto c1a45162d3...

Updating c1a45162..5d1932ce
Fast-forward
 swh/deposit/api/common.py                          | 142 ++++++++++-
 swh/deposit/api/deposit_update.py                  |  83 +-----
 swh/deposit/config.py                              |   5 +
 swh/deposit/errors.py                              |  14 ++
 swh/deposit/parsers.py                             |  20 ++
 swh/deposit/tests/api/test_deposit_metadata.py     | 277 +++++++++++++++++++++
 swh/deposit/tests/api/test_parsers.py              |  21 +-
 swh/deposit/tests/conftest.py                      |   6 +-
 .../tests/data/atom/entry-data-with-origin.xml     |  13 +
 ...-with-swhid-fail-metadata-functional-checks.xml |  12 +
 .../tests/data/atom/entry-data-with-swhid.xml      |  13 +
 swh/deposit/tests/test_utils.py                    |  61 ++++-
 swh/deposit/utils.py                               |  40 ++-
 13 files changed, 609 insertions(+), 98 deletions(-)
 create mode 100644 swh/deposit/tests/api/test_deposit_metadata.py
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-origin.xml
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-swhid-fail-metadata-functional-checks.xml
 create mode 100644 swh/deposit/tests/data/atom/entry-data-with-swhid.xml
Changes applied before test
commit 5d1932ce291cdecf78f08642ce6783845c3f616d
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Nov 17 15:28:27 2020 +0100

    Refuse SWHID use with qualifier anchor and visit targeting both a snapshot
    
    Related to T2537

commit b86840cd1e2278ebfbcfdc6d57f3384f28fea96c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Nov 13 14:32:05 2020 +0100

    Adapt existing POST to a collection to allow metadata-only deposit
    
    The following endpoint is adapted to allow it:
    
    POST /1/<collection>
    HEADER: Content-type: application/atom+xml;type=entry
    And the xml provided by the client contains either:
    
    <swh:deposit>
      <swh:reference>
        <swh:object swhid="{swhid}" />
      </swh:reference>
    </swh:deposit>
    or:
    
    <swh:deposit>
      <swh:reference>
        <swh:origin url="{url}" />
      </swh:reference>
    </swh:deposit>
    
    Any invalid swhid raises a 400 bad request. If everything passes, a 201
    response with usual deposit receipt is received by the deposit user. The
    deposit row is updated with status "done", complete_date and reception_date to
    the same and current date, the swhid columns are updated as well if any.
    
    Note: Technically, this is reusing the existing code from the metadata update
    scenario. The current metadata update still happens the same way as before.
    
    Related to T2537

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/353/ for more details.