Page MenuHomeSoftware Heritage

Handle <swh:create_origin> / <swh:add_to_origin> in multipart uploads
ClosedPublic

Authored by vlorentz on Dec 30 2020, 2:41 PM.

Details

Summary

They were only handled in non-multipart ones.

Depends on D4799

Diff Detail

Event Timeline

Build is green

Patch application report for D4800 (id=16982)

Could not rebase; Attempt merge onto b85ca5f976...

Updating b85ca5f9..de71d457
Fast-forward
 docs/specs/metadata_example.xml                    |  58 +-
 docs/specs/spec-meta-deposit.rst                   |  62 +-
 swh/deposit/api/common.py                          | 109 +--
 swh/deposit/tests/api/test_collection.py           | 326 +--------
 .../tests/api/test_collection_add_to_origin.py     | 158 +++++
 swh/deposit/tests/api/test_collection_post_atom.py | 415 +++++++----
 .../tests/api/test_collection_post_binary.py       | 366 +---------
 .../tests/api/test_collection_post_metadata.py     | 275 --------
 .../tests/api/test_collection_post_multipart.py    | 212 ++----
 .../tests/api/test_collection_reuse_slug.py        | 242 +++++++
 swh/deposit/tests/api/test_deposit_update.py       | 775 +--------------------
 swh/deposit/tests/api/test_deposit_update_atom.py  | 608 ++++++++++++++++
 .../tests/api/test_deposit_update_binary.py        | 408 +++++++++++
 swh/deposit/tests/cli/test_client.py               |   3 +-
 swh/deposit/tests/common.py                        |  69 ++
 swh/deposit/tests/conftest.py                      |  25 +-
 16 files changed, 2007 insertions(+), 2104 deletions(-)
 create mode 100644 swh/deposit/tests/api/test_collection_add_to_origin.py
 delete mode 100644 swh/deposit/tests/api/test_collection_post_metadata.py
 create mode 100644 swh/deposit/tests/api/test_collection_reuse_slug.py
 create mode 100644 swh/deposit/tests/api/test_deposit_update_atom.py
 create mode 100644 swh/deposit/tests/api/test_deposit_update_binary.py
Changes applied before test
commit de71d457a68de601850a926537988910c8e224cb
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Dec 30 14:41:06 2020 +0100

    Handle <swh:create_origin> / <swh:add_to_origin> in multipart uploads
    
    They were only handled in non-multipart ones.

commit 489cf772ae78e7810076135a11752f59c51c7ead
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Dec 30 14:39:50 2020 +0100

    Split <swh:create_origin> / <swh:add_to_origin> handling to its own function.
    
    It will be reused by _multipart_upload in the next commit, and makes _atom_entry smaller.

commit 7494221688dfdd61068552eea3a3552b95da4703
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Dec 23 17:37:57 2020 +0100

    Fix formatting of XML snippets in the documentation.

commit 32ae9c190221af68111f2010d515a99d6e04510f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Dec 23 17:37:24 2020 +0100

    Rephrase introduction of the metadata-only deposit documentation

commit 24a3fe5361d9b783c04a5c844ec731a164b626f7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Dec 23 15:09:52 2020 +0100

    tests: Deduplicate boilerplate when POSTing/PUTing an multipart requests.

commit 0f32502adb9f1dd9fb2eb83460afbec039d819ff
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Dec 23 14:48:40 2020 +0100

    tests: Deduplicate boilerplate when POSTing/PUTing an Atom document.

commit f1579dc754fb46f16c24f74d8fe5146cdae760df
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Dec 23 13:21:11 2020 +0100

    tests: Deduplicate boilerplate when POSTing/PUTing a zip archive.
    
    It makes it clearer which are the important arguments for the test.

commit 07b0119f30156d3abcca35988999203615eec3c5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Dec 23 10:42:14 2020 +0100

    Fix '# fmt' directives.
    
    They were not aligned properly, which causes Black to crash:
$ black --version
black, version 20.8b1

$ cat test.py
def f():
    # fmt: off
    if True:
        print()
        # fmt: on

def g():
    pass

$ black test.py
error: cannot format test.py: INTERNAL ERROR: Black produced code that is not equivalent to the source.  Please report a bug on https://github.com/psf/black/issues.  This diff might be helpful: /tmp/blk_phpp_27o.log
Oh no! 💥 💔 💥
1 file failed to reformat.
```

commit edf97f4f613bb1137fb1f15e63a3f199847f6c29
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Dec 22 17:23:34 2020 +0100

Reorganize POST/PUT SWORD tests

To group them more logically.
See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/504/ for more details.

Thanks for investigating this issue.

Should there be an "integration" test for the swh deposit upload command that I (think I) pulled from the docs ad triggered the issue in the first place?

I'm not familiar with the SWORD workflow/spec, but I'm a bit confused by the docs for this function/view, especially the replace_* arguments: wouldn't one be able to end up calling this function multiple times for a given deposit, and end up with a "cumulative" deposit?

In that case, shouldn't the origin_url be set once the full set of metadata is collected, or at least only if we're using replace_metadata?

In D4800#119751, @olasd wrote:

Should there be an "integration" test for the swh deposit upload command that I (think I) pulled from the docs ad triggered the issue in the first place?

Maybe.

I'm not familiar with the SWORD workflow/spec, but I'm a bit confused by the docs for this function/view, especially the replace_* arguments: wouldn't one be able to end up calling this function multiple times for a given deposit, and end up with a "cumulative" deposit?

Yes.

In that case, shouldn't the origin_url be set once the full set of metadata is collected, or at least only if we're using replace_metadata?

You're right, but it's harder to implement that way, and I don't see a good reason to bother with it.
It should be documented, though.

ardumont added a subscriber: ardumont.

lgtm

(I don't know what the issue mentioned is though)

iirc, the replace arguments are used in the update context (which is not cumulative then, well, at least initially, don't recall if that changed since).

swh/deposit/tests/api/test_collection_post_multipart.py
44

oops, a print ;)

This revision is now accepted and ready to land.Jan 4 2021, 11:42 AM
This revision was landed with ongoing or failed builds.Jan 4 2021, 3:50 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D4800 (id=17010)

Rebasing onto ae7bfc215f...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-513-D4800.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/513/ for more details.