- in one of the cases it was redundant to check for emptiness before checking the fields anyway
- checking for emptiness does not make sense, as there are a lot of 'errors' it does not catch in non-obvious ways: a) xmlns attributes count as non-emptiness b) unknown tags/attributes count as non-emptiness
Details
Diff Detail
- Repository
- rDDEP Push deposit
- Branch
- master
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 17070 Build 26346: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 26345: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D4469 (id=15861)
Could not rebase; Attempt merge onto 4d7d80c1b8...
Removing swh/deposit/tests/data/atom/tei-sample.xml Merge made by the 'recursive' strategy. docs/endpoints/content.rst | 2 - docs/metadata.rst | 9 --- docs/specs/metadata_example.xml | 1 - docs/specs/spec-loading.rst | 3 +- docs/specs/spec-meta-deposit.rst | 2 - docs/specs/spec-sparse-deposit.rst | 2 - docs/user-manual.rst | 2 - swh/deposit/api/common.py | 27 ++++++-- swh/deposit/api/deposit_update.py | 10 +-- swh/deposit/tests/api/test_deposit.py | 6 +- swh/deposit/tests/api/test_deposit_atom.py | 75 ++++++++-------------- .../api/test_deposit_private_read_metadata.py | 2 +- swh/deposit/tests/api/test_deposit_update.py | 14 ++-- swh/deposit/tests/conftest.py | 2 +- swh/deposit/tests/data/atom/codemeta-sample.xml | 3 +- .../entry-data-fail-metadata-functional-checks.xml | 1 - swh/deposit/tests/data/atom/entry-data-minimal.xml | 3 +- swh/deposit/tests/data/atom/entry-data0.xml | 2 - swh/deposit/tests/data/atom/entry-data1.xml | 1 - swh/deposit/tests/data/atom/entry-data2.xml | 2 +- swh/deposit/tests/data/atom/entry-data3.xml | 1 + swh/deposit/tests/data/atom/error-with-decimal.xml | 2 - .../data/atom/error-with-external-identifier.xml | 7 ++ swh/deposit/tests/data/atom/metadata.xml | 2 - swh/deposit/tests/data/atom/tei-sample.xml | 1 - 25 files changed, 73 insertions(+), 109 deletions(-) create mode 100644 swh/deposit/tests/data/atom/error-with-external-identifier.xml delete mode 100644 swh/deposit/tests/data/atom/tei-sample.xml
Changes applied before test
commit 6bbc72af330eabb4365ee3603048fd2c839457ba Merge: 4d7d80c1 d3f03178 Author: Jenkins user <jenkins@localhost> Date: Thu Nov 12 17:37:39 2020 +0000 Merge branch 'diff-target' into HEAD commit d3f03178191fa65bbe7bf23826a5e43b47173ab4 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 12 18:37:20 2020 +0100 Check document bodies contain mandatory fields, instead of just checking for emptiness. 1. in one of the cases it was redundant to check for emptiness before checking the fields anyway 2. checking for emptiness does not make sense, as there are a lot of 'errors' it does not catch in non-obvious ways: a) xmlns attributes count as non-emptiness b) unknown tags/attributes count as non-emptiness commit 21b667dc79142e2bd7560ead2324543406531800 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 12 18:34:08 2020 +0100 Remove test_post_deposit_atom_entry_tei. We barely support it, and no one uses it. Additionally, a future commit will outright refuse documents without an atom:author or atom:title. commit a85b22ca7e5f3e2e47c3b06b4ea323bb830f1565 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 12 15:01:38 2020 +0100 Remove the <client> tag from the protocol. It didn't match the Atom specification, and isn't used anywhere. commit 4793f2ef85561d473a0b9932367b40d4a449ae7f Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Nov 12 14:47:09 2020 +0100 Remove the <external_identifier> tag from the protocol. It didn't match the Atom specification, duplicates the Slug header, while being optional and specified as 'SHOULD match the Slug'.
See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/327/ for more details.
Maybe the check for emptiness "does not make sense", i don't really see that.
In any case, I disagree with those technical changes.
With this the current deposit behavior changes...
There is a reason for the metadata checks to happen later (as in an asynchronous check task
done by a worker when the status is in deposited state).
It's to be less restrictive on the metadata input.
Because users could decide to send lots of deposit requests.
And only on the last one could there be the required metadata.
As long as the deposit state stays partial, it's fine not to send immediately
those required metadata.
With this, you are forcing the users to provide immediately the minimum required metadata.
The change is thus not exhaustive enough.
And also, what happens if that deposit request metadata is not a final one?
That then forces the subsequent eventual deposit metadata request to also send the requisite metadata.
(or to change the current deposit code to actually check previous deposit metadata request to check if
we already have those mandatory one)...