Page MenuHomeSoftware Heritage

Raise 400 instead of 500 on 'incomplete' multipart request
ClosedPublic

Authored by vlorentz on Oct 12 2022, 10:20 AM.

Details

Reviewers
lunar
ardumont
Group Reviewers
Reviewers
Maniphest Tasks
Restricted Maniphest Task
Commits
rDDEPe0d553df6ffe: Raise 400 instead of 500 on 'incomplete' multipart request
Summary

Resolves T4622

Diff Detail

Repository
rDDEP Push deposit
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D8664 (id=31287)

Rebasing onto 730e81d8dd...

First, rewinding head to replay your work on top of it...
Applying: Raise 400 instead of 500 on 'incomplete' multipart request
Changes applied before test
commit 5d2053f048d3fdb9729f411ab0bbd40924c81e41
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 12 10:19:30 2022 +0200

    Raise 400 instead of 500 on 'incomplete' multipart request

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/865/ for more details.

swh/deposit/api/common.py
561–570

This check feels redundant at this point and I’m not sure it’s adding value to users but there’s no harm either.

590

After reading the documentation, it’s unclear to me why the previous error (raised for providing more than one archive or definition) raised ERROR_CONTENT and this one raises BAD_REQUEST.

Quoting the definition :

12.1.1. sword:ErrorContent

The supplied format is not the same as that identified in the Packaging header and/or that supported by the server

Associated HTTP Status: 415 (Unsupported Media Type) or 406 (Not Acceptable)

12.1.3. sword:ErrorBadRequest

Some parameters sent with the POST were not understood. The server MUST also return a status code of 400 Bad Request.

I feel ERROR_CONTENT is probably the right one in all these cases.

631–632

As you are now raising an exception if filehandler is None at the beginning, I guess this assert could be removed.

ardumont added a subscriber: ardumont.
ardumont added inline comments.
swh/deposit/api/common.py
590

After reading the documentation, it’s unclear to me why the previous error (raised for providing more than one archive or definition)

Reasons... have been lost to the ages...

Although, I do recall a tedious deposit beginning so technical decisions [1] were made so it actually lands and gets deployed.
That must be probably why this ^.

[1] 1 "multipart" request with actually one tarball (archive) and one xml file (metadata). If more or less ~> 400.

I feel ERROR_CONTENT is probably the right one in all these cases.

In this context, that's probably a technical issue with the provided file.
For some reason, it's not a valid file hence the resulting bad request.

This revision is now accepted and ready to land.Oct 17 2022, 6:35 PM
swh/deposit/api/common.py
590

I am quite confused by your answer. I am not sure it matters regarding the actual error code, but as far as I can understand, the SWORDv2 spec mandates that there is only two parts in this case:

The client can create a resource by POSTing a multipart mime message to the Col-IRI with two parts: An Atom Entry containing the metadata for the deposit (known as the Entry Part), and the binary content as the second part (known as the Media Part), […]

Was there a time where swh-deposit behaved differently?

swh/deposit/api/common.py
590

I'm also getting confused and it feels like we're getting trapped in a loop.

After reading the documentation, it’s unclear to me why the previous error (raised
for providing more than one archive or definition)

The client can create a resource by POSTing a multipart mime message to the Col-IRI
with two parts: An Atom Entry containing the metadata for the deposit (known as the
Entry Part), and the binary content as the second part (known as the Media Part), […]

Tentatively trying to break the cycle (maybe? ;p).

That quotes replies it better than i did then.

Are you asking for us to align the error to be the same in both conditional (in the
current diff, lines 578, 563, 591?). Why not, but If so, that's a subject for another
diff (and to me all those feel like "Bad request" anyway).

Was there a time where swh-deposit behaved differently?

It's possible that at an early stage, plenty of issues regarding those steps were not
hardened enough (hence the multitude of conditionals here to alleviate this). But the
gist of the behavior must have been the same (since we used the same spec your are
quoting afair).

vlorentz marked an inline comment as done.

rebase + remove redundant assertion

swh/deposit/api/common.py
590

The two errors added by this diff are caused by expected types being missing. ERROR_CONTENT is when receiving content with unexpected types.

Because multipart requests can have arbitrarily many contents, one does not imply the other (even though they are correlated)

631–632

indeed

Build is green

Patch application report for D8664 (id=31401)

Rebasing onto 730e81d8dd...

Current branch diff-target is up to date.
Changes applied before test
commit af718a5dcb5083b16bc030bb9f4821adfabba9fc
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 12 10:19:30 2022 +0200

    Raise 400 instead of 500 on 'incomplete' multipart request

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/866/ for more details.

This revision was landed with ongoing or failed builds.Oct 27 2022, 2:18 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D8664 (id=31663)

Rebasing onto 9f8535874d...

Current branch diff-target is up to date.
Changes applied before test
commit e0d553df6ffeb5bbcf5c844a3c1836fc84941092
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 12 10:19:30 2022 +0200

    Raise 400 instead of 500 on 'incomplete' multipart request

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/869/ for more details.