Page MenuHomeSoftware Heritage

Move Deposit creation from api/common.py to api/collection.py
ClosedPublic

Authored by vlorentz on Nov 24 2020, 4:35 PM.

Details

Summary

Motivation:

  1. it's not an operation in common, only 'POST Col-IRI' may create a deposit
  2. it makes the BaseAPI._deposit_put function simpler and less complex

This looks like a very long commit, but there are actually four small
changes. In order:

  1. actually move that creation (the large block of code was moved from one file to the other)
  1. make the 'deposit_id' argument of most functions of 'int' type instead of 'Optional[int]', since we now get the id very early by creating it in the Col-IRI view.
  1. unfortunately, this means creating a Deposit and committing it directly to the DB, which changes the behavior of the API, as it used to run checks *before* committing to the DB. Therefore, instead of immediately committing to get the id, it passes the Deposit object, which will be committed later by _deposit_put if all checks pass. This means changing the 'deposit_id: int' argument of many internal functions to 'deposit: Deposit'. This is why the diff looks big, but it's actually a trivial change
  1. updated test_add_metadata_to_unknown_deposit because the API now returns the right error message, as a side-effect.

Diff Detail

Repository
rDDEP Push deposit
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D4581 (id=16273)

Rebasing onto f612cdc00e...

Current branch diff-target is up to date.
Changes applied before test
commit 85486a79e014f681aaab844998b1fa98ccc92b49
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Nov 24 16:32:42 2020 +0100

    Move Deposit creation from api/common.py to api/collection.py
    
    Motivation:
    
    1. it's not an operation in common, only 'POST Col-IRI' may create a deposit
    2. it makes the BaseAPI._deposit_put function simpler and less complex
    
    This looks like a very long commit, but there are actually four small
    changes. In order:
    
    1. actually move that creation (the large block of code was moved
       from one file to the other)
    
    2. make the 'deposit_id' argument of most functions of 'int' type
       instead of 'Optional[int]', since we now get the id very early
       by creating it in the Col-IRI view.
    
    3. unfortunately, this means creating a Deposit and committing it
       directly to the DB, which changes the behavior of the API, as it used
       to run checks *before* committing to the DB.
       Therefore, instead of immediately committing to get the id, it passes
       the Deposit object, which will be committed later by _deposit_put if
       all checks pass.
       This means changing the 'deposit_id: int' argument of many internal
       functions to 'deposit: Deposit'. This is why the diff looks big,
       but it's actually a trivial change
    
    4. updated test_add_metadata_to_unknown_deposit because the API now returns
       the right error message, as a side-effect.

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/430/ for more details.

ardumont added a subscriber: ardumont.

lgtm, a couple of remarks inline.

And thanks.

swh/deposit/api/common.py
892

^ that checks could go away now.

Also the collection_name parameter can go away, it's redundant with the deposit collection field.

And then simplify the check below.

1074

you should pass the collection as well here.

I think then the checks on the collection done with the self.checks method call could be also dropped since it's done by the getter method.

1115–1117

same pass the collection.

This revision now requires changes to proceed.Nov 25 2020, 10:27 AM
swh/deposit/api/common.py
892

Actually, no, because the deposit object is Optional.

But I have an idea to do it in a future diff with an other refactoring.

1074

indeed (for the first point), but I'll do it in a different diff (as it slightly changes the behavior of the API)

I don't understand your second point, what do you mean by "the getter method"?

ok then ;)

swh/deposit/api/common.py
1074

ack

I don't understand your second point, what do you mean by "the getter method"?

i called get_deposit_by_id the getter method for some reason.

Within its implementation, it's getting the deposit and check for collection consistency.
If the collection does not exist, well the deposit will be either not found or the existing deposit's associated collection will be different so an error should be raised.
Thus rendering the next check done in self.checks unnecessary.
(if we pass this call to get_deposit_by_id, all is fine, then the first check in self.checks will always pass)

But ok, let's land this and check further in the subsequent diff ;)

This revision is now accepted and ready to land.Nov 25 2020, 11:24 AM