Page MenuHomeSoftware Heritage

deposit_read: Simplify api to return only relevant deposit information
ClosedPublic

Authored by ardumont on Apr 22 2020, 6:15 PM.

Details

Summary

With this, the deposit information needed for the deposit loader to build the
revision are sent in the "deposit" key. It contains the following keys:

  • id: Deposit id
  • collection: collection name
  • client: client name
  • author_date: creation date
  • author: author
  • committer_date: publisher date
  • committer: publisher

This avoids:

  • unnecessary redundancy. Metadata were set in multiple places even though identical.
  • some information were no longer used (branch name, synthetic, revision type, etc...)
Test Plan

tox

Diff Detail

Repository
rDDEP Push deposit
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

ardumont created this revision.Apr 22 2020, 6:15 PM

Build is green

Patch application report for D3047 (id=10829)

Rebasing onto b91abac189...

Current branch diff-target is up to date.
Changes applied before test
commit 823acc6242a909aec1422abd27cf88a194183ee9
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Wed Apr 22 18:13:26 2020 +0200

    deposit_read: Simplify api to return only relevant deposit information
    
    This avoids:
    - avoids unnecessary redundancy: metadata was set in multiple places
    - sending no longer read information

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/5/ for more details.

just worrying over branches ;-)

swh/deposit/api/private/deposit_read.py
215

just checking that a branch master is still created for a deposit..
I can't see it here, but it may be part of the simplification..

swh/deposit/tests/api/test_deposit_private_read_metadata.py
563

who is in charge of the branch?

ardumont added inline comments.Apr 23 2020, 9:49 AM
swh/deposit/api/private/deposit_read.py
215

That entry was no longer used, thus why i removed it.
It's no longer a master branch, it's HEAD, part of the loader package unification as well.

And it's done loader side [1]

[1] https://forge.softwareheritage.org/source/swh-loader-core/browse/master/swh/loader/package/deposit/loader.py$52

ardumont added inline comments.Apr 23 2020, 10:45 AM
swh/deposit/tests/api/test_deposit_private_read_metadata.py
563

the loader (see previous comment for more details).

ardumont updated this revision to Diff 10835.Apr 23 2020, 11:05 AM

Rebase on latest master

Build is green

Patch application report for D3047 (id=10835)

Could not rebase; Attempt merge onto 042adcb6e2...

Updating 042adcb..2d9f2d3
Fast-forward
 swh/loader/package/deposit/loader.py               | 40 +++++------
 .../hello_2.10.json                                | 33 ++-------
 .../hello_2.11.json                                | 34 ++-------
 swh/loader/package/deposit/tests/test_deposit.py   | 81 +++++++++++++++++++++-
 4 files changed, 111 insertions(+), 77 deletions(-)
Changes applied before test
commit 2d9f2d38157955829d0813570c10338b9ab67865
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Wed Apr 22 17:55:27 2020 +0200

    deposit.loader: Build revision out of the deposit api read metadata

commit 96f3e296e70ce0ea2f42d0efb0f793106c813fe1
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Wed Apr 22 17:16:03 2020 +0200

    deposit.loader: Fix revision metadata redundancy in deposit metadata
    
    Related to T2374

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/29/ for more details.

ardumont edited the summary of this revision. (Show Details)Apr 23 2020, 11:11 AM
ardumont updated this revision to Diff 10836.Apr 23 2020, 11:21 AM
ardumont edited the summary of this revision. (Show Details)

Align with review and description adaptation

  • Rename fullname to client
  • Add committer entry (that could be different in the future)
ardumont edited the summary of this revision. (Show Details)Apr 23 2020, 11:22 AM

Build is green

Patch application report for D3047 (id=10836)

Rebasing onto b91abac189...

Current branch diff-target is up to date.
Changes applied before test
commit 4d26fd87e0ce286137ee1aa2d76e7ed5dc346311
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Wed Apr 22 18:13:26 2020 +0200

    deposit_read: Simplify api to return only relevant deposit information
    
    This avoids:
    - avoids unnecessary redundancy: metadata was set in multiple places
    - sending no longer read information

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/6/ for more details.

Could you change the trst with two different persons? for a better coverage of changes?

Could you change the trst with two different persons? for a better coverage of changes?

I'd say yes but we cannot.

That's currently hard-coded in the production code right now with the SWH_PERSON.
If that's no good, i'm willing to change it but i don't know with what exactly, any clues?

swh/deposit/api/private/deposit_read.py
204

@moranegg hard-coded here in "author" and "committer" (i'm only porting the previous behavior heh).

So any no go, can i land it?

ardumont updated this revision to Diff 10870.Apr 23 2020, 4:35 PM

Rebase on latest master

moranegg accepted this revision.Apr 23 2020, 4:36 PM
This revision is now accepted and ready to land.Apr 23 2020, 4:36 PM

Build is green

Patch application report for D3047 (id=10870)

Rebasing onto d1a244c7cd...

Current branch diff-target is up to date.
Changes applied before test
commit 4752ab97240a63119e99899d7d663b0f28e6a949
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Wed Apr 22 18:13:26 2020 +0200

    deposit_read: Simplify api to return only relevant deposit information
    
    This avoids:
    - avoids unnecessary redundancy: metadata was set in multiple places
    - sending no longer read information

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/8/ for more details.