Page MenuHomeSoftware Heritage

deposit.loader: Build revision out of the deposit api read metadata
ClosedPublic

Authored by ardumont on Apr 22 2020, 6:16 PM.

Details

Summary

Read "deposit" information out of the deposit read api and let the loader build
the revision out of those information.

Prior to this commit, this read the "revision" entry with data already
pre-built.

Related to D3047

Test Plan

tox

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D3048 (id=10830)

Could not rebase; Attempt merge onto 042adcb6e2...

Updating 042adcb..14b5f26
Fast-forward
 swh/loader/package/deposit/loader.py               | 40 ++++++++++------------
 .../hello_2.10.json                                | 30 ++++------------
 .../hello_2.11.json                                | 31 ++++-------------
 swh/loader/package/deposit/tests/test_deposit.py   |  7 ++--
 4 files changed, 36 insertions(+), 72 deletions(-)
Changes applied before test
commit 14b5f26a6ab16610bc5091c5d33db72ef0523f9f
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Wed Apr 22 17:55:27 2020 +0200

    deposit.loader: Build revision out of the deposit api read metadata

commit b766c456c196746f3bb2c822723604dcd4e6317d
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Wed Apr 22 17:16:03 2020 +0200

    deposit.loader: Fix revision metadata redundancy in deposit metadata
    
    Related to T2374

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/27/ for more details.

Thanks for the fast diff !

First quick review without tests.
I have a couple of questions inline.

swh/loader/package/deposit/loader.py
97

Here the author and committer is the same.
Which is what we have done with HAL. but we should leave it open to two different fields.

swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.10.json
35

why it's not synthetic?

50–51

is this fullname about a author or committer?

swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.11.json
64

should I see here the 'extrinsic' property with the deleted metadata?

swh/loader/package/deposit/loader.py
97

So that means we should update the other diff with both author and committer.
Tell me if we should also rename to something more metadata like.

author, publisher, author_date, publisher_date or something.
The loader would then do simply a mapping here.

swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.10.json
35

It is but it's no longer the api side which decides.
It's done loader side as well.
That's another field which got silently dropped.

50–51

Yes, it hit me tonight, should be client because it's the client name.

swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.11.json
64

No, that file is to simulate a call to the deposit_read api call.
That's prior to the loader creating the revision.

swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.10.json
50–51

Also, i'm keeping how the revision is built, not changing that.
I don't mind renaming to client which is clearer (i'm heading there).

Align with D3047 changes:

  • "fullname" renamed to "client"
  • "committer" entry added

Build is green

Patch application report for D3048 (id=10837)

Could not rebase; Attempt merge onto 042adcb6e2...

Updating 042adcb..fca7e47
Fast-forward
 swh/loader/package/deposit/loader.py               | 41 ++++++-----
 .../hello_2.10.json                                | 40 ++++-------
 .../hello_2.11.json                                | 45 ++++--------
 swh/loader/package/deposit/tests/test_deposit.py   | 81 +++++++++++++++++++++-
 4 files changed, 126 insertions(+), 81 deletions(-)
Changes applied before test
commit fca7e4783d581b6bc9fd2f5be039e5c06df025a4
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Wed Apr 22 17:55:27 2020 +0200

    deposit.loader: Build revision out of the deposit api read metadata
    
    Prior to this commit, it was done using a "revision" field built deposit read
    api metadata server side (reading a "revision" key). It's now built out of
    deposit information loader side (reading a "deposit" key).
    
    This fixes a long time FIXME which needed to go away.

commit 96f3e296e70ce0ea2f42d0efb0f793106c813fe1
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Wed Apr 22 17:16:03 2020 +0200

    deposit.loader: Fix revision metadata redundancy in deposit metadata
    
    Related to T2374

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/30/ for more details.

swh/loader/package/deposit/loader.py
97

Diff updated. Added a new entry "committer".

swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.10.json
50–51

diff updated: renamed to "client"

vlorentz added inline comments.
swh/loader/package/deposit/loader.py
86

why this instead of a_metadata["revision"]["message"]?

swh/loader/package/deposit/loader.py
86

It's no longer part of the a_metadata any more (see D3047 ;)

moranegg added inline comments.
swh/loader/package/deposit/loader.py
97

that's good.
no need for more "metada" like:-)

swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.10.json
35

thanks for the clarification

50–51

client is better.

swh/loader/package/deposit/tests/data/https_deposit.softwareheritage.org/hello_2.11.json
64

ok

This revision is now accepted and ready to land.Apr 23 2020, 3:04 PM