Page MenuHomeSoftware Heritage

Open a paginated list user deposits endpoint
ClosedPublic

Authored by ardumont on May 21 2021, 11:21 AM.

Details

Summary

This adds a paginated listing endpoint so authenticated user can retrieve their deposit
information in batch. This touches another part of an equivalent private listing api [1]
to allow paginated code reuse.

It's not a sword endpoint but it lists deposits in xml nonetheless [2]

Next step if this lands as-is would be to open the use of such api to the deposit cli.

Discuss [2] ;)

[1] The "all" listing api used in the admin deposit view of the webapp

[2] Checking the spec for the listing collection endpoint [3) kinda took me aback...
Thus why i removed the easy-hack from the task (i had also forgotten it was one...). In
there it says, well do an atompub (and then forwards to the atompub rfc [4] and i got
lost in there... I kinda did not want to fall in there so i took another simpler way.
Also it's somewhat for internal use within the cli, so the exposal in the main server
is an implementation detail. Anyway, feel free to propose something better ;)

[3] http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html#protocoloperations_listingcollections

[4] https://www.ietf.org/rfc/rfc5023.txt

Related to T2996

Test Plan

tox

Diff Detail

Repository
rDDEP Push deposit
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 21582
Build 33531: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 33530: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D5765 (id=20611)

Rebasing onto e2f8900d12...

Current branch diff-target is up to date.
Changes applied before test
commit 88002d5ca8bb13cee267ac68554681cb8049ed77
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 21 11:01:58 2021 +0200

    Open a list user deposits endpoints
    
    Related to T2996

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/649/ for more details.

swh/deposit/api/urls.py
87

Unsure about my comment here ^

I separated it from the main collection.py module for technical reason [1] and
readability (simpler to read it in a dedicated module).

[1] sounds unlikely that mixing the classes needed for pagination and listing will work
with the APIPost class in the main collection.CollectionAPI class

ardumont retitled this revision from Open a list user deposits endpoints to Open a paginated list user deposits endpoints.May 21 2021, 11:43 AM
ardumont retitled this revision from Open a paginated list user deposits endpoints to Open a paginated list user deposits endpoint.May 21 2021, 11:45 AM
anlambert added a subscriber: anlambert.

Looks good to me but some improvements can be added to the doc and tests implementation before landing this.

swh/deposit/api/collection_list.py
16

You should put the verb between double backticks, it will render nicely in the doc.

31

s/will be dealt with/is handled by/

32

s/attribute class/class attribute/

swh/deposit/api/urls.py
87

Yes it makes sense to proceed like this.

swh/deposit/tests/api/test_collection_list.py
18

Deposit list api requires authentication

24

If the response data is JSON, as anonymous_client is a rest_framework.test.APIClient instance, you can use response.data here instead of response.content.

66–67

As authenticated_client is an instance of rest_framework.test.APIClient, you have directly access to the data using response.data.

This revision now requires changes to proceed.May 21 2021, 12:00 PM
ardumont added inline comments.
swh/deposit/tests/api/test_collection_list.py
24

as mentioned in the description, not in case of error, the answer is then xml but gotcha, i'll check the nominal case.

I double checked to be sure ;)

[31] > /home/tony/work/inria/repo/swh/swh-environment/swh-deposit/swh/deposit/tests/api/test_collection_list.py(26)test_deposit_collection_list_is_auth_protected()
-> assert b"protected by basic authentication" in response.content
   5 frames hidden (try 'help hidden_frames')
(Pdb++) response.data
*** AttributeError: 'HttpResponse' object has no attribute 'data'
(Pdb++) ll
  17     def test_deposit_collection_list_is_auth_protected(anonymous_client):
  18         """Deposit list requires authentication
  19
  20         """
  21         url = reverse(COLLECTION_LIST, args=("test",))
  22         response = anonymous_client.get(url)
  23         assert response.status_code == status.HTTP_401_UNAUTHORIZED
  24         import pdb; pdb.set_trace()
  25
66–67

right!

ardumont marked 2 inline comments as done.
  • Adapt according to review
  • Reword commit message according to diff description
swh/deposit/tests/api/test_collection_list.py
24

Is it easy to turn the XML error response into a JSON one ?

That's weird to have different response formats for an endpoint.

Build is green

Patch application report for D5765 (id=20617)

Rebasing onto e2f8900d12...

Current branch diff-target is up to date.
Changes applied before test
commit a68f1b00d2ce00038d1efd21f7a7f7ede10d5baf
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 21 11:01:58 2021 +0200

    Open a paginated list user deposits endpoint
    
    This adds a paginated listing endpoint so authenticated user can retrieve their deposit
    information in batch. This touches another part of an equivalent private listing api [1]
    to allow paginated code reuse.
    
    It's not a sword endpoint so it lists json in the nominal case (with authorized user) [2]
    
    It's not always consistent though as it returns xml in case of issues (as it's reusing
    existing behavior for the checks).
    
    Related to T2996

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/650/ for more details.

swh/deposit/tests/api/test_collection_list.py
24

You could for instance testing on the request content_type in make_error_response_from_dict and return a JSON response when requested.

It's not a sword endpoint so it lists json in the nominal case (with authorized user) [2]

why not XML, though?

It's not a sword endpoint so it lists json in the nominal case (with authorized user) [2]

why not XML, though?

Because i wanted to check reactions first on the all "is it ok to open such endpoint that way" ?
;)

As for details, I reused the current code which is coming from the rest_framework (which
outputs json). I did not dig in further than that (might be it could convert in something else?)

I'm for consistency as well so i'd prefer the public api sends xml in all cases (and let
the cli part deal with json/dict). So if the overall idea of this diff is ok, i'll dig
in further so the actual response is consistent with the other public endpoints (xml).

swh/deposit/tests/api/test_collection_list.py
24

Agreed on the inconsistency weirdness, see my other answer to val.
Yes, we can do as you suggest.

I'm more of a mind to try and see if that other public endpoints can
return xml as the other endpoints does.

swh/deposit/tests/api/test_collection_list.py
24

Ack, I guess you will have to write a custom DRF renderer to output xml then.

swh/deposit/tests/api/test_collection_list.py
24

(1) I looked into it and found one [1] (albeit not packaged in debian). It does the job (i tried
the code out of some copy/paste ;)

But then i recalled we did not use specific xml renderer so far.
We used plain rendered response out of xml template currently.

Plus, (1) clashes with the default error answers (which are rendered
but not the same way then). I don't want to spend too much time on this.
So i'll continue on the initial road.

Thanks though.

[1] https://jpadilla.github.io/django-rest-framework-xml/renderers/#setting-the-renderers

It's not a sword endpoint so it lists json in the nominal case (with authorized user) [2]

why not XML, though?

Because i wanted to check reactions first on the all "is it ok to open such endpoint that way" ?
;)

As for details, I reused the current code which is coming from the rest_framework (which
outputs json). I did not dig in further than that (might be it could convert in something else?)

I'm for consistency as well so i'd prefer the public api sends xml in all cases (and let
the cli part deal with json/dict). So if the overall idea of this diff is ok, i'll dig
in further so the actual response is consistent with the other public endpoints (xml).

Well and now that i'm starting to list, i don't actually know what the xml must look like.
any hints @vlorentz ?

I have an ugly xml local stuff which is similar to what's the status endpoint returns
(for one deposit) but here, it's for a list of deposits (it does not work yet but you
grok the gist... maybe):

<entry xmlns="http://www.w3.org/2005/Atom"
       xmlns:sword="http://purl.org/net/sword/terms/"
       xmlns:dcterms="http://purl.org/dc/terms/"
       xmlns:sd="https://www.softwareheritage.org/schema/2018/deposit"
       >
  <link rel="next" href="{{ next }}" />
  <link rel="previous" href="{{ previous }}" />
  <sd:deposits>
    {% for deposit in results %}
    <sd:deposit>
      {% for key, value in deposit.items %}
      {% if value is not None %}<sd:{{ key }}>{{ value }}</sd:{{ key }}>{% endif %}
      {% endfor %}
    </sd:deposit>
  {% endfor %}
  </sd:deposits>
</entry>
swh/deposit/tests/api/test_collection_list.py
24

if what I am saying is possible at all ¯\_(ツ)_/¯

swh/deposit/tests/api/test_collection_list.py
24

it is, i just got side-tracked by [1] not working [2] is the answer

[1]

{% for key, value in deposit.items() %}  # d'oh

[2]

{% for key, value in deposit.items %}
swh/deposit/tests/api/test_collection_list.py
24

Yes, django template language has some limitations. You cannot call a function with arguments for instance.

Adapt according to discussions

Build is green

Patch application report for D5765 (id=20618)

Rebasing onto e2f8900d12...

Current branch diff-target is up to date.
Changes applied before test
commit b16f857c189d2111eafed1b93b1ab34475d93b73
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 21 17:51:33 2021 +0200

    Adapt endpoint to actually return xml instead of json

commit a68f1b00d2ce00038d1efd21f7a7f7ede10d5baf
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 21 11:01:58 2021 +0200

    Open a paginated list user deposits endpoint
    
    This adds a paginated listing endpoint so authenticated user can retrieve their deposit
    information in batch. This touches another part of an equivalent private listing api [1]
    to allow paginated code reuse.
    
    It's not a sword endpoint so it lists json in the nominal case (with authorized user) [2]
    
    It's not always consistent though as it returns xml in case of issues (as it's reusing
    existing behavior for the checks).
    
    Related to T2996

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/651/ for more details.

Looks good to me.

swh/deposit/templates/deposit/collection_list.xml
11 ↗(On Diff #20618)

would be more readable with a line break and indentation after the if directive

This revision is now accepted and ready to land.May 21 2021, 6:27 PM

Adapt according to suggestion
squash commits

Build is green

Patch application report for D5765 (id=20624)

Rebasing onto e2f8900d12...

Current branch diff-target is up to date.
Changes applied before test
commit c28bb48669989b779dcce61986920727438bd5ac
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 21 11:01:58 2021 +0200

    Open a paginated list user deposits endpoint
    
    This adds a paginated listing endpoint so authenticated user can retrieve their deposit
    information in batch. This touches another part of an equivalent private listing api to
    allow paginated code reuse.
    
    That new endpoint is not a sword endpoint but it lists deposits in xml relatively sword
    like nonetheless.
    
    Related to T2996

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/652/ for more details.