Page MenuHomeSoftware Heritage

add Python client for the Web API
ClosedPublic

Authored by zack on Mar 18 2020, 7:02 PM.

Diff Detail

Repository
rDWCLI Web client
Branch
feature/client
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 11211
Build 16929: tox-on-jenkinsJenkins
Build 16928: arc lint + arc unit

Event Timeline

  • Web API client: first implementation
  • Web API client: add tests
  • web client tests: add missing copyright headers

I have added a couple of inline comments that should be addressed before landing the client.

I noticed that only PIDS can be provided to the content, directory, revision, release and snapshot methods.
I imagine this is intended for validation purposes. But it could be great to also allow to pass hashes to these methods
as we can easily reconstruct the PIDS from them when we know the object type.

Also a nitpick: for consistency with the other swh modules docstrings, sentences should begin with an upper case letter.

Next step: add a CLI tool wrapping that web api client ?

requirements-swh.txt
2–3

You can remove the swh.core dependency.

swh/web/client/client.py
176–189

You could move that block in the class constructor and create a _getters instance variable.

swh/web/client/tests/conftest.py
23

remaining debug output I guess

swh/web/client/tests/test_web_api_client.py
13

You should also test web_api_client.content(pid) result.

25

You should also test web_api_client.directory(pid) result.

39

You should also test web_api_client.release(pid) result.

54

You should also test web_api_client.revision(pid) result.

This revision now requires changes to proceed.Mar 19 2020, 3:22 PM
zack marked 7 inline comments as done.
  • requirements.txt: drop redundant dep on swh.core
  • client.py: move getters table for .get() to an instance variable
  • pytest fixture: remove left-over debug print
  • tests: also test type-specific getters
  • client.py: make docstring start with uppercase letters

Thanks for the great review !

All done, except for this point, which was indeed on purpose:

I noticed that only PIDS can be provided to the content, directory, revision, release and snapshot methods.

not really for validation purposes (although it clearly helps with that), but rather because a background idea here is to be future-proof, and already adopt a convention that we have on the roadmap for API v2, i.e., use PIDs (and only PIDs) everywhere.

swh/web/client/client.py
176–189

good idea, it also allows to factor out the _get_snapshot() method, that might be useful elsewhere

swh/web/client/tests/conftest.py
23

indeed, oops :-)

swh/web/client/tests/test_web_api_client.py
13

right, I didn't want to duplicate test logic, but I think I've now found a way to test both ways of retrieving objects without doing so

(same answer applies to all other comments like this one)

not really for validation purposes (although it clearly helps with that), but rather because a background idea here is to be future-proof, and already adopt a convention that we have on the roadmap for API v2, i.e., use PIDs (and only PIDs) everywhere.

Ack, I was missing some context.

Otherwise, looks good to me !

swh/web/client/tests/test_web_api_client.py
38

nice !

This revision is now accepted and ready to land.Mar 19 2020, 4:27 PM