Page MenuHomeSoftware Heritage

add Python client for the archive WEB API
AbandonedPublic

Authored by zack on Dec 16 2019, 4:50 PM.

Details

Reviewers
None
Group Reviewers
Reviewers
Summary

not meant to land in snippets, but it's not clear where else it should land, so
I'm posting here for review purposes, we can determine where to land it later

Related: T2279

Diff Detail

Repository
rDSNIP Code snippets
Branch
webclient
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 9812
Build 14489: arc lint + arc unit

Event Timeline

zack created this revision.Dec 16 2019, 4:50 PM

Looks good overall, but I have a lot of nitpicks

zack/webclient.py
106–128

You could get rid of all these elif switches by splitting the typify function, and removing jsonify.

137

Wrong name, as it's reading from JSON, instead of producing JSON.

206–215

Can be shorter:

functions = {
    SNAPSHOT: self.snapshot,
    REVISION: self.revision,
    # ...
}
return functions[pid_.object_type](pid_)

(and it would check the type is known)

226–235

same

299–331

What about generating a list of branches instead? That's what callers of this function will have to do one way or an other, anyway.

419–420

yield from r.iter_content(chunk_size=None, decode_unicode=False)

zack updated this revision to Diff 8724.Dec 17 2019, 10:18 AM
  • simplify code and streamline returned snapshot types
zack updated this revision to Diff 8725.Dec 17 2019, 10:25 AM
zack marked 4 inline comments as done.
  • webclient: add missing error checking
zack marked an inline comment as done.Dec 17 2019, 10:25 AM
zack added inline comments.
zack/webclient.py
106–128

I don't get what you mean about typify here.

If you want to split into different typify_foo functions and a dictionary based dispatched, I disagree, as some will be oneliners and then we'll have to catch the KeyError in there to reraise a ValueError. All in all, the code will be longer.

If it's something else, can you explain?

137

jsonify() was actually producing json, with .json() from the response object, and returning it as Python data.
But I got rid of it anyway as it was overkill to have a function just for that.

226–235

I haven't done dictionary based dispatching for the iter() case, because yield is syntactically invalid within lambda, so I'd have to use def-s instead for a bunch of oneliners, and that would make the code longer. (I've added the fall through else to check for errors which was missing though.

299–331

Good idea, done.

I note that this is now inconsistent with directory, which does return the parent directory id in each entry, but it's an inconsistency that is derived from the API itself, that should probably be fixed there with 2.0 anyway.

vlorentz added inline comments.Dec 17 2019, 11:01 AM
zack/webclient.py
106–128

You don't need dictionary dispatch, because all lines calling typify have the value of obj_type defined statically; so they might as well call typify_* functions directly.

IMHO this should be a 'standalone' repo/project. I mean it should be possible to pip install it at least.

zack added a comment.Jan 9 2020, 11:02 AM

IMHO this should be a 'standalone' repo/project. I mean it should be possible to pip install it at least.

as mentioned above :-)

not meant to land in snippets, but it's not clear where else it should land, so
I'm posting here for review purposes, we can determine where to land it later

so, yeah, I agree!, but I'm not sure where…

zack edited the summary of this revision. (Show Details)
lewo added a subscriber: lewo.Wed, Mar 11, 5:01 PM
zack abandoned this revision.Mon, Mar 16, 4:10 PM

abandoned in favor of D2835