Page MenuHomeSoftware Heritage

add Python client for the archive WEB API
AbandonedPublic

Authored by zack on Dec 16 2019, 4:50 PM.

Details

Reviewers
None
Group Reviewers
Reviewers
Summary

not meant to land in snippets, but it's not clear where else it should land, so
I'm posting here for review purposes, we can determine where to land it later

Related: T2279

Diff Detail

Repository
rDSNIP Code snippets
Branch
webclient
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 9811
Build 14488: arc lint + arc unit

Event Timeline

Looks good overall, but I have a lot of nitpicks

zack/webclient.py
105–127

You could get rid of all these elif switches by splitting the typify function, and removing jsonify.

136

Wrong name, as it's reading from JSON, instead of producing JSON.

205–214

Can be shorter:

functions = {
    SNAPSHOT: self.snapshot,
    REVISION: self.revision,
    # ...
}
return functions[pid_.object_type](pid_)

(and it would check the type is known)

225–234

same

298–330

What about generating a list of branches instead? That's what callers of this function will have to do one way or an other, anyway.

418–419

yield from r.iter_content(chunk_size=None, decode_unicode=False)

  • simplify code and streamline returned snapshot types
zack marked 4 inline comments as done.
  • webclient: add missing error checking
zack added inline comments.
zack/webclient.py
105–127

I don't get what you mean about typify here.

If you want to split into different typify_foo functions and a dictionary based dispatched, I disagree, as some will be oneliners and then we'll have to catch the KeyError in there to reraise a ValueError. All in all, the code will be longer.

If it's something else, can you explain?

136

jsonify() was actually producing json, with .json() from the response object, and returning it as Python data.
But I got rid of it anyway as it was overkill to have a function just for that.

225–234

I haven't done dictionary based dispatching for the iter() case, because yield is syntactically invalid within lambda, so I'd have to use def-s instead for a bunch of oneliners, and that would make the code longer. (I've added the fall through else to check for errors which was missing though.

298–330

Good idea, done.

I note that this is now inconsistent with directory, which does return the parent directory id in each entry, but it's an inconsistency that is derived from the API itself, that should probably be fixed there with 2.0 anyway.

zack/webclient.py
105–127

You don't need dictionary dispatch, because all lines calling typify have the value of obj_type defined statically; so they might as well call typify_* functions directly.

IMHO this should be a 'standalone' repo/project. I mean it should be possible to pip install it at least.

IMHO this should be a 'standalone' repo/project. I mean it should be possible to pip install it at least.

as mentioned above :-)

not meant to land in snippets, but it's not clear where else it should land, so
I'm posting here for review purposes, we can determine where to land it later

so, yeah, I agree!, but I'm not sure where…