Page MenuHomeSoftware Heritage

Add athena subcommand to create/query AWS Athena database
ClosedPublic

Authored by seirl on Apr 14 2021, 1:55 PM.

Diff Detail

Repository
rDDATASET Datasets
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

seirl requested review of this revision.Apr 14 2021, 1:55 PM
seirl created this revision.
seirl retitled this revision from Move ORC table schema in relational.py to Add athena subcommand to create/query AWS Athena database.Apr 14 2021, 1:57 PM
seirl edited the summary of this revision. (Show Details)
zack requested changes to this revision.Apr 14 2021, 4:09 PM
zack added a subscriber: zack.

Most of my comments are minor/nice to have, although I'd like to be able to pass queries directly on the CLI.

swh/dataset/athena.py
1 ↗(On Diff #19691)

this lacks the usual copyright/license blurb

1 ↗(On Diff #19691)

also, a short docstring stating this module implements the "athena" subcommands for the CLI would be nice to have

124 ↗(On Diff #19691)

KiB, MiB, GiB, etc. ;-)

swh/dataset/cli.py
130 ↗(On Diff #19691)

maybe "manage and query"

171 ↗(On Diff #19691)

So the query must be given in a file? That seems cumbersome. How about expecting it as a string argument, either as an option, but even as the only way to pass a query; after all one can always use shell expansion $(cat ...) if the query is too long for the command line.

swh/dataset/relational.py
1

a comment line before the formatting pragma stating what this module contains would be nice to have

This revision now requires changes to proceed.Apr 14 2021, 4:09 PM

Thanks for the review!

swh/dataset/cli.py
171 ↗(On Diff #19691)

I'm mimicking how most database command lines work. You can also write <<<"query" if you want to inline it (or echo "query" | swh athena query), and heredocs are more natural this way.

Add documentation and licensing info

This revision was not accepted when it landed; it landed in state Needs Review.Apr 14 2021, 6:48 PM
This revision was not accepted when it landed; it landed in state Needs Review.
This revision was automatically updated to reflect the committed changes.