Page MenuHomeSoftware Heritage

Fix default config file may be absent in scanner cli
ClosedPublic

Authored by tenma on Sep 25 2020, 11:44 AM.

Details

Summary
  • make config_file optional
  • fail if file does not exist instead of using empty config dict

Close T2632

Diff Detail

Repository
rDTSCN Code scanner
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 15599
Build 24016: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 24015: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D4046 (id=14267)

Rebasing onto f10d3d87cb...

Current branch diff-target is up to date.
Changes applied before test
commit 0171f225b6e401e88285b55dc339d1e84162f238
Author: tenma <tenma+swh@mailbox.org>
Date:   Fri Sep 25 11:41:10 2020 +0200

    Fix default config file may be absent in scanner cli
    
    - make config_file optional
    - fail if file does not exist instead of using empty config dict

See https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/50/ for more details.

zack added inline comments.
swh/scanner/cli.py
55–66

can't this logic go into swh.core.config() somehow? otherwise this boilerplate will have to be copy/pasted in all modules that need the global config file, which seems silly to me

swh/scanner/cli.py
55–66

definitely, but until the config question is settled i.e. we agree on a spec, I will not try to modify core.config, 2 recent tries demonstrated it has no consensus. The idea is to update relevant user code when core.config is ready.

swh/scanner/cli.py
55–66

ack!

(i'll leave the actual review to others as i don't know nearly enough about the config inner workings)

on another note, do I write tests for this cli?

In D4046#100083, @tenma wrote:

on another note, do I write tests for this cli?

i was surprised to find out the other day that they didn't exist yet

proper functional testing of what the scanner does is a bit complicated (requires backend mocking, etc.) and out-of-scope for this diff

but a simple test that checks that, dunno, at least "swh scanner --help" doesn't explode would indeed be nice and would have help avoiding this bug in the first place

fail if file does not exist instead of using empty config dict

Isn't it already the default behavior?

$ swh scanner -C foo/bar
Usage: swh scanner [OPTIONS] COMMAND [ARGS]...
Try "swh scanner -h" for help.

Error: Invalid value for "-C" / "--config-file": File "foo/bar" does not exist.

fail if file does not exist instead of using empty config dict

Isn't it already the default behavior?

It no longer is if we switch to click.Path(exists=False), which is required for this diff to work.
A problem (or feature) is that the read* functions do not fail and return {} if file is unloadable.

extracted isort change from commit, which was directly merged by olasd

Build is green

Patch application report for D4046 (id=14272)

Rebasing onto a7a17e7ed6...

Current branch diff-target is up to date.
Changes applied before test
commit 834abd58927a5f3bbc7d891949cef61140e72f30
Author: tenma <tenma+swh@mailbox.org>
Date:   Fri Sep 25 14:25:47 2020 +0200

    Fix default config file may be absent in scanner cli
    
    - make config_file optional
    - fail if file does not exist instead of using empty config dict

See https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/51/ for more details.

oh, ok

swh/scanner/cli.py
55–66

IMO it's more readable like this:

if config_file is None:
    if config.config_exists(DEFAULT_CONFIG_PATH):
        config_file = DEFAULT_CONFIG_PATH
    else:
        conf = DEFAULT_CONFIG
elif not config.config_exists(config_file):
    raise FileNotFoundError(config_file)

but either way is fine with me

This revision is now accepted and ready to land.Sep 25 2020, 2:31 PM

Also, I added description of defaults in the command help, but it reads badly, do you know a better way?
Remarks:

  • clicks inline even if I use a docstring
  • moving it to the command/group docstring would resolve the problem, but causes unexpectedly the docstring to be ignored by click (we wants static strings here)

IMO it's more readable like this

I tried many things to have it more readable, yours seems OK but not complete.
Trying to to fit a combinatoric (here of 3 independent variables) problem with hierarchical solution (conditional blocks), albeit not very complex, are always hard to get right.
I tried to flatten it and that's the result.

  • moving it to the command/group docstring would resolve the problem, but causes unexpectedly the docstring to be ignored by click (we wants static strings here)

What's a static string? How could click tell the difference?

Can you show me the code you tried?

Can you show me the code you tried?

I meant "needing no evaluation". https://forge.softwareheritage.org/P785

Oh indeed, that's https://bugs.python.org/issue28739

A little hacky, but we can do this instead:

def _set_doc(doc):
    def decorator(f):
        f.__doc__ = doc
        return f
    return decorator

@swh_cli_group.group(name="scanner", context_settings=CONTEXT_SETTINGS)
@click.option(
    "-C",
    "--config-file",
    default=None,
    type=click.Path(exists=False, dir_okay=False, path_type=str),
    help=f"""YAML configuration file. If absent and cannot load the default one,
    default parameters are used.""",
)
@click.pass_context
@_set_doc(
    f"""Software Heritage Scanner tools.

    Default config path is {DEFAULT_CONFIG_PATH}.
    Default config values are {pformat(DEFAULT_CONFIG)}
    """
)
def scanner(ctx, config_file: Optional[str]):

Build is green

Patch application report for D4046 (id=14302)

Rebasing onto ad23ee03c0...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-52-D4046.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/52/ for more details.