Page MenuHomeSoftware Heritage

Add tests and fix behavior of scanner cli
ClosedPublic

Authored by tenma on Sep 29 2020, 7:53 PM.

Details

Reviewers
douardda
Group Reviewers
Reviewers
Summary
  • add smoke and unit tests for cli to have maximum coverage
  • fix behavior of cli in case an invalid config file is specified via envvar
  • rename scan to do_scan in scanner
  • fix not self-contained temp_path fixture in conftest

Need advice on whether to use a real (tree that exist in the archive) root_path for the tests, here do_scan is mocked and never actually called.

Follow-up to D4046.

Related to D4046, D4130

Diff Detail

Repository
rDTSCN Code scanner
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 15824
Build 24360: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 24359: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D4089 (id=14422)

Rebasing onto ad23ee03c0...

Current branch diff-target is up to date.
Changes applied before test
commit d4e8857620cf95df324e47046337e7180b0d90f2
Author: tenma <tenma+swh@mailbox.org>
Date:   Tue Sep 29 19:22:24 2020 +0200

    Add tests and fix behavior of scanner cli
    
    - add smoke and unit tests
    - fix behavior of cli in case an invalid config file is specified via
    envvar

Link to build: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/53/
See console output for more information: https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/53/console

OK I have been biten by the tests I just introduced. They are not self-contained, advice welcome on how to make them so.

douardda added a subscriber: douardda.
douardda added inline comments.
swh/scanner/cli.py
53

This def_path var is useless

55

This comment in not really useful either. It gives the impression we plan on having a dynamic config system, which we do not.

60

the elif are a bit disturbing here.

I also find that this whole snip of code does not really make explicit and clear the logic behind it. I'd rather write it like

if not config_file:
  if env_path:
    config_file = env_path

if config_file:
  if not config.config_exists(config_file):
    raise FileNotFound()
else:
  if config.config_exists(DEFAULT_CONFIG_PATH):
    config_file = DEFAULT_CONFIG_PATH

It's not shorter but it expresses the logic better IMHO.

68

choose either a syntax or the other, but do not let this second option a comment.

swh/scanner/tests/test_cli.py
14–16

why are these fixtures?
What aspect/feature of a fixture is used here?

21

what??? what's this directory doing here? How can you be sure it exists?

26

same, not a valid value. ~ may very well be a valid directory.

If you want a directory that does not exists, you create a temp directory (that exists), than use a path that is a subdirectoy of this and you are sure does not exists (because you have full control of it).

See e.g. the tmpdir fixture

https://docs.pytest.org/en/stable/tmpdir.html
31

I don't get it how a unit test can expect to have this an existing file?

Please use tmpdirs/tmpfiles for this kind of test.

40

not a big deal, but is this really a smoke test?

Also if you test the swh scanner --help invocation, why not verify a few sanity checks are ok on the expected result (aka the help string)?

43

This is not a TODO, according the very nature of this diff, testing this is a must have.

45

note that making this comment a docstring makes pytest display it when the test fail (in verbose mode I believe).

53

oops

83

same, needs to be done in this diff.

This revision now requires changes to proceed.Sep 30 2020, 10:13 AM
tenma retitled this revision from Add tests and fix behavior of scanner cli to WIP Add tests and fix behavior of scanner cli.Sep 30 2020, 11:47 AM
tenma marked 4 inline comments as done.
  • Rename scan to do_scan in scanner
  • Fix not self-contained temp_path fixture in conftest
  • Change tests and fix behavior of scanner cli

Build is green

Patch application report for D4089 (id=14562)

Could not rebase; Attempt merge onto ad23ee03c0...

Updating ad23ee0..2ef852d
Fast-forward
 MANIFEST.in                       |   2 +-
 swh/scanner/cli.py                |  62 ++++++++-------
 swh/scanner/scanner.py            |   2 +-
 swh/scanner/tests/conftest.py     |  13 ++--
 swh/scanner/tests/data/global.yml |   2 +
 swh/scanner/tests/test_cli.py     | 158 ++++++++++++++++++++++++++++++++++++++
 6 files changed, 203 insertions(+), 36 deletions(-)
 create mode 100644 swh/scanner/tests/data/global.yml
 create mode 100644 swh/scanner/tests/test_cli.py
Changes applied before test
commit 2ef852ddb634586db80f205d5cbb57d4bf70c833
Author: tenma <tenma+swh@mailbox.org>
Date:   Tue Sep 29 19:22:24 2020 +0200

    Add tests and fix behavior of scanner cli
    
    - fix behavior of cli in case an invalid config file is specified via
    - add smoke and unit tests
    - use a sample test config file and add it to MANIFEST
    - mock main dependencies to have very isolated tests

commit 1e688196d610517fe892e3b0c6eebffc471c4eb2
Author: tenma <tenma+swh@mailbox.org>
Date:   Thu Oct 1 21:41:11 2020 +0200

    Fix not self-contained temp_path fixture in conftest
    
    Conflicts arised between multiple users of the same session-scoped
    tmpdir/tmp_path. Fix this before introducing the new tests that trigger
    the behavior, by using a dedicated subdirectory.

commit 076f41705da20eb54bf005c3c60fc39755edb2fd
Author: tenma <tenma+swh@mailbox.org>
Date:   Thu Oct 1 21:28:45 2020 +0200

    Rename scan to do_scan in scanner
    
    It helps avoid ambiguity between uses of 'scan' and 'scanner' for the reader.
    Do qualified import in scan to make it more testable.

commit 8d96ac3bed31eb88bf575ca7b8ca8fdb262d0d90
Author: tenma <tenma+swh@mailbox.org>
Date:   Thu Oct 1 21:35:27 2020 +0200

    Remove parse_url helper that adds no real value

commit 7d7846954e35a969bba2d85f234d272e1fdabfaf
Author: tenma <tenma+swh@mailbox.org>
Date:   Tue Sep 29 19:47:54 2020 +0200

    Improve cli documentation
    
    The docstring was moved out of scanner docstring spot to work around a
    bug in our current Python version. See https://bugs.python.org/issue28739

See https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/57/ for more details.

tenma marked an inline comment as not done.

Rebased to remove doc and parse_url commits

Build is green

Patch application report for D4089 (id=14578)

Rebasing onto ad23ee03c0...

Current branch diff-target is up to date.
Changes applied before test
commit dc184535b22c0b4ab77ff7a4d0a283523d1cd729
Author: tenma <tenma+swh@mailbox.org>
Date:   Tue Sep 29 19:22:24 2020 +0200

    Add tests and fix behavior of scanner cli
    
    - fix behavior of cli in case an invalid config file is specified via
    - add smoke and unit tests
    - use a sample test config file and add it to MANIFEST
    - mock main dependencies to have very isolated tests

commit dc2d460c1a50fbd88e51869706f79ab42a6cb92b
Author: tenma <tenma+swh@mailbox.org>
Date:   Thu Oct 1 21:41:11 2020 +0200

    Fix not self-contained temp_path fixture in conftest
    
    Conflicts arised between multiple users of the same session-scoped
    tmpdir/tmp_path. Fix this before introducing the new tests that trigger
    the behavior, by using a dedicated subdirectory.

commit c43244078c08624acf5d1dd2c5ad8dfd7c266f93
Author: tenma <tenma+swh@mailbox.org>
Date:   Thu Oct 1 21:28:45 2020 +0200

    Rename scan to do_scan in scanner
    
    It helps avoid ambiguity between uses of 'scan' and 'scanner' for the reader.
    Do qualified import in scan to make it more testable.

See https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/61/ for more details.

is this actually proposed for merging now? If so, please remove "WIP" from the diff title

swh/scanner/tests/data/global.yml
2

is this a real token? if so, should it be in the testsuite? what happens when the token is revoked server-side?

swh/scanner/tests/test_cli.py
26

In this case, as expand_user is never called, this directory cannot exist.
Keep it and comment about this or not?

@zack it is not ready yet, need iteration with douardda.
Thanks for the remark on the token, should not be here.

  • Import qualified scanner.scan inside cli.scan entrypoint
  • Fix not self-contained temp_path fixture in conftest
  • cli: fix behavior in case an invalid config file is given
  • cli: add tests

Build is green

Patch application report for D4089 (id=14912)

Rebasing onto 88ded8631b...

Current branch diff-target is up to date.
Changes applied before test
commit f03942d67c02a2d9ba30f51395d2cd97f08d07c2
Author: tenma <tenma+swh@mailbox.org>
Date:   Sun Oct 11 11:06:54 2020 +0200

    cli: add tests
    
    - add smoke and unit tests
    - mock scanner to not run actual scan
    - spy on config loading to verify how config is loaded
    - use a sample test config file and add it to MANIFEST

commit ae66225ae054af10c8d15e6a463932b0cc4a7577
Author: tenma <tenma+swh@mailbox.org>
Date:   Sun Oct 11 11:19:10 2020 +0200

    cli: fix behavior in case an invalid config file is given

commit 93b5f9608e87153b31f9c86e4974232dd36a0df5
Author: tenma <tenma+swh@mailbox.org>
Date:   Thu Oct 1 21:41:11 2020 +0200

    Fix not self-contained temp_path fixture in conftest
    
    Conflicts arised between multiple users of the same session-scoped
    fixtures based on tmp_path_factory.
    Fix both the fixture itself to avoid side-effects and tests based
    on its inner workings.

commit 8347015b5fd409d92e49916ed4da83cb1d16e307
Author: tenma <tenma+swh@mailbox.org>
Date:   Thu Oct 1 21:28:45 2020 +0200

    Import qualified scanner.scan inside cli.scan entrypoint
    
    It makes cli more testable and helps avoid ambiguity between uses of 'scan' and
    'scanner' for the reader.

See https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/71/ for more details.

I'm mostly OK with this now, so I'll make it "accepted", but please refactor a bit the cli_run_[n]ok() helper functions before landing it.

swh/scanner/cli.py
57

why is this if statement not in the if not config_file below? If the env var is set but is dandling pointer but the config_file file has been explicitly given, we should not fail IMHO.

swh/scanner/tests/test_cli.py
49

Using these kind of helper functions (cli_run_[n]ok) as is (return a boolean) has a rather annoying drawbacks: on failure, the test only reports an "False is not True' kind of message, which is not so user friendly.

It's better to implement theses as assert_cli_run_ok or silmilar and make them do the assertion (and use a better assertion message is possible).

It's also prevent you from writing:

assert call_function()

which is not very meaningful for the fellow code reader (even if in this case the function name is rather explanatory).

51

it's unclear to me in what situation an execution raises an exception or prints error messages but does have a null exit code. Is this really happening ? If not, checking the exit code is enough. Otherwise, there is something else to fix :-)

This revision is now accepted and ready to land.Oct 13 2020, 1:24 PM

Build is green

Patch application report for D4089 (id=15001)

Rebasing onto c2768d171a...

First, rewinding head to replay your work on top of it...
Applying: Import qualified scanner.scan inside cli.scan entrypoint
Applying: Fix not self-contained temp_path fixture in conftest
Applying: cli: fix behavior in case an invalid config file is given
Applying: cli: add tests
Changes applied before test
commit cbeec9f1a9b3339380eb079fe71732b1dbbe7707
Author: tenma <tenma+swh@mailbox.org>
Date:   Sun Oct 11 11:06:54 2020 +0200

    cli: add tests
    
    - add smoke and unit tests
    - mock scanner to not run actual scan
    - spy on config loading to verify how config is loaded
    - use a sample test config file and add it to MANIFEST

commit 0e2f3ef8ed2155f68762d6017de69a49e6d7c435
Author: tenma <tenma+swh@mailbox.org>
Date:   Sun Oct 11 11:19:10 2020 +0200

    cli: fix behavior in case an invalid config file is given

commit 6a9ae59b5f2aea374f2b7507db7ac1a195ddb82a
Author: tenma <tenma+swh@mailbox.org>
Date:   Thu Oct 1 21:41:11 2020 +0200

    Fix not self-contained temp_path fixture in conftest
    
    Conflicts arised between multiple users of the same session-scoped
    fixtures based on tmp_path_factory.
    Fix both the fixture itself to avoid side-effects and tests based
    on its inner workings.

commit 1a74fb12188afa2ddbf16f385e90a6bb132f3312
Author: tenma <tenma+swh@mailbox.org>
Date:   Thu Oct 1 21:28:45 2020 +0200

    Import qualified scanner.scan inside cli.scan entrypoint
    
    It makes cli more testable and helps avoid ambiguity between uses of 'scan' and
    'scanner' for the reader.

See https://jenkins.softwareheritage.org/job/DTSCN/job/tests-on-diff/73/ for more details.

tenma retitled this revision from WIP Add tests and fix behavior of scanner cli to Add tests and fix behavior of scanner cli.