Page MenuHomeSoftware Heritage

Introduce a simpler base pattern for lister implementations.
AcceptedPublic

Authored by olasd on Jul 6 2020, 10:39 AM.

Details

Reviewers
ardumont
Group Reviewers
Reviewers
Summary

This new pattern uses the lister support features introduced in swh.scheduler to
replace the database management done in previous iterations of the listers.

Depends on D3424.
Requires D3423 from swh.scheduler (for tests).

Test Plan

new tox tests introduced

Diff Detail

Event Timeline

olasd created this revision.Jul 6 2020, 10:39 AM

Build has FAILED

Patch application report for D3425 (id=12126)

Could not rebase; Attempt merge onto 1408517c08...

Updating 1408517..2d96eaf
Fast-forward
 requirements-swh.txt             |   2 +-
 setup.py                         |   4 +-
 swh/lister/pattern.py            | 225 +++++++++++++++++++++++++++++++++++++++
 swh/lister/tests/conftest.py     |   7 ++
 swh/lister/tests/test_pattern.py | 101 ++++++++++++++++++
 5 files changed, 336 insertions(+), 3 deletions(-)
 create mode 100644 swh/lister/pattern.py
 create mode 100644 swh/lister/tests/conftest.py
 create mode 100644 swh/lister/tests/test_pattern.py
Changes applied before test
commit 2d96eaf45866373eb4eb48cf5ba0b0a0ef312ed6
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Jul 6 10:27:57 2020 +0200

    Intoduce a simpler base pattern for lister implementations.
    
    This new pattern uses the lister support features introduced in swh.scheduler to
    replace the database management done in previous iterations of the listers.

commit 014c446d0594c96c3cd3886bf63077758bedb068
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Jun 25 12:12:47 2020 +0200

    Switch over to setuptools-scm

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/14/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/14/console

lgtm, just a few nitpicks.

swh/lister/pattern.py
84

I would go for TypeError instead

121–123

Wouldn't it make more sense to do self.config.get(self.LISTER_NAME, {}).get(self.instance, {}).get("credentials", [])?

swh/lister/tests/conftest.py
7 ↗(On Diff #12126)

Could you make it explicit?

swh/lister/tests/test_pattern.py
23

InstantiableLister

olasd added inline comments.Jul 6 2020, 4:18 PM
swh/lister/pattern.py
121–123

I used this schema to be backwards-compatible with the current config. I think there's space for a more extensible config schema, but overall we haven't really needed more than a credentials key for now.

Maybe I'll want to change this when I reach the Debian lister. We'll see.

swh/lister/tests/conftest.py
7 ↗(On Diff #12126)

The way this currently works is that we need to import the swh_scheduler fixture, and all its dependencies, whatever they may be. Making the import implicit is a PITA.

The proper way to do this is to extract the swh_scheduler fixture into a pytest plugin. That's D3430.

olasd updated this revision to Diff 12164.Jul 6 2020, 5:22 PM

Apply some review comments

Build is green

Patch application report for D3425 (id=12164)

Rebasing onto 014c446d05...

Current branch diff-target is up to date.
Changes applied before test
commit 06ac7d667c4fddfa26dc56b843f800046f92b579
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Jul 6 10:27:57 2020 +0200

    Intoduce a simpler base pattern for lister implementations.
    
    This new pattern uses the lister support features introduced in swh.scheduler to
    replace the database management done in previous iterations of the listers.

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/15/ for more details.

vlorentz added inline comments.Jul 7 2020, 10:24 AM
swh/lister/pattern.py
121–123

ok

swh/lister/tests/conftest.py
7 ↗(On Diff #12126)

I meant doing from swh.scheduler.tests.conftest import swh_scheduler, but that's fine too

(nit: did not read the diff yet but typo on "Introduce" both in the diff description and git commit message ;)

ardumont accepted this revision.Jul 7 2020, 11:50 AM

Awesome.

Sounds quite promising ;)

Should definitely simplify us life.
Can't wait to see what an old list would look like with this ;)

Something like:

--------------------------------------------------- (<- * 1000)
++++++

;)

Thanks.

This revision is now accepted and ready to land.Jul 7 2020, 11:50 AM
olasd marked an inline comment as done.Jul 7 2020, 4:09 PM
olasd added inline comments.
swh/lister/tests/conftest.py
7 ↗(On Diff #12126)

I know. But the swh_scheduler fixture depends on other fixtures, and they would have needed to be explicitly imported too.

olasd updated this revision to Diff 12312.Jul 9 2020, 5:19 PM

rebase

Build is green

Patch application report for D3425 (id=12312)

Could not rebase; Attempt merge onto 014c446d05...

Updating 014c446..34ff7c7
Fast-forward
 conftest.py                                |   6 +
 pytest.ini                                 |   3 +
 requirements-swh.txt                       |   2 +-
 requirements-test.txt                      |   3 +-
 swh/lister/bitbucket/tests/test_tasks.py   |  27 ++--
 swh/lister/cgit/tests/test_tasks.py        |  13 +-
 swh/lister/core/tests/conftest.py          |   4 +-
 swh/lister/cran/tests/test_tasks.py        |  13 +-
 swh/lister/debian/tests/test_tasks.py      |  12 +-
 swh/lister/gitea/tests/test_tasks.py       |  33 +++--
 swh/lister/github/tests/test_tasks.py      |  22 +--
 swh/lister/gitlab/tests/test_tasks.py      |  35 +++--
 swh/lister/gnu/tests/test_tasks.py         |  13 +-
 swh/lister/launchpad/tests/test_tasks.py   |  17 ++-
 swh/lister/npm/tests/test_tasks.py         |  21 ++-
 swh/lister/packagist/tests/test_tasks.py   |  12 +-
 swh/lister/pattern.py                      | 225 +++++++++++++++++++++++++++++
 swh/lister/phabricator/tests/test_tasks.py |   6 +-
 swh/lister/pypi/tests/test_tasks.py        |  13 +-
 swh/lister/tests/test_cli.py               |   2 +-
 swh/lister/tests/test_pattern.py           | 101 +++++++++++++
 tox.ini                                    |   1 +
 22 files changed, 497 insertions(+), 87 deletions(-)
 create mode 100644 conftest.py
 create mode 100644 swh/lister/pattern.py
 create mode 100644 swh/lister/tests/test_pattern.py
Changes applied before test
commit 34ff7c7da5e566854f8868b7d244817f5c1f4e18
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Jul 6 10:27:57 2020 +0200

    Intoduce a simpler base pattern for lister implementations.
    
    This new pattern uses the lister support features introduced in swh.scheduler to
    replace the database management done in previous iterations of the listers.

commit c9963d4302b93c58b60800b058a2c66a10a6fa7e
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Jul 9 12:14:42 2020 +0200

    Use the new names for the swh.scheduler test fixtures

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/17/ for more details.

olasd updated this revision to Diff 12415.Wed, Jul 15, 4:20 PM

rebase once more

Build is green

Patch application report for D3425 (id=12415)

Could not rebase; Attempt merge onto c9963d4302...

Updating c9963d4..ab941d8
Fast-forward
 conftest.py                      |   4 +
 requirements-swh.txt             |   2 +-
 swh/lister/pattern.py            | 225 +++++++++++++++++++++++++++++++++++++++
 swh/lister/tests/test_pattern.py | 101 ++++++++++++++++++
 4 files changed, 331 insertions(+), 1 deletion(-)
 create mode 100644 swh/lister/pattern.py
 create mode 100644 swh/lister/tests/test_pattern.py
Changes applied before test
commit ab941d8afad48d1af47e371a66be9cce6ac02c28
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Jul 6 10:27:57 2020 +0200

    Intoduce a simpler base pattern for lister implementations.
    
    This new pattern uses the lister support features introduced in swh.scheduler to
    replace the database management done in previous iterations of the listers.

commit 5f1fbbe8a4d70c43cada6022d469965821c263a8
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jul 15 16:16:49 2020 +0200

    Make sure LC_ALL is UTF-8 for pytest-postgresql to work

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/19/ for more details.

olasd updated this revision to Diff 12452.Thu, Jul 16, 12:23 PM

Rebase on top of recent changes; Improve pattern following the implementation of an actual lister

  • drop the default config nonsense
  • add a commit_page method called on successfully recording a page of origins
  • improve docs
olasd requested review of this revision.Thu, Jul 16, 12:27 PM

Build is green

Patch application report for D3425 (id=12452)

Could not rebase; Attempt merge onto 5f1fbbe8a4...

Updating 5f1fbbe..4d9db6e
Fast-forward
 MANIFEST.in                       |   3 +-
 conftest.py                       |  19 +++
 requirements-swh.txt              |   2 +-
 swh/lister/core/tests/conftest.py |   4 +-
 swh/lister/pattern.py             | 236 ++++++++++++++++++++++++++++++++++++++
 swh/lister/tests/test_cli.py      |   2 +-
 swh/lister/tests/test_pattern.py  |  93 +++++++++++++++
 7 files changed, 352 insertions(+), 7 deletions(-)
 create mode 100644 swh/lister/pattern.py
 create mode 100644 swh/lister/tests/test_pattern.py
Changes applied before test
commit 4d9db6e09f500285f2f1a484aeb6c11be674394c
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Jul 6 10:27:57 2020 +0200

    Intoduce a simpler base pattern for lister implementations.
    
    This new pattern uses the lister support features introduced in swh.scheduler to
    replace the database management done in previous iterations of the listers.

commit 211f4610df2006e939bb3f6e8f2cb7b87151dfba
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Jul 16 11:45:58 2020 +0200

    Move get_scheduler monkeypatching into an explicit pytest fixture
    
    This allows us to actually run the lister instantiation code instead of relying
    on the underlying structure of the lister object. In turn, this allows future
    listers to use the scheduler right in their __init__.

commit d0c1df65f1f4b05cd9e766ee3916d25182ecdb25
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Jul 16 10:27:54 2020 +0200

    Only include relevant data files in MANIFEST.in
    
    The previous include would pull all .mypy_cache directories too, which are quite
    large as they include the SQLAlchemy stubs.

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/22/ for more details.

ardumont retitled this revision from Intoduce a simpler base pattern for lister implementations. to Introduce a simpler base pattern for lister implementations..Thu, Jul 16, 1:41 PM
ardumont accepted this revision.Thu, Jul 16, 1:43 PM
This revision is now accepted and ready to land.Thu, Jul 16, 1:43 PM

There is still the typo in the commit message though ;)
"Intoduce -> Introduce" (I allowed myself to fix the diff description).

olasd updated this revision to Diff 12644.Wed, Jul 22, 12:43 PM

Some more design improvements from more lister implementations

Build is green

Patch application report for D3425 (id=12644)

Could not rebase; Attempt merge onto 5f1fbbe8a4...

Updating 5f1fbbe..7ab4f69
Fast-forward
 MANIFEST.in                       |   3 +-
 conftest.py                       |  19 +++
 requirements-swh.txt              |   2 +-
 swh/lister/core/tests/conftest.py |   4 +-
 swh/lister/pattern.py             | 238 ++++++++++++++++++++++++++++++++++++++
 swh/lister/tests/test_cli.py      |   2 +-
 swh/lister/tests/test_pattern.py  |  93 +++++++++++++++
 7 files changed, 354 insertions(+), 7 deletions(-)
 create mode 100644 swh/lister/pattern.py
 create mode 100644 swh/lister/tests/test_pattern.py
Changes applied before test
commit 7ab4f69692af27b47a41566fc6db7aeeba4dda66
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Jul 6 10:27:57 2020 +0200

    Introduce a simpler base pattern for lister implementations.
    
    This new pattern uses the lister support features introduced in swh.scheduler to
    replace the database management done in previous iterations of the listers.

commit 211f4610df2006e939bb3f6e8f2cb7b87151dfba
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Jul 16 11:45:58 2020 +0200

    Move get_scheduler monkeypatching into an explicit pytest fixture
    
    This allows us to actually run the lister instantiation code instead of relying
    on the underlying structure of the lister object. In turn, this allows future
    listers to use the scheduler right in their __init__.

commit d0c1df65f1f4b05cd9e766ee3916d25182ecdb25
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Jul 16 10:27:54 2020 +0200

    Only include relevant data files in MANIFEST.in
    
    The previous include would pull all .mypy_cache directories too, which are quite
    large as they include the SQLAlchemy stubs.

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/25/ for more details.