Page MenuHomeSoftware Heritage

Added LaunchpadLister
AbandonedPublic

Authored by legau on Mar 10 2020, 3:11 PM.

Details

Reviewers
ardumont
Group Reviewers
Reviewers
Summary

First draft for (T1734)

It uses only launchpadlib and no http requests. This is because we've been discussing with the people at launchpad and they're
planning on adding a new endpoint git_repositories.getAllRepositories just for this. Using only the library it will be easier to implement
once they're done.

It is functionnal but tests are not ready yet. Looking for improvements and remarks on its design as it overrides almost everything.

Diff Detail

Repository
rDLS Listers
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 11204
Build 16918: tox-on-jenkinsJenkins
Build 16917: arc lint + arc unit

Event Timeline

legau created this revision.Mar 10 2020, 3:11 PM
legau updated this revision to Diff 10135.Mar 18 2020, 4:28 PM

Updated with git_repositories.getRepositories

Tasks tests added, Still no lister tests

ardumont added a subscriber: ardumont.EditedMar 24 2020, 5:58 PM

Awesome.

Thanks for your interest and work on this.

Still no lister tests

Please add at least one integration test like the gnu one [1].
You could check other lister, they all have at least one integration test like this one.

That does not delve into the intricacies of the lister contrary to some other tests.
It allows to directly check the scheduled tasks (what's been listed and outputted in the scheduler's db).

You can then check the task's shape is somewhat in the form you expect.

[1] https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gnu/tests/test_lister.py$12-52

Cheers,

legau updated this revision to Diff 10254.Mar 25 2020, 4:29 PM
  • Added Tests and annotations
ardumont added inline comments.Mar 25 2020, 6:21 PM
requirements-test.txt
6

you should not need it for tests as it's runtime required already.

swh/lister/launchpad/__init__.py
2

2020

swh/lister/launchpad/lister.py
29

What are those parameters?

You might want to use some defined at [1]

[1] https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/__init__.py$13-19

swh/lister/launchpad/tasks.py
41

Please remove it.

swh/lister/launchpad/tests/test_lister.py
9 ↗(On Diff #10254)

I did not fully realize earlier.

Does that test currently load a real launchpad instance?
(i believe so and the diff description seems to agree ;)

I don't see the data test file.

requests_mock_datadir is a fixture to transform http requests into lookup within the ./data/<https_server-name>/remaining-file-named-after-remaining-http-query-parameters...

legau added inline comments.Mar 25 2020, 8:27 PM
swh/lister/launchpad/lister.py
29

These arguments are what the launchpad instance need to be set up as described here https://help.launchpad.net/API/launchpadlib

swh/lister/launchpad/tests/test_lister.py
9 ↗(On Diff #10254)

Yes it does, what would you recommend to use to mock launchpadlib data ?

ardumont added inline comments.Mar 26 2020, 9:25 AM
swh/lister/launchpad/tests/test_lister.py
9 ↗(On Diff #10254)

We are using pytest and this allows to define fixtures like requests_mock_datadir.

So as a first step, i'd look if there is any existing fixture around launchpadlib [1]
Note that checking how they are testing themselves should be interesting.
As that would help in understanding how to define the fixture if we need to do it ourselves.

If there is no fixture yet, you can try to define one which would mock the launchpadlib calls to return a fixed dataset of repositories to list. Just define a json file in the "tests/data/launchpadlib/dataset.json" and make the fixture returns it. Then in the tests, you can further ensure the data listed and the test dataset are consistent together.

We use pytest-mocker for that, for example [2]

[1] https://pytest.readthedocs.io/en/2.7.3/plugins_index/index.html

[2] https://forge.softwareheritage.org/source/swh-storage/browse/master/swh/storage/tests/test_retry.py$58-90

ardumont accepted this revision.Mar 27 2020, 10:52 AM

Requiring changes for the sake of the change on the testing part.
We don't want to depend on third party dataset for the testing.
That could fail for no good reasons.

I think i'm fine with the implementation otherwise.

This revision is now accepted and ready to land.Mar 27 2020, 10:52 AM
ardumont requested changes to this revision.Mar 27 2020, 10:53 AM

wrong selection sorry

This revision now requires changes to proceed.Mar 27 2020, 10:53 AM
legau updated this revision to Diff 10535.Apr 4 2020, 6:56 PM

Updated tests to mock launchpadlib

legau updated this revision to Diff 10536.Apr 4 2020, 7:05 PM

Rebase

ardumont added inline comments.Apr 5 2020, 11:10 AM
swh/lister/core/tests/conftest.py
33 ↗(On Diff #10536)

When you'll have your fixture working as you want.
Define said lister fixture in swh/lister/launchpad/tests/conftest.py (or even just before the test that uses it, i would not mind).

@pytest.fixture
def lister_launchpad(swh_listers):
    with patch('launchpadlib.launchpad.Launchpad', MagicMock()):
        lister = swh_listers['launchpad']
    ... 
    # define how the mock should behave when calling given methods...
    # what you actually defined below in the beginning of the tests.
    ...
    lister.launchpad.git_repositories.getRepositories = MagicMock(
        side_effect=[mock_lp_response(i) for i in range(3)])
    ...

    return lister

And then instead of using swh_listers in your test definition, use the fixture define here:

def test_launchpad_lister(lister_launchpad):
    ...

That way, when reading the test, we can concentrate on checking the expected behavior of the lister.
And not the setup of the fixture (for that we will read the fixture definition ;)

swh/lister/launchpad/tests/data/response0.json
2 ↗(On Diff #10536)

Please, make it readable (pretty print).

Personally, i use within emacs, M-x json-pretty-print ;)

swh/lister/launchpad/tests/test_lister.py
26 ↗(On Diff #10536)

we have a datadir fixture [1] for that, "inject" datadir in the test signature and:

response_filepath  = os.path.join(datadir, f'response{page}.json')
with open(response_filepath, encoding='utf-8') as f:
...
`

[1] https://forge.softwareheritage.org/source/swh-core/browse/master/swh/core/pytest_plugin.py$0-121

legau marked 5 inline comments as done.Apr 5 2020, 4:38 PM
legau added inline comments.
swh/lister/core/tests/conftest.py
33 ↗(On Diff #10536)

That was my first thought but swh_listers instantiates all the listers at once so in LaunchpadLister.init login to launchpad is already called when you want to retrieve swh_listers['launchpad']

legau updated this revision to Diff 10537.Apr 5 2020, 4:58 PM

Fixed JSON and added datadir fixture

legau added a comment.Apr 5 2020, 4:59 PM

I think I broke Phabricator with my commits

I think I broke Phabricator with my commits

I'm not sure what that means.

In any case, I see a merge within your commits, please use git rebase to update a branch instead [1]

Quoting [1]
rewriting old commits with git rebase (to preserve a nice, easy to bisect history)

And then update the diff again, maybe that will also help jenkins to trigger the build diff.

[1] https://wiki.softwareheritage.org/wiki/Code_review_in_Phabricator#Updating_your_branch_to_reflect_requested_changes

swh/lister/core/tests/conftest.py
33 ↗(On Diff #10536)

That should not be an issue. Have you tried?

ardumont added inline comments.Apr 6 2020, 10:31 AM
swh/lister/core/tests/conftest.py
33 ↗(On Diff #10536)

And If that becomes an issue nonetheless, do not hesitate to split the existing fixtures to be able to reuse some part within your own (as long as other tests are still working, it's fine ;):

Something like, in the main conftest:

@pytest.fixture
def lister_db_url(postgresql_proc):
    db_url = 'postgresql://{user}@{host}:{port}/{dbname}'.format(
        host=postgresql_proc.host,
        port=postgresql_proc.port,
        user='postgres',
        dbname='tests')
    logger.debug('lister db_url: %s', db_url)
    return db_url


@pytest.fixture
def swh_listers(request, lister_db_url, swh_scheduler):
    listers = {}

    # Prepare schema for all listers
    for lister_name in SUPPORTED_LISTERS:
        lister = get_lister(lister_name, db_url=lister_db_url)
        lister.scheduler = swh_scheduler  # inject scheduler fixture
        listers[lister_name] = lister
    initialize(create_engine(lister_db_url), drop_tables=True)

    # Add the load-archive-files expected by some listers (gnu, cran, ...)
    swh_scheduler.create_task_type({
        'type': 'load-archive-files',
        'description': 'Load archive files.',
        'backend_name': 'swh.loader.package.tasks.LoadArchive',
        'default_interval': '1 day',
    })

    return listers

In launchpadlister one:

@pytest.fixture
def launchpad_lister(request, lister_db_url, swh_scheduler):
    # all the necessary cogs for mocking launchpad...
    # ...
    with patch('launchpadlib.launchpad.Launchpad', MagicMock()):
        lister = get_lister('launchpad', db_url=lister_db_url)
        lister.scheduler = swh_scheduler  # inject scheduler fixture
        initialize(create_engine(lister_db_url), drop_tables=True)
    return lister
legau added inline comments.Apr 6 2020, 5:35 PM
swh/lister/core/tests/conftest.py
33 ↗(On Diff #10536)

Indeed it was an issue because when you use swh_listers fixture it is called before any patch so swh_listers['launchpad'].launchpad is already instanciated with a real Launchpad instance.
I'm using what you wrote it works great

legau updated this revision to Diff 10543.Apr 6 2020, 5:48 PM

Modified Fixtures

legau added a comment.Apr 6 2020, 5:49 PM

Build is still broken even with rebasing

ardumont added inline comments.Apr 6 2020, 6:01 PM
swh/lister/launchpad/tests/conftest.py
14

And actually, you are making the mock being injected to swh_listers without using it later.

That feels strange.

15

drop swh_listers, you don't need it any longer.

42

you are monkey-patching here and no longer mocking.

Why not the initial with patch stanza you used earlier, that felt more standard, well at least on how we define it here in the team ;)

Build is still broken even with rebasing

yes, that may have nothing to do with you ;)

How do you update your diff, something like?

arc diff origin/master --update D2799
legau added a comment.Apr 6 2020, 6:50 PM

Yes that's how I do it

legau added inline comments.Apr 6 2020, 7:00 PM
swh/lister/launchpad/tests/conftest.py
15

Yes that's right

42

lister.launchpad.git_repositories.getRepositories is already a mocked object, would it be ok just to redefine its side_effect like

lister.launchpad.git_repositories.getRepositories.side_effect =[mock_lp_response(i) for i in range(3)]

Yes that's how I do it

Ok, thanks.

I think you built that diff prior to some changes in the jenkins configuration regarding diff (the dates roughly match in my head).
(Again, that's not on you)

I'm thinking that change in between (plus the current error message in the build) made it so it won't build (somehow, jenkins is missing some stuff from the staging repository).
The "new" build never got green but it never started, it systematically fail at applying the patch altogether.

If that's ok with you, could you please abandon this diff and make a new one?

Here, select "Abandon Revision" at the bottom of the page (selection box).

And then, from your machine, create a new diff with:

arc diff --create origin/master

The drawback is that we'll lost the history exchange of the diff, but i think you mostly, if not all, answered to everything already ;)

Cheers,

swh/lister/launchpad/tests/conftest.py
42

I don't know yet, i'd like to see the current build's status first ;)
So i'm focusing on that first, see my main comment ;)

legau abandoned this revision.Apr 7 2020, 5:34 PM