Page MenuHomeSoftware Heritage

launchpad: Reimplement lister using new Lister API
ClosedPublic

Authored by anlambert on Jan 28 2021, 2:02 PM.

Details

Summary

Port launchpad lister to the swh.lister.pattern.Lister API.

Last update date of each listed git repositories is now sent to the scheduler.

The lister can work in incremental mode, only modified repositories since
the last listing operation will be returned in that case.

Closes T2992

Diff Detail

Repository
rDLS Listers
Branch
launchpad-lister-new-api
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 18851
Build 29204: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 29203: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D4962 (id=17704)

Rebasing onto 72be074a79...

Current branch diff-target is up to date.
Changes applied before test
commit 58c0b9c10f4b94702cdac0c994bd332ae06634ec
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Wed Jan 27 18:58:54 2021 +0100

    launchpad: Reimplement lister using new Lister API
    
    Port launchpad lister to the swh.lister.pattern.Lister API.
    
    Last update date of each listed git repositories is now sent to the scheduler.
    
    The lister can work in incremental mode, only modified repositories since
    the last listing operation will be returned in that case.
    
    Closes T2992

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/213/ for more details.

Update:

  • add a test to check lister instantiation with configuration file
  • remove test_get_lister restriction for launchpad now everything is properly mocked

Build is green

Patch application report for D4962 (id=17707)

Rebasing onto 72be074a79...

Current branch diff-target is up to date.
Changes applied before test
commit 49b63e04f039be2c89eb5efc3f3f3ab8120a930b
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Wed Jan 27 18:58:54 2021 +0100

    launchpad: Reimplement lister using new Lister API
    
    Port launchpad lister to the swh.lister.pattern.Lister API.
    
    Last update date of each listed git repositories is now sent to the scheduler.
    
    The lister can work in incremental mode, only modified repositories since
    the last listing operation will be returned in that case.
    
    Closes T2992

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/216/ for more details.

ardumont added inline comments.
debian/changelog
676 ↗(On Diff #17707)

too much info in your commits ^
I guess you tested the debian build works but you committed too much in here ;)

swh/lister/launchpad/__init__.py
11

to answer our previous discussion/question, we can drop it apparently, nothing is complaining so far, well lister wise at least.

debian/changelog
676 ↗(On Diff #17707)

oh jeez, thanks !

Remove debian folder committed by mistake

swh/lister/launchpad/__init__.py
11

yes, we can remove the remaining one in ported listers safely.

Build is green

Patch application report for D4962 (id=17710)

Rebasing onto ae17b6b9a0...

First, rewinding head to replay your work on top of it...
Applying: launchpad: Reimplement lister using new Lister API
Changes applied before test
commit 7adee7239f6ab8f47d0d41a91caf379ac7b90cc2
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Wed Jan 27 18:58:54 2021 +0100

    launchpad: Reimplement lister using new Lister API
    
    Port launchpad lister to the swh.lister.pattern.Lister API.
    
    Last update date of each listed git repositories is now sent to the scheduler.
    
    The lister can work in incremental mode, only modified repositories since
    the last listing operation will be returned in that case.
    
    Closes T2992

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/218/ for more details.

lgtm so far (did not read the test)

couple of questions inline.

swh/lister/launchpad/lister.py
71

Should the computation about updating/restoring the state within those methods?
What's the point of the finalize method then?

Also shouldn't there be comparison between date to keep only the most recent of the modified date?

106

Why not constrain the iso8601 conversion back and forth here?

swh/lister/launchpad/lister.py
71

Those methods are only there to serialize/deserialize the lister state in order to be able to sent it through RPC.

106

Because we want to manipulate datetime objects while the lister runs.

swh/lister/launchpad/lister.py
106

right, i misread that part.

yep, looks good to me.

Thanks.

swh/lister/launchpad/lister.py
71

indeed, i mixed the 2 dates notions we have.

The one used for the incremental nature of the lister (pagination "index" date used to store the state, what we are using) and the "last_update" for one repo to list.

moving along ;)

Thanks.

This revision is now accepted and ready to land.Jan 28 2021, 3:21 PM

yep, looks good to me.

Thanks.

Two remaining ones to port: gnu and packagist. Then we will be able to drop the dependency to SQLAlchemy.

Build is green

Patch application report for D4962 (id=17714)

Rebasing onto ae17b6b9a0...

Current branch diff-target is up to date.
Changes applied before test
commit f862004700259234a690779b48dd19c37c93d5a0
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Wed Jan 27 18:58:54 2021 +0100

    launchpad: Reimplement lister using new Lister API
    
    Port launchpad lister to the swh.lister.pattern.Lister API.
    
    Last update date of each listed git repositories is now sent to the scheduler.
    
    The lister can work in incremental mode, only modified repositories since
    the last listing operation will be returned in that case.
    
    Closes T2992

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/219/ for more details.