Page MenuHomeSoftware Heritage

cgit: rewrite the CGit lister
ClosedPublic

Authored by douardda on Aug 30 2019, 6:03 PM.

Details

Summary

Simplify the code:

  • do only inherit from ListerBase
  • implement HTTP queries directly using requests
  • get rid of convoluted code

Make the origin_url gathered from the git repo's "project" page instead of
building it from the 'url_prefix' hack. Now, the lister WILL make substancially
more requests, since it will make one request per listed git repo, but
the provided origin_url should be pretty reliable now.

When several url are provided as clonable URLs, choose the http/https one first,
otherwise, choose the first one of the list.

Add proper tests for the cgit lister.

Also, get rid of the 'time_updated' column in the model.

Depends on D1928

Diff Detail

Repository
rDLS Listers
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 7562
Build 10820: tox-on-jenkinsJenkins
Build 10819: arc lint + arc unit

Event Timeline

This one, i'll have a look on monday ;)

Don't this need a rebase (to make jenkins happy)?
The jenkins failure seems to be out of its depth.

I just have a minor nitpick about the tests with multiple pages.

Otherwise, this looks good, thx.

swh/lister/cgit/lister.py
131

I guess the html_url, full_name, etc... and other unpopulated field dbs are defaulting to null values (so it does not break ;).
And are also what needs to be further dealt with via T1978

swh/lister/cgit/tests/test_lister.py
72

Don't we want to check a tad the urls as you did in the other tests?

Don't this need a rebase (to make jenkins happy)?
The jenkins failure seems to be out of its depth.

I thinks this needs a fix in cli.py (which is undone/deleted in D1504 but meh). Let me check that.

swh/lister/cgit/lister.py
131

That's the idea yes.

swh/lister/cgit/tests/test_lister.py
72

As you wish, master ;-)

Btw, i think we can make this related to T1861 (bullet 1. as implementation).

nahimilega added inline comments.
swh/lister/cgit/lister.py
13–14

You could add a docstring to the class. Something like this https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/packagist/lister.py$0-15
IIRC, we decided to make a docstring for the lister class which shows their output.
I forgot to create a task regarding this(my bad)

swh/lister/cgit/lister.py
112–131

This line is the same as line 57. Maybe we could make a function for this.

swh/lister/cgit/lister.py
13–14

Either the class or the init as whatever is more suited for such documentation.

Add some docstring and ensure tests pass ok

swh/lister/cgit/lister.py
25

to be used

This revision is now accepted and ready to land.Sep 2 2019, 12:19 PM
swh/lister/cgit/lister.py
25

seen that, and also 'gather published "Clone" URLs' (without 'the')

This revision was automatically updated to reflect the committed changes.