Page MenuHomeSoftware Heritage

[WIP] add first implementation of FusionForge lister
AbandonedPublic

Authored by anlambert on Nov 10 2017, 5:13 PM.

Details

Reviewers
None
Group Reviewers
Reviewers
Summary

first version of a FusionForge lister (T778)

This is the first attempt at writing a lister for projects hosted on a FusionForge instance, like for instance:

  • gforge.inria.fr (T390)
  • adullact.net (T775)
  • sourcesup.renater.fr

Currently, the lister only considers git and svn repositories and works quite well with
the three forge cited above. It should also work with other FusionForge instances if
the url schemes for their hosted repositories can be handled by the current implementation.

Diff Detail

Repository
rDLS Listers
Branch
fusionforge-lister
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 1109
Build 1453: arc lint + arc unit

Event Timeline

anlambert created this revision.Nov 10 2017, 5:13 PM
olasd added a subscriber: olasd.Nov 10 2017, 5:49 PM

Hey,

Rather than having the fusionforge base url in the configuration, I think it should be set as an argument to the lister. However I think the credentials store should indeed be in a configuration file.

That way, we can deploy the fusionforge lister once and create tasks in the scheduler to list the different (public) fusionforges that we know of "dynamically" without having to redeploy a config, leaving the configuration for the credentials store.

This also makes sure that failing to list one of the forges won't impact listing the others; it will also let us have different reccurrence times for different forges.

The rest of the approach looks perfectly reasonable but I haven't looked in depth yet.

Two other testcases for you: alioth.debian.org; www.fusionforge.org.

Effectively, passing the forge base url as an argument to the lister seems a better choice. I will update the diff accordingly next week.

Otherwise, I tried the lister several times with the alioth forge but after a few calls to the soap api, I got banned and could not issue any more requests (even basic http get ones)
for a couple of hours. Looks like there is some requests rate limiting in place. I also had the same kind of issue when trying to load (not list) repositories from the renater forge (however
I was a little brutal on the number of concurrent repos being loaded).

anlambert updated this revision to Diff 901.Nov 17 2017, 5:45 PM

Updating D267: add first implementation of FusionForge lister

Some improvements to the FusionForge lister:

  • pass forge url as argument to the lister instead of storing it in configuration
  • properly decode to utf-8 strings describing projects as SOAP replies from the FusionForge API are usually iso-8859-1 encoded (for instance on adullact.net, descriptions are written in french which contains a lot of accented letters which were not correctly decoded)

For the moment, the lister only creates oneshot tasks. Next steps would be
to list only projects with changes since the last swh visit.

anlambert planned changes to this revision.Oct 12 2018, 2:03 PM

This needs some rework before review.

zack retitled this revision from add first implementation of FusionForge lister to [WIP] add first implementation of FusionForge lister.Oct 15 2018, 10:43 AM
zack removed 1 blocking reviewer(s): Reviewers.
anlambert abandoned this revision.Mar 16 2019, 3:30 AM

Time flies so as swh APIs and code hosting solutions ... I honestly do not have any energy to waste on that, so closing this once for all !