Page MenuHomeSoftware Heritage
Feed Advanced Search

Oct 22 2021

anlambert closed T3645: cgit: handle "429 Too Many Requests" errors as Resolved.

Fixed in swh-lister v2.2.0 and deployed to production, closing this.

Oct 22 2021, 5:16 PM · Origin-CGit, CGit lister
anlambert added a revision to T3645: cgit: handle "429 Too Many Requests" errors: D6540: cgit: Enable to retry throttled HTTP requests.
Oct 22 2021, 3:15 PM · Origin-CGit, CGit lister

Oct 11 2021

vlorentz triaged T3645: cgit: handle "429 Too Many Requests" errors as Normal priority.
Oct 11 2021, 2:09 PM · Origin-CGit, CGit lister

Jan 29 2021

ardumont closed T2999: Optimize the number of HTTP requests sent by the cgit lister as Resolved.
Jan 29 2021, 5:36 PM · CGit lister
ardumont closed D4968: cgit: Compute origin urls out of a base git url when provided..
Jan 29 2021, 3:37 PM · CGit lister, Lister
swh-public-ci added a comment to D4968: cgit: Compute origin urls out of a base git url when provided..

Build is green

Jan 29 2021, 3:36 PM · CGit lister, Lister
ardumont updated the diff for D4968: cgit: Compute origin urls out of a base git url when provided..

Rebase

Jan 29 2021, 3:33 PM · CGit lister, Lister
anlambert accepted D4968: cgit: Compute origin urls out of a base git url when provided..

Looks good to me !

Jan 29 2021, 3:31 PM · CGit lister, Lister
ardumont added inline comments to D4968: cgit: Compute origin urls out of a base git url when provided..
Jan 29 2021, 3:31 PM · CGit lister, Lister
swh-public-ci added a comment to D4968: cgit: Compute origin urls out of a base git url when provided..

Build is green

Jan 29 2021, 3:31 PM · CGit lister, Lister
ardumont updated the diff for D4968: cgit: Compute origin urls out of a base git url when provided..

Adapt according to review:

  • Add more cgit instance samples
  • Rework docstring
  • Rework url computation a bit
Jan 29 2021, 3:27 PM · CGit lister, Lister
anlambert added inline comments to D4968: cgit: Compute origin urls out of a base git url when provided..
Jan 29 2021, 3:22 PM · CGit lister, Lister
ardumont added inline comments to D4968: cgit: Compute origin urls out of a base git url when provided..
Jan 29 2021, 3:15 PM · CGit lister, Lister
anlambert added inline comments to D4968: cgit: Compute origin urls out of a base git url when provided..
Jan 29 2021, 2:28 PM · CGit lister, Lister
swh-public-ci added a comment to D4968: cgit: Compute origin urls out of a base git url when provided..

Build is green

Jan 29 2021, 2:17 PM · CGit lister, Lister
ardumont updated the diff for D4968: cgit: Compute origin urls out of a base git url when provided..

Adapt according to review

Jan 29 2021, 2:14 PM · CGit lister, Lister
ardumont added inline comments to D4968: cgit: Compute origin urls out of a base git url when provided..
Jan 29 2021, 12:54 PM · CGit lister, Lister
anlambert requested changes to D4968: cgit: Compute origin urls out of a base git url when provided..

The current implementation will not work for most cases where the base cgit URL and the base clone URL differs (see inline comment).

Jan 29 2021, 12:26 PM · CGit lister, Lister
ardumont updated the summary of D4968: cgit: Compute origin urls out of a base git url when provided..
Jan 29 2021, 12:21 PM · CGit lister, Lister
ardumont added a revision to T2999: Optimize the number of HTTP requests sent by the cgit lister: D4968: cgit: Compute origin urls out of a base git url when provided..
Jan 29 2021, 11:41 AM · CGit lister
ardumont added a comment to T2999: Optimize the number of HTTP requests sent by the cgit lister.

I guess you mean that some cgit origin URLs previously loaded into the archive will not be the same if
the approach in that task is implemented ?

Jan 29 2021, 11:37 AM · CGit lister
anlambert added a comment to T2999: Optimize the number of HTTP requests sent by the cgit lister.

Analyzing further the suggestions using the deprecated swh-lister cache db table as data
point (production data) [1], 3 instances so far will generate sometimes wrong origin
urls with the suggested approach.

Jan 29 2021, 11:05 AM · CGit lister
ardumont claimed T2999: Optimize the number of HTTP requests sent by the cgit lister.
Jan 29 2021, 10:54 AM · CGit lister
ardumont added a comment to T2999: Optimize the number of HTTP requests sent by the cgit lister.

Analyzing further the suggestions using the deprecated swh-lister cache db table as data
point (production data) [1], 3 instances so far will generate sometimes wrong origin
urls with the suggested approach.

Jan 29 2021, 10:25 AM · CGit lister
ardumont updated the task description for T2999: Optimize the number of HTTP requests sent by the cgit lister.
Jan 29 2021, 9:30 AM · CGit lister
ardumont added a parent task for T2999: Optimize the number of HTTP requests sent by the cgit lister: T376: ingest git.eclipse.org repositories.
Jan 29 2021, 9:24 AM · CGit lister

Jan 28 2021

vsellier closed T2988: Improve cgit lister to add last modification date of the repos as Resolved.
Jan 28 2021, 2:10 PM · CGit lister, Lister

Jan 27 2021

anlambert triaged T2999: Optimize the number of HTTP requests sent by the cgit lister as Normal priority.
Jan 27 2021, 1:52 PM · CGit lister
vsellier added a revision to T2988: Improve cgit lister to add last modification date of the repos: D4954: cgit: Don't stop the listing when a repository page is not available.
Jan 27 2021, 12:42 PM · CGit lister, Lister

Jan 26 2021

vsellier added a revision to T2988: Improve cgit lister to add last modification date of the repos: D4953: cgit: Add support for last_update information during listing.
Jan 26 2021, 6:33 PM · CGit lister, Lister
vsellier changed the status of T2988: Improve cgit lister to add last modification date of the repos from Open to Work in Progress.
Jan 26 2021, 6:04 PM · CGit lister, Lister
ardumont closed T2984: Port cgit lister to the new Lister API as Resolved.
Jan 26 2021, 9:57 AM · Lister, CGit lister, Sprint 2021 01

Jan 25 2021

vsellier added a revision to T2984: Port cgit lister to the new Lister API: D4943: cgit lister: Add missing types on the init method.
Jan 25 2021, 6:33 PM · Lister, CGit lister, Sprint 2021 01
ardumont added a parent task for T2984: Port cgit lister to the new Lister API: T2442: Provide a unified API for listers to interact with the scheduler.
Jan 25 2021, 2:19 PM · Lister, CGit lister, Sprint 2021 01
vsellier triaged T2988: Improve cgit lister to add last modification date of the repos as Normal priority.
Jan 25 2021, 11:40 AM · CGit lister, Lister

Jan 22 2021

ardumont moved T2984: Port cgit lister to the new Lister API from in-progress to code review on the Sprint 2021 01 board.
Jan 22 2021, 3:53 PM · Lister, CGit lister, Sprint 2021 01
vsellier added a revision to T2984: Port cgit lister to the new Lister API: D4926: Port cgit lister to the new lister api.
Jan 22 2021, 3:53 PM · Lister, CGit lister, Sprint 2021 01
vsellier moved T2984: Port cgit lister to the new Lister API from Backlog to in-progress on the Sprint 2021 01 board.
Jan 22 2021, 11:09 AM · Lister, CGit lister, Sprint 2021 01
vsellier changed the status of T2984: Port cgit lister to the new Lister API from Open to Work in Progress.
Jan 22 2021, 11:09 AM · Lister, CGit lister, Sprint 2021 01

Oct 10 2019

ardumont updated the summary of D2077: cgit.tests: Check the tasks from the scheduler.
Oct 10 2019, 9:47 AM · Origin-CGit, CGit lister
ardumont closed D2077: cgit.tests: Check the tasks from the scheduler.
Oct 10 2019, 9:47 AM · Origin-CGit, CGit lister

Oct 9 2019

douardda accepted D2077: cgit.tests: Check the tasks from the scheduler.
Oct 9 2019, 6:11 PM · Origin-CGit, CGit lister
swh-public-ci added a comment to D2077: cgit.tests: Check the tasks from the scheduler.

Build is green
See https://jenkins.softwareheritage.org/job/DLS/job/tox/400/ for more details.

Oct 9 2019, 6:04 PM · Origin-CGit, CGit lister
ardumont updated the diff for D2077: cgit.tests: Check the tasks from the scheduler.
  • Introduce swh-listers fixture in that diff
  • Use swh.core.pytest_plugins fixture (requires swh.core >= 0.0.73)
Oct 9 2019, 6:01 PM · Origin-CGit, CGit lister

Oct 7 2019

douardda accepted D2077: cgit.tests: Check the tasks from the scheduler.
Oct 7 2019, 10:13 AM · Origin-CGit, CGit lister

Oct 6 2019

ardumont updated the summary of D2077: cgit.tests: Check the tasks from the scheduler.
Oct 6 2019, 9:56 AM · Origin-CGit, CGit lister

Oct 5 2019

swh-public-ci added a comment to D2077: cgit.tests: Check the tasks from the scheduler.

Build is green
See https://jenkins.softwareheritage.org/job/DLS/job/tox/387/ for more details.

Oct 5 2019, 6:27 PM · Origin-CGit, CGit lister
ardumont added projects to D2077: cgit.tests: Check the tasks from the scheduler: CGit lister, Origin-CGit.
Oct 5 2019, 6:25 PM · Origin-CGit, CGit lister

Sep 11 2019

ardumont closed T1861: cgit lister: Adapt lister to deal with inconsistent `git clone uri` pattern as Resolved.

D1929 took care of it ;)

Sep 11 2019, 4:12 PM · CGit lister, Lister

Jun 29 2019

ardumont updated the task description for T1861: cgit lister: Adapt lister to deal with inconsistent `git clone uri` pattern.
Jun 29 2019, 3:57 PM · CGit lister, Lister
ardumont updated the task description for T1861: cgit lister: Adapt lister to deal with inconsistent `git clone uri` pattern.
Jun 29 2019, 9:15 AM · CGit lister, Lister
ardumont added a comment to T1861: cgit lister: Adapt lister to deal with inconsistent `git clone uri` pattern.

I'd be inclined to use a composition of solution:

  • having the listing policy determined at cgit instance lister initialization (3.)
  • Since 1. has already been done in the past, use that instead as a fallback (eclipse and freedesktop might be popular enough to sustain the load for a tad more requests than the current bare listing we do).
Jun 29 2019, 9:13 AM · CGit lister, Lister
ardumont renamed T1861: cgit lister: Adapt lister to deal with inconsistent `git clone uri` pattern from cgit lister adaptations to deal with cgit.freedesktop.org to cgit lister: Adapt lister to deal with inconsistent `git clone uri` pattern.
Jun 29 2019, 9:10 AM · CGit lister, Lister

Jun 28 2019

ardumont updated the task description for T1861: cgit lister: Adapt lister to deal with inconsistent `git clone uri` pattern.
Jun 28 2019, 8:36 PM · CGit lister, Lister
ardumont updated the task description for T1861: cgit lister: Adapt lister to deal with inconsistent `git clone uri` pattern.
Jun 28 2019, 8:11 PM · CGit lister, Lister
ardumont triaged T1861: cgit lister: Adapt lister to deal with inconsistent `git clone uri` pattern as Normal priority.
Jun 28 2019, 8:11 PM · CGit lister, Lister
nahimilega closed T1659: rewrite the CGit lister as a proper lister as Resolved by committing rDLSb972a2a88d25: swh.lister.cgit.
Jun 28 2019, 5:22 PM · CGit lister
anlambert updated the task description for T1835: List/Ingest major cgit instances.
Jun 28 2019, 11:44 AM · Lister
ardumont added a comment to T1835: List/Ingest major cgit instances.

http://git.upsilon.cc/ (@zack's git repositories) might be relevant.

Jun 28 2019, 11:38 AM · Lister

Jun 20 2019

nahimilega updated the task description for T1835: List/Ingest major cgit instances.
Jun 20 2019, 2:20 PM · Lister
ardumont triaged T1835: List/Ingest major cgit instances as Normal priority.
Jun 20 2019, 1:37 PM · Lister

Jun 19 2019

nahimilega updated subscribers of T1659: rewrite the CGit lister as a proper lister.
Jun 19 2019, 1:26 PM · CGit lister
vlorentz added a revision to T1659: rewrite the CGit lister as a proper lister: D1610: swh.lister.cgit.
Jun 19 2019, 1:23 PM · CGit lister

Jun 17 2019

zack added a comment to T1659: rewrite the CGit lister as a proper lister.

Thanks for your interest in working on this @nahimilega , it would be very useful to move forward on a bunch of pending ingestions, including Tor !

Jun 17 2019, 10:01 PM · CGit lister
nahimilega added a comment to T1659: rewrite the CGit lister as a proper lister.

but i'd say convert to python. depending on xmllint is very brittle... i already had to tweak the thing once to make it work at all, and the pipeline is kind of nasty. i think you will have to import some HTML parser at some point anyways, so you might as well bite that bullet now.

Jun 17 2019, 7:43 PM · CGit lister
anarcat added a comment to T1659: rewrite the CGit lister as a proper lister.

not that I get a vote in this, but i'd say convert to python. depending on xmllint is very brittle... i already had to tweak the thing once to make it work at all, and the pipeline is kind of nasty. i think you will have to import some HTML parser at some point anyways, so you might as well bite that bullet now.

Jun 17 2019, 7:23 PM · CGit lister
nahimilega added a comment to T1659: rewrite the CGit lister as a proper lister.

I would be pretty interested in integrating it with other listers and moving it to the common repo. I guess we can proceed in two ways.

  1. Use the script that we already have. Just run this script via python to get the list of repos.
Jun 17 2019, 7:14 PM · CGit lister

Jun 12 2019

anarcat added a comment to T1659: rewrite the CGit lister as a proper lister.

i couldn't find the time to work through the developer setup and the lister tutorial, so I used the shell script to generate a list of projects for tor gitweb.

Jun 12 2019, 10:20 PM · CGit lister
anarcat removed a parent task for T1659: rewrite the CGit lister as a proper lister: T1798: ingest Tor project source code (meta task).
Jun 12 2019, 9:32 PM · CGit lister
anarcat added a parent task for T1659: rewrite the CGit lister as a proper lister: T1799: ingest Tor git repositories.
Jun 12 2019, 9:31 PM · CGit lister
anarcat added a parent task for T1659: rewrite the CGit lister as a proper lister: T1798: ingest Tor project source code (meta task).
Jun 12 2019, 9:30 PM · CGit lister

Apr 18 2019

zack added a parent task for T1659: rewrite the CGit lister as a proper lister: T1451: ingest GNU Savannah Git repositories.
Apr 18 2019, 10:38 AM · CGit lister
zack triaged T1659: rewrite the CGit lister as a proper lister as Low priority.
Apr 18 2019, 10:38 AM · CGit lister

May 13 2016

olasd changed the visibility for CGit lister.
May 13 2016, 5:23 PM

Feb 22 2016

olasd set the image for CGit lister to Unknown Object (File).
Feb 22 2016, 8:17 PM

Oct 2 2015

zack removed a member for CGit lister: zack.
Oct 2 2015, 10:54 PM
zack created CGit lister.
Oct 2 2015, 10:54 PM