Page MenuHomeSoftware Heritage

Git loaderFolder
ActivePublic

Members

  • This project does not have any members.
  • View All

Watchers

  • This project does not have any watchers.
  • View All

Details

Recent Activity

Fri, Jun 19

olasd updated subscribers of T2459: skip exogenous branches when ingesting github/gitlab git repositories.

The heuristic you're talking about only applies for branches which name starts with refs/. All other branches are passed through unscathed, I think.

Fri, Jun 19, 5:44 PM · Git loader
zack added a comment to T2459: skip exogenous branches when ingesting github/gitlab git repositories.

as a related data point, the current graph export code applies the following heuristic to decide which outbound edges from snapshot nodes to emit:

  • keep branch names starting with refs/heads/
  • keep branch names starting with refs/tags/
  • drop everything else
Fri, Jun 19, 1:35 PM · Git loader
olasd closed T2410: Check and complete the gitorious.org import as Resolved.

We still need to try to ingest the zeq2 repo, but that can be done in a followup task.

Fri, Jun 19, 10:20 AM · Git loader, Origin-Gitorious
zack updated the task description for T2459: skip exogenous branches when ingesting github/gitlab git repositories.
Fri, Jun 19, 9:55 AM · Git loader
zack triaged T2459: skip exogenous branches when ingesting github/gitlab git repositories as Normal priority.
Fri, Jun 19, 9:50 AM · Git loader

May 30 2020

olasd added a comment to T2410: Check and complete the gitorious.org import.

The following repositories failed to import. Their on-disk structure is either completely empty, or only contains refs (no actual git objects stored):

May 30 2020, 12:58 PM · Git loader, Origin-Gitorious

May 29 2020

olasd added a comment to T2410: Check and complete the gitorious.org import.

After the first (naive, I guess) pass, 1470 repositories are still missing.

May 29 2020, 5:16 PM · Git loader, Origin-Gitorious

May 19 2020

olasd changed the status of T2410: Check and complete the gitorious.org import from Open to Work in Progress.

The code for loading git repositories from disk hasn't been run in production in a while, so I've decided to run the imports of the missing repos manually.

May 19 2020, 5:02 PM · Git loader, Origin-Gitorious
olasd added a comment to T2410: Check and complete the gitorious.org import.

We also have a single origin with no full visit:

May 19 2020, 12:07 PM · Git loader, Origin-Gitorious
olasd added a comment to T2410: Check and complete the gitorious.org import.

After dumping all origins starting with https://gitorious.org/ in the archive:

May 19 2020, 12:04 PM · Git loader, Origin-Gitorious
rdicosmo triaged T2410: Check and complete the gitorious.org import as High priority.
May 19 2020, 9:49 AM · Git loader, Origin-Gitorious

Apr 28 2020

ardumont added a comment to T2373: staging: git loader: failure to ingest huge repository (e.g. nixpkgs).

Currently running this again with debug logs...

Apr 28 2020, 1:20 PM · Git loader
ardumont added a comment to T2373: staging: git loader: failure to ingest huge repository (e.g. nixpkgs).

Currently running this again with debug logs...
Thanks for the input.

Apr 28 2020, 12:10 PM · Git loader
olasd added a comment to T2373: staging: git loader: failure to ingest huge repository (e.g. nixpkgs).

Reading this again, and seeing that the workers have 16GB of RAM, there's something weird going on that's not related to the volume of the packfile (which is 2GB max).

Apr 28 2020, 11:53 AM · Git loader
olasd added a comment to T2373: staging: git loader: failure to ingest huge repository (e.g. nixpkgs).

The base logic of the git loader regarding packfiles hasn't really been touched since it was first implemented: it's never been really profiled/optimized with respect to its memory usage; This issue isn't specific to the staging infra, it's only more salient there because the workers have been made with tight constraints.

Apr 28 2020, 11:49 AM · Git loader

Apr 22 2020

ardumont added a comment to T2373: staging: git loader: failure to ingest huge repository (e.g. nixpkgs).

[2] I will add some swap to that node to check if that goes further with it.

Apr 22 2020, 4:31 PM · Git loader
ardumont updated the task description for T2373: staging: git loader: failure to ingest huge repository (e.g. nixpkgs).
Apr 22 2020, 3:37 PM · Git loader
ardumont renamed T2373: staging: git loader: failure to ingest huge repository (e.g. nixpkgs) from staging: loader git: failure to ingest repository to staging: git loader: failure to ingest huge repository (e.g. nixpkgs).
Apr 22 2020, 3:33 PM · Git loader
ardumont triaged T2373: staging: git loader: failure to ingest huge repository (e.g. nixpkgs) as Normal priority.
Apr 22 2020, 3:33 PM · Git loader

Apr 21 2020

zack closed T1195: git loader: fail to ingest our own hello world repository as Resolved.
Apr 21 2020, 11:46 AM · Git loader

Apr 15 2020

ardumont closed D3019: git.loader: fix failing origin visit update step due to uninitialized internal state variables.
Apr 15 2020, 11:49 AM · Git loader
anlambert accepted D3019: git.loader: fix failing origin visit update step due to uninitialized internal state variables.

I got the exact same situation when I updated the mercurial loader to swh-model objects.

Apr 15 2020, 11:42 AM · Git loader
swh-public-ci added a comment to D3019: git.loader: fix failing origin visit update step due to uninitialized internal state variables.

Build is green

Apr 15 2020, 10:27 AM · Git loader
ardumont updated the summary of D3019: git.loader: fix failing origin visit update step due to uninitialized internal state variables.
Apr 15 2020, 10:26 AM · Git loader
ardumont updated the diff for D3019: git.loader: fix failing origin visit update step due to uninitialized internal state variables.

Improve the git commit

Apr 15 2020, 10:25 AM · Git loader
ardumont retitled D3019: git.loader: fix failing origin visit update step due to uninitialized internal state variables from git.loader: Initialize internal state in __init__ to git.loader: fix failing origin visit update step due to uninitialized internal state variables.
Apr 15 2020, 10:24 AM · Git loader
ardumont updated the summary of D3019: git.loader: fix failing origin visit update step due to uninitialized internal state variables.
Apr 15 2020, 10:24 AM · Git loader

Jan 22 2020

olasd added a comment to T2242: GitHub loading optimization: skip repos with old enough updated_at/pushed_at timestamps.

I agree that this may be a useful optimization for some upstreams where getting the state of the remote repository is expensive.

Jan 22 2020, 1:25 PM · Git loader

Jan 21 2020

zack updated the task description for T2242: GitHub loading optimization: skip repos with old enough updated_at/pushed_at timestamps.
Jan 21 2020, 1:34 PM · Git loader
zack triaged T2242: GitHub loading optimization: skip repos with old enough updated_at/pushed_at timestamps as Normal priority.
Jan 21 2020, 1:33 PM · Git loader
zack created T2242: GitHub loading optimization: skip repos with old enough updated_at/pushed_at timestamps.
Jan 21 2020, 1:33 PM · Git loader

Nov 19 2019

ardumont added a comment to T2094: KeyError: 'content:add' in swh.loader.core.loader.

@douardda fixed that behavior in loader.core D2299

Nov 19 2019, 11:28 AM · Git loader
douardda closed T2094: KeyError: 'content:add' in swh.loader.core.loader as Resolved.

This has been fixed by cb42fea77070

Nov 19 2019, 11:26 AM · Git loader
ardumont added a comment to T2094: KeyError: 'content:add' in swh.loader.core.loader.

Reproduced.

Nov 19 2019, 10:53 AM · Git loader

Nov 15 2019

zack triaged T2094: KeyError: 'content:add' in swh.loader.core.loader as High priority.
Nov 15 2019, 11:23 PM · Git loader
robguinness updated the task description for T2094: KeyError: 'content:add' in swh.loader.core.loader.
Nov 15 2019, 6:36 PM · Git loader
robguinness created T2094: KeyError: 'content:add' in swh.loader.core.loader.
Nov 15 2019, 6:34 PM · Git loader

Nov 5 2019

moranegg added a comment to T2059: Generate (swh) releases from all git tags.

Note that this doesn't solve the question of pulling release notes from e.g. GitHub release pages, which is something that would need to be done by some other component (T17 comes to mind).

Nov 5 2019, 1:35 PM · Git loader
olasd updated the task description for T2059: Generate (swh) releases from all git tags.
Nov 5 2019, 12:00 PM · Git loader
olasd triaged T2059: Generate (swh) releases from all git tags as Normal priority.
Nov 5 2019, 11:58 AM · Git loader

Oct 1 2019

ardumont edited P320 loader errors per loader type: ~/.config/swh/kibana/group-by.yml.
Oct 1 2019, 10:06 AM · Git loader, Mercurial loader, PyPI loader

Sep 30 2019

ardumont added a comment to T1280: git origins: latest failure reports.

To ease the analysis, here is an aggregate of the 09/2019 latest failures:

Sep 30 2019, 7:47 PM · Git loader
ardumont added a comment to T1280: git origins: latest failure reports.

New dashboards with latest errors as of 09/2019 [1]

Sep 30 2019, 6:22 PM · Git loader

Sep 10 2019

olasd closed T1988: Upgrade dulwich on celery workers as Resolved.

I've backported dulwich 0.19.13-1 to our stretch repo, upgraded all workers and they're restarting.

Sep 10 2019, 12:10 PM · System administration, Git loader

Sep 7 2019

ardumont added a comment to T1988: Upgrade dulwich on celery workers .

And nice work on the investigation and the fix within dulwich ;)

Sep 7 2019, 9:41 AM · System administration, Git loader
ardumont added a project to T1988: Upgrade dulwich on celery workers : System administration.
Sep 7 2019, 9:41 AM · System administration, Git loader
anlambert triaged T1988: Upgrade dulwich on celery workers as Normal priority.
Sep 7 2019, 12:35 AM · System administration, Git loader

Sep 6 2019

ardumont closed T1987: loader-git: failure when saving git pack as Resolved.
Sep 6 2019, 9:23 PM · Git loader
ardumont renamed T1987: loader-git: failure when saving git pack from loader-git: failure when trying to save git pack to loader-git: failure when saving git pack.
Sep 6 2019, 9:19 PM · Git loader
ardumont renamed T1987: loader-git: failure when saving git pack from loader-git: failure when trying to save data package to loader-git: failure when trying to save git pack.
Sep 6 2019, 9:18 PM · Git loader