In T3544#69746, @olasd wrote:

I can see a few alternatives to using git:// over tcp:

Give our swh bot accounts SSH keys, and use that to clone from GitHub over ssh.

Sep 1 2021, 10:06 PM · Origin-GitHub, Git loader

olasd added a comment to T3544: Deal with GitHub removing support for git:// URLs.

The dulwich HTTP(s) support is implemented on top of urllib(3?).

Sep 1 2021, 9:18 PM · Origin-GitHub, Git loader

vlorentz triaged T3544: Deal with GitHub removing support for git:// URLs as High priority.

Sep 1 2021, 9:11 PM · Origin-GitHub, Git loader

Aug 10 2021

zack raised the priority of T3457: Some git repositories are failing to be ingested because of MemoryError from Normal to High.

Aug 10 2021, 12:10 PM · Git loader

vsellier added a comment to T3457: Some git repositories are failing to be ingested because of MemoryError.

Another example in production, during the stop phase of a worker, the loader was alone on the server (with 12Go of ram) and was oom killed:

Aug 10 08:53:24 worker05 python3[871]: [2021-08-10 08:53:24,745: INFO/ForkPoolWorker-1] Load origin 'https://github.com/evands/Specs' with type 'git'
Aug 10 08:54:17 worker05 python3[871]: [62B blob data]
Aug 10 08:54:17 worker05 python3[871]: [586B blob data]
Aug 10 08:54:17 worker05 python3[871]: [473B blob data]
Aug 10 08:54:29 worker05 python3[871]: Total 782419 (delta 6), reused 5 (delta 5), pack-reused 782401                                         
Aug 10 08:54:29 worker05 python3[871]: [2021-08-10 08:54:29,044: INFO/ForkPoolWorker-1] Listed 6 refs for repo https://github.com/evands/Specs
Aug 10 08:59:21 worker05 kernel: [    871]  1004   871   247194   161634  1826816    46260             0 python3                              
Aug 10 09:08:29 worker05 systemd[1]: swh-worker@loader_git.service: Unit process 871 (python3) remains running after unit stopped.            
Aug 10 09:15:29 worker05 kernel: [    871]  1004   871   412057   372785  3145728        0             0 python3                              
Aug 10 09:16:57 worker05 kernel: [    871]  1004   871   823648   784496  6443008        0             0 python3                              
Aug 10 09:24:44 worker05 kernel: CPU: 2 PID: 871 Comm: python3 Not tainted 5.10.0-0.bpo.7-amd64 #1 Debian 5.10.40-1~bpo10+1                   
Aug 10 09:24:44 worker05 kernel: [    871]  1004   871  2800000  2760713 22286336        0             0 python3                              
Aug 10 09:24:44 worker05 kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0-2,oom_memcg=/system.slice/system-swh\x2dworker.slice,task_memcg=/system.slice/system-swh\x2dworker.slice/swh-worker@loader_git.service,task=python3,pid=871,uid=1004           
Aug 10 09:24:44 worker05 kernel: Memory cgroup out of memory: Killed process 871 (python3) total-vm:11200000kB, anon-rss:11038844kB, file-rss:4008kB, shmem-rss:0kB, UID:1004 pgtables:21764kB oom_score_adj:0
Aug 10 09:24:45 worker05 kernel: oom_reaper: reaped process 871 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Aug 10 2021, 11:32 AM · Git loader

Aug 9 2021

ardumont added a comment to T3457: Some git repositories are failing to be ingested because of MemoryError.

[3] possibly T2373

Aug 9 2021, 4:19 PM · Git loader

ardumont triaged T3472: Git loader implementations divergence point of attention as Normal priority.

Aug 9 2021, 2:26 PM · Git loader

Aug 5 2021

vlorentz renamed T3457: Some git repositories are failing to be ingested because of MemoryError from Big git repositories are failing to be ingested to Some git repositories are failing to be ingested because of MemoryError.

Aug 5 2021, 2:13 PM · Git loader

vlorentz added a comment to T3457: Some git repositories are failing to be ingested because of MemoryError.

It's exactly the same issue AFAIK

Aug 5 2021, 1:37 PM · Git loader

ardumont updated subscribers of T3457: Some git repositories are failing to be ingested because of MemoryError.

For information, @vlorentz opened a related issue in dulwich [1].

Aug 5 2021, 1:05 PM · Git loader

ardumont updated the task description for T3457: Some git repositories are failing to be ingested because of MemoryError.

Aug 5 2021, 1:04 PM · Git loader

Aug 4 2021

ardumont updated the task description for T3457: Some git repositories are failing to be ingested because of MemoryError.

Aug 4 2021, 12:21 PM · Git loader

ardumont updated the task description for T3457: Some git repositories are failing to be ingested because of MemoryError.

Aug 4 2021, 12:08 PM · Git loader

ardumont updated the task description for T3457: Some git repositories are failing to be ingested because of MemoryError.

Aug 4 2021, 12:05 PM · Git loader

ardumont updated the task description for T3457: Some git repositories are failing to be ingested because of MemoryError.

Aug 4 2021, 10:30 AM · Git loader

ardumont triaged T3457: Some git repositories are failing to be ingested because of MemoryError as Normal priority.

Aug 4 2021, 10:28 AM · Git loader

Jul 13 2021

vlorentz added a comment to T3311: Use .gitmodules to discover origins.

if it would be worth submitting these recursive origins with "save code now" so we can try to get submodule updates close to the update of the main repository

Jul 13 2021, 12:10 PM · Archive coverage, Git loader

Jul 12 2021

olasd added a comment to T3311: Use .gitmodules to discover origins.

I also wonder if we have a somewhat common approach to handle the SVN externals as well.

Jul 12 2021, 3:48 PM · Archive coverage, Git loader

olasd added a comment to T3311: Use .gitmodules to discover origins.

I think this is worthwhile in general, at least for repositories that are still live.

Jul 12 2021, 3:47 PM · Archive coverage, Git loader

May 6 2021

zack added a comment to T3311: Use .gitmodules to discover origins.

In T3311#64737, @vlorentz wrote:

I think the only issue with (3) is not being retroactive

May 6 2021, 6:49 PM · Archive coverage, Git loader

vlorentz added a project to T3311: Use .gitmodules to discover origins: Archive coverage.

May 6 2021, 6:34 PM · Archive coverage, Git loader

vlorentz added a comment to T3311: Use .gitmodules to discover origins.

I think the only issue with (3) is not being retroactive

May 6 2021, 6:30 PM · Archive coverage, Git loader

zack added a comment to T3311: Use .gitmodules to discover origins.

This is a good idea, thanks for raising it.

May 6 2021, 6:06 PM · Archive coverage, Git loader

vlorentz triaged T3311: Use .gitmodules to discover origins as Low priority.

May 6 2021, 5:07 PM · Archive coverage, Git loader

Apr 14 2021

KShivendu closed T3132: loader-git: Bad formatting of the "Pack file too big" error message as Resolved.

Apr 14 2021, 5:44 PM · Easy hack, Git loader

Apr 5 2021

aastha1999 added a revision to T3132: loader-git: Bad formatting of the "Pack file too big" error message: D5418: Fix Pack File too big error formatting.

Apr 5 2021, 9:22 AM · Easy hack, Git loader

Apr 4 2021

KShivendu added a comment to T3132: loader-git: Bad formatting of the "Pack file too big" error message.

I am here to just say: swh-loader-git doesn't have a CONTRIBUTORS file. You may ask the contributor to add it as well :)

Apr 4 2021, 10:56 AM · Easy hack, Git loader

Mar 15 2021

vlorentz triaged T3132: loader-git: Bad formatting of the "Pack file too big" error message as Low priority.

Mar 15 2021, 1:18 PM · Easy hack, Git loader

Mar 5 2021

vlorentz added a subtask for T2059: Generate (swh) releases from all git tags: T3089: Remove the 'metadata' column of the 'revision' table.

Mar 5 2021, 12:30 PM · Git loader

Mar 1 2021

anlambert lowered the priority of T2926: Failed ingestion of a GitHub repository from High to Normal.

Lowering task priority to normal, nothing critical here.

Mar 1 2021, 2:57 PM · Web app, Git loader

Feb 3 2021

olasd updated subscribers of T3025: git loaders are getting oom-killed repeatedly in prod.

After mulling this over with @zack, and looking at the starved worker logs for a while, I suspect that we're also being bitten by our (early, early) choice of using celery acks_late, which only acknowledges tasks when they're done: when a worker is OOM-killed, it will never send task acknowledgements to rabbitmq, which will keep re-sending it the tasks.

Feb 3 2021, 8:16 PM · Git loader, System administration

olasd added a revision to T3025: git loaders are getting oom-killed repeatedly in prod: D5003: celery: acknowledge tasks as soon as they're received.

Feb 3 2021, 8:11 PM · Git loader, System administration

olasd added a comment to T3025: git loaders are getting oom-killed repeatedly in prod.

My current workaround attempt is switching pack fetches from https://github.com/* to git://github.com/*, transparently in the git loader; dulwich's git over TCP transport doesn't have to do the same "double-buffering" as the https transport, so it should allow us to fail earlier (hopefully without involving the oom killer).

Feb 3 2021, 5:36 PM · Git loader, System administration

olasd added a comment to T3025: git loaders are getting oom-killed repeatedly in prod.

Attempts at mitigating the issue:

Feb 3 2021, 5:28 PM · Git loader, System administration

olasd changed the status of T3025: git loaders are getting oom-killed repeatedly in prod from Open to Work in Progress.

Feb 3 2021, 3:38 PM · Git loader, System administration

Jan 13 2021

moranegg updated the task description for T1101: fetch release note from github to keep in release_metadata table.

Jan 13 2021, 10:53 AM · Git loader

Jan 7 2021

anlambert added a comment to T2926: Failed ingestion of a GitHub repository.

In T2926#56128, @rdicosmo wrote:

Thanks Antoine, any way to have this kind of errors also reported in the admin dashboard for save code now.

Jan 7 2021, 11:44 AM · Web app, Git loader

rdicosmo added a comment to T2926: Failed ingestion of a GitHub repository.

Thanks Antoine, any way to have this kind of errors also reported in the admin dashboard for save code now.

Jan 7 2021, 11:41 AM · Web app, Git loader

anlambert added a comment to T2926: Failed ingestion of a GitHub repository.

For the record, the load failure on 2021-01-04T17:05:11Z was due to a network error (found via Kibana):

Jan 7 2021, 11:34 AM · Web app, Git loader

Jan 6 2021

anlambert added a comment to T2926: Failed ingestion of a GitHub repository.

The repository has been correctly ingested on 05 January 2021, 11:56 UTC .

Jan 6 2021, 1:53 PM · Web app, Git loader

rdicosmo updated subscribers of T2926: Failed ingestion of a GitHub repository.

Jan 6 2021, 12:34 PM · Web app, Git loader

Jan 4 2021

rdicosmo triaged T2926: Failed ingestion of a GitHub repository as High priority.

Jan 4 2021, 7:29 PM · Web app, Git loader

Oct 16 2020

vlorentz added projects to T2666: GitHub releases not available in record: Data Model, Git loader.

Oct 16 2020, 2:28 PM · Git loader, Data Model

vlorentz merged T2666: GitHub releases not available in record into T2059: Generate (swh) releases from all git tags.

Oct 16 2020, 2:26 PM · Git loader

Sep 24 2020

vlorentz added a comment to T340: add missing "archive_type" property to revision.metadata JSON for all imported dsc.

I don't think so; the loader is storing the data elsewhere, but still doesn't write the archive type in each of these entries

Sep 24 2020, 11:10 AM · Git loader

Sep 22 2020

olasd closed T340: add missing "archive_type" property to revision.metadata JSON for all imported dsc as Wontfix.

I suspect that this is superseded by work done by @vlorentz for the extrinsic metadata store.

Sep 22 2020, 6:23 PM · Git loader

olasd placed T996: Load git origins with missing revisions again up for grabs.

Sep 22 2020, 4:43 PM · Git loader

ardumont added a comment to T2373: git loader OOM when loading huge repository.

running some of the sources on production. I have "save code now" guix and
nixpkgs repositories, i could also add the linux kernel (it the visit is old
enough).

Sep 22 2020, 9:45 AM · Git loader

Sep 21 2020

ardumont added a comment to T2616: Analyze the launchpad repository failures.

I have opened a "fresher" dashboard on kibana with the errors (grouped by error message as kibana filter, they needs toggling on/off to actually see them) [1]
I think we need to cross those filtering messages with sentry to actually have some context though... (as we don't have really any with that board...).

Sep 21 2020, 7:27 PM · Git loader

ardumont added a comment to T2373: git loader OOM when loading huge repository.

fwiw, loader-core v0.11.0 deployed in production.

Sep 21 2020, 3:57 PM · Git loader

vlorentz closed T2373: git loader OOM when loading huge repository as Resolved.

Sep 21 2020, 3:35 PM · Git loader

zack added a comment to T2373: git loader OOM when loading huge repository.

In T2373#49214, @ardumont wrote:

fwiw, loader-core v0.11.0 deployed in production.

Sep 21 2020, 2:43 PM · Git loader

ardumont added a comment to T2373: git loader OOM when loading huge repository.

fwiw, loader-core v0.11.0 deployed in production.

Sep 21 2020, 2:38 PM · Git loader

ardumont renamed T2616: Analyze the launchpad repository failures from Analyze the gitea repository (codeberg) failures to Analyze the launchpad repository failures.

Sep 21 2020, 1:44 PM · Git loader

ardumont triaged T2616: Analyze the launchpad repository failures as Normal priority.

Sep 21 2020, 1:44 PM · Git loader

Sep 20 2020

zack added a comment to T2373: git loader OOM when loading huge repository.

I can confirm that with the current master HEAD of swh-loader-core (452fa224f9ca635a979cf1a8e98c88bb560ca98a), loading of the Linux kernel repo no longer OOM.
(It failed after ~24 hours, but apparently for unrelated reasons.)

Sep 20 2020, 2:31 PM · Git loader

Sep 18 2020

ardumont changed the status of T2373: git loader OOM when loading huge repository from Open to Work in Progress.

Sep 18 2020, 3:42 PM · Git loader

ardumont added a revision to T2373: git loader OOM when loading huge repository: D3988: loaders: Move the proxy storage filter after the buffer proxy.

Sep 18 2020, 3:15 PM · Git loader

ardumont added a revision to T2373: git loader OOM when loading huge repository: D3986: loaders: Move the proxy storage filter after the buffer proxy.

Sep 18 2020, 3:11 PM · Git loader

ardumont added a comment to T2373: git loader OOM when loading huge repository.

Status on this. Loader-core has been tagged 0.11.0 which includes D3976.

Sep 18 2020, 2:57 PM · Git loader

swh-public-ci added a comment to D3978: tests: Don't check the number of created 'person' objects..

Build is green

Sep 18 2020, 11:19 AM · Git loader

vlorentz closed D3978: tests: Don't check the number of created 'person' objects..

Sep 18 2020, 11:18 AM · Git loader

vlorentz updated the diff for D3978: tests: Don't check the number of created 'person' objects..

rebase

Sep 18 2020, 11:18 AM · Git loader

swh-public-ci added a comment to D3978: tests: Don't check the number of created 'person' objects..

Build is green

Sep 18 2020, 11:16 AM · Git loader

ardumont updated the summary of D3978: tests: Don't check the number of created 'person' objects..

Sep 18 2020, 11:15 AM · Git loader

Sep 17 2020

vlorentz added a revision to T2373: git loader OOM when loading huge repository: D3976: loader: Stop materializing full lists of objects to be stored..

Sep 17 2020, 2:32 PM · Git loader

vlorentz added a comment to T2373: git loader OOM when loading huge repository.

Adding pagination to these endpoints seems quite overkill.

Sep 17 2020, 2:31 PM · Git loader

olasd added a comment to T2373: git loader OOM when loading huge repository.

In T2373#48877, @ardumont wrote:

So content_missing call explodes mid-air client side (`"POST /content/missing
HTTP/1.1" 200 9475383` so client received the data).

It so happens that the content_missing api is taking an unlimited amount of
bytes ids as input [1] And then "tries" to stream to the client the results
(rpc layer in the middle makes that moot).

Sep 17 2020, 2:03 PM · Git loader

zack merged task T2607: git loader OOM when loading the linux kernel repo into T2373: git loader OOM when loading huge repository.

Sep 17 2020, 9:53 AM · Git loader

zack merged T2607: git loader OOM when loading the linux kernel repo into T2373: git loader OOM when loading huge repository.

Sep 17 2020, 9:53 AM · Git loader

zack renamed T2373: git loader OOM when loading huge repository from staging: git loader: failure to ingest huge repository (e.g. nixpkgs) to git loader OOM when loading huge repository.

Sep 17 2020, 9:53 AM · Git loader

ardumont added a comment to T2373: git loader OOM when loading huge repository.

So content_missing call explodes mid-air client side (`"POST /content/missing
HTTP/1.1" 200 9475383` so client received the data).

Sep 17 2020, 9:48 AM · Git loader

douardda added a comment to T2373: git loader OOM when loading huge repository.

FTR, in a test setup I made a few days ago on docker, I had a git loader crunching ~28GB of RES mem (on 32 available on that machine). Not sure which repo it was ingesting, but it was on codeberg.

Sep 17 2020, 9:10 AM · Git loader

zack renamed T2607: git loader OOM when loading the linux kernel repo from git loader OOM when loading the linux kernel repo (at least in the docker dev environment) to git loader OOM when loading the linux kernel repo.

Sep 17 2020, 9:03 AM · Git loader

zack raised the priority of T2607: git loader OOM when loading the linux kernel repo from Normal to High.

Very likely the same issue, thanks @ardumont !
Given what @olasd said in that issue (the ingestion logic having remained pretty much the same since ever), and that I can confirm linux.git was loading just fine on my laptop no more than a year ago, the increased memory usage probably comes from elsewhere.
Anyway, it looks like a potentially important issue, so I'm raising priority and also removing the association with the docker env (as you could also reproduce this on staging).

Sep 17 2020, 9:03 AM · Git loader

ardumont added a comment to T2607: git loader OOM when loading the linux kernel repo.

possibly related to T2373.

Sep 17 2020, 8:51 AM · Git loader