Page MenuHomeSoftware Heritage

git loader OOM when loading the linux kernel repo
Closed, DuplicatePublic

Description

I've tried to load the linux kernel repo in a locally running swh stack under the docker development environment.

That used to work just fine and I've used it in the past to have a non-entirely trivial archive locally available for experimentation (this time it is in the context of T2600).
It no longer works for me, because the git loader got OOM killed after reaching ~30 GB of VIRT mem and ~20 GB of RES mem.
It happened twice (2nd time when the scheduler retried the same job), so I don't think it's a temporary glitch.

I'm filing this partly because it would be nice to allow people to load a "serious repo" locally in their test, even if that is not a goal per se.
But also because if it takes (at least) that much RAM to load the linux kernel repo, it might be the case that there are other non-artificial repos out there that make our workers explode.
(Unless this is specific to some settings of the docker dev repo, in which case it would still be nice to tune them to avoid OOM killing.)

Event Timeline

zack triaged this task as Normal priority.Sep 16 2020, 8:26 PM
zack created this task.
zack updated the task description. (Show Details)
zack raised the priority of this task from Normal to High.Sep 17 2020, 9:03 AM
zack removed a project: Docker environment.
zack added a subscriber: olasd.

Very likely the same issue, thanks @ardumont !
Given what @olasd said in that issue (the ingestion logic having remained pretty much the same since ever), and that I can confirm linux.git was loading just fine on my laptop no more than a year ago, the increased memory usage probably comes from elsewhere.
Anyway, it looks like a potentially important issue, so I'm raising priority and also removing the association with the docker env (as you could also reproduce this on staging).

zack renamed this task from git loader OOM when loading the linux kernel repo (at least in the docker dev environment) to git loader OOM when loading the linux kernel repo.Sep 17 2020, 9:03 AM