Page MenuHomeSoftware Heritage

Git loaderFolder
ActivePublic

Members

  • This project does not have any members.
  • View All

Watchers

  • This project does not have any watchers.
  • View All

Details

Recent Activity

Oct 16 2020

vlorentz added projects to T2666: GitHub releases not available in record: Data Model, Git loader.
Oct 16 2020, 2:28 PM · Git loader, Data Model
vlorentz merged T2666: GitHub releases not available in record into T2059: Generate (swh) releases from all git tags.
Oct 16 2020, 2:26 PM · Git loader

Sep 24 2020

vlorentz added a comment to T340: add missing "archive_type" property to revision.metadata JSON for all imported dsc.

I don't think so; the loader is storing the data elsewhere, but still doesn't write the archive type in each of these entries

Sep 24 2020, 11:10 AM · Git loader

Sep 22 2020

olasd closed T340: add missing "archive_type" property to revision.metadata JSON for all imported dsc as Wontfix.

I suspect that this is superseded by work done by @vlorentz for the extrinsic metadata store.

Sep 22 2020, 6:23 PM · Git loader
olasd placed T996: Load git origins with missing revisions again up for grabs.
Sep 22 2020, 4:43 PM · Git loader
ardumont added a comment to T2373: git loader OOM when loading huge repository.

running some of the sources on production. I have "save code now" guix and
nixpkgs repositories, i could also add the linux kernel (it the visit is old
enough).

Sep 22 2020, 9:45 AM · Git loader

Sep 21 2020

ardumont added a comment to T2616: Analyze the launchpad repository failures.

I have opened a "fresher" dashboard on kibana with the errors (grouped by error message as kibana filter, they needs toggling on/off to actually see them) [1]
I think we need to cross those filtering messages with sentry to actually have some context though... (as we don't have really any with that board...).

Sep 21 2020, 7:27 PM · Git loader
ardumont added a comment to T2373: git loader OOM when loading huge repository.

fwiw, loader-core v0.11.0 deployed in production.

Sep 21 2020, 3:57 PM · Git loader
vlorentz closed T2373: git loader OOM when loading huge repository as Resolved.
Sep 21 2020, 3:35 PM · Git loader
zack added a comment to T2373: git loader OOM when loading huge repository.

fwiw, loader-core v0.11.0 deployed in production.

Sep 21 2020, 2:43 PM · Git loader
ardumont added a comment to T2373: git loader OOM when loading huge repository.

fwiw, loader-core v0.11.0 deployed in production.

Sep 21 2020, 2:38 PM · Git loader
ardumont renamed T2616: Analyze the launchpad repository failures from Analyze the gitea repository (codeberg) failures to Analyze the launchpad repository failures.
Sep 21 2020, 1:44 PM · Git loader
ardumont triaged T2616: Analyze the launchpad repository failures as Normal priority.
Sep 21 2020, 1:44 PM · Git loader

Sep 20 2020

zack added a comment to T2373: git loader OOM when loading huge repository.

I can confirm that with the current master HEAD of swh-loader-core (452fa224f9ca635a979cf1a8e98c88bb560ca98a), loading of the Linux kernel repo no longer OOM.
(It failed after ~24 hours, but apparently for unrelated reasons.)

Sep 20 2020, 2:31 PM · Git loader

Sep 18 2020

ardumont changed the status of T2373: git loader OOM when loading huge repository from Open to Work in Progress.
Sep 18 2020, 3:42 PM · Git loader
ardumont added a revision to T2373: git loader OOM when loading huge repository: D3988: loaders: Move the proxy storage filter after the buffer proxy.
Sep 18 2020, 3:15 PM · Git loader
ardumont added a revision to T2373: git loader OOM when loading huge repository: D3986: loaders: Move the proxy storage filter after the buffer proxy.
Sep 18 2020, 3:11 PM · Git loader
ardumont added a comment to T2373: git loader OOM when loading huge repository.

Status on this. Loader-core has been tagged 0.11.0 which includes D3976.

Sep 18 2020, 2:57 PM · Git loader
swh-public-ci added a comment to D3978: tests: Don't check the number of created 'person' objects..

Build is green

Sep 18 2020, 11:19 AM · Git loader
vlorentz closed D3978: tests: Don't check the number of created 'person' objects..
Sep 18 2020, 11:18 AM · Git loader
vlorentz updated the diff for D3978: tests: Don't check the number of created 'person' objects..

rebase

Sep 18 2020, 11:18 AM · Git loader
swh-public-ci added a comment to D3978: tests: Don't check the number of created 'person' objects..

Build is green

Sep 18 2020, 11:16 AM · Git loader
ardumont updated the summary of D3978: tests: Don't check the number of created 'person' objects..
Sep 18 2020, 11:15 AM · Git loader

Sep 17 2020

vlorentz added a revision to T2373: git loader OOM when loading huge repository: D3976: loader: Stop materializing full lists of objects to be stored..
Sep 17 2020, 2:32 PM · Git loader
vlorentz added a comment to T2373: git loader OOM when loading huge repository.

Adding pagination to these endpoints seems quite overkill.

Sep 17 2020, 2:31 PM · Git loader
olasd added a comment to T2373: git loader OOM when loading huge repository.

So content_missing call explodes mid-air client side (`"POST /content/missing
HTTP/1.1" 200 9475383` so client received the data).

It so happens that the content_missing api is taking an unlimited amount of
bytes ids as input [1] And then "tries" to stream to the client the results
(rpc layer in the middle makes that moot).

Sep 17 2020, 2:03 PM · Git loader
zack merged task T2607: git loader OOM when loading the linux kernel repo into T2373: git loader OOM when loading huge repository.
Sep 17 2020, 9:53 AM · Git loader
zack merged T2607: git loader OOM when loading the linux kernel repo into T2373: git loader OOM when loading huge repository.
Sep 17 2020, 9:53 AM · Git loader
zack renamed T2373: git loader OOM when loading huge repository from staging: git loader: failure to ingest huge repository (e.g. nixpkgs) to git loader OOM when loading huge repository.
Sep 17 2020, 9:53 AM · Git loader
ardumont added a comment to T2373: git loader OOM when loading huge repository.

So content_missing call explodes mid-air client side (`"POST /content/missing
HTTP/1.1" 200 9475383` so client received the data).

Sep 17 2020, 9:48 AM · Git loader
douardda added a comment to T2373: git loader OOM when loading huge repository.

FTR, in a test setup I made a few days ago on docker, I had a git loader crunching ~28GB of RES mem (on 32 available on that machine). Not sure which repo it was ingesting, but it was on codeberg.

Sep 17 2020, 9:10 AM · Git loader
zack renamed T2607: git loader OOM when loading the linux kernel repo from git loader OOM when loading the linux kernel repo (at least in the docker dev environment) to git loader OOM when loading the linux kernel repo.
Sep 17 2020, 9:03 AM · Git loader
zack raised the priority of T2607: git loader OOM when loading the linux kernel repo from Normal to High.

Very likely the same issue, thanks @ardumont !
Given what @olasd said in that issue (the ingestion logic having remained pretty much the same since ever), and that I can confirm linux.git was loading just fine on my laptop no more than a year ago, the increased memory usage probably comes from elsewhere.
Anyway, it looks like a potentially important issue, so I'm raising priority and also removing the association with the docker env (as you could also reproduce this on staging).

Sep 17 2020, 9:03 AM · Git loader
ardumont added a comment to T2607: git loader OOM when loading the linux kernel repo.

possibly related to T2373.

Sep 17 2020, 8:51 AM · Git loader

Sep 16 2020

zack updated the task description for T2607: git loader OOM when loading the linux kernel repo.
Sep 16 2020, 8:28 PM · Git loader
zack triaged T2607: git loader OOM when loading the linux kernel repo as Normal priority.
Sep 16 2020, 8:26 PM · Git loader

Sep 11 2020

douardda closed T1342: Handle annotated tag with no tagger, a subtask of T1280: git origins: latest failure reports, as Resolved.
Sep 11 2020, 2:30 PM · Git loader
douardda closed T1342: Handle annotated tag with no tagger as Resolved.

Let's call it fixed (until further notice).

Sep 11 2020, 2:30 PM · Git loader
olasd placed T1280: git origins: latest failure reports up for grabs.
Sep 11 2020, 2:25 PM · Git loader

Jul 27 2020

ardumont closed T2481: Migrate dvcs loader tests code to pytest as Resolved.
Jul 27 2020, 3:21 PM · SVN Loader, Mercurial loader, Git loader, Core Loader
ardumont added a parent task for T2481: Migrate dvcs loader tests code to pytest: T2221: Development workflow & code quality.
Jul 27 2020, 3:20 PM · SVN Loader, Mercurial loader, Git loader, Core Loader

Jul 20 2020

ardumont closed T2483: tests: Make check-snapshot utility test function recursively check targetted object exists, a subtask of T2481: Migrate dvcs loader tests code to pytest, as Resolved.
Jul 20 2020, 9:17 AM · SVN Loader, Mercurial loader, Git loader, Core Loader
ardumont closed T2483: tests: Make check-snapshot utility test function recursively check targetted object exists as Resolved.
Jul 20 2020, 9:17 AM · SVN Loader, Mercurial loader, Git loader, Core Loader
ardumont closed T2484: Move sharable fixtures out of conftest into a dedicated pytest plugin, a subtask of T2481: Migrate dvcs loader tests code to pytest, as Resolved.
Jul 20 2020, 9:16 AM · SVN Loader, Mercurial loader, Git loader, Core Loader
ardumont closed T2484: Move sharable fixtures out of conftest into a dedicated pytest plugin as Resolved.
Jul 20 2020, 9:16 AM · SVN Loader, Mercurial loader, Git loader, Core Loader

Jul 17 2020

ardumont added a revision to T2484: Move sharable fixtures out of conftest into a dedicated pytest plugin: D3551: tests: Reuse pytest fixtures from swh.loader.core.
Jul 17 2020, 12:12 PM · SVN Loader, Mercurial loader, Git loader, Core Loader
ardumont added a revision to T2484: Move sharable fixtures out of conftest into a dedicated pytest plugin: D3550: tests: Reuse pytest fixtures from swh.loader.core.
Jul 17 2020, 12:04 PM · SVN Loader, Mercurial loader, Git loader, Core Loader
ardumont added a revision to T2484: Move sharable fixtures out of conftest into a dedicated pytest plugin: D3549: tests: Reuse pytest fixtures from swh.loader.core.
Jul 17 2020, 12:04 PM · SVN Loader, Mercurial loader, Git loader, Core Loader

Jul 16 2020

ardumont closed T2488: Drop loader.core BaseLoaderTest and BaseLoaderStorageTest, a subtask of T2481: Migrate dvcs loader tests code to pytest, as Resolved.
Jul 16 2020, 3:18 PM · SVN Loader, Mercurial loader, Git loader, Core Loader
ardumont closed T2488: Drop loader.core BaseLoaderTest and BaseLoaderStorageTest as Resolved.
Jul 16 2020, 3:18 PM · SVN Loader, Mercurial loader, Git loader, Core Loader