Page MenuHomeSoftware Heritage

ingest Google Code Subversion repositories
Closed, MigratedEdits Locked

Description

Note:

  • map the origin url with the old googlecode ones. Scheme is http://<project-name>.googlecode.com/svn/
  • we keep the loader-svn's current state (revision hash divergence check detection or svn:external triggers an error which logs an error and stops the loading).

Related Objects

StatusAssignedTask
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration
Migratedgitlab-migration

Event Timeline

Currently running on swh's internal infrastructure workers.

ardumont changed the task status from Open to Work in Progress.Jan 11 2017, 12:33 PM
zack renamed this task from Ingest googlecode's svn dump repositories to ingest Google Code Subversion repositories.Feb 12 2017, 6:15 PM
zack added a project: Restricted Project.
zack moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Feb 12 2017, 6:37 PM
zack lowered the priority of this task from High to Normal.Feb 14 2017, 9:52 AM

Command used to trigger the production of tasks:

cat INDEX.shuffle.svndump | ./bin/list-svndump-urls | SWH_WORKER_INSTANCE=swh_loader_svn python3 -u -m swh.loader.svn.producer svn-archive --visit-date 'Tue, 3 May 2016 17:16:32 +0200'

where:

zack removed projects: Restricted Project, SVN Loader.Apr 5 2017, 2:04 PM

An update on this, this is still work in progress.

status

~168.5k repositories to ingest out of 575k repositories.

This is already scheduled in the loader-svn queue.
This is in stand-by (cf. below).

issues

As we hit regularly the following issue:

  1. out of ram -> 2. worker killed without possibilities to run the cleaning step -> 3. out of disk issues -> 4. at least one worker idle (the one without space disk) which consumes without doing anything useful the remaining queued jobs.

Note: Important implementation detail, loader-svn works on disk.

workaround in progress

For now, @olasd and I worked on provisioning vms on beaubourg (almost there) to make those disk workers (git-disk + svn) run there.

The hypervisor beaubourg being not used as much as it could and louvre quite the opposite.

This is a workaround for now as other tasks have higher priorities (i'll mention them back when they exist :)

It got restarted 2 weeks ago (Monday 18th September 2017).
It just finished (Monday 2nd October 2017).

T676 now kicks in.

Reopened since a subtask (or child task) is still opened (T676).

09:41:46      +zack | (as a rule of thumb, parent tasks should not be closed if there are still child taks open, but there might be exceptions)
zack claimed this task.
gitlab-migration changed the task status from Resolved to Migrated.Sun, Jan 8, 4:18 PM
gitlab-migration claimed this task.
gitlab-migration changed the status of subtask T328: svn / subversion loader from Resolved to Migrated.