Status | Assigned | Task | ||
---|---|---|---|---|
Unknown Object (Maniphest Task) | ||||
Migrated | gitlab-migration | T367 ingest Google Code repositories | ||
Migrated | gitlab-migration | T617 ingest Google Code Subversion repositories | ||
Unknown Object (Maniphest Task) | ||||
Unknown Object (Maniphest Task) | ||||
Unknown Object (Maniphest Task) | ||||
Migrated | gitlab-migration | T328 svn / subversion loader | ||
Migrated | gitlab-migration | T386 compare svn loader performances with git-svn | ||
Migrated | gitlab-migration | T410 Compare swh-loader-svn which injects in swh-storage remotely with bare git-svn which clones on disk |
Event Timeline
I see a factor ranging from 2 for short version histories to 8 for long
histories w.r.t. git-svn, which is difficult to analyse like this.
Is there a means of getting a detailed analysis of the running times of
the different phases of the svn-swh importer, separating cloning,
processing and storing time?
I see a factor ranging from 2 for short version histories to 8 for long
histories w.r.t. git-svn, which is difficult to analyse like this.
Indeed.
Is there a means of getting a detailed analysis of the running times of
the different phases of the svn-swh importer, separating cloning,
processing and storing time?
Unfortunately no.
I don't know how to compare fairly those tools and have the details.
swh-svn does checkout one revision at a time, git hash compute and send for injection to swh (at each revision).
git-svn does checkout one revision at a time, git hash compute and store on disk.
Also, I cannot really inhibit the swh-svn injection because we walk the revision tree at each revision.
Thus, there are optimizations in the discussion between swh backend and the loader...
I guess the only way to be fair with swh-svn would be to also inject the git clones using swh-loader-git and then compare the sum of git svn times (git clone + loader-git injection) with the swh-svn time...
The good point in favor of swh-svn though is that it does the job without falling apart.
git-svn does not handle well the 'huge' repository alone...
Unfortunately no.
I don't know how to compare fairly those tools and have the details.
Now, i do ^^
Also, I cannot really inhibit the swh-svn injection because we walk the revision tree at each revision.
Thus, there are optimizations in the discussion between swh backend and the loader...
I was wrong, i can inhibit that ^^
Running a batch on the current 5 repositories with data sending to storage inhibited.