Page MenuHomeSoftware Heritage

Compare swh-loader-svn which injects in swh-storage remotely with bare git-svn which clones on disk
Closed, ResolvedPublic

Related Objects

Event Timeline

I see a factor ranging from 2 for short version histories to 8 for long
histories w.r.t. git-svn, which is difficult to analyse like this.

Is there a means of getting a detailed analysis of the running times of
the different phases of the svn-swh importer, separating cloning,
processing and storing time?

ardumont added a comment.EditedMay 23 2016, 5:42 PM

I see a factor ranging from 2 for short version histories to 8 for long
histories w.r.t. git-svn, which is difficult to analyse like this.

Indeed.

Is there a means of getting a detailed analysis of the running times of
the different phases of the svn-swh importer, separating cloning,
processing and storing time?

Unfortunately no.
I don't know how to compare fairly those tools and have the details.

swh-svn does checkout one revision at a time, git hash compute and send for injection to swh (at each revision).
git-svn does checkout one revision at a time, git hash compute and store on disk.

Also, I cannot really inhibit the swh-svn injection because we walk the revision tree at each revision.
Thus, there are optimizations in the discussion between swh backend and the loader...

I guess the only way to be fair with swh-svn would be to also inject the git clones using swh-loader-git and then compare the sum of git svn times (git clone + loader-git injection) with the swh-svn time...


The good point in favor of swh-svn though is that it does the job without falling apart.
git-svn does not handle well the 'huge' repository alone...

Unfortunately no.
I don't know how to compare fairly those tools and have the details.

Now, i do ^^

Also, I cannot really inhibit the swh-svn injection because we walk the revision tree at each revision.
Thus, there are optimizations in the discussion between swh backend and the loader...

I was wrong, i can inhibit that ^^

Running a batch on the current 5 repositories with data sending to storage inhibited.

cf. T386#6859 entitled 'comparison - with swh-storage'

ardumont closed this task as Resolved.Jun 13 2016, 4:02 PM