Page MenuHomeSoftware Heritage

SVN loader: Add efficient incremental loader based on partial dumps
ClosedPublic

Authored by anlambert on Sep 20 2018, 8:21 PM.

Details

Summary

This diff adds a new loader class SWHSvnLoaderFromRemoteDump enabling
to load svn repositories in an incremental and efficient way.

This is a first draft in order to share the idea and the results.

The loader is based on the creation of dump files generated with the
rsvndump tool. It is not packaged in Debian so you must get it from
http://rsvndump.sourceforge.net/ first, then compile and install it.

Related T1161

Test Plan

To do, compare the loading of a sample svn repository in one
pass with the loading of the same repository in multiple passes.

Diff Detail

Repository
rDLDSVN Subversion (SVN) loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Well, sounds good already.
That matches my understanding of T1161 \m/

I'd split the __init__ into multiple function calls (retrieve dump, split dump, ...) to clarify steps.

Also beware for the actual implementation in tests, the calls to storage from __init__ will make things awkward.
I think trying to move those to prepare might be better ;)

Also beware for the actual implementation in tests, the calls to storage from init will make things awkward.
I think trying to move those to prepare might be better ;)

Following my own advice on D434 ;)

Improve implementation of the loader:

  • spread previous code into logical methods
  • move the dump file creation to the prepare method (following advices of D434)

Next step, write tests !

This revision was not accepted when it landed; it landed in state Needs Review.Sep 24 2018, 12:11 PM
This revision was automatically updated to reflect the committed changes.