Page MenuHomeSoftware Heritage

Properly handle loading of repository sub-tree
Closed, MigratedEdits Locked

Description

Subversion allows to perform checkout/export operation on a specific sub-tree of a repository, see below:

anlambert@carnavalet:/tmp$ svn info https://svn.code.sf.net/p/xvidcap/code/trunk/debian
Path: debian
URL: https://svn.code.sf.net/p/xvidcap/code/trunk/debian
Relative URL: ^/trunk/debian
Repository Root: https://svn.code.sf.net/p/xvidcap/code
Repository UUID: 521773ef-0118-0410-98fd-b0fa47ad2f46
Revision: 319
Node Kind: directory
Last Changed Author: charly4711
Last Changed Rev: 319
Last Changed Date: 2009-07-14 09:45:41 +0200 (mar., 14 juil. 2009)

anlambert@carnavalet:/tmp$ svn checkout https://svn.code.sf.net/p/xvidcap/code/trunk/debian xvidcap-debian
A    xvidcap-debian/rules
A    xvidcap-debian/changelog
A    xvidcap-debian/control
A    xvidcap-debian/postinst
A    xvidcap-debian/postrm
A    xvidcap-debian/copyright
A    xvidcap-debian/Makefile.am
A    xvidcap-debian/xvidcap.menu
A    xvidcap-debian/xvidcap.files
A    xvidcap-debian/compat
A    xvidcap-debian/bts
Checked out revision 319.

Currently, the subversion loader does not handle correctly that case due to the use of svnrdump.
Indeed, svnrdump filters the repository paths outside of the sub-tree but still dumps all
commits of the root repository. This means that the produced dump might contain empty commits
if those modify paths outside of the sub-tree.

Below is an extract of the dump file generated by svnrdump dump https://svn.code.sf.net/p/xvidcap/code/trunk/debian,
we can see there is commits without any modifications on the dumped sub-tree of the repository.

Revision-number: 15
Prop-content-length: 130
Content-length: 130

K 10
svn:author
V 10
charly4711
K 8
svn:date
V 27
2006-08-26T14:12:17.108970Z
K 7
svn:log
V 24
deleting ffmpeg-svn5528

PROPS-END

Revision-number: 16
Prop-content-length: 147
Content-length: 147

K 10
svn:author
V 10
charly4711
K 8
svn:date
V 27
2006-08-26T14:21:45.547817Z
K 7
svn:log
V 41
updated to new ffmpeg, loads of bugfixes

PROPS-END

Consequently, the subversion loader will generate a lot of empty revisions targeting the same directory
when loading data coming from such a dump.
This is what we can observe on that repository whose loading has been executed on staging.
If you look at the revisions history, you will find a lot of empty ones that should not have been archived.

So the loader implementation should be improved to properly handle the loading of a sub-tree by
filtering out the commits that do not modify paths in it.