Page MenuHomeSoftware Heritage

svn: Ensure proper incremental loading of a repository
ClosedPublic

Authored by anlambert on Nov 19 2021, 10:52 AM.

Details

Summary

When performing an incremental loading of a subversion repository, i.e. only
load new revisions issued since the last loading, we need to replay the whole
set of path modifications since the first revision in order to restore possible
file states induced by setting svn properties on those files.

For instance if a file got the svn:eol-style property set in a revision lesser
than the last one loaded into the archive, we will miss that information if
we do not start replaying the paths modifications since the first revision.

So ensure to replay path modifications since first revision and start yielding
new data to archive once we reached the revision to resume the loading from.

Depends on D6660

Diff Detail

Repository
rDLDSVN Subversion (SVN) loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D6661 (id=24202)

Could not rebase; Attempt merge onto 0237d07b17...

Updating 0237d07..ed0728c
Fast-forward
 swh/loader/svn/loader.py            |  2 +-
 swh/loader/svn/svn.py               | 11 +++++-
 swh/loader/svn/tests/test_loader.py | 74 ++++++++++++++++++++++++++++++++++++-
 3 files changed, 83 insertions(+), 4 deletions(-)
Changes applied before test
commit ed0728c8b02e893fe54c07e8260414e9b3479eb7
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Nov 18 22:44:52 2021 +0100

    svn: Ensure proper incremental loading of a repository
    
    When performing an incremental loading of a subversion repository, i.e. only
    load new revisions issued since the last loading, we need to replay the whole
    set of path modifications since the first revision in order to restore possible
    file states induced by setting svn properties on those files.
    
    For instance if a file got the svn:eol-style property set in a revision lesser
    than the last one loaded into the archive, we will miss that information if
    we do not start replaying the paths modifications since the first revision.
    
    So ensure to replay path modifications since first revision and start yielding
    new data to archive once we reached the revision to resume the loading from.

commit c9aaa314df27b0ce62df1ddf9008ec7dca97ad82
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Nov 18 19:58:20 2021 +0100

    loader: Ensure to reload from first revision when history got altered
    
    Previously, the reloading was performed starting revision 2.

See https://jenkins.softwareheritage.org/job/DLDSVN/job/tests-on-diff/197/ for more details.

ardumont added a subscriber: ardumont.

Nice catch.

svn...

This revision is now accepted and ready to land.Nov 19 2021, 11:51 AM