HomeSoftware Heritage

ra: Send modified objects only to storage after replaying a revision

Description

ra: Send modified objects only to storage after replaying a revision

Previously all contents and directories of the reconstructed filesystem
were sent to the storage after having replayed a svn revision.
The filtering of the new contents and directories to write to the storage
is then delegated to the storage filtering proxy.

Proceeding like this has a huge performance impact on the loading of large
subversion repositories as large sets of objects to archive are filtered
again and again after each revision replay.

That commit performs the objects filtering at the loader level instead of
delegating that task to the storage filtering proxy.
It is done by maintaining a set of added or modified paths for a given
revision when replaying it. As we use the svn_ra API, that set of paths
can be easily computed with confidence.

This change provides a really significant speedup to the overall loading
time of a subversion repository.

Related to T3839