Page MenuHomeSoftware Heritage

loader: Clean replay directory before post_load operation
ClosedPublic

Authored by anlambert on Dec 1 2021, 6:55 PM.

Details

Summary

The post_load operation will export the last loaded revision to a
new temporary directory to check possible revision divergence.

However the reconstructed filesystem for that revision still exists
in another temporary directory after all revisions have been replayed.

So ensure to clean that latter before post_load to gain some disk space
and avoid possible "No space left on device" errors.

Diff Detail

Repository
rDLDSVN Subversion (SVN) loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D6721 (id=24412)

Rebasing onto d0b14d9d08...

Current branch diff-target is up to date.
Changes applied before test
commit 5cbfe6d03fe0c11c483072fd5836048eaa57f114
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Dec 1 18:50:33 2021 +0100

    loader: Clean replay directory before post_load operation
    
    The post_load operation will export the last loaded revision to a
    new temporary directory to check possible revision divergence.
    
    However the reconstructed filesystem for that revision still exists
    in another temporary directory after all revisions have been replayed.
    
    So ensure to clean that latter before post_load to gain some disk space
    and avoid possible "No space left on device" errors.

See https://jenkins.softwareheritage.org/job/DLDSVN/job/tests-on-diff/207/ for more details.

ardumont added a subscriber: ardumont.

good idea

a suggestion inline

swh/loader/svn/loader.py
394–397

I think that's what done in the method cleanup already...
But as it's called after post_load, we do have too much stuff at the post_load step.
So this ^ should be enough, shouldn't it?

397

wasn't it already done?

This revision is now accepted and ready to land.Dec 2 2021, 9:45 AM
swh/loader/svn/loader.py
394–397

I tried to use cleanup in the first place but got that error when running tests:

Traceback (most recent call last):
  File "/home/anlambert/swh/swh-environment/swh-loader-core/swh/loader/core/loader.py", line 353, in load
    self.post_load()
  File "/home/anlambert/swh/swh-environment/swh-loader-svn/swh/loader/svn/tests/test_loader.py", line 1822, in post_load
    return super().post_load(success)
  File "/home/anlambert/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 555, in post_load
    self._check_revision_divergence(
  File "/home/anlambert/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 330, in _check_revision_divergence
    checked_dir_id = self.swh_revision_hash_tree_at_svn_revision(rev)
  File "/home/anlambert/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 162, in swh_revision_hash_tree_at_svn_revision
    local_dirname, local_url = self.svnrepo.export_temporary(revision)
  File "/home/anlambert/swh/swh-environment/swh-loader-svn/swh/loader/svn/svn.py", line 204, in export_temporary
    local_dirname = tempfile.mkdtemp(
  File "/usr/lib/python3.9/tempfile.py", line 498, in mkdtemp
    _os.mkdir(file, 0o700)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-anlambert/pytest-296/test_loader_svn_empty_local_di0/swh.loader.svn.ky9d3ggx-109839/check-revision-6.3xd63zy4'