Page MenuHomeSoftware Heritage

docker-dev: fix vault test which fails the build
Closed, MigratedEdits Locked

Description

Build broken [1] [2]

Recent change introduced a symlink as README.rst in the swh-core module.
This module is used to try and build the tarball from the vault.
Which now fails the assertion.

So either the vault is fine and the test needs to be improved.
Or the vault builds the tarball without taking into account the symlink case.
In any case, a fix is in order.

[1]

05:02:51 =================================== FAILURES ===================================
05:02:51 _____________________________ test_vault_directory _____________________________
05:02:51
05:02:51 scheduler_host = <testinfra.host.Host docker://docker_swh-scheduler_run_c030d9fb56f4>
05:02:51 git_origin = 'https://forge.softwareheritage.org/source/swh-core'
05:02:51
05:02:51     def test_vault_directory(scheduler_host, git_origin):
05:02:51         # retrieve the root directory of the master branch of the ingested git
05:02:51         # repository (by the git_origin fixture)
05:02:51         visit = apiget(f'origin/{quote_plus(git_origin)}/visit/latest')
05:02:51         snapshot = apiget(f'snapshot/{visit["snapshot"]}')
05:02:51         rev_id = snapshot["branches"]["refs/heads/master"]["target"]
05:02:51         revision = apiget(f'revision/{rev_id}')
05:02:51         dir_id = revision['directory']
05:02:51
05:02:51         # now cook it
05:02:51         cook = apiget(f'vault/directory/{dir_id}/', 'POST')
05:02:51         assert cook['obj_type'] == 'directory'
05:02:51         assert cook['obj_id'] == dir_id
05:02:51         assert cook['fetch_url'].endswith(f'vault/directory/{dir_id}/raw/')
05:02:51
05:02:51         # while it's cooking, get the directory tree from the archive
05:02:51         directory = getdirectory(dir_id)
05:02:51
05:02:51         # retrieve the cooked tar file
05:02:51         resp = pollapi(f'vault/directory/{dir_id}/raw')
05:02:51         tarf = tarfile.open(fileobj=io.BytesIO(resp.content))
05:02:51
05:02:51         # and check the tarfile seems ok wrt. 'directory'
05:02:51         assert tarf.getnames()[0] == dir_id
05:02:51         tarfiles = {t.name: t for t in tarf.getmembers()}
05:02:51
05:02:51         for fname, fdesc in directory:
05:02:51             tfinfo = tarfiles.get(join(dir_id, fname))
05:02:51             assert tfinfo, f"Missing path {fname} in retrieved tarfile"
05:02:51             if fdesc['type'] == 'file':
05:02:51 >               assert fdesc['length'] == tfinfo.size, \
05:02:51                     f"File {fname}: length mismatch"
05:02:51 E               AssertionError: File docs/README.rst: length mismatch
05:02:51 E               assert 13 == 0
05:02:51 E                +  where 0 = <TarInfo '24e386f2cd7ec6629b703e0bc6ece831bfe683b4/docs/README.rst' at 0x7f39203acb38>.size

[2] https://jenkins.softwareheritage.org/view/all/job/swh-docker-dev/772/consoleFull

Event Timeline

ardumont renamed this task from docker-deb: fix vault test which fails the build to docker-dev: fix vault test which fails the build.Mar 29 2021, 9:43 AM
ardumont triaged this task as Normal priority.
ardumont created this task.

So either the vault is fine and the test needs to be improved.
Or the vault builds the tarball without taking into account the symlink case.

Based on my tests the vault properly handle symlinks when cooking a directory, see the
ls output after having extracted a cooked tarball:

anlambert@carnavalet:/tmp/9f9bc9da9acdd39caa1ff8e9f41b32c7e7c9a7a1/docs$ ls -lrt
total 24
lrwxrwxrwx 1 anlambert anlambert   13 avril  8 11:42 README.rst -> ../README.rst
-rw-r--r-- 1 anlambert anlambert   39 avril  8 11:42 Makefile
-rw-r--r-- 1 anlambert anlambert  131 avril  8 11:42 index.rst
-rw-r--r-- 1 anlambert anlambert   43 avril  8 11:42 conf.py
-rw-r--r-- 1 anlambert anlambert  396 avril  8 11:42 cli.rst
drwxr-xr-x 2 anlambert anlambert 4096 avril  8 11:44 _templates
drwxr-xr-x 2 anlambert anlambert 4096 avril  8 11:44 _static

The issue comes from the vault test of the docker environment, looks like a symlink has a size of zero bytes
inside the tarball while it has a size of 13 bytes once extracted on the filesystem.

==================================================================================================== FAILURES ====================================================================================================
______________________________________________________________________________________________ test_vault_directory ______________________________________________________________________________________________

scheduler_host = <testinfra.host.Host docker://docker_swh-scheduler_run_3f4ad1d98220>, git_origin = 'https://forge.softwareheritage.org/source/swh-core'

    def test_vault_directory(scheduler_host, git_origin):
        # retrieve the root directory of the master branch of the ingested git
        # repository (by the git_origin fixture)
        visit = apiget(f"origin/{quote_plus(git_origin)}/visit/latest")
        snapshot = apiget(f'snapshot/{visit["snapshot"]}')
        rev_id = snapshot["branches"]["refs/heads/master"]["target"]
        revision = apiget(f"revision/{rev_id}")
        dir_id = revision["directory"]
    
        # now cook it
        cook = apiget(f"vault/directory/{dir_id}/", "POST")
        assert cook["obj_type"] == "directory"
        assert cook["obj_id"] == dir_id
        assert cook["fetch_url"].endswith(f"vault/directory/{dir_id}/raw/")
    
        # while it's cooking, get the directory tree from the archive
        directory = getdirectory(dir_id)
    
        # retrieve the cooked tar file
        resp = pollapi(f"vault/directory/{dir_id}/raw")
        tarf = tarfile.open(fileobj=io.BytesIO(resp.content))
    
        # and check the tarfile seems ok wrt. 'directory'
        assert tarf.getnames()[0] == dir_id
        tarfiles = {t.name: t for t in tarf.getmembers()}
    
        for fname, fdesc in directory:
            tfinfo = tarfiles.get(join(dir_id, fname))
            assert tfinfo, f"Missing path {fname} in retrieved tarfile"
            if fdesc["type"] == "file":
>               assert fdesc["length"] == tfinfo.size, f"File {fname}: length mismatch"
E               AssertionError: File docs/README.rst: length mismatch
E               assert 13 == 0
E                 +13
E                 -0

tests/test_vault.py:47: AssertionError
============================================================================================ short test summary info =============================================================================================
FAILED tests/test_vault.py::test_vault_directory - AssertionError: File docs/README.rst: length mismatch

A simple fix could be to discard any symlinks when checking for length in the test.