Page MenuHomeSoftware Heritage

Google code svn import - 'Eventful partial visit. Detail: too many values to unpack'
Closed, ResolvedPublic

Description

On some repositories, an unpack error occurs during the directory manipulation.

After analysis, this is due to some symlinks having an empty space in their path (as they are read, they were basically split on empty space...).

Steps to reproduce on a local storage with latest swh-loader-svn:

  • Use /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/g/guidanceapp/guidanceapp-repo.svndump.gz.
  • execute the python3 toplevel:
dump = 'guidanceapp-repo.svndump.gz'
origin_url = 'http://foo/bar/svn'

import logging
logging.basicConfig(level=logging.DEBUG)

from swh.loader.svn.tasks import MountAndLoadSvnRepositoryTsk

t = MountAndLoadSvnRepositoryTsk()
t.run(archive_path=dump, origin_url=origin_url, visit_date='2016-05-03T15:16:32+00:00')
INFO:swh.loader.svn.SvnLoader:Archive to mount and load guidanceapp-repo.svndump.gz
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Creating svn origin for http://foo/bar/svn
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done creating svn origin for http://foo/bar/svn
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Creating origin_visit for origin 1 at time 2016-05-03T15:16:32+00:00
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done Creating origin_visit for origin 1 at time 2016-05-03T15:16:32+00:00
INFO:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:[revision_start-revision_end]: [1-176]
INFO:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Processing {'uuid': b'0ddbf88f-2c8e-4daf-859a-184bc4592f0f', 'remote_url': 'file:///tmp/tmp.zcjhxc5a.swh.loader.svn', 'local_url': b'/tmp/tmp.kbcjbspf.swh.loader/tmp.zcjhxc5a.swh.loader.svn', 'swh-origin': 1}.
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 1, swhrev: b321f5a2e3b6b1215be50f9fb3caed573f44b8bd, dir: 75ed58f260bfa4102d0e09657803511f5f0ab372
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 2, swhrev: aedfcf2d43d2df999e09fddaab59b6e374971cbd, dir: eed5d0b9d212549be902c280357d58146351bb54
...
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 163, swhrev: 4950f20b83bf6f1758d88657d22d3199ed13a95d, dir:9b37c5f0c3eb108e9128db7246a0639463329aee
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 164, swhrev: 0c12759f77083a0de1a6caccebdd4969559b1bbe, dir:957a27b8e22d16ae8ce98a6b75007580427bcb34
ERROR:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Eventful partial visit. Detail: too many values to unpack (expected 2)
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:occ: {'visit': 1, 'target': b'\xc2\xd2\xab2\x0c\x95<v\x0c\x9f\xc8z\x11\xe2\xd8\xf7\xc9\x958\x1e', 'branch': 'master', 'origin': 1, 'target_type': 'revision'}
ERROR:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Loading failure, updating to `partial` status
Traceback (most recent call last):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 217, in process_swh_revisions
    self.config['revision_packet_size']):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-core/swh/core/utils.py", line 40, in grouper
    for _data in itertools.zip_longest(*args, fillvalue=None):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 161, in process_svn_revisions
    for rev, nextrev, commit, new_objects, root_directory in gen_revs:
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/svn.py", line 266, in swh_hash_data_per_revision
    objects = self.swhreplay.compute_hashes(rev)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 333, in compute_hashes
    self.replay(rev)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 318, in replay
    self.conn.replay(rev, rev+1, self.editor)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 128, in close
    filetype, source_link = self.__make_symlink()
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 82, in __make_symlink
    filetype, src = f.read().split(b' ')
ValueError: too many values to unpack (expected 2)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-core/swh/loader/core/loader.py", line 686, in load
    self.store_data()
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 308, in store_data
    self.last_known_swh_revision)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 496, in process_repository
    svnrepo, revision_start, revision_end, revision_parents)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 236, in process_swh_revisions
    'id': _id,
swh.loader.svn.loader.SvnLoaderEventful: too many values to unpack (expected 2)
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Updating origin_visit for origin 1 with status partial
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done updating origin_visit for origin 1 with status partial
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Sending 286 contents
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done sending 286 contents
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Sending 193 directories
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done sending 193 directories
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Sending 1 occurrences
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done sending 1 occurrences
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Clean up temp directory /tmp/tmp.zcjhxc5a.swh.loader.svn for project