Page MenuHomeSoftware Heritage

google code svn import: symlink: embedded null character in src
Closed, ResolvedPublic

Description

Repository /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/s/snakepitcs/snakepitcs-repo.svndump.gz

dump = 'halpy-repo.svndump.gz'
origin_url = 'http://halpy.googlecode.com'

import logging
logging.basicConfig(level=logging.DEBUG)

from swh.loader.svn.tasks import MountAndLoadSvnRepositoryTsk

t = MountAndLoadSvnRepositoryTsk()
t.run(archive_path=dump, origin_url=origin_url, visit_date='2016-05-03T15:16:32+00:00')

Stacktrace:

$ python3
Python 3.5.3 (default, Jan 19 2017, 14:11:04)
[GCC 6.3.0 20170118] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> dump = 'snakepitcs-repo.svndump.gz'
>>>
>>> url = dump.split('.')
>>> origin_url = 'http://%s.googlecode.com' % url[0]
>>>
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>>
>>> from swh.loader.svn.tasks import MountAndLoadSvnRepositoryTsk
>>>
>>> t = MountAndLoadSvnRepositoryTsk()
>>> t.run(archive_path=dump, origin_url=origin_url, visit_date='2016-05-03T15:16:32+00:00')
INFO:swh.loader.svn.SvnLoader:Archive to mount and load snakepitcs-repo.svndump.gz
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Creating svn origin for http://snakepitcs-repo.googlecode.com
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done creating svn origin for http://snakepitcs-repo.googlecode.com
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Creating origin_visit for origin 1 at time 2016-05-03T15:16:32+00:00
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done Creating origin_visit for origin 1 at time 2016-05-03T15:16:32+00:00
INFO:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:[revision_start-revision_end]: [1-594]
INFO:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Processing {'remote_url': 'file:///tmp/tmp.0ix7byg2.swh.loader.svn', 'uuid': b'c1685c75-f48e-402c-91f9-43ccff35c029', 'local_url': b'/tmp/tmp.9tp2_ksn.swh.loader/tmp.0ix7byg2.swh.loader.svn', 'swh-origin': 1}.
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 1, swhrev: 57b6dca1cdb5ce5f44f03b2089ff9085370e444e, dir: a9d1fb73fb683bfa494d1fe569136e0b4d644178
...
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 383, swhrev: 5b3aed6f70f648db38168c7071ca32314f95ef80, dir:f6944ad886c9e7035a176ac4b4fb22a4f3f8cb42
ERROR:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Eventful partial visit. Detail: symlink: embedded null characterin src
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:occ: {'visit': 1, 'branch': 'master', 'target_type': 'revision','origin': 1, 'target': b'\x14\xfc\x1c\x90\xc8\x99\x95_\x97\tPAx:\x1682\xbd\xc6\x8a'}
ERROR:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Loading failure, updating to `partial` status
Traceback (most recent call last):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 217, in process_swh_revisions
    self.config['revision_packet_size']):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-core/swh/core/utils.py", line 40, in grouper
    for _data in itertools.zip_longest(*args, fillvalue=None):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 161, in process_svn_revisions
    for rev, nextrev, commit, new_objects, root_directory in gen_revs:
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/svn.py", line 266, in swh_hash_data_per_revision
    objects = self.swhreplay.compute_hashes(rev)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 333, in compute_hashes
    self.replay(rev)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 318, in replay
    self.conn.replay(rev, rev+1, self.editor)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 131, in close
    filetype, source_link = self.__make_symlink()
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 88, in __make_symlink
    os.symlink(src=src, dst=self.fullpath)
ValueError: symlink: embedded null character in src

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-core/swh/loader/core/loader.py", line 742, in load
    self.store_data()
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 308, in store_data
    self.last_known_swh_revision)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 496, in process_repository
    svnrepo, revision_start, revision_end, revision_parents)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 236, in process_swh_revisions
    'id': _id,
swh.loader.svn.loader.SvnLoaderEventful: symlink: embedded null character in src
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Updating origin_visit for origin 1 with status partial
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done updating origin_visit for origin 1 with status partial
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Sending 789 contents
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done sending 789 contents
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Sending 580 directories
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done sending 580 directories
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Sending 1 occurrences
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done sending 1 occurrences
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Clean up temp directory /tmp/tmp.0ix7byg2.swh.loader.svn for project

Event Timeline

With latest fix from T839, this removes that edge case as well.
The symlink referenced here was not a symlink.
Which now makes sense with my first disconcerting analysis.

No links for that commit:

$ mkdir tmp; cd tmp
$ svnadmin create snakepitcs
$ gzip -dc ../../repos/snakepitcs-repo.svndump.gz | svnadmin load ./snakepitcs
$ svn checkout file:///home/storage/svn/repos/snakepitcs@384
$ cd snakepitcs
$ svn info | grep Revision | grep 384
Revision: 384
$ find -L . -xtype l
$  # nothing is listed here