Page MenuHomeSoftware Heritage

google code svn import: file exists
Closed, MigratedEdits Locked

Description

Repository to reproduce the error:

  • /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/p/poordecisions/poordecisions-repo.svndump.gz
dump = 'poordecisions-repo.svndump.gz'
origin_url = 'http://halpy.googlecode.com'

import logging
logging.basicConfig(level=logging.DEBUG)

from swh.loader.svn.tasks import MountAndLoadSvnRepositoryTsk

t = MountAndLoadSvnRepositoryTsk()
t.run(archive_path=dump, origin_url=origin_url, visit_date='2016-05-03T15:16:32+00:00')

Stacktrace:

>>> t.run(archive_path=dump, origin_url=origin_url, visit_date='2016-05-03T15:16:32+00:00')
INFO:swh.loader.svn.SvnLoader:Archive to mount and load poordecisions-repo.svndump.gz
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Creating svn origin for http://halpy.googlecode.com
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done creating svn origin for http://halpy.googlecode.com
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Creating origin_visit for origin 2 at time 2016-05-03T15:16:32+00:00
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done Creating origin_visit for origin 2 at time 2016-05-03T15:16:32+00:00
INFO:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:[revision_start-revision_end]: [1-1061]
INFO:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Processing {'local_url': b'/tmp/tmp.ky9bqpop.swh.loader/tmp.r3gixhpg.swh.loader.svn', 'swh-origin': 2, 'remote_url': 'file:///tmp/tmp.r3gixhpg.swh.loader.svn', 'uuid': b'c11c370f-1b26-4b7e-bc3a-6dab230fbc63'}.
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 1, swhrev: f1a401fbe377af5db14b83ae72cb146b4767a884, dir: 75ed58f260bfa4102d0e09657803511f5f0ab372
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 2, swhrev: 7c4b8e1fc64df450b9aefc3bbeb14e8c243f26c3, dir: 75ed58f260bfa4102d0e09657803511f5f0ab372
...
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 365, swhrev: 15dfb2d8f2a1c508eeaa16efa9c68c3b0790c7ba, dir: 08a773348b6240ab4fd17285826b0881383f1583
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 366, swhrev: 042a2f64849d43890e55ee48e3ac413c5f7e2f33, dir: 55f4771c779393abc3b45d58a6b25849fbc47209
ERROR:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Eventful partial visit. Detail: [Errno 17] File exists: b'/tmp/tmp.ky9bqpop.swh.loader/tmp.r3gixhpg.swh.loader.svn/apps/webos/PoorDecisions'
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:occ: {'visit': 5, 'target': b'\xb9\x1f\tP,\x89\xe2\xe7\x1e4\x12\x92v\x97\xe3\xb6\xc3>%p', 'origin': 2, 'target_type': 'revision', 'branch': 'master'}
ERROR:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Loading failure, updating to `partial` status
Traceback (most recent call last):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 217, in process_swh_revisions
    self.config['revision_packet_size']):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-core/swh/core/utils.py", line 40, in grouper
    for _data in itertools.zip_longest(*args, fillvalue=None):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 161, in process_svn_revisions
    for rev, nextrev, commit, new_objects, root_directory in gen_revs:
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/svn.py", line 266, in swh_hash_data_per_revision
    objects = self.swhreplay.compute_hashes(rev)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 343, in compute_hashes
    self.replay(rev)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 328, in replay
    self.conn.replay(rev, rev+1, self.editor)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 277, in add_directory
    os.makedirs(os.path.join(self.rootpath, path), exist_ok=True)
  File "/usr/lib/python3.5/os.py", line 241, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: b'/tmp/tmp.ky9bqpop.swh.loader/tmp.r3gixhpg.swh.loader.svn/apps/webos/PoorDecisions'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-core/swh/loader/core/loader.py", line 732, in load
    self.store_data()
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 308, in store_data
    self.last_known_swh_revision)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 496, in process_repository
    svnrepo, revision_start, revision_end, revision_parents)
  File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 236, in process_swh_revisions
    'id': _id,
swh.loader.svn.loader.SvnLoaderEventful: [Errno 17] File exists: b'/tmp/tmp.ky9bqpop.swh.loader/tmp.r3gixhpg.swh.loader.svn/apps/webos/PoorDecisions'

Event Timeline

Apparently a link exists with the same path. Thus the error.

So either:

  • a clean up is wrongly done (prior to this step)
  • an action from the server is sent but not capture and thus not applied...
  • something else entirely i don't foresee yet

Change in ra.py:

fullpath = os.path.join(self.rootpath, path)
try:
     os.makedirs(fullpath, exist_ok=True)
 except:
     print('path exists?', os.path.exists(fullpath))
     print('file?', os.path.isfile(fullpath))
     print('dir?', os.path.isdir(fullpath))
     print('link?', os.path.islink(fullpath))
     raise

output:

DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 366, swhrev: 042a2f64849d43890e55ee48e3ac413c5f7e2f33, dir: 55f4771c779393abc3b45d58a6b25849fbc47209
path exists? False
file? False
dir? False
link? True
ERROR:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Eventful partial visit. Detail: [Errno 17] File exists: b'/tmp/tmp.0k6pryou.swh.loader/tmp.2_s8nelu.swh.loader.svn/apps/webos/PoorDecisions'