Page MenuHomeSoftware Heritage

svn.loader: Uncompress the tarball during the `prepare` call
ClosedPublic

Authored by ardumont on Sep 21 2018, 2:01 PM.

Details

Summary

Prior to this commit, the uncompression step happened in the init
call. Now, this happens in the prepare method.

This allows to have 'partial' visit if something wrong happens during
uncompression step (against nothing at the moment).

This is symmetric to what we do with the mercurial loader
(HgArchiveBundle20Loader [1]).

[1] https://forge.softwareheritage.org/source/swh-loader-mercurial/browse/master/swh/loader/mercurial/bundle20_loader.py$506-517

Test Plan

Use toplevel and load a repository dump.
All should be fine.

cf. README.md adaptations (D434#change-e9wbZ9Ppa6Oy) which demonstrates such use case ;)

$ python3
>>> repo = '0-512-md'
>>> archive_name = '%s-repo.svndump.gz' % repo
>>> archive_path = '/home/storage/svn/dumps/%s' % archive_name
>>> origin_url = 'http://%s.googlecode.com' % repo
>>> svn_url = 'file://%s' % repo
>>>
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>>
>>> from swh.loader.svn.tasks import MountAndLoadSvnRepository
>>>
>>> t = MountAndLoadSvnRepository()
>>> t.run(archive_path=archive_path,
...       origin_url=origin_url,
...       visit_date='2016-05-03T15:16:32+00:00',
...       start_from_scratch=True)
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Creating svn origin for http://0-512-md.googlecode.com
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Done creating svn origin for http://0-512-md.googlecode.com
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Creating origin_visit for origin 152084 at time 2016-05-03T15:16:32+00:00
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Done Creating origin_visit for origin 152084 at time 2016-05-03T15:16:32+00:00
INFO:swh.scheduler.task.MountAndLoadSvnRepository:Archive to mount and load /home/storage/svn/dumps/0-512-md-repo.svndump.gz
INFO:swh.scheduler.task.MountAndLoadSvnRepository:Processing revisions [1-7] for {'swh-origin': 152084, 'remote_url': 'file:///tmp/swh.loader.svn.i64lzqlk-1721/dumps', 'local_url': b'/tmp/swh.loader.svn.mtmtstvm-1721/dumps', 'uuid': b'ec87962a-a51e-4ed8-98a0-4cb3e98efb5c'}
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:rev: 1, swhrev: 5a91adb1960e0a22e93791ec424ce499758ab341, dir: a9d1fb73fb683bfa494d1fe569136e0b4d644178
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Checking hash computations on revision 1...
DEBUG:root:cleanup /tmp/swh.loader.svn.mtmtstvm-1721/check-revision-1._2lz2vsz
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:rev: 2, swhrev: 8b226ab66df5fa515f73ea27ad4a5ed302fbaeb7, dir: ca600fef50d94d6c007fb578f74c30a4ac67ff5b
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Checking hash computations on revision 2...
DEBUG:root:cleanup /tmp/swh.loader.svn.mtmtstvm-1721/check-revision-2.rpd6r57p
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:rev: 3, swhrev: 408e8d85998348e93785d18084d0c86612dded61, dir: 68314eeabe4bc0344677c6cc531309fc08505c3c
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Checking hash computations on revision 3...
DEBUG:root:cleanup /tmp/swh.loader.svn.mtmtstvm-1721/check-revision-3.qj6eebkt
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:rev: 4, swhrev: 3ae5713da1b985873909df985227927c1e5b6ad3, dir: af4a87d85895a9c97fdb46a2fa937da4b5ed593a
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Checking hash computations on revision 4...
DEBUG:root:cleanup /tmp/swh.loader.svn.mtmtstvm-1721/check-revision-4.e065iufe
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:rev: 5, swhrev: b72db23b204e580e15c0ae7cf12fa4c66bddc5f6, dir: 2f3c4b5232fad459fe8a96ffb619da06b1c7bb59
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Checking hash computations on revision 5...
DEBUG:root:cleanup /tmp/swh.loader.svn.mtmtstvm-1721/check-revision-5._fvzvhc_
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:rev: 6, swhrev: 09183576e2e25e292290ed639bd637cd7414f175, dir: 6eb431da5ba5529f8aa6e900901adcc4206b45a8
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Checking hash computations on revision 6...
DEBUG:root:cleanup /tmp/swh.loader.svn.mtmtstvm-1721/check-revision-6.3znrs8_p
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:rev: 7, swhrev: e43f72e12c88abece79a87b8c9ad232e1b773d18, dir: a645db83f2e17629038ee029eac3d0ca9ae70262
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Checking hash computations on revision 7...
DEBUG:root:cleanup /tmp/swh.loader.svn.mtmtstvm-1721/check-revision-7.mrjbob0s
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:snapshot: {'id': b'\xa1\xa2\x8c\n\xb3\x87\xa8\xf9\xe0a\x8c\xb7\x05\xea\xb8\x1f\xc4H\xf4s', 'branches': {b'master': {'target': b'\xe4?r\xe1,\x88\xab\xec\xe7\x9a\x87\xb8\xc9\xad#.\x1bw=\x18', 'target_type': 'revision'}}}
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Sending 288 contents
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Done sending 288 contents
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Sending 34 directories
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Done sending 34 directories
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Sending 7 revisions
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Done sending 7 revisions
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Updating origin_visit for origin 152084 with status full
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Done updating origin_visit for origin 152084 with status full
DEBUG:root:cleanup /tmp/swh.loader.svn.mtmtstvm-1721
DEBUG:swh.scheduler.task.MountAndLoadSvnRepository:Clean up temporary directory dump /tmp/swh.loader.svn.i64lzqlk-1721 for project dumps
DEBUG:amqp:Start from server, version: 0.9, properties: {'capabilities': {'publisher_confirms': True, 'exchange_exchange_bindings': True, 'basic.nack': True, 'consumer_cancel_notify': True, 'connection.blocked': True, 'consumer_priorities': True, 'authentication_failure_close': True, 'per_consumer_qos': True, 'direct_reply_to': True}, 'cluster_name': 'rabbit@bespin', 'copyright': 'Copyright (C) 2007-2017 Pivotal Software, Inc.', 'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'platform': 'Erlang/OTP', 'product': 'RabbitMQ', 'version': '3.6.10'}, mechanisms: [b'PLAIN', b'AMQPLAIN'], locales: ['en_US']
DEBUG:amqp:using channel_id: 1
DEBUG:amqp:Channel open
{'status': 'eventful'}
>>>

Diff Detail

Repository
rDLDSVN Subversion (SVN) loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

ardumont added a project: SVN Loader.
swh/loader/svn/tasks.py
65 ↗(On Diff #1340)

As discussed orally, we remove that strange tiny bit.

Note:
I did not render the svn_url optional though as this could impact the initial SvnLoader.
I lazily don't want to tamper with this right now ;)

ardumont edited the test plan for this revision. (Show Details)
swh/loader/svn/loader.py
551 ↗(On Diff #1340)

How about dropping the svn_url parameter? it is kind of useless now.

559 ↗(On Diff #1340)

Remove that test if the svn_url parameter is dropped

swh/loader/svn/tasks.py
65 ↗(On Diff #1340)

From my point of view, I think you can safely drop it

Let's forget my comments about the svn_url parameter drop and let's land it!

This revision is now accepted and ready to land.Sep 21 2018, 2:20 PM

Let's forget my comments about the svn_url parameter drop

Right.

tl;dr We cannot remove it since we need to respect the base loader class' signature (SvnLoader needs it).

and let's land it!

\m/

Fix blank spaces in readme and rebase

  • README.md: Update sample usage
  • svn.tasks: Clarify tasks' docstring
  • svn.loader: Use correct scheme to prepare the svndump load
This revision was automatically updated to reflect the committed changes.