Changeset View
Changeset View
Standalone View
Standalone View
README
| swh-loader-svn | swh-loader-svn | ||||
| ============== | ============== | ||||
| Documents are in the ./docs folder: | Documents are in the ./docs folder: | ||||
| - Specification: ./docs/swh-loader-svn.txt | - Specification: ./docs/swh-loader-svn.txt | ||||
| - Comparison performance with git-svn: ./docs/comparison-git-svn-swh-svn.org | - Comparison performance with git-svn: ./docs/comparison-git-svn-swh-svn.org | ||||
| # Configuration file | # Configuration file | ||||
| ## Location | ## Location | ||||
| Either: | Either: | ||||
| - /etc/softwareheritage/loader/svn.ini | - /etc/softwareheritage/ | ||||
| - ~/.config/swh/loader/svn.ini | - ~/.config/swh/ | ||||
| - ~/.swh/loader/svn.ini | - ~/.swh/ | ||||
| Note: Will call that location $SWH_CONFIG_PATH | |||||
| ## Configuration sample | ## Configuration sample | ||||
| $SWH_CONFIG_PATH/loader/svn.yml: | |||||
| ``` | ``` | ||||
| storage: | storage: | ||||
| cls: remote | cls: remote | ||||
| args: | args: | ||||
| url: http://localhost:5002/ | url: http://localhost:5002/ | ||||
| send_contents: true | |||||
| send_directories: true | |||||
| send_revisions: true | |||||
| send_releases: true | |||||
| send_occurrences: true | |||||
| # nb of max contents to send for storage | |||||
| content_packet_size: 10000 | |||||
| # 100 Mib of content data | |||||
| content_packet_block_size_bytes: 104857600 | |||||
| # limit for swh content storage for one blob (beyond that limit, the | |||||
| # content's data is not sent for storage) | |||||
| content_packet_size_bytes: 1073741824 | |||||
| directory_packet_size: 2500 | |||||
| revision_packet_size: 10 | |||||
| release_packet_size: 1000 | |||||
| occurrence_packet_size: 1000 | |||||
| check_revision: 10 | check_revision: 10 | ||||
| ``` | ``` | ||||
| ## configuration content | ## configuration content | ||||
| With at least the following module (swh.loader.svn.tasks) and queue | With at least the following module (swh.loader.svn.tasks) and queue | ||||
| (swh_loader_svn): | (swh_loader_svn): | ||||
| $SWH_CONFIG_PATH/worker.yml: | |||||
| ``` | ``` | ||||
| [main] | task_broker: amqp://guest@localhost// | ||||
| task_broker = amqp://guest@localhost// | task_modules: | ||||
| task_modules = swh.loader.svn.tasks | task_modules: | ||||
| task_queues = swh_loader_svn | - swh.loader.svn.tasks | ||||
| task_queues: | |||||
| - swh_loader_svn | |||||
| task_soft_time_limit = 0 | task_soft_time_limit = 0 | ||||
| ``` | ``` | ||||
| swh.loader.svn.tasks and swh_loader_svn are the important entries here. | `swh.loader.svn.tasks` and `swh_loader_svn` are the important entries here. | ||||
| ## toplevel | |||||
| ``` | |||||
| $ python3 | |||||
| repo = 'pyang-repo-r343-eol-native-mixed-lf-crlf' | |||||
| #repo = 'zipeg-gae' | |||||
| origin_url = 'http://%s.googlecode.com' % repo | |||||
| local_repo_path = '/home/storage/svn/repo' | |||||
| svn_url = 'file://%s/%s' % (local_repo_path, repo) | |||||
| import logging | |||||
| logging.basicConfig(level=logging.DEBUG) | |||||
| from swh.loader.svn.tasks import LoadSWHSvnRepositoryTsk | |||||
| t = LoadSWHSvnRepositoryTsk() | |||||
| t.run(svn_url=svn_url, | |||||
| destination_path='/tmp', | |||||
| origin_url=origin_url, visit_date='2016-05-03T15:16:32+00:00', | |||||
| start_from_scratch=True) | |||||
| ``` | |||||
| ## Production like | |||||
| ## start worker instance | start worker instance | ||||
| To start a current worker instance: | To start a current worker instance: | ||||
| ```sh | ```sh | ||||
| python3 -m celery worker --app=swh.scheduler.celery_backend.config.app \ | python3 -m celery worker --app=swh.scheduler.celery_backend.config.app \ | ||||
| --pool=prefork \ | --pool=prefork \ | ||||
| --concurrency=10 \ | --concurrency=10 \ | ||||
| -Ofair \ | -Ofair \ | ||||
| Show All 38 Lines | |||||