Page MenuHomeSoftware Heritage

No OneTemporary

diff --git a/README b/README
deleted file mode 100644
index 90b283b..0000000
--- a/README
+++ /dev/null
@@ -1,96 +0,0 @@
-SWH-loader-dir
-==============
-
-The Software Heritage Directory Loader is a tool and a library to walk a local
-directory and inject into the SWH dataset all unknown contained files.
-
-
-Directory loader
-================
-
-
-### Configuration
-
-This is the loader's (or task's) configuration file.
-
-loader/dir.yml:
-
- storage:
- cls: remote
- args:
- url: http://localhost:5002/
-
- send_contents: True
- send_directories: True
- send_revisions: True
- send_releases: True
- send_occurrences: True
- # nb of max contents to send for storage
- content_packet_size: 100
- # 100 Mib of content data
- content_packet_block_size_bytes: 104857600
- # limit for swh content storage for one blob (beyond that limit, the
- # content's data is not sent for storage)
- content_packet_size_bytes: 1073741824
- directory_packet_size: 250
- revision_packet_size: 100
- release_packet_size: 100
- occurrence_packet_size: 100
-
-Present in possible locations:
-- ~/.config/swh/loader/dir.ini
-- ~/.swh/loader/dir.ini
-- /etc/softwareheritage/loader/dir.ini
-
-
-#### Toplevel
-
-Load directory directly from code or toplevel:
-
- from swh.loader.dir.loader import DirLoader
-
- dir_path = '/path/to/directory
-
- # Fill in those
- origin = {'url': 'some-origin', 'type': 'dir'}
- visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
- release = None
- revision = {}
- occurrence = {}
-
- DirLoader().load(dir_path, origin, visit_date, revision, release, [occurrence])
-
-
-#### Celery
-
-Load directory using celery.
-
-Providing you have a properly configured celery up and running
-
-worker.ini needs to be updated with the following keys:
-
- task_modules = swh.loader.dir.tasks
- task_queues = swh_loader_dir
-
-cf. https://forge.softwareheritage.org/diffusion/DCORE/browse/master/README.md
-for more details
-
-You can send the following message to the task queue:
-
- from swh.loader.dir.tasks import LoadDirRepository
-
- # Fill in those
- origin = {'url': 'some-origin', 'type': 'dir'}
- visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
- release = None
- revision = {}
- occurrence = {}
-
- # Send message to the task queue
- LoaderDirRepository().run(('/path/to/dir, origin, visit_date, revision, release, [occurrence]))
-
-
-Directory producer
-==================
-
-None
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..0047603
--- /dev/null
+++ b/README.md
@@ -0,0 +1,96 @@
+SWH-loader-dir
+===================
+
+The Software Heritage Directory Loader is a tool and a library.
+
+Its sole purpose is to walk a local directory and inject into the SWH
+dataset all unknown contained files from that directory structure.
+
+
+## Configuration
+
+The loader needs a configuration file in *`{/etc/softwareheritage |
+~/.config/swh | ~/.swh}`/loader/dir.yml*.
+
+This file should be similar to this (adapt according to your needs):
+
+``` yaml
+storage:
+ cls: remote
+ args:
+ url: http://localhost:5002/
+
+send_contents: True
+send_directories: True
+send_revisions: True
+send_releases: True
+send_occurrences: True
+# nb of max contents to send for storage
+content_packet_size: 100
+# 100 Mib of content data
+content_packet_block_size_bytes: 104857600
+# limit for swh content storage for one blob (beyond that limit, the
+# content's data is not sent for storage)
+content_packet_size_bytes: 1073741824
+directory_packet_size: 250
+revision_packet_size: 100
+release_packet_size: 100
+occurrence_packet_size: 100
+```
+
+## Run
+
+To run the loader, you can use either:
+
+- python3's toplevel
+- celery
+
+### Toplevel
+
+Load directory directly from code or toplevel:
+
+``` Python
+from swh.loader.dir.loader import DirLoader
+
+dir_path = '/path/to/directory
+
+# Fill in those
+origin = {'url': 'some-origin', 'type': 'dir'}
+visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
+release = None
+revision = {}
+occurrence = {}
+
+DirLoader().load(dir_path, origin, visit_date, revision, release, [occurrence])
+```
+
+### Celery
+
+To use celery, add the following entries in the
+*`{/etc/softwareheritage | ~/.config/swh | ~/.swh}`/worker.yml*` file:
+
+``` yaml
+task_modules:
+ - swh.loader.dir.tasks
+task_queues:
+ - swh_loader_dir
+```
+
+cf. [swh-core's documentation](https://forge.softwareheritage.org/diffusion/DCORE/browse/master/README.md) for
+more details.
+
+You can then send the following message to the task queue:
+
+``` Python
+from swh.loader.dir.tasks import LoadDirRepository
+
+# Fill in those
+origin = {'url': 'some-origin', 'type': 'dir'}
+visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
+release = None
+revision = {}
+occurrence = {}
+
+# Send message to the task queue
+LoaderDirRepository().run(('/path/to/dir', origin, visit_date, revision, release, [occurrence]))
+```
diff --git a/docs/.gitignore b/docs/.gitignore
index 58a761e..f6b5c55 100644
--- a/docs/.gitignore
+++ b/docs/.gitignore
@@ -1,3 +1,4 @@
_build/
apidoc/
*-stamp
+README.md
diff --git a/docs/Makefile b/docs/Makefile
index c30c50a..ec260d2 100644
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -1 +1,6 @@
include ../../swh-docs/Makefile.sphinx
+
+html: copy_md
+
+copy_md:
+ cp ../README.md README.md
diff --git a/docs/index.rst b/docs/index.rst
index 8b64117..2e88ed8 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,15 +1,15 @@
Software Heritage - Development Documentation
=============================================
.. toctree::
:maxdepth: 2
:caption: Contents:
-
+ README.md
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

File Metadata

Mime Type
application/octet-stream
Expires
Mon, Jul 21, 11:36 PM (2 d)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3320794

Event Timeline