diff --git a/docs/getting-started.rst b/docs/getting-started.rst index 33f1749b..5e4556b1 100644 --- a/docs/getting-started.rst +++ b/docs/getting-started.rst @@ -1,286 +1,286 @@ Getting Started =============== This is a guide for how to prepare and push a software deposit with the `swh deposit` commands. The API is rooted at https://deposit.softwareheritage.org/1. For more details, see the `main documentation <./index.html>`__. Requirements ------------ You need to be referenced on SWH's client list to have: * credentials (needed for the basic authentication step) - in this document we reference ```` as the client's name and ```` as its associated authentication password. * an associated collection_. .. _collection: https://bitworking.org/projects/atom/rfc5023#rfc.section.8.3.3 `Contact us for more information. `__ Prepare a deposit ----------------- * compress the files in a supported archive format: - zip: common zip archive (no multi-disk zip files). - tar: tar archive without compression or optionally any of the following compression algorithm gzip (`.tar.gz`, `.tgz`), bzip2 (`.tar.bz2`) , or lzma (`.tar.lzma`) * (Optional) prepare a metadata file (more details :ref:`deposit-metadata`): Push deposit ------------ You can push a deposit with: * a single deposit (archive + metadata): The user posts in one query a software source code archive and associated metadata. The deposit is directly marked with status ``deposited``. * a multisteps deposit: 1. Create an incomplete deposit (marked with status ``partial``) 2. Add data to a deposit (in multiple requests if needed) 3. Finalize deposit (the status becomes ``deposited``) Single deposit ^^^^^^^^^^^^^^ Once the files are ready for deposit, we want to do the actual deposit in one shot, sending exactly one POST query: * 1 archive (content-type ``application/zip`` or ``application/x-tar``) * 1 metadata file in atom xml format (``content-type: application/atom+xml;type=entry``) For this, we need to provide the: * arguments: ``--username 'name' --password 'pass'`` as credentials * archive's path (example: ``--archive path/to/archive-name.tgz``) * software's name (optional if a metadata filepath is specified and the artifact's name is included in the metadata file). * author's name (optional if a metadata filepath is specified and the authors are included in the metadata file). This can be specified multiple times in case of multiple authors. * (optionally) metadata file's path ``--metadata path/to/file.metadata.xml``. * (optionally) ``--slug 'your-id'`` argument, a reference to a unique identifier the client uses for the software object. If not provided, A UUID will be generated by SWH. You can do this with the following command: minimal deposit .. code:: shell $ swh deposit upload --username name --password secret \ - --author "some@noone" --author "second@noone" \ + --author "some@nobody" --author "second@nobody" \ --name 'je-suis-gpl' \ --archive je-suis-gpl.tgz with client's external identifier (``slug``) .. code:: shell $ swh deposit upload --username name --password secret \ - --author "some@noone" \ + --author "some@nobody" \ --name 'je-suis-gpl' \ --archive je-suis-gpl.tgz \ --slug je-suis-gpl to a specific client's collection .. code:: shell $ swh deposit upload --username name --password secret \ - --author "some@noone" \ + --author "some@nobody" \ --name 'je-suis-gpl' \ --archive je-suis-gpl.tgz \ --collection 'second-collection' You just posted a deposit to your collection on Software Heritage If everything went well, the successful response will contain the elements below: .. code:: shell { 'deposit_status': 'deposited', 'deposit_id': '7', 'deposit_date': 'Jan. 29, 2018, 12:29 p.m.' } Note: As the deposit is in ``deposited`` status, you can no longer update the deposit after this query. It will be answered with a 403 forbidden answer. If something went wrong, an equivalent response will be given with the `error` and `detail` keys explaining the issue, e.g.: .. code:: shell { 'error': 'Unknown collection name xyz', 'detail': None, 'deposit_status': None, 'deposit_status_detail': None, 'deposit_swh_id': None, 'status': 404 } multisteps deposit ^^^^^^^^^^^^^^^^^^^^^^^^^ The steps to create a multisteps deposit: 1. Create an incomplete deposit ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ First use the ``--partial`` argument to declare there is more to come .. code:: shell $ swh deposit upload --username name --password secret \ --archive foo.tar.gz \ --partial 2. Add content or metadata to the deposit ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Continue the deposit by using the ``--deposit-id`` argument given as a response for the first step. You can continue adding content or metadata while you use the ``--partial`` argument. To only add one new archive to the deposit: .. code:: shell $ swh deposit upload --username name --password secret \ --archive add-foo.tar.gz \ --deposit-id 42 \ --partial To only add metadata to the deposit: .. code:: shell $ swh deposit upload --username name --password secret \ --metadata add-foo.tar.gz.metadata.xml \ --deposit-id 42 \ --partial or: .. code:: shell $ swh deposit upload --username name --password secret \ --name 'add-foo' --author 'someone' \ --deposit-id 42 \ --partial 3. Finalize deposit ~~~~~~~~~~~~~~~~~~~ On your last addition (same command as before), by not declaring it ``--partial``, the deposit will be considered completed. Its status will be changed to ``deposited`` Update deposit ---------------- * replace deposit: - only possible if the deposit status is ``partial`` and ``--deposit-id `` is provided - by using the ``--replace`` flag - ``--metadata-deposit`` replaces associated existing metadata - ``--archive-deposit`` replaces associated archive(s) - by default, with no flag or both, you'll replace associated metadata and archive(s): .. code:: shell $ swh deposit upload --username name --password secret \ --deposit-id 11 \ --archive updated-je-suis-gpl.tgz \ --replace * update a loaded deposit with a new version: - by using the external-id with the ``--slug`` argument, you will link the new deposit with its parent deposit: .. code:: shell $ swh deposit upload --username name --password secret \ --archive je-suis-gpl-v2.tgz \ --slug 'je-suis-gpl' \ Check the deposit's status -------------------------- You can check the status of the deposit by using the ``--deposit-id`` argument: .. code:: shell $ swh deposit status --username name --password secret \ --deposit-id 11 .. code:: json { 'deposit_id': '11', 'deposit_status': 'deposited', 'deposit_swh_id': None, 'deposit_status_detail': 'Deposit is ready for additional checks \ (tarball ok, metadata, etc...)' } The different statuses: - **partial**: multipart deposit is still ongoing - **deposited**: deposit completed - **rejected**: deposit failed the checks - **verified**: content and metadata verified - **loading**: loading in-progress - **done**: loading completed successfully - **failed**: the deposit loading has failed When the deposit has been loaded into the archive, the status will be marked ``done``. In the response, will also be available the , , , . For example: .. code:: json { 'deposit_id': '11', 'deposit_status': 'done', 'deposit_swh_id': 'swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9', 'deposit_swh_id_context': 'swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9;origin=https://forge.softwareheritage.org/source/jesuisgpl/', 'deposit_swh_anchor_id': 'swh:1:rev:e76ea49c9ffbb7f73611087ba6e999b19e5d71eb', 'deposit_swh_anchor_id_context': 'swh:1:rev:e76ea49c9ffbb7f73611087ba6e999b19e5d71eb;origin=https://forge.softwareheritage.org/source/jesuisgpl/', 'deposit_status_detail': 'The deposit has been successfully \ loaded into the Software Heritage archive' } diff --git a/swh/deposit/cli/admin.py b/swh/deposit/cli/admin.py index 364ee32e..100a7ef1 100644 --- a/swh/deposit/cli/admin.py +++ b/swh/deposit/cli/admin.py @@ -1,254 +1,254 @@ # Copyright (C) 2017-2019 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information import click from swh.deposit.config import setup_django_for from swh.deposit.cli import deposit @deposit.group('admin') @click.option('--config-file', '-C', default=None, type=click.Path(exists=True, dir_okay=False,), help="Optional extra configuration file.") @click.option('--platform', default='development', type=click.Choice(['development', 'production']), help='development or production platform') @click.pass_context def admin(ctx, config_file, platform): """Server administration tasks (manipulate user or collections)""" # configuration happens here setup_django_for(platform, config_file=config_file) @admin.group('user') @click.pass_context def user(ctx): """Manipulate user.""" # configuration happens here pass def _create_collection(name): """Create the collection with name if it does not exist. Args: name (str): collection's name Returns: collection (DepositCollection): the existing collection object (created or not) """ # to avoid loading too early django namespaces from swh.deposit.models import DepositCollection try: collection = DepositCollection.objects.get(name=name) click.echo('Collection %s exists, nothing to do.' % name) except DepositCollection.DoesNotExist: click.echo('Create new collection %s' % name) collection = DepositCollection.objects.create(name=name) click.echo('Collection %s created' % name) return collection @user.command('create') @click.option('--username', required=True, help="User's name") @click.option('--password', required=True, help="Desired user's password (plain).") @click.option('--firstname', default='', help="User's first name") @click.option('--lastname', default='', help="User's last name") @click.option('--email', default='', help="User's email") @click.option('--collection', help="User's collection") @click.option('--provider-url', default='', help="Provider URL") @click.option('--domain', default='', help="The domain") @click.pass_context def user_create(ctx, username, password, firstname, lastname, email, collection, provider_url, domain): """Create a user with some needed information (password, collection) If the collection does not exist, the collection is then created alongside. - The password is stored encrypted using django's utilies. + The password is stored encrypted using django's utilities. """ # to avoid loading too early django namespaces from swh.deposit.models import DepositClient # If collection is not provided, fallback to username if not collection: collection = username click.echo('collection: %s' % collection) # create the collection if it does not exist collection = _create_collection(collection) # user create/update try: user = DepositClient.objects.get(username=username) click.echo('User %s exists, updating information.' % user) user.set_password(password) except DepositClient.DoesNotExist: click.echo('Create new user %s' % username) user = DepositClient.objects.create_user( username=username, password=password) user.collections = [collection.id] user.first_name = firstname user.last_name = lastname user.email = email user.is_active = True user.provider_url = provider_url user.domain = domain user.save() click.echo('Information registered for user %s' % user) @user.command('list') @click.pass_context def user_list(ctx): """List existing users. This entrypoint is not paginated yet as there is not a lot of entry. """ # to avoid loading too early django namespaces from swh.deposit.models import DepositClient users = DepositClient.objects.all() if not users: output = 'Empty user list' else: output = '\n'.join((user.username for user in users)) click.echo(output) @user.command('exists') @click.argument('username', required=True) @click.pass_context def user_exists(ctx, username): """Check if user exists. """ # to avoid loading too early django namespaces from swh.deposit.models import DepositClient try: DepositClient.objects.get(username=username) click.echo('User %s exists.' % username) ctx.exit(0) except DepositClient.DoesNotExist: click.echo('User %s does not exist.' % username) ctx.exit(1) @admin.group('collection') @click.pass_context def collection(ctx): """Manipulate collections.""" pass @collection.command('create') @click.option('--name', required=True, help="Collection's name") @click.pass_context def collection_create(ctx, name): _create_collection(name) @collection.command('list') @click.pass_context def collection_list(ctx): """List existing collections. This entrypoint is not paginated yet as there is not a lot of entry. """ # to avoid loading too early django namespaces from swh.deposit.models import DepositCollection collections = DepositCollection.objects.all() if not collections: output = 'Empty collection list' else: output = '\n'.join((col.name for col in collections)) click.echo(output) @admin.group('deposit') @click.pass_context def deposit(ctx): """Manipulate deposit.""" pass @deposit.command('reschedule') @click.option('--deposit-id', required=True, help="Deposit identifier") @click.pass_context def deposit_reschedule(ctx, deposit_id): """Reschedule the deposit loading This will: - check the deposit's status to something reasonable (failed or done). That means that the checks have passed alright but something went wrong during the loading (failed: loading failed, done: loading ok, still for some reasons as in bugs, we need to reschedule it) - reset the deposit's status to 'verified' (prior to any loading but after the checks which are fine) and removes the different archives' identifiers (swh-id, ...) - trigger back the loading task through the scheduler """ # to avoid loading too early django namespaces from datetime import datetime from swh.deposit.models import Deposit from swh.deposit.config import ( DEPOSIT_STATUS_LOAD_SUCCESS, DEPOSIT_STATUS_LOAD_FAILURE, DEPOSIT_STATUS_VERIFIED, SWHDefaultConfig, ) try: deposit = Deposit.objects.get(pk=deposit_id) except Deposit.DoesNotExist: click.echo('Deposit %s does not exist.' % deposit_id) ctx.exit(1) # Check the deposit is in a reasonable state accepted_statuses = [ DEPOSIT_STATUS_LOAD_SUCCESS, DEPOSIT_STATUS_LOAD_FAILURE ] if deposit.status == DEPOSIT_STATUS_VERIFIED: click.echo('Deposit %s\'s status already set for rescheduling.' % ( deposit_id)) ctx.exit(0) if deposit.status not in accepted_statuses: click.echo('Deposit %s\'s status be one of %s.' % ( deposit_id, ', '.join(accepted_statuses))) ctx.exit(1) task_id = deposit.load_task_id if not task_id: click.echo('Deposit %s cannot be rescheduled. It misses the ' 'associated task.' % deposit_id) ctx.exit(1) # Reset the deposit's state deposit.swh_id = None deposit.swh_id_context = None deposit.swh_anchor_id = None deposit.swh_anchor_id_context = None deposit.status = DEPOSIT_STATUS_VERIFIED deposit.save() # Trigger back the deposit scheduler = SWHDefaultConfig().scheduler scheduler.set_status_tasks( [task_id], status='next_run_not_scheduled', next_run=datetime.now())