Page MenuHomeSoftware Heritage

No OneTemporary

diff --git a/docs/getting-started.rst b/docs/getting-started.rst
index 33f1749b..5e4556b1 100644
--- a/docs/getting-started.rst
+++ b/docs/getting-started.rst
@@ -1,286 +1,286 @@
Getting Started
===============
This is a guide for how to prepare and push a software deposit with
the `swh deposit` commands.
The API is rooted at https://deposit.softwareheritage.org/1.
For more details, see the `main documentation <./index.html>`__.
Requirements
------------
You need to be referenced on SWH's client list to have:
* credentials (needed for the basic authentication step)
- in this document we reference ``<name>`` as the client's name and
``<pass>`` as its associated authentication password.
* an associated collection_.
.. _collection: https://bitworking.org/projects/atom/rfc5023#rfc.section.8.3.3
`Contact us for more information.
<https://www.softwareheritage.org/contact/>`__
Prepare a deposit
-----------------
* compress the files in a supported archive format:
- zip: common zip archive (no multi-disk zip files).
- tar: tar archive without compression or optionally any of the
following compression algorithm gzip (`.tar.gz`, `.tgz`), bzip2
(`.tar.bz2`) , or lzma (`.tar.lzma`)
* (Optional) prepare a metadata file (more details :ref:`deposit-metadata`):
Push deposit
------------
You can push a deposit with:
* a single deposit (archive + metadata):
The user posts in one query a software
source code archive and associated metadata.
The deposit is directly marked with status ``deposited``.
* a multisteps deposit:
1. Create an incomplete deposit (marked with status ``partial``)
2. Add data to a deposit (in multiple requests if needed)
3. Finalize deposit (the status becomes ``deposited``)
Single deposit
^^^^^^^^^^^^^^
Once the files are ready for deposit, we want to do the actual deposit
in one shot, sending exactly one POST query:
* 1 archive (content-type ``application/zip`` or ``application/x-tar``)
* 1 metadata file in atom xml format (``content-type: application/atom+xml;type=entry``)
For this, we need to provide the:
* arguments: ``--username 'name' --password 'pass'`` as credentials
* archive's path (example: ``--archive path/to/archive-name.tgz``)
* software's name (optional if a metadata filepath is specified and the
artifact's name is included in the metadata file).
* author's name (optional if a metadata filepath is specified and the authors
are included in the metadata file). This can be specified multiple times in
case of multiple authors.
* (optionally) metadata file's path ``--metadata
path/to/file.metadata.xml``.
* (optionally) ``--slug 'your-id'`` argument, a reference to a unique identifier
the client uses for the software object. If not provided, A UUID will be
generated by SWH.
You can do this with the following command:
minimal deposit
.. code:: shell
$ swh deposit upload --username name --password secret \
- --author "some@noone" --author "second@noone" \
+ --author "some@nobody" --author "second@nobody" \
--name 'je-suis-gpl' \
--archive je-suis-gpl.tgz
with client's external identifier (``slug``)
.. code:: shell
$ swh deposit upload --username name --password secret \
- --author "some@noone" \
+ --author "some@nobody" \
--name 'je-suis-gpl' \
--archive je-suis-gpl.tgz \
--slug je-suis-gpl
to a specific client's collection
.. code:: shell
$ swh deposit upload --username name --password secret \
- --author "some@noone" \
+ --author "some@nobody" \
--name 'je-suis-gpl' \
--archive je-suis-gpl.tgz \
--collection 'second-collection'
You just posted a deposit to your collection on Software Heritage
If everything went well, the successful response will contain the
elements below:
.. code:: shell
{
'deposit_status': 'deposited',
'deposit_id': '7',
'deposit_date': 'Jan. 29, 2018, 12:29 p.m.'
}
Note: As the deposit is in ``deposited`` status, you can no longer
update the deposit after this query. It will be answered with a 403
forbidden answer.
If something went wrong, an equivalent response will be given with the
`error` and `detail` keys explaining the issue, e.g.:
.. code:: shell
{
'error': 'Unknown collection name xyz',
'detail': None,
'deposit_status': None,
'deposit_status_detail': None,
'deposit_swh_id': None,
'status': 404
}
multisteps deposit
^^^^^^^^^^^^^^^^^^^^^^^^^
The steps to create a multisteps deposit:
1. Create an incomplete deposit
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
First use the ``--partial`` argument to declare there is more to come
.. code:: shell
$ swh deposit upload --username name --password secret \
--archive foo.tar.gz \
--partial
2. Add content or metadata to the deposit
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Continue the deposit by using the ``--deposit-id`` argument given as a response
for the first step. You can continue adding content or metadata while you use
the ``--partial`` argument.
To only add one new archive to the deposit:
.. code:: shell
$ swh deposit upload --username name --password secret \
--archive add-foo.tar.gz \
--deposit-id 42 \
--partial
To only add metadata to the deposit:
.. code:: shell
$ swh deposit upload --username name --password secret \
--metadata add-foo.tar.gz.metadata.xml \
--deposit-id 42 \
--partial
or:
.. code:: shell
$ swh deposit upload --username name --password secret \
--name 'add-foo' --author 'someone' \
--deposit-id 42 \
--partial
3. Finalize deposit
~~~~~~~~~~~~~~~~~~~
On your last addition (same command as before), by not declaring it
``--partial``, the deposit will be considered completed. Its status will be
changed to ``deposited``
Update deposit
----------------
* replace deposit:
- only possible if the deposit status is ``partial`` and
``--deposit-id <id>`` is provided
- by using the ``--replace`` flag
- ``--metadata-deposit`` replaces associated existing metadata
- ``--archive-deposit`` replaces associated archive(s)
- by default, with no flag or both, you'll replace associated
metadata and archive(s):
.. code:: shell
$ swh deposit upload --username name --password secret \
--deposit-id 11 \
--archive updated-je-suis-gpl.tgz \
--replace
* update a loaded deposit with a new version:
- by using the external-id with the ``--slug`` argument, you will
link the new deposit with its parent deposit:
.. code:: shell
$ swh deposit upload --username name --password secret \
--archive je-suis-gpl-v2.tgz \
--slug 'je-suis-gpl' \
Check the deposit's status
--------------------------
You can check the status of the deposit by using the ``--deposit-id`` argument:
.. code:: shell
$ swh deposit status --username name --password secret \
--deposit-id 11
.. code:: json
{
'deposit_id': '11',
'deposit_status': 'deposited',
'deposit_swh_id': None,
'deposit_status_detail': 'Deposit is ready for additional checks \
(tarball ok, metadata, etc...)'
}
The different statuses:
- **partial**: multipart deposit is still ongoing
- **deposited**: deposit completed
- **rejected**: deposit failed the checks
- **verified**: content and metadata verified
- **loading**: loading in-progress
- **done**: loading completed successfully
- **failed**: the deposit loading has failed
When the deposit has been loaded into the archive, the status will be
marked ``done``. In the response, will also be available the
<deposit_swh_id>, <deposit_swh_id_context>, <deposit_swh_anchor_id>,
<deposit_swh_anchor_id_context>. For example:
.. code:: json
{
'deposit_id': '11',
'deposit_status': 'done',
'deposit_swh_id': 'swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9',
'deposit_swh_id_context': 'swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9;origin=https://forge.softwareheritage.org/source/jesuisgpl/',
'deposit_swh_anchor_id': 'swh:1:rev:e76ea49c9ffbb7f73611087ba6e999b19e5d71eb',
'deposit_swh_anchor_id_context': 'swh:1:rev:e76ea49c9ffbb7f73611087ba6e999b19e5d71eb;origin=https://forge.softwareheritage.org/source/jesuisgpl/',
'deposit_status_detail': 'The deposit has been successfully \
loaded into the Software Heritage archive'
}
diff --git a/swh/deposit/cli/admin.py b/swh/deposit/cli/admin.py
index 364ee32e..100a7ef1 100644
--- a/swh/deposit/cli/admin.py
+++ b/swh/deposit/cli/admin.py
@@ -1,254 +1,254 @@
# Copyright (C) 2017-2019 The Software Heritage developers
# See the AUTHORS file at the top-level directory of this distribution
# License: GNU General Public License version 3, or any later version
# See top-level LICENSE file for more information
import click
from swh.deposit.config import setup_django_for
from swh.deposit.cli import deposit
@deposit.group('admin')
@click.option('--config-file', '-C', default=None,
type=click.Path(exists=True, dir_okay=False,),
help="Optional extra configuration file.")
@click.option('--platform', default='development',
type=click.Choice(['development', 'production']),
help='development or production platform')
@click.pass_context
def admin(ctx, config_file, platform):
"""Server administration tasks (manipulate user or collections)"""
# configuration happens here
setup_django_for(platform, config_file=config_file)
@admin.group('user')
@click.pass_context
def user(ctx):
"""Manipulate user."""
# configuration happens here
pass
def _create_collection(name):
"""Create the collection with name if it does not exist.
Args:
name (str): collection's name
Returns:
collection (DepositCollection): the existing collection object
(created or not)
"""
# to avoid loading too early django namespaces
from swh.deposit.models import DepositCollection
try:
collection = DepositCollection.objects.get(name=name)
click.echo('Collection %s exists, nothing to do.' % name)
except DepositCollection.DoesNotExist:
click.echo('Create new collection %s' % name)
collection = DepositCollection.objects.create(name=name)
click.echo('Collection %s created' % name)
return collection
@user.command('create')
@click.option('--username', required=True, help="User's name")
@click.option('--password', required=True,
help="Desired user's password (plain).")
@click.option('--firstname', default='', help="User's first name")
@click.option('--lastname', default='', help="User's last name")
@click.option('--email', default='', help="User's email")
@click.option('--collection', help="User's collection")
@click.option('--provider-url', default='', help="Provider URL")
@click.option('--domain', default='', help="The domain")
@click.pass_context
def user_create(ctx, username, password, firstname, lastname, email,
collection, provider_url, domain):
"""Create a user with some needed information (password, collection)
If the collection does not exist, the collection is then created
alongside.
- The password is stored encrypted using django's utilies.
+ The password is stored encrypted using django's utilities.
"""
# to avoid loading too early django namespaces
from swh.deposit.models import DepositClient
# If collection is not provided, fallback to username
if not collection:
collection = username
click.echo('collection: %s' % collection)
# create the collection if it does not exist
collection = _create_collection(collection)
# user create/update
try:
user = DepositClient.objects.get(username=username)
click.echo('User %s exists, updating information.' % user)
user.set_password(password)
except DepositClient.DoesNotExist:
click.echo('Create new user %s' % username)
user = DepositClient.objects.create_user(
username=username,
password=password)
user.collections = [collection.id]
user.first_name = firstname
user.last_name = lastname
user.email = email
user.is_active = True
user.provider_url = provider_url
user.domain = domain
user.save()
click.echo('Information registered for user %s' % user)
@user.command('list')
@click.pass_context
def user_list(ctx):
"""List existing users.
This entrypoint is not paginated yet as there is not a lot of
entry.
"""
# to avoid loading too early django namespaces
from swh.deposit.models import DepositClient
users = DepositClient.objects.all()
if not users:
output = 'Empty user list'
else:
output = '\n'.join((user.username for user in users))
click.echo(output)
@user.command('exists')
@click.argument('username', required=True)
@click.pass_context
def user_exists(ctx, username):
"""Check if user exists.
"""
# to avoid loading too early django namespaces
from swh.deposit.models import DepositClient
try:
DepositClient.objects.get(username=username)
click.echo('User %s exists.' % username)
ctx.exit(0)
except DepositClient.DoesNotExist:
click.echo('User %s does not exist.' % username)
ctx.exit(1)
@admin.group('collection')
@click.pass_context
def collection(ctx):
"""Manipulate collections."""
pass
@collection.command('create')
@click.option('--name', required=True, help="Collection's name")
@click.pass_context
def collection_create(ctx, name):
_create_collection(name)
@collection.command('list')
@click.pass_context
def collection_list(ctx):
"""List existing collections.
This entrypoint is not paginated yet as there is not a lot of
entry.
"""
# to avoid loading too early django namespaces
from swh.deposit.models import DepositCollection
collections = DepositCollection.objects.all()
if not collections:
output = 'Empty collection list'
else:
output = '\n'.join((col.name for col in collections))
click.echo(output)
@admin.group('deposit')
@click.pass_context
def deposit(ctx):
"""Manipulate deposit."""
pass
@deposit.command('reschedule')
@click.option('--deposit-id', required=True, help="Deposit identifier")
@click.pass_context
def deposit_reschedule(ctx, deposit_id):
"""Reschedule the deposit loading
This will:
- check the deposit's status to something reasonable (failed or done). That
means that the checks have passed alright but something went wrong during
the loading (failed: loading failed, done: loading ok, still for some
reasons as in bugs, we need to reschedule it)
- reset the deposit's status to 'verified' (prior to any loading but after
the checks which are fine) and removes the different archives'
identifiers (swh-id, ...)
- trigger back the loading task through the scheduler
"""
# to avoid loading too early django namespaces
from datetime import datetime
from swh.deposit.models import Deposit
from swh.deposit.config import (
DEPOSIT_STATUS_LOAD_SUCCESS, DEPOSIT_STATUS_LOAD_FAILURE,
DEPOSIT_STATUS_VERIFIED, SWHDefaultConfig,
)
try:
deposit = Deposit.objects.get(pk=deposit_id)
except Deposit.DoesNotExist:
click.echo('Deposit %s does not exist.' % deposit_id)
ctx.exit(1)
# Check the deposit is in a reasonable state
accepted_statuses = [
DEPOSIT_STATUS_LOAD_SUCCESS, DEPOSIT_STATUS_LOAD_FAILURE
]
if deposit.status == DEPOSIT_STATUS_VERIFIED:
click.echo('Deposit %s\'s status already set for rescheduling.' % (
deposit_id))
ctx.exit(0)
if deposit.status not in accepted_statuses:
click.echo('Deposit %s\'s status be one of %s.' % (
deposit_id, ', '.join(accepted_statuses)))
ctx.exit(1)
task_id = deposit.load_task_id
if not task_id:
click.echo('Deposit %s cannot be rescheduled. It misses the '
'associated task.' % deposit_id)
ctx.exit(1)
# Reset the deposit's state
deposit.swh_id = None
deposit.swh_id_context = None
deposit.swh_anchor_id = None
deposit.swh_anchor_id_context = None
deposit.status = DEPOSIT_STATUS_VERIFIED
deposit.save()
# Trigger back the deposit
scheduler = SWHDefaultConfig().scheduler
scheduler.set_status_tasks(
[task_id], status='next_run_not_scheduled',
next_run=datetime.now())

File Metadata

Mime Type
text/x-diff
Expires
Mon, Aug 18, 11:41 PM (1 w, 3 d ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3274006

Event Timeline