Page Menu
Home
Software Heritage
Search
Configure Global Search
Log In
Files
F9697416
No One
Temporary
Actions
View File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Flag For Later
Size
17 KB
Subscribers
None
View Options
diff --git a/docs/getting-started.rst b/docs/getting-started.rst
index 33f1749b..5e4556b1 100644
--- a/docs/getting-started.rst
+++ b/docs/getting-started.rst
@@ -1,286 +1,286 @@
Getting Started
===============
This is a guide for how to prepare and push a software deposit with
the `swh deposit` commands.
The API is rooted at https://deposit.softwareheritage.org/1.
For more details, see the `main documentation <./index.html>`__.
Requirements
------------
You need to be referenced on SWH's client list to have:
* credentials (needed for the basic authentication step)
- in this document we reference ``<name>`` as the client's name and
``<pass>`` as its associated authentication password.
* an associated collection_.
.. _collection: https://bitworking.org/projects/atom/rfc5023#rfc.section.8.3.3
`Contact us for more information.
<https://www.softwareheritage.org/contact/>`__
Prepare a deposit
-----------------
* compress the files in a supported archive format:
- zip: common zip archive (no multi-disk zip files).
- tar: tar archive without compression or optionally any of the
following compression algorithm gzip (`.tar.gz`, `.tgz`), bzip2
(`.tar.bz2`) , or lzma (`.tar.lzma`)
* (Optional) prepare a metadata file (more details :ref:`deposit-metadata`):
Push deposit
------------
You can push a deposit with:
* a single deposit (archive + metadata):
The user posts in one query a software
source code archive and associated metadata.
The deposit is directly marked with status ``deposited``.
* a multisteps deposit:
1. Create an incomplete deposit (marked with status ``partial``)
2. Add data to a deposit (in multiple requests if needed)
3. Finalize deposit (the status becomes ``deposited``)
Single deposit
^^^^^^^^^^^^^^
Once the files are ready for deposit, we want to do the actual deposit
in one shot, sending exactly one POST query:
* 1 archive (content-type ``application/zip`` or ``application/x-tar``)
* 1 metadata file in atom xml format (``content-type: application/atom+xml;type=entry``)
For this, we need to provide the:
* arguments: ``--username 'name' --password 'pass'`` as credentials
* archive's path (example: ``--archive path/to/archive-name.tgz``)
* software's name (optional if a metadata filepath is specified and the
artifact's name is included in the metadata file).
* author's name (optional if a metadata filepath is specified and the authors
are included in the metadata file). This can be specified multiple times in
case of multiple authors.
* (optionally) metadata file's path ``--metadata
path/to/file.metadata.xml``.
* (optionally) ``--slug 'your-id'`` argument, a reference to a unique identifier
the client uses for the software object. If not provided, A UUID will be
generated by SWH.
You can do this with the following command:
minimal deposit
.. code:: shell
$ swh deposit upload --username name --password secret \
- --author "some@noone" --author "second@noone" \
+ --author "some@nobody" --author "second@nobody" \
--name 'je-suis-gpl' \
--archive je-suis-gpl.tgz
with client's external identifier (``slug``)
.. code:: shell
$ swh deposit upload --username name --password secret \
- --author "some@noone" \
+ --author "some@nobody" \
--name 'je-suis-gpl' \
--archive je-suis-gpl.tgz \
--slug je-suis-gpl
to a specific client's collection
.. code:: shell
$ swh deposit upload --username name --password secret \
- --author "some@noone" \
+ --author "some@nobody" \
--name 'je-suis-gpl' \
--archive je-suis-gpl.tgz \
--collection 'second-collection'
You just posted a deposit to your collection on Software Heritage
If everything went well, the successful response will contain the
elements below:
.. code:: shell
{
'deposit_status': 'deposited',
'deposit_id': '7',
'deposit_date': 'Jan. 29, 2018, 12:29 p.m.'
}
Note: As the deposit is in ``deposited`` status, you can no longer
update the deposit after this query. It will be answered with a 403
forbidden answer.
If something went wrong, an equivalent response will be given with the
`error` and `detail` keys explaining the issue, e.g.:
.. code:: shell
{
'error': 'Unknown collection name xyz',
'detail': None,
'deposit_status': None,
'deposit_status_detail': None,
'deposit_swh_id': None,
'status': 404
}
multisteps deposit
^^^^^^^^^^^^^^^^^^^^^^^^^
The steps to create a multisteps deposit:
1. Create an incomplete deposit
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
First use the ``--partial`` argument to declare there is more to come
.. code:: shell
$ swh deposit upload --username name --password secret \
--archive foo.tar.gz \
--partial
2. Add content or metadata to the deposit
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Continue the deposit by using the ``--deposit-id`` argument given as a response
for the first step. You can continue adding content or metadata while you use
the ``--partial`` argument.
To only add one new archive to the deposit:
.. code:: shell
$ swh deposit upload --username name --password secret \
--archive add-foo.tar.gz \
--deposit-id 42 \
--partial
To only add metadata to the deposit:
.. code:: shell
$ swh deposit upload --username name --password secret \
--metadata add-foo.tar.gz.metadata.xml \
--deposit-id 42 \
--partial
or:
.. code:: shell
$ swh deposit upload --username name --password secret \
--name 'add-foo' --author 'someone' \
--deposit-id 42 \
--partial
3. Finalize deposit
~~~~~~~~~~~~~~~~~~~
On your last addition (same command as before), by not declaring it
``--partial``, the deposit will be considered completed. Its status will be
changed to ``deposited``
Update deposit
----------------
* replace deposit:
- only possible if the deposit status is ``partial`` and
``--deposit-id <id>`` is provided
- by using the ``--replace`` flag
- ``--metadata-deposit`` replaces associated existing metadata
- ``--archive-deposit`` replaces associated archive(s)
- by default, with no flag or both, you'll replace associated
metadata and archive(s):
.. code:: shell
$ swh deposit upload --username name --password secret \
--deposit-id 11 \
--archive updated-je-suis-gpl.tgz \
--replace
* update a loaded deposit with a new version:
- by using the external-id with the ``--slug`` argument, you will
link the new deposit with its parent deposit:
.. code:: shell
$ swh deposit upload --username name --password secret \
--archive je-suis-gpl-v2.tgz \
--slug 'je-suis-gpl' \
Check the deposit's status
--------------------------
You can check the status of the deposit by using the ``--deposit-id`` argument:
.. code:: shell
$ swh deposit status --username name --password secret \
--deposit-id 11
.. code:: json
{
'deposit_id': '11',
'deposit_status': 'deposited',
'deposit_swh_id': None,
'deposit_status_detail': 'Deposit is ready for additional checks \
(tarball ok, metadata, etc...)'
}
The different statuses:
- **partial**: multipart deposit is still ongoing
- **deposited**: deposit completed
- **rejected**: deposit failed the checks
- **verified**: content and metadata verified
- **loading**: loading in-progress
- **done**: loading completed successfully
- **failed**: the deposit loading has failed
When the deposit has been loaded into the archive, the status will be
marked ``done``. In the response, will also be available the
<deposit_swh_id>, <deposit_swh_id_context>, <deposit_swh_anchor_id>,
<deposit_swh_anchor_id_context>. For example:
.. code:: json
{
'deposit_id': '11',
'deposit_status': 'done',
'deposit_swh_id': 'swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9',
'deposit_swh_id_context': 'swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9;origin=https://forge.softwareheritage.org/source/jesuisgpl/',
'deposit_swh_anchor_id': 'swh:1:rev:e76ea49c9ffbb7f73611087ba6e999b19e5d71eb',
'deposit_swh_anchor_id_context': 'swh:1:rev:e76ea49c9ffbb7f73611087ba6e999b19e5d71eb;origin=https://forge.softwareheritage.org/source/jesuisgpl/',
'deposit_status_detail': 'The deposit has been successfully \
loaded into the Software Heritage archive'
}
diff --git a/swh/deposit/cli/admin.py b/swh/deposit/cli/admin.py
index 364ee32e..100a7ef1 100644
--- a/swh/deposit/cli/admin.py
+++ b/swh/deposit/cli/admin.py
@@ -1,254 +1,254 @@
# Copyright (C) 2017-2019 The Software Heritage developers
# See the AUTHORS file at the top-level directory of this distribution
# License: GNU General Public License version 3, or any later version
# See top-level LICENSE file for more information
import click
from swh.deposit.config import setup_django_for
from swh.deposit.cli import deposit
@deposit.group('admin')
@click.option('--config-file', '-C', default=None,
type=click.Path(exists=True, dir_okay=False,),
help="Optional extra configuration file.")
@click.option('--platform', default='development',
type=click.Choice(['development', 'production']),
help='development or production platform')
@click.pass_context
def admin(ctx, config_file, platform):
"""Server administration tasks (manipulate user or collections)"""
# configuration happens here
setup_django_for(platform, config_file=config_file)
@admin.group('user')
@click.pass_context
def user(ctx):
"""Manipulate user."""
# configuration happens here
pass
def _create_collection(name):
"""Create the collection with name if it does not exist.
Args:
name (str): collection's name
Returns:
collection (DepositCollection): the existing collection object
(created or not)
"""
# to avoid loading too early django namespaces
from swh.deposit.models import DepositCollection
try:
collection = DepositCollection.objects.get(name=name)
click.echo('Collection %s exists, nothing to do.' % name)
except DepositCollection.DoesNotExist:
click.echo('Create new collection %s' % name)
collection = DepositCollection.objects.create(name=name)
click.echo('Collection %s created' % name)
return collection
@user.command('create')
@click.option('--username', required=True, help="User's name")
@click.option('--password', required=True,
help="Desired user's password (plain).")
@click.option('--firstname', default='', help="User's first name")
@click.option('--lastname', default='', help="User's last name")
@click.option('--email', default='', help="User's email")
@click.option('--collection', help="User's collection")
@click.option('--provider-url', default='', help="Provider URL")
@click.option('--domain', default='', help="The domain")
@click.pass_context
def user_create(ctx, username, password, firstname, lastname, email,
collection, provider_url, domain):
"""Create a user with some needed information (password, collection)
If the collection does not exist, the collection is then created
alongside.
- The password is stored encrypted using django's utilies.
+ The password is stored encrypted using django's utilities.
"""
# to avoid loading too early django namespaces
from swh.deposit.models import DepositClient
# If collection is not provided, fallback to username
if not collection:
collection = username
click.echo('collection: %s' % collection)
# create the collection if it does not exist
collection = _create_collection(collection)
# user create/update
try:
user = DepositClient.objects.get(username=username)
click.echo('User %s exists, updating information.' % user)
user.set_password(password)
except DepositClient.DoesNotExist:
click.echo('Create new user %s' % username)
user = DepositClient.objects.create_user(
username=username,
password=password)
user.collections = [collection.id]
user.first_name = firstname
user.last_name = lastname
user.email = email
user.is_active = True
user.provider_url = provider_url
user.domain = domain
user.save()
click.echo('Information registered for user %s' % user)
@user.command('list')
@click.pass_context
def user_list(ctx):
"""List existing users.
This entrypoint is not paginated yet as there is not a lot of
entry.
"""
# to avoid loading too early django namespaces
from swh.deposit.models import DepositClient
users = DepositClient.objects.all()
if not users:
output = 'Empty user list'
else:
output = '\n'.join((user.username for user in users))
click.echo(output)
@user.command('exists')
@click.argument('username', required=True)
@click.pass_context
def user_exists(ctx, username):
"""Check if user exists.
"""
# to avoid loading too early django namespaces
from swh.deposit.models import DepositClient
try:
DepositClient.objects.get(username=username)
click.echo('User %s exists.' % username)
ctx.exit(0)
except DepositClient.DoesNotExist:
click.echo('User %s does not exist.' % username)
ctx.exit(1)
@admin.group('collection')
@click.pass_context
def collection(ctx):
"""Manipulate collections."""
pass
@collection.command('create')
@click.option('--name', required=True, help="Collection's name")
@click.pass_context
def collection_create(ctx, name):
_create_collection(name)
@collection.command('list')
@click.pass_context
def collection_list(ctx):
"""List existing collections.
This entrypoint is not paginated yet as there is not a lot of
entry.
"""
# to avoid loading too early django namespaces
from swh.deposit.models import DepositCollection
collections = DepositCollection.objects.all()
if not collections:
output = 'Empty collection list'
else:
output = '\n'.join((col.name for col in collections))
click.echo(output)
@admin.group('deposit')
@click.pass_context
def deposit(ctx):
"""Manipulate deposit."""
pass
@deposit.command('reschedule')
@click.option('--deposit-id', required=True, help="Deposit identifier")
@click.pass_context
def deposit_reschedule(ctx, deposit_id):
"""Reschedule the deposit loading
This will:
- check the deposit's status to something reasonable (failed or done). That
means that the checks have passed alright but something went wrong during
the loading (failed: loading failed, done: loading ok, still for some
reasons as in bugs, we need to reschedule it)
- reset the deposit's status to 'verified' (prior to any loading but after
the checks which are fine) and removes the different archives'
identifiers (swh-id, ...)
- trigger back the loading task through the scheduler
"""
# to avoid loading too early django namespaces
from datetime import datetime
from swh.deposit.models import Deposit
from swh.deposit.config import (
DEPOSIT_STATUS_LOAD_SUCCESS, DEPOSIT_STATUS_LOAD_FAILURE,
DEPOSIT_STATUS_VERIFIED, SWHDefaultConfig,
)
try:
deposit = Deposit.objects.get(pk=deposit_id)
except Deposit.DoesNotExist:
click.echo('Deposit %s does not exist.' % deposit_id)
ctx.exit(1)
# Check the deposit is in a reasonable state
accepted_statuses = [
DEPOSIT_STATUS_LOAD_SUCCESS, DEPOSIT_STATUS_LOAD_FAILURE
]
if deposit.status == DEPOSIT_STATUS_VERIFIED:
click.echo('Deposit %s\'s status already set for rescheduling.' % (
deposit_id))
ctx.exit(0)
if deposit.status not in accepted_statuses:
click.echo('Deposit %s\'s status be one of %s.' % (
deposit_id, ', '.join(accepted_statuses)))
ctx.exit(1)
task_id = deposit.load_task_id
if not task_id:
click.echo('Deposit %s cannot be rescheduled. It misses the '
'associated task.' % deposit_id)
ctx.exit(1)
# Reset the deposit's state
deposit.swh_id = None
deposit.swh_id_context = None
deposit.swh_anchor_id = None
deposit.swh_anchor_id_context = None
deposit.status = DEPOSIT_STATUS_VERIFIED
deposit.save()
# Trigger back the deposit
scheduler = SWHDefaultConfig().scheduler
scheduler.set_status_tasks(
[task_id], status='next_run_not_scheduled',
next_run=datetime.now())
File Metadata
Details
Attached
Mime Type
text/x-diff
Expires
Mon, Aug 18, 11:41 PM (1 w, 3 d ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3274006
Attached To
rDDEP Push deposit
Event Timeline
Log In to Comment