diff --git a/talks-public/2018-02-04-deposit-vault-walkthrough/deposit-vault-walkthrough.org b/talks-public/2018-02-04-deposit-vault-walkthrough/deposit-vault-walkthrough.org index cf67411..eae70b4 100644 --- a/talks-public/2018-02-04-deposit-vault-walkthrough/deposit-vault-walkthrough.org +++ b/talks-public/2018-02-04-deposit-vault-walkthrough/deposit-vault-walkthrough.org @@ -1,182 +1,254 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) #+LATEX_HEADER_EXTRA: \usepackage{listings} #+INCLUDE: "../../common/modules/prelude.org" :minlevel 1 #+TITLE: Source Code Deposit Walkthrough # does not allow short title, so we override it for beamer as follows : #+BEAMER_HEADER: \title[Software Heritage]{Software Heritage\\Source Code Deposit Walkthrough} #+BEAMER_HEADER: \author{Morane Gruenepter} #+BEAMER_HEADER: \date[04/01/2018, Deposit walkthrough]{04 January 2018\\Deposit walkthrough\\Paris, France} #+AUTHOR: Morane Gruenpeter #+DATE: 04 January 2018 #+EMAIL: morane@gmail.com #+DESCRIPTION: Software Heritage: Source Code Deposit Walkthroug #+KEYWORDS: software heritage preservation knowledge deposit technology sword #+BEAMER_HEADER: \institute[Software Heritage]{Metadata specialist\\Software Heritage\\\href{mailto:morane@softwareheritage.org}{\tt morane@softwareheritage.org}} * Source code deposit: From deposit to vault ** Source code deposit: From deposit to vault :PROPERTIES: :CUSTOM_ID: walkthrough :END: **** First version of our software deposit prototype \\ - documentation is available on:\\ - *\url{docs.softwareheritage.org/devel/swh-deposit/}* + *\url{https://deposit.softwareheritage.org/}* **** Features - pushing *deposits* to the Software Heritage archive - software source code + metadata - full *transparency* of the loading and downloading processes - download the deposit by cooking the bundle in the *vault* **** SWORD-compliant - *SWORD v2* protocol for single and multi-part deposits - deposit MUST, SHOULD and MAY contain certain metadata attributes -* Deposit walkthrough +* Deposit walkthrough +** Deposit walkthrough +*** Prepare source code for deposit +#+BEAMER: \tiny +#+BEGIN_SRC +$ tar cvf +#+END_SRC +#+BEAMER: \pause + +*** Create metadata file with /MUST/ metadata +#+BEAMER: \tiny +- *the url* representing the location of the source /MUST/ be provided +- *the external-identifier* /MUST/ be provided +- *the name* of the software deposit /MUST/ be provided +- *the author/s* of the software deposit /MUST/ be provided + + +*** +#+BEAMER: \tiny +#+BEGIN_SRC + + + Je suis GPL + 1785io25c695 + origin url + description + + GPL 3 + url spdx + + + author1 + Inria + + + +#+END_SRC +#+BEAMER: \pause + + ** Deposit walkthrough *** Pushing a single deposit with metadata #+BEAMER: \tiny #+BEGIN_SRC -#!/usr/bin/env bash - -ARCHIVE=${1-'je-suis-gpl.tar.gz'} -MD5=$(2-md5sum $ARCHIVE | cut -f 1 -d' ') -NAME=$(3-basename $ARCHIVE) -METADATA_ENTRY=${4-'metadata.xml'} -EXTERNAL_ID=${5-'external-id'} -COLLECTION =${6-'fsf_collection'} - -curl -i -u 'client_name':'client_password' \ - -X POST \ - --data-binary @${ARCHIVE} \ - -H "In-Progress: false" \ - -H "Content-MD5: ${MD5}" \ - -H "Content-Disposition: attachment; filename=${NAME}" \ - -H "Slug: ${EXTERNAL_ID}" \ - -F "atom=@${METADATA_ENTRY};type=application/atom+xml;type=entry" \ - deposit.softwareheritage.org/1/${COLLECTION} +$ swh-deposit --login 'name' --pass 'secret' \ + --collection 'fsf-collection' --slug 'ext-id' \ + --archive '/path/to/je-suis-gpl.tgz' \ + --metadata '/path/to/je-suis-gpl-metadata.xml' #+END_SRC #+BEAMER: \pause -*** response +*** Response #+BEAMER: \tiny #+BEGIN_SRC 11 Jan. 4, 2018, 2:51 p.m. je-suis-gpl.tar.gz ready-for-checks + + + + + href="http://deposit.swh.org/1/fsf-collection/11/status/"/> ... #+END_SRC ** Deposit walkthrough *** Multi-part deposit - - *In-Progress* header is /true/ when creating a multi-part deposit. - - the deposit will be completed and marked *ready-for-checks* when the header is /false/. - - use the *DEPOSIT-ID* given on the first deposit. +#+BEAMER: \footnotesize + A Multi-part deposit with *partial* status can be: + - completely replaced + - updated with new content or metadata #+BEAMER: \pause -*** Updating a multi-part deposit +*** Updating by replacing archive and metadata #+BEAMER: \tiny #+BEGIN_SRC -curl -i -u 'client_name':'client_password' \ - -X PUT \ - --data-binary @${ARCHIVE} \ - -H "In-Progress: true" \ - -H "Content-MD5: ${MD5}" \ - -H "Content-Disposition: attachment; filename=${NAME}" \ - -H "Slug:${EXTERNAL_ID}" \ - -H "Content-type: application/zip" \ - deposit.softwareheritage.org/1/${COLLECTION}/${DEPOSIT_ID}/media/ +$ swh-deposit --login 'name' --pass 'secret' --deposit-id '11' \ + --collection 'fsf-collection' --slug 'ext-id' --in-progress 'true'\ + --archive '/path/to/updated-e-suis-gpl.tgz' \ + --metadata '/path/to/updated-je-suis-gpl-metadata.xml' #+END_SRC +*** Updating by adding content or metadata +#+BEAMER: \tiny +#+BEGIN_SRC +$ swh-deposit --login 'name' --pass 'secret' --deposit-id '11' \ + --collection 'fsf-collection' --slug 'ext-id' --in-progress 'true'\ + --archive '/path/to/added-je-suis-gpl.tgz' \ + +#+END_SRC + +*** +#+BEAMER: \footnotesize +To mark deposit completed *in-progress* /must/ be false ** Deposit walkthrough *** What's your status? +#+BEAMER: \footnotesize - *partial* : multi-part deposit is still ongoing - *ready-for-checks*: deposit completed - *ready-for-load*: content and metadata verified - - *success*: loading completed successfuly + - *success*: loading completed successfully - *failure*: loading failed #+BEAMER: \pause *** Checking the deposit's state #+BEAMER: \tiny #+BEGIN_SRC -$ curl -i -u 'client_name':'client_password' \ - deposit.softwareheritage.org/1/${COLLECTION}/${DEPOSIT_ID}/status/ +$ curl -i -u 'login':'secret' deposit.softwareheritage.org/1/col/11/status/ #+END_SRC #+BEAMER: \pause *** #+BEAMER: \tiny #+BEGIN_SRC HTTP/1.0 200 OK Date: Thu, 04 Jan 2018 15:20:12 GMT ... 11 success - the deposited archive has been - successfully ingested into the + Software Heritage archive - 608757ea9bd8494d729732cc9a414948c160bd3c + swh:1:rev:sha1:608757ea... #+END_SRC ** The deposit was succesfuly pushed now we want to download the content with the #+BEAMER: \huge \centering *Vault* -* Vault walkthrough + ** Vault walkthrough -*** Requesting download with swh-id +*** Software identifier to request download +#+BEAMER: \footnotesize +The swh-id *swh:1:rev:sha1:608757ea9bd8494d729732cc9a414948c160bd3c* + +is composed of: +- the context *swh:1:rev:sha1* +- and the object identifier *608757ea9bd8494d729732cc9a414948c160bd3c* +We will use the object identifier to create a bundle to download +*** Reqesting download with swh-id #+BEAMER: \tiny #+BEGIN_SRC sh -curl -X POST /api/1/vault/revision/e04b2a7b.../gitfast +$ curl -X POST /api/1/vault/revision/608757ea.../gitfast #+END_SRC +** Vault walkthrough *** Checking progress #+BEAMER: \tiny #+BEGIN_SRC sh -curl /api/1/vault/revision/e04b2a7b.../gitfast +$ curl /api/1/vault/revision/608757ea.../gitfast #+END_SRC +# can we cook objects that aren't revisions? -*** response +*** Response #+BEAMER: \tiny #+BEGIN_SRC json { - 'fetch_url': '/api/1/vault/revision/e04b2a7b.../gitfast/raw/', + 'fetch_url': '/api/1/vault/revision/608757ea.../gitfast/raw/', 'progress_message': None, - 'status': 'done', + 'status': 'pending', 'id': 4, - 'obj_id': 'e04b2a7b8a8838da0693e9fd992a10d6fd211b50', + 'obj_id': '608757ea9bd8494d729732cc9a414948c160bd3c', 'obj_type': 'revision_gitfast' } #+END_SRC +*** What's your status? +#+BEAMER: \small + - *null* : no requests for the objects + - *new*: bundle requested for the object + - *pending*: bundle in preparation + - *done*: bundle ready for download + ** Vault walkthrough -*** Download when status is marked /done/ +*** Checking progress +#+BEAMER: \tiny +#+BEGIN_SRC sh +$ curl /api/1/vault/revision/608757ea.../gitfast +#+END_SRC + +*** Response +#+BEAMER: \tiny +#+BEGIN_SRC json +{ + 'fetch_url': '/api/1/vault/revision/608757ea.../gitfast/raw/', + 'progress_message': None, + 'status': 'done', + 'id': 4, + 'obj_id': '608757ea9bd8494d729732cc9a414948c160bd3c', + 'obj_type': 'revision_gitfast' +} +#+END_SRC + +*** Download available when status is marked /done/ #+BEAMER: \tiny #+BEGIN_SRC sh -$ curl /api/1/vault/revision/e04b2a7b.../gitfast/raw/ \ +$ curl /api/1/vault/revision/608757ea.../gitfast/raw/ \ -O path/to/revision.gitfast.gz $ git init $ zcat path/to/revision.gitfast.gz | git fast-import $ git checkout HEAD #+END_SRC