diff --git a/talks-public/2018-02-04-deposit-vault-walkthorugh/Makefile b/talks-public/2018-02-04-deposit-vault-walkthorugh/Makefile new file mode 100644 index 0000000..68fbee7 --- /dev/null +++ b/talks-public/2018-02-04-deposit-vault-walkthorugh/Makefile @@ -0,0 +1 @@ +include ../Makefile.slides diff --git a/talks-public/2018-02-04-deposit-vault-walkthorugh/deposit-vault-walkthrough.org b/talks-public/2018-02-04-deposit-vault-walkthorugh/deposit-vault-walkthrough.org new file mode 100644 index 0000000..15a16b4 --- /dev/null +++ b/talks-public/2018-02-04-deposit-vault-walkthorugh/deposit-vault-walkthrough.org @@ -0,0 +1,213 @@ +#+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) +#+LATEX_HEADER_EXTRA: \usepackage{listings} +#+INCLUDE: "../../common/modules/prelude.org" :minlevel 1 +#+TITLE: Source Code Deposit Walkthrough +# does not allow short title, so we override it for beamer as follows : +#+BEAMER_HEADER: \title[Software Heritage]{Software Heritage\\Source Code Deposit Walkthrough} +#+BEAMER_HEADER: \author{Morane Gruenepter} +#+BEAMER_HEADER: \date[04/01/2018, Deposit walkthrough]{04 January 2018\\Deposit walkthrough\\Paris, France} +#+AUTHOR: Morane Gruenpeter +#+DATE: 04 January 2018 +#+EMAIL: morane@gmail.com +#+DESCRIPTION: Software Heritage: Source Code Deposit Walkthroug +#+KEYWORDS: software heritage preservation knowledge deposit technology sword + +#+BEAMER_HEADER: \institute[Software Heritage]{Metadata specialist\\Software Heritage\\\href{mailto:morane@softwareheritage.org}{\tt morane@softwareheritage.org}} + + +* Source code deposit: From deposit to vault +** Source code deposit: From deposit to vault + :PROPERTIES: + :CUSTOM_ID: walkthrough + :END: + +**** + First version of our software deposit prototype \\ + *\url{https://deposit.softwareheritage.org/}* +**** Features + - pushing *deposits* to the Software Heritage archive + - software source code + metadata + - full *transparency* of the loading and downloading processes + - download the deposit by cooking the bundle in the *vault* + +* Deposit walkthrough +** Deposit walkthrough + :PROPERTIES: + :CUSTOM_ID: depositwalkthrough + :END: +*** Request service document +#+BEAMER: \scriptsize +#+BEGIN_SRC +$ curl -i --user "$CREDS" \ + https://deposit.softwareheritage.org/1/servicedocument/ +#+END_SRC + +#+BEAMER: \pause +*** response +#+BEAMER: \tiny +#+BEGIN_SRC +HTTP/1.0 200 OK +Server: WSGIServer/0.2 CPython/3.5.3 +Content-Type: application/xml + + +... +2.0 +209715200 + + The Software Heritage (SWH) Archive + + Software Collection + application/zip + Collection Policy + Software Heritage Archive + Collect, Preserve, Share + ... + + + +#+END_SRC + + +** Deposit walkthrough +*** Pushing a single deposit with metadata +#+BEAMER: \tiny +#+BEGIN_SRC +$ curl -i -u "$CREDS" \ + -X POST \ + --data-binary @${ARCHIVE} \ + -H "In-Progress: false" \ + -H "Content-MD5: ${MD5}" \ + -H "Content-Disposition: attachment; filename=${NAME}" \ + -H "Slug: ${EXTERNAL_ID}" \ + -H "Packaging: http://purl.org/net/sword/package/SimpleZip" \ + -H "Content-type: application/zip" \ + -F "atom=@${METADATA_ENTRY};type=application/atom+xml;type=entry" \ + ${SERVER}/1/${COLLECTION}/ +#+END_SRC +#+BEAMER: \pause + +*** response +#+BEAMER: \tiny +#+BEGIN_SRC + + 11 + Jan. 4, 2018, 2:51 p.m. + swh-deposit.zip + ready-for-checks + ... + + + ... + +#+END_SRC + +** Deposit walkthrough +*** Multi-part deposit + - To create a multi-part deposit, the *In-Progress* header is /true/. + - The deposit will be completed and marked *ready-for-checks* when the header is /false/. + +#+BEAMER: \pause + +*** Updating a multi-part deposit +#+BEAMER: \tiny +#+BEGIN_SRC +$ curl -i -u "$CREDS" \ + -X PUT \ + --data-binary @${ARCHIVE} \ + -H "In-Progress: true" \ + -H "Content-MD5: ${MD5}" \ + -H "Content-Disposition: attachment; filename=${NAME}" \ + -H 'Slug: external-id' \ + -H 'Packaging: http://purl.org/net/sword/package/SimpleZip' \ + -H 'Content-type: application/zip' \ + ${SERVER}/1/${COLLECTION}/${DEPOSIT_ID}/media/ +#+END_SRC + + + +** Deposit walkthrough +*** What's your status? + - *partial* : multi-part deposit is still ongoing + - *ready-for-checks*: deposit completed + - *ready-for-load*: content and metadata verified + - *success*: loading completed successfuly + - *failure*: loading failed +#+BEAMER: \pause + +*** Checking the deposit's state +#+BEAMER: \tiny +#+BEGIN_SRC +$ curl -i -u "${CREDS}" \ + ${SERVER}/1/${COLLECTION}/${DEPOSIT_ID}/status/ +#+END_SRC +#+BEAMER: \pause + +*** +#+BEAMER: \tiny +#+BEGIN_SRC +HTTP/1.0 200 OK +Date: Thu, 04 Jan 2018 15:20:12 GMT +... + + 11 + success + Loading is successful + 608757ea9bd8494d729732cc9a414948c160bd3c + +#+END_SRC + +** +The deposit was succesfuly pushed + +now we want to download the content with the + +#+BEAMER: \huge \centering +*Vault* + + +* Vault walkthrough +** Vault walkthrough +*** Requesting download with swh-id +#+BEAMER: \tiny +#+BEGIN_SRC python +from swh.vault.api.client import RemoteVaultClient +c = RemoteVaultClient('http://orangeriedev.internal.softwareheritage.org:5005') +c.cook('revision_gitfast', '608757ea9bd8494d729732cc9a414948c160bd3c') +#+END_SRC + +*** Checking progress +#+BEAMER: \tiny +#+BEGIN_SRC py +# Call that as many times as you want to check the cooking progress +c.progress('revision_gitfast', '608757ea9bd8494d729732cc9a414948c160bd3c') +#+END_SRC + +*** response +#+BEAMER: \tiny +#+BEGIN_SRC +{ + 'fetch_url': '/api/1/vault/revision_gitfast/594617d1cd9d9d6bc0cfbd531bbaa1ed19627e9b/raw/', + 'progress_message': None, + 'status': 'done', + 'id': 4, + 'obj_id': '608757ea9bd8494d729732cc9a414948c160bd3c', + 'obj_type': 'revision_gitfast' +} +#+END_SRC + +** Vault walkthrough +*** Download when status is marked /done/ +#+BEAMER: \tiny +#+BEGIN_SRC python +$ curl https://archive.softwareheritage.org/api/1/vault/revision_gitfast/swh-id/raw/ \ + path/to/revision.gitfast.gz + +$ git init +$ zcat path/to/revision.gitfast.gz | git fast-import +$ git revert HEAD +#+END_SRC