diff --git a/common/modules/swh-pids.org b/common/modules/swh-pids.org new file mode 100644 index 0000000..2601bd5 --- /dev/null +++ b/common/modules/swh-pids.org @@ -0,0 +1,141 @@ +#+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) +# +# Software Heritage PIDs: here we present our rationale for introducing a new identifier schema, and the identifier schema itself +# +#+INCLUDE: "prelude.org" :minlevel 1 +# +# We need tcolorbox here: add the following lines to your main .org document! +# +#+LATEX_HEADER: \usepackage{tcolorbox} +#+BEAMER_HEADER: \usepackage{tcolorbox} + +* The quest for a PID + :PROPERTIES: + :CUSTOM_ID: main + :END: +** Systems of identifiers + :PROPERTIES: + :CUSTOM_ID: definition + :END: +*** A /system of identifiers/ is + - a set of labels (the identifiers) + - mechanisms to perform : + |------------------------+---------------------------| + | /Generation (minting)/ | create a new label | + | /Assignment/ | associate label to object | + | /Retrieval/ | get object from a label | + |------------------------+---------------------------| + - optionally, mechanisms to perform: + |------------------+---------------------------| + | /Verification/ | check label and object | + | /Reverse Lookup/ | get label from an object | + | /Description/ | get metadata of an object | + |------------------+---------------------------| +** Mechanisms offered in some systems of identifiers + :PROPERTIES: + :CUSTOM_ID: survey + :END: + |--------------------+----------+-------+-------+--------| + | *Mech.* / *System* | *Handle* | *DOI* | *Ark* | *PURL* | + |--------------------+----------+-------+-------+--------| + | Generation | Yes | Yes | Yes | Yes | + | Assignment | Yes | Yes | Yes | Yes | + | Retrieval | Yes | Yes | Yes | Yes | + | Verification | N.A. | N.A. | N.A. | N.A. | + | Reverse Lookup | N.A. | N.A. | N.A. | N.A. | + | Description | Yes | Yes | Yes | N.A. | + |--------------------+----------+-------+-------+--------| +** Our challenges in the PID landscape + :PROPERTIES: + :CUSTOM_ID: challenges + :END: +*** Typical properties of systems of identifiers + \hfill uniqueness, non ambiguity, persistence, abstraction (opacity) +*** Key needed properties from our use cases + - gratis :: identifiers are free (billions of objects) + - integrity :: the associated object cannot been changed (sw dev, reproducibility) + - no middle man :: no central authority is needed (sw dev, reproducibility) +*** + \hfill we could not find systems with both *integrity* and *no middle man* ! +** An important distinction: DIOs vs. IDOs + :PROPERTIES: + :CUSTOM_ID: diovsido + :END: +#+BEGIN_EXPORT latex + \begin{quote} + The term “Digital Object Identifier” is construed as “digital identifier of an object," rather than “identifier of a digital object” \hfill Norman Paskin. 2010 + \end{quote} +#+END_EXPORT +*** DIO (Digital Identifier of an Object) + digital identifiers for (potentially) *non digital objects* + - epistemic complexity (manifestations, versions, locations, etc.) + - need an authority to ensure persistence and uniqueness +#+BEAMER: \pause +*** IDO (Identifier of a Digital Object) + digital identifiers (only) for *digital objects* + - can provide both *integrity* and *no middle man* + - broadly used in modern software development (git, etc.) +*** + \hfill for the core Software Heritage archive, IDOs are enough +** IDOs in Software Development: the origins + # R. C. Merkle, A digital signature based on a conventional encryption + # function, Crypto '87 + #+BEAMER: \vspace{-3mm} +***** Merkle tree (R. C. Merkle, Crypto 1979) :B_picblock: + :PROPERTIES: + :BEAMER_opt: pic=merkle, leftpic=true, width=.5\linewidth + :BEAMER_env: picblock + :BEAMER_act: + :END: + Combination of + - tree + - hash function +***** Classical cryptographic construction + fast, parallel signature of large data structures, built-in deduplication +#+BEAMER: \pause + - satisfies all three criteria + - widely used in industry (e.g., Git, nix, blockchains, IPFS, ...) + +** IDOs in Software Heritage: a worked example + #+LATEX: \centering + #+LATEX: \only<1>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/merkle_1}}} + #+LATEX: \only<2>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/contents}}} + #+LATEX: \only<3>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/merkle_2_contents}}} + #+LATEX: \only<4>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/directories}}} + #+LATEX: \only<5>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/merkle_3_directories}}} + #+LATEX: \only<6>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/revisions}}} + #+LATEX: \only<7>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/merkle_4_revisions}}} + #+LATEX: \only<8>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/releases}}} + #+LATEX: \only<9>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/merkle_5_releases}}} +** The Software Heritage IDO schema \hfill (see *\url{http://bit.ly/swhpids}*) +#+BEGIN_EXPORT latex +\small +\begin{tcolorbox} +\href{https://archive.softwareheritage.org/swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2} +{swh:1:{\bf cnt}:94a9ed024d3859793618152ea559a168bbcbb5e2} \hfill full text of the GPL3 license +\end{tcolorbox} +\pause +\begin{tcolorbox} +\href{https://archive.softwareheritage.org/swh:1:dir:d198bc9d7a6bcf6db04f476d29314f157507d505} +{swh:1:{\bf dir}:d198bc9d7a6bcf6db04f476d29314f157507d505} \hfill Darktable source code +\end{tcolorbox} +\pause +\begin{tcolorbox} +\href{https://archive.softwareheritage.org/swh:1:rev:309cf2674ee7a0749978cf8265ab91a60aea0f7d} +{swh:1:{\bf rev}:309cf2674ee7a0749978cf8265ab91a60aea0f7d} +\end{tcolorbox} +\hfill a {\bf revision} in the development history of Darktable\\\pause +\begin{tcolorbox} +\href{https://archive.softwareheritage.org/swh:1:rel:22ece559cc7cc2364edc5e5593d63ae8bd229f9f} +{swh:1:{\bf rel}:22ece559cc7cc2364edc5e5593d63ae8bd229f9f} +\end{tcolorbox} +\hfill {\bf release} 2.3.0 of Darktable, dated 24 December 2016\\\pause +\begin{tcolorbox} +\href{https://archive.softwareheritage.org/swh:1:snp:c7c108084bc0bf3d81436bf980b46e98bd338453} +{swh:1:{\bf snp}:c7c108084bc0bf3d81436bf980b46e98bd338453} +\end{tcolorbox} +\hfill a {\bf snapshot} of the entire Darktable repository (4 May 2017, GitHub) +#+END_EXPORT +#+LATEX: \pause +*** + *Current resolvers:* \url{archive.softwareheritage.org} and \url{n2t.org}