diff --git a/common/modules/swh-ardc.org b/common/modules/swh-ardc.org index a17f7ef..e041d9d 100644 --- a/common/modules/swh-ardc.org +++ b/common/modules/swh-ardc.org @@ -1,168 +1,265 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) # # Software is all around us # #+INCLUDE: "prelude.org" :minlevel 1 #+INCLUDE: "169.org" -* How Software Heritage addresses ARDC +* Source code pillar of Open Science, and how Software Heritage addresses ARDC :PROPERTIES: :CUSTOM_ID: main :END: +** Source code is /special/ (software is /not/ data) + :PROPERTIES: + :CUSTOM_ID: swnotdata + :END: +*** /Executable/ and /human readable/ knowledge \hfill copyright law :noexport: + /“Programs must be written for people to read, and only incidentally for machines to execute.”/\\ + \hfill Harold Abelson +#+BEAMER: \pause +*** Software /evolves/ over time + - projects may last decades + - the /development history/ is key to its /understanding/ +#+BEAMER: \pause +*** Complexity :B_picblock: + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_OPT: pic=python3-matplotlib.pdf, width=.6\linewidth + :END: + - /millions/ of lines of code + - large /web of dependencies/ + + easy to break, difficult to maintain + + /research software/ a thin top layer + - sophisticated /developer communities/ +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +#+BEAMER: \pause +*** Precious, endangered /executable/ and /human readable/ knowledge + key people *passing away*, platforms (GoogleCode, Gitorious, etc.) closing down ...\\ + \hfill no organised effort to catalog and archive it +** Source code is /special/, cont'd + :PROPERTIES: + :CUSTOM_ID: swnotdatacontd + :END: +*** Versioning, granularity + - Project :: “Inria created OCaml and Scikit-learn”\pause + - Release :: “2D Voronoi Diagrams were introduced in CGAL 3.1.0”\pause + - Precise state of a project :: “This result was produced using commit 0064fbd...”\pause + - Code fragment :: “The core algorithm is in lines 101 to 143 of the file parmap.ml contained in the precise state of the project corresponding to commit 0064fbd....” + #+BEAMER: \pause +*** Authors can have multiple roles: + - Architecture, Management, Development, Documentation, Testing, ... + +** Software Source code: a pillar of Open Science + :PROPERTIES: + :CUSTOM_ID: pillaropenscience + :END: +*** Software is everywhere in modern research :B_picblock: + :PROPERTIES: + :BEAMER_opt: pic=papermountain, leftpic=true, width=.3\linewidth + :BEAMER_env: picblock + :BEAMER_COL: .6 + :END: +#+BEGIN_QUOTE +[...] software [...] essential in their fields. + +\mbox{}\hfill Top 100 papers (Nature, 2014) +#+END_QUOTE +#+BEGIN_QUOTE +Sometimes, if you dont have the software, you dont have the data + +\mbox{}\hfill Christine Borgman, Paris, 2018 +#+END_QUOTE +# http://www.nature.com/news/the-top-100-papers-1.16224 +#+BEAMER: \pause +*** Open Science: three pillars :B_block: + :PROPERTIES: + :BEAMER_COL: .45 + :BEAMER_env: block + :END: +#+latex: \begin{center} +#+ATTR_LATEX: :width \extblockscale{\linewidth} +file:PreservationTriangle.png +#+latex: \end{center} +#+BEAMER: \pause +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +*** Nota bene + \hfill The links in the picture are *essential* +** A plurality of needs + :PROPERTIES: + :CUSTOM_ID: userneeds + :END: +*** Researchers + - *archive* and *reference* software used in articles + - *find* useful software + - get *credit* for developed software + - verify/reproduce/improve results + #+BEAMER: \pause +*** Laboratories/teams + - track software contributions + - produce reports + - maintain web page + #+BEAMER: \pause +*** Research Organization + - know its *software assets* for: technology *transfer*, impact *metrics*, strategy ** What is at stake: ARDC \hfill in increasing order of difficulty :PROPERTIES: :CUSTOM_ID: ardc :END: *** Archive Research software artifacts must be properly *archived*\\ \hfill make sure we can /retrieve/ them (/reproducibility/) #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: \vspace{-.5em} *** Reference Research software artifacts must be properly *referenced*\\ \hfill make sure we can /identify/ them (/reproducibility/) #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: \vspace{-.5em} *** Describe Research software artifacts must be properly *described*\\ \hfill make it easy to /discover/ and /reuse/ them (/visibility/) #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: \vspace{-.5em} *** Cite/Credit Research software artifacts must be properly *cited* /(not the same as referenced!)/\\ \hfill to give /credit/ to authors (/evaluation/!) *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: ** Addressing the four ARDC needs (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) :PROPERTIES: :CUSTOM_ID: swh-ardc-short :END: *** Archive (8B+ files, 140M+ projects) :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: #+ATTR_LATEX: :width .8\linewidth file:swh-dataflow-merkle.pdf \vspace{-1em} #+BEAMER: \pause - [[https://save.softwareheritage.org][save.softwareheritage.org]] - [[https://deposit.softwareheritage.org][deposit.softwareheritage.org]] # (HAL, IPOL) #+BEAMER: \pause *** Reference (20 billion SWHIDs) :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: [[https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/][Intrinsic, decentralised, cryptographically strong identifiers, SWHIDs]] \vspace{-1em} #+ATTR_LATEX: :width 1.02\linewidth file:SWHID-v1.4_3.png Now supported [[https://www.softwareheritage.org/2020/05/13/swhid-adoption/][in SPDX 2.2, Wikidata]] etc. #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** Describe :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - /Intrinsic metadata/ from source code - Contributed the [[https://codemeta.github.io/codemeta-generator/][Codemeta generator]] #+BEAMER: \pause *** Cite/Credit :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - Contributed /software citation/ style [[https://www.ctan.org/tex-archive/macros/latex/contrib/biblatex-contrib/biblatex-software][biblatex-software, v 1.2-2 now on CTAN]] ** Addressing the A(archive) in ARDC (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) :PROPERTIES: :CUSTOM_ID: swh-a :END: -#+latex: \vspace{-1em} +#+latex: \vspace{-0.8em} *** /Universal/ source code archive \hfill not only resarch \hfill (8B+ files, 140M+ projects) :PROPERTIES: :BEAMER_env: block :END: #+ATTR_LATEX: :width .6\linewidth file:swh-dataflow-merkle.pdf #+latex: \vspace{-1em} - your research software /is likely there already/! #+BEAMER: \pause - anyone can trigger archival with [[https://save.softwareheritage.org][save.softwareheritage.org]] #+BEAMER: \pause - selected partners can push to the archive via [[https://deposit.softwareheritage.org][deposit.softwareheritage.org]] # (HAL, IPOL) -#+BEAMER: \pause ** Addressing the R(eference) in ARDC (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) :PROPERTIES: :CUSTOM_ID: swh-r :END: #+latex: \vspace{-0.8em} *** Software Heritage Identifiers (SWHID) \hfill [[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][link to full docs]] :B_block: :PROPERTIES: :BEAMER_env: block :END: 20+B [[https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/][intrinsic, decentralised, cryptographically strong identifiers, SWHIDs]] # #+INCLUDE: "../../common/modules/swh-id-syntax.org::#swh-id-syntax" :only-contents t :minlevel 3 #+LATEX: \centering%\forcebeamerstart #+LATEX: \mode{\only<1>{\includegraphics[width=0.9\linewidth]{SWHID-v1.4_1.png}}} #+LATEX: \mode{\only<2>{\includegraphics[width=0.9\linewidth]{SWHID-v1.4_2.png}}} #+LATEX: \only<3->{\includegraphics[width=0.9\linewidth]{SWHID-v1.4_3.png}} #+LATEX: %\forcebeamerend *** :PROPERTIES: :BEAMER_act: <4-> :BEAMER_env: block :END: Emerging standard : Linux Foundation [[https://spdx.github.io/spdx-spec/appendix-VI-external-repository-identifiers/#persistent-id][SPDX 2.2]]; IANA registered; WikiData [[https://www.wikidata.org/wiki/Property:P6138][P6138]] #+latex: \vspace{-0.5em} *** Full fledged /source code references/ for reproducibility :B_block: :PROPERTIES: :BEAMER_act: <5-> :BEAMER_env: block :END: Examples: [[https://archive.softwareheritage.org/swh:1:cnt:64582b78792cd6c2d67d35da5a11bb80886a6409;origin=https://github.com/virtualagc/virtualagc;lines=245-261/][Apollo 11 AGC excerpt]], [[https://archive.softwareheritage.org/swh:1:cnt:bb0faf6919fc60636b2696f32ec9b3c2adb247fe;origin=https://github.com/id-Software/Quake-III-Arena;lines=549-572/][Quake III rsqrt]]; Guidelines available, see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] #+BEAMER: \pause ** Addressing D(escribe) and C(ite) in ARDC (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) :PROPERTIES: :CUSTOM_ID: swh-dc :END: *** Describe :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - Collect /intrinsic metadata/ - Contributed the [[https://codemeta.github.io/codemeta-generator/][Codemeta generator]] #+ATTR_LATEX: :width .8\linewidth file:CodeMetaGenerator.png #+BEAMER: \pause *** Cite/Credit :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - Contributed /software citation/ style [[https://www.ctan.org/tex-archive/macros/latex/contrib/biblatex-contrib/biblatex-software][biblatex-software, v 1.2-2 now on CTAN]] #+ATTR_LATEX: :width .8\linewidth file:BibLaTeX-swh.png