diff --git a/talks-public/2020-02-19-RDA-AMA/2020-02-19-RDA-AMA.org b/talks-public/2020-02-19-RDA-AMA/2020-02-19-RDA-AMA.org index 52859f4..6e12622 100644 --- a/talks-public/2020-02-19-RDA-AMA/2020-02-19-RDA-AMA.org +++ b/talks-public/2020-02-19-RDA-AMA/2020-02-19-RDA-AMA.org @@ -1,591 +1,594 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) #+TITLE: Archiving And Referencing All Software Source Code Using Software Heritage #+SUBTITLE: #+AUTHOR: Roberto Di Cosmo #+EMAIL: roberto@dicosmo.org @rdicosmo @swheritage #+BEAMER_HEADER: \date{February 19th, 2020} #+BEAMER_HEADER: \title[Archiving And Referencing All Software Source Code]{Archiving And Referencing All Software Source Code} #+BEAMER_HEADER: \author[Roberto Di Cosmo \hspace{5em} www.dicosmo.org]{Roberto Di Cosmo\\[2em]} # #+BEAMER_HEADER: \setbeameroption{show notes on second screen} #+BEAMER_HEADER: \setbeameroption{hide notes} #+KEYWORDS: software heritage legacy preservation knowledge mankind technology #+LATEX_HEADER: \usepackage{tcolorbox} #+LATEX_HEADER: \definecolor{links}{HTML}{2A1B81} #+LATEX_HEADER: \hypersetup{colorlinks,linkcolor=,urlcolor=links} # # prelude.org contains all the information needed to export the main beamer latex source # use prelude-toc.org to get the table of contents # #+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 #+INCLUDE: "../../common/modules/169.org" # +LaTeX_CLASS_OPTIONS: [aspectratio=169,handout,xcolor=table] #+LATEX_HEADER: \usepackage{bbding} #+LATEX_HEADER: \DeclareUnicodeCharacter{66D}{\FiveStar} #+LATEX_HEADER: \usepackage[anythingbreaks]{breakurl} # # If you want to change the title logo it's here # # +BEAMER_HEADER: \titlegraphic{\includegraphics[width=0.7\textwidth]{SWH-logo}} # aspect ratio can be changed, but the slides need to be adapted # - compute a "resizing factor" for the images (macro for picblocks?) # # set the background image # # https://pacoup.com/2011/06/12/list-of-true-169-resolutions/ # #+BEAMER_HEADER: \pgfdeclareimage[height=90mm,width=160mm]{bgd}{swh-world-169.png} #+BEAMER_HEADER: \setbeamertemplate{background}{\pgfuseimage{bgd}} #+LATEX: \addtocounter{framenumber}{-1} * Introduction #+INCLUDE: "../../common/modules/rdc-bio.org::#main" :only-contents t :minlevel 2 ** The knowledge is in the /source code/ #+INCLUDE: "../../common/modules/source-code-different-short.org::#thesourcecode" :only-contents t :minlevel 3 ** Source code is /special/ *** /Executable/ and /human readable/ knowledge \hfill copyright law /“Programs must be written for people to read, and only incidentally for machines to execute.”/\\ \hfill Harold Abelson #+BEAMER: \pause *** Software /evolves/ over time - projects may last decades - the /development history/ is key to its /understanding/ #+BEAMER: \pause *** Complexity :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=python3-matplotlib.pdf, width=.6\linewidth :END: - /millions/ of lines of code - large /web of dependencies/ + easy to break, difficult to maintain - sophisticated /developer communities/ * Academia's evolving practice ** Software is a pillar of Science ... *** Software is everywhere in modern science :PROPERTIES: :BEAMER_COL: .6 :BEAMER_env: block :END: #+BEGIN_QUOTE [...] the vast majority describe [...] or software that have become essential in their fields. \mbox{}\hfill Top 100 papers (\href{http://www.nature.com/news/the-top-100-papers-1.16224}{Nature, 2014}) #+END_QUOTE #+BEAMER: \pause *** :PROPERTIES: :BEAMER_COL: .45 :END: #+latex: \begin{center} #+ATTR_LATEX: :width \extblockscale{\linewidth} file:papermountain.jpg #+latex: \end{center} #+BEAMER: \pause *** :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEGIN_QUOTE Sometimes, if you dont have the software, you dont have the data \mbox{}\hfill Christine Borgman, Paris, 2018 #+END_QUOTE ** ... a /forgotten/ pillar of Open Science *** Lack of recognition :PROPERTIES: :BEAMER_env: block :END: not (yet) a first class citizen - in the EOSC plan # - in the EU copyright reform - in the scholarly world #+BEAMER: \pause *** Lack of consensus on how to :PROPERTIES: :BEAMER_env: block :END: - /archive/ software - /choose/ a license - /cite/ a software project ** Pressure to make the source code available is raising *** Why :PROPERTIES: :BEAMER_col: 0.48 :BEAMER_env: block :END: Necessary to... - /reproduce/ and /verify/, - /modify/ and /evolve/, *building new experiments* from old ones #+BEAMER: \pause *** When and where :PROPERTIES: :BEAMER_col: 0.48 :BEAMER_env: block :END: - debate started end of first 2000 decade (bio, statistics, medicine...) - growing in Computer Science since the [[https://www.artifact-eval.org/about.html][ESEC/FSE 2011 Artifact Evaluation Award]] #+BEAMER: \pause *** A wealth of initiatives... - Policies: ACM [[https://www.acm.org/publications/policies/artifact-review-badging][Artifact Review and Badging]], AEC, ... - Working groups: [[https://www.force11.org/software-citation-principles][FORCE11]], [[https://www.rd-alliance.org/groups/software-source-code-ig][RDA]], [[https://www.ouvrirlascience.fr/logiciels-libres-et-open-source/][SPSO]], ... # - Metrics: [[https://www.ouvrirlascience.fr/about-the-proposal-for-software-indicators-in-open-science-monitor-3/][Open Science Monitor]] (Elsevier!), ... - Journals: [[https://www.ipol.im/][IPOL]], ReScience, InsightJournal, JOSS, eLife, ACM DL, ... - Repositories: FigShare, Zenodo, ... - Common infrastructures: [[https://www.softwareheritage.org][Software Heritage]] ** What is at stake \hfill in increasing order of difficulty \vspace{-7pt} *** Archival Research software artifacts must be properly *archived*\\ \hfill make it sure we can /retrieve/ them (/reproducibility/) #+BEAMER: \pause *** Identification Research software artifacts must be properly *referenced*\\ \hfill make it sure we can /identify/ them (/reproducibility/) #+BEAMER: \pause *** Metadata Research software artifacts must be properly *described*\\ \hfill make it easy to /discover/ them (/visibility/) #+BEAMER: \pause *** Citation Research software artifacts must be properly *cited* /(not the same as referenced!)/\\ \hfill to give /credit/ to authors (/evaluation/!) #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: \vspace{-5pt} \hfill Let's focus on the /first two!/ \hfill\mbox{} * Archiving and referencing /all/ the source code: Software Heritage #+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :minlevel 2 ** Software Heritage: a revolutionary infrastructure \hfill bit.ly/swhpaper :PROPERTIES: :CUSTOM_ID: hilights :END: *** /All/ the software source code #+latex: \centering #+latex: \mbox{}\hfill\includegraphics[width=\extblockscale{.35\linewidth}]{swh-dataflow-merkle.pdf}\hfill\pause #+latex: \includegraphics[width=\extblockscale{.75\linewidth}]{2019-09-archive-growth.png}\hfill\mbox{}\\ #+BEAMER: \pause The largest software source code archive /ever/ *** /Uniform and intrinsic/ identifiers for reproducibility :B_block: :PROPERTIES: :BEAMER_env: block :END: Tracking over /20 billion software artifacts/, and counting... \hfill \url{bit.ly/swhpidpaper} #+BEAMER: \pause *** Adoption highlights - Wikidata https://www.wikidata.org/wiki/Property:P6138 - reference archive for /swmath.org/, HAL, etalab - part of the french National Plan for Open Science - ... * Zoom on the SWH-ID ** Modern software development #+INCLUDE: "../../common/modules/vcs-history.org::#timeline" :only-contents t :minlevel 3 ** /Intrinsic/ identifiers for modern software development #+INCLUDE: "../../common/modules/vcs-history.org::#dvcs-to-merkle" :only-contents t :minlevel 3 ** The SWH-ID schema: syntax and semantics #+INCLUDE: "../../common/modules/swh-id-syntax.org::#swh-id-syntax" :only-contents t :minlevel 3 ** Walkthrough the Parmap article *** Danelutto and Di Cosmo, 2012 :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: block :END: #+latex: \begin{center} #+ATTR_LATEX: :width \extblockscale{1.45\linewidth} file:parmap-article-conclusion.png #+latex: \end{center} #+latex: \begin{tiny} M. Danelutto and R. Di Cosmo, /A “Minimal Disruption” skeleton experiment: Seamless map & reduce embedding in OCaml,” Procedia CS, vol. 9, pp. 1837–1846, 2012. [Online]. Available: \href{http://dx.doi.org/10.1016/j.procs.2012.04.202}{[DOI: 10.1016/j.procs.2012.04.202]} #+latex: \end{tiny} #+BEAMER: \pause *** :PROPERTIES: :BEAMER_COL: .5 :END: #+latex: \vspace{-10pt} #+latex: \begin{center} #+latex: \begin{tiny} Accessed on the 6th of February 2020 #+latex: \end{tiny} #+latex: \vspace{-15pt} #+ATTR_LATEX: :width \extblockscale{1.5\linewidth} file:parmap-on-gitorious.png #+BEAMER: \pause #+latex: \vspace{-10pt} #+latex: \vspace{-15pt} #+ATTR_LATEX: :width \extblockscale{1.5\linewidth} file:parmap-moved-to-github.png #+latex: \vspace{-15pt} #+latex:\begin{tcolorbox} #+latex: \begin{tiny} \href{https://archive.softwareheritage.org/swh:1:snp:78209702559384ee1b5586df13eca84a5123aa82;origin=https://gitorious.org/parmap/parmap.git/}{swh:1:snp:78209702559384ee1b5586df13eca84a5123aa82} #+latex: \end{tiny} #+latex:\end{tcolorbox}\noindent #+latex: \end{center} #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+latex: \vspace{+5pt} \hfill Only 8 years later ! \hfill\mbox{} ** Referencing an algorithm in the source code \vspace{-20pt} *** Figure 1 in [Danelutto and Di Cosmo, 2012] :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: block :END: \begin{tiny} Parmap's implementation of the distribution, fork, and recollection phase \end{tiny} #+latex: \begin{center} #+ATTR_LATEX: :width \extblockscale{\linewidth} file:article-parmap-code.png #+latex: \end{center} #+BEAMER: \pause *** :PROPERTIES: :BEAMER_COL: .5 :END: \vspace{+20pt} #+latex: \begin{center} #+latex: \vspace{-20pt} #+ATTR_LATEX: :width \extblockscale{1.5\linewidth} file:parmap-cnt.png #+BEAMER: \pause \begin{tcolorbox} \begin{tiny} \href{https://archive.softwareheritage.org/swh:1:cnt:d5214ff9562a1fe78db51944506ba48c20de3379;origin=https://gitorious.org/parmap/parmap.git;lines=101-143/} {swh:1:cnt:d5214ff9562a1fe78db51944506ba48c20de3379;\\ origin=https://gitorious.org/parmap/parmap.git;lines=101-143} \end{tiny} \end{tcolorbox}\noindent #+latex: \end{center} ** A solution to address the reproducibility crisis *** McBane. 2020 :PROPERTIES: :BEAMER_COL: .48 :BEAMER_env: block :END: #+latex: \begin{center} #+ATTR_LATEX: :width \extblockscale{1.45\linewidth} file:swh-id-example-McBane2020-ReScience.png #+latex: \end{center} #+latex: \begin{tiny} George C. McBane. (2020). [Rp] Reproduction of interaction second virial coefficient calculation for H$_2$--CO interactions [J. Chem. Phys. vol. 112, 4417 (2000)]. Rescience C, 6(1), #1. \url{http://doi.org/10.5281/zenodo.3630224} #+latex: \end{tiny} #+BEAMER: \pause *** :PROPERTIES: :BEAMER_COL: .48 :END: \vspace{+20pt} #+latex: \begin{center} #+latex: \vspace{-20pt} #+ATTR_LATEX: :width \extblockscale{1.2\linewidth} file:ReScience-McBane-snp-example.png #+BEAMER: \pause \begin{tcolorbox} \begin{tiny} \href{https://archive.softwareheritage.org/swh:1:snp:85dcb31156194ea3bad60f06c1e7999e7bb1a90c/} {swh:1:snp:85dcb31156194ea3bad60f06c1e7999e7bb1a90c} \end{tiny} \end{tcolorbox}\noindent #+latex: \end{center} ** Zoom on the trust model for identifiers \vspace{-5pt} *** Trust model for usual DOIs :B_block: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: block :END: #+ATTR_LATEX: :width \linewidth file:doi-vs-pid-1.pdf #+BEAMER: \pause *** Trust model for DOIs with checksums :B_block: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: block :END: #+ATTR_LATEX: :width \linewidth file:doi-vs-pid-2.pdf *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** Trust model for SWH-IDs :PROPERTIES: :END: #+ATTR_LATEX: :width .3\linewidth file:doi-vs-pid-3.pdf ** A few examples :noexport: *** Let's look at some famous exceprts of source code #+BEAMER: \pause *** Apollo 11 source code ([[https://archive.softwareheritage.org/swh:1:cnt:64582b78792cd6c2d67d35da5a11bb80886a6409;origin=https://github.com/virtualagc/virtualagc;lines=245-261/][excerpt]]) :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.48 :BEAMER_env: block :END: #+LATEX: \includegraphics[width=\linewidth]{apollo-11-cranksilly.png} # excerpt of routine that asks astronaut to turn around the LEM #+BEAMER: \pause *** Quake III source code ([[https://archive.softwareheritage.org/swh:1:cnt:bb0faf6919fc60636b2696f32ec9b3c2adb247fe;origin=https://github.com/id-Software/Quake-III-Arena;lines=549-572/][excerpt]]) :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.45 :BEAMER_env: block :END: #+LATEX: \includegraphics[width=\linewidth]{quake-carmack-sqrt-1.png} # smart efficient implementation of 1/sqrt(x) on a CPU without special support #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** It works! we have /intrinsic/ identifiers for all 20+ billion objects in the archive * Practical guidelines for archiving and referencing ** Prepare your software source code \hfill \href{https://www.softwareheritage.org/save-and-reference-research-software/}{complete guidelines} # scientific software (save code now) use-case (one slide)- preapre #+INCLUDE: "../../common/modules/swh-scientific-preservation.org::#prepare" :only-contents t :minlevel 3 ** Submit save request on SWH \hfill \href{https://www.softwareheritage.org/save-and-reference-research-software/}{complete guidelines} # scientific software (save code now) use-case (one slide) #+INCLUDE: "../../common/modules/swh-scientific-preservation.org::#save" :only-contents t :minlevel 3 ** Reference software artifacts in your articles \hfill \href{https://www.softwareheritage.org/save-and-reference-research-software/}{complete guidelines} # scientific software (save code now) use-case (one slide) #+INCLUDE: "../../common/modules/swh-scientific-preservation.org::#reference" :only-contents t :minlevel 3 ** Reference software artifacts in your articles \hfill \href{https://www.softwareheritage.org/save-and-reference-research-software/}{complete guidelines} #+INCLUDE: "../../common/modules/swh-scientific-preservation.org::#referencecontd" :only-contents t :minlevel 3 * What about metadata and citation? ** It's more complex than it seems! *** Software is complex - Structure :: monolithic/composite; self-contained/external dependencies - Lifetime :: one-shot/long term - Community :: one man/one team/distributed community - Authorship :: complex set of roles - Authority :: institutions/organizations/communities/single person #+BEAMER: \pause *** Various granularities - Exact status of the source code :: for reproducibility, e.g. #+latex: \emph{``you can find at \href{https://archive.softwareheritage.org/swh:1:cnt:cdf19c4487c43c76f3612557d4dc61f9131790a4;lines=146-187/}{swh:1:cnt:cdf19c4487c43c76f3612557d4dc61f9131790a4;lines=146-187} the core algorithm used in this article''} - (Major) release :: \emph{``This functionality is available in OCaml version 4''} - Project :: \emph{``Inria has created OCaml and Scikit-Learn''}. ** We are not alone :noexport: *** Research Software does not exist in isolation :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=python3-matplotlib.pdf, width=.6\linewidth, leftpic=true :END: large /web of dependencies/ on non-research software #+BEAMER: \pause *** Industry and developers have been here :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - NSRL (NIST) - SPDX (Linux Foundation) - SWH-ID (Software Heritage) - SWID (ISO Standard) - Wikidata Software Properties #+BEAMER: \pause *** We must :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - accept the complexity - avoid reinventing the wheel - connect with existing communities of practice ** Proposals for metadata and citation in the scholarly world *** Refined ontology for contributors :B_block: :PROPERTIES: :BEAMER_COL: .55 :BEAMER_env: block :END: - Design, Architecture, - Coding, Testing, Debugging, - Documentation, Maintenance, Support, - Management # \hfill see also [[https://www.casrai.org/credit.html][CRediT]], [[https://geodynamics.org/cig/metadata/?software=aspect&version=2.1.0][Geodynamics]] #+BEAMER: \pause *** Reference is distinct from citation :B_block: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: block :END: - *Reference* is for /reproducibility/\\ \hfill and now we can get it right! - *Citation* is for /credit/\\ \hfill and the jury is still out... \hfill They must not be conflated *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** Keep the human in the loop :B_block: :PROPERTIES: :BEAMER_env: block :END: When /credit/ is at stake, automation/crowdsourcing is not enough!\\ \hfill Humans /are needed/ to get /quality information/ +#+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: - Experiments are ongoing on /moderated/ software deposit. +*** Experiments are ongoing on /moderated/ software deposit ... (IDCC 2020) + /Curated Archiving of Research Software Artifacts : lessons learned from the French open archive (HAL)/ + https://hal.archives-ouvertes.fr/hal-02475835v1 ** Conclusion *** Research software :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - pillar of open science # - not just data - finally in the limelight *** Doing it right is not easy :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - /simplistic/ approaches, "just data", ... # - /directives/ are coming - soon part of /research evaluation/ *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** You can help make a change - leverage Software Heritage in conferences and journals for /archival/ and /reference/ - join the conversation on /software citation/ and /software evaluation/ criteria #+BEAMER: \pause *** Where can you participate? - Software Source Code Interest group - \href{https://www.rd-alliance.org/groups/software-source-code-ig}{RDA-SSC IG} - Software Source code Identification Working Group - \href{https://www.rd-alliance.org/groups/software-source-code-identification-wg}{RDA-Force11-SCID WG} - Software Citation Implementation Working Group - \href{https://www.force11.org/group/software-citation-implementation-working-group}{Force11-SCIWG} ** Come in, we're open ! #+INCLUDE: "../../common/modules/last-slide-references.org::#references-identifiers" :only-contents t :minlevel 3 #+TODO: include reference Pierre Alliez, Roberto Di Cosmo, Benjamin Guedj, Alain Girault, Mohand-Said Hacid, et al.. Attributing and Referencing (Research) Software: # Best Practices and Outlook from Inria. Computing in Science & Engineering, IEEE, 2019, pp.1-14. # ⟨10.1109/MCSE.2019.2949413⟩. ⟨hal-02135891v2⟩ * Appendix :B_appendix: :PROPERTIES: :BEAMER_env: appendix :END: * Worked example Merkle tree ** A worked example #+LATEX: \centering\forcebeamerstart #+LATEX: \only<1>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_1.pdf}}} #+LATEX: \only<2>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/contents.pdf}}} #+LATEX: \only<3>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_2_contents.pdf}}} #+LATEX: \only<4>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/directories.pdf}}} #+LATEX: \only<5>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_3_directories.pdf}}} #+LATEX: \only<6>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/revisions.pdf}}} #+LATEX: \only<7>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_4_revisions.pdf}}} #+LATEX: \only<8>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/releases.pdf}}} #+LATEX: \only<9>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_5_releases.pdf}}} #+LATEX: \only<10>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/snapshots.pdf}}} #+LATEX: \forcebeamerend * History of VCS :noexport: ** Evolution of software development #+INCLUDE: "../../common/modules/vcs-history.org::#timeline" :only-contents t :minlevel 3 ** Foundations of modern DVCS #+INCLUDE: "../../common/modules/vcs-history.org::#dvcs-to-merkle" :only-contents t :minlevel 3 ** In a picture \hfill (from https://github.com/progit/progit2) #+INCLUDE: "../../common/modules/vcs-history.org::#vcs-explained" :only-contents t :minlevel 3 ** A massive adoption #+INCLUDE: "../../common/modules/vcs-history.org::#adoption" :only-contents t :minlevel 3 * Connecting communities :noexport: ** FORCE11 Software Citation Implementation WG *** Spawned from the FORCE11 Software Citation WG (2/2016) led by Daniel Katz, Kyle Niemeyer and Arfon Smith *** Co-chairs Neil Chue Hong, Martin Fenner, Daniel Katz #+TODO:fill in with links ** RDA Software Source Code Interest Group *** Co-chairs Roberto Di Cosmo, Neil Chue Hong, Mingfang Wu, Julia Collins *** Objectives a forum for discussing /software/ inside RDA *** Chronology - RDA 10, Montreal 9/2017 :: motivations, survey of ontologies, metadata use cases - RDA 11, Berlin 3/2018 :: identification of gaps in metadata - RDA 13, Philadelphia 4/2019 :: FAIR for Software Source Code - RDA 15, Melbourne 3/2020 :: Should we create a FAIR4Software WG? *** Web page https://www.rd-alliance.org/groups/software-source-code-ig ** RDA WG on Software Source Code Identification *** Joint RDA & FORCE11 WG which spawned from RDA's Software Source Code IG & FORCE11's SCIWG *** Co-chairs Roberto Di Cosmo, Daniel Katz, Martin Fenner *** Objectives - bring together people involved/interested in /software identification/ - produce concrete recommendations for the academic community *** Chronology - FORCE2019, Edinburgh 10/2019 :: Research Software Hackathon - identification track - RDA 15, Melbourne 3/2020 :: Software identification use cases *** https://www.rd-alliance.org/groups/software-source-code-identification-wg ** Inria's Software Citation Working Group *** Members \hfill task force of Inria's scientific council *** Mission - map the landscape - collect best practices - identify potential Inria contributions - make recommendations *** First outcome Position paper available from \hfill https://hal.archives-ouvertes.fr/hal-02135891 diff --git a/talks-public/2020-02-25-SophiaForumNumerica/2020-02-25-SophiaForumNumerica.org b/talks-public/2020-02-25-SophiaForumNumerica/2020-02-25-SophiaForumNumerica.org index b28dbf0..c5cd6f7 100644 --- a/talks-public/2020-02-25-SophiaForumNumerica/2020-02-25-SophiaForumNumerica.org +++ b/talks-public/2020-02-25-SophiaForumNumerica/2020-02-25-SophiaForumNumerica.org @@ -1,598 +1,667 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) #+TITLE: Archiving, referencing and attributing research software #+SUBTITLE: towards software as a first class citizen # #+AUTHOR: Roberto Di Cosmo # #+EMAIL: roberto@dicosmo.org @rdicosmo @swheritage #+BEAMER_HEADER: \date{February 25th, 2020} #+BEAMER_HEADER: \title[(CC-BY 4.0) Research Software]{Archiving, referencing and attributing research software} #+BEAMER_HEADER: \author[Roberto Di Cosmo]{Roberto Di Cosmo\\Inria and Université de Paris} # #+BEAMER_HEADER: \setbeameroption{show notes on second screen} #+BEAMER_HEADER: \setbeameroption{hide notes} #+KEYWORDS: software heritage legacy preservation knowledge mankind technology #+LATEX_HEADER: \usepackage{tcolorbox} #+LATEX_HEADER: \definecolor{links}{HTML}{2A1B81} #+LATEX_HEADER: \hypersetup{colorlinks,linkcolor=,urlcolor=links} # # prelude.org contains all the information needed to export the main beamer latex source # use prelude-toc.org to get the table of contents # #+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 #+INCLUDE: "../../common/modules/169.org" # +LaTeX_CLASS_OPTIONS: [aspectratio=169,handout,xcolor=table] #+LATEX_HEADER: \usepackage{bbding} #+LATEX_HEADER: \DeclareUnicodeCharacter{66D}{\FiveStar} # # If you want to change the title logo it's here # # +BEAMER_HEADER: \titlegraphic{\includegraphics[width=0.7\textwidth]{SWH-logo}} # aspect ratio can be changed, but the slides need to be adapted # - compute a "resizing factor" for the images (macro for picblocks?) # # set the background image # # https://pacoup.com/2011/06/12/list-of-true-169-resolutions/ # #+BEAMER_HEADER: \pgfdeclareimage[height=90mm,width=160mm]{bgd}{swh-world-169.png} #+BEAMER_HEADER: \setbeamertemplate{background}{\pgfuseimage{bgd}} #+LATEX: \addtocounter{framenumber}{-1} * Software Source Code: a precious heritage ** Software source code: a precious part of our heritage #+INCLUDE: "../../common/modules/source-code-different-short.org::#softwareisdifferent" :only-contents t :minlevel 3 ** Source code is a /special/ and endangered heritage *** /Executable/ and /human readable/ knowledge \hfill copyright law :noexport: /“Programs must be written for people to read, and only incidentally for machines to execute.”/\\ \hfill Harold Abelson #+BEAMER: \pause *** Software /evolves/ over time - projects may last decades - the /development history/ is key to its /understanding/ #+BEAMER: \pause *** Complexity :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=python3-matplotlib.pdf, width=.6\linewidth :END: - /millions/ of lines of code - large /web of dependencies/ + easy to break, difficult to maintain - sophisticated /developer communities/ *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** Precious, endangered /Executable/ and /human readable/ knowledge key people *are passing away* ...\\ \hfill no organised effort to catalog and archive it * Software Source Code: a (forgotten) pillar of Science ** Software Source code: pillar of Open Science *** Software is everywhere in modern research :B_picblock: :PROPERTIES: :BEAMER_opt: pic=papermountain, leftpic=true, width=.3\linewidth :BEAMER_env: picblock :BEAMER_COL: .6 :END: #+BEGIN_QUOTE [...] software [...] essential in their fields. \mbox{}\hfill Top 100 papers (Nature, 2014) #+END_QUOTE #+BEGIN_QUOTE Sometimes, if you dont have the software, you dont have the data \mbox{}\hfill Christine Borgman, Paris, 2018 #+END_QUOTE # http://www.nature.com/news/the-top-100-papers-1.16224 #+BEAMER: \pause *** Open Science: three pillars :B_block: :PROPERTIES: :BEAMER_COL: .45 :BEAMER_env: block :END: #+latex: \begin{center} #+ATTR_LATEX: :width \extblockscale{\linewidth} file:PreservationTriangle.png #+latex: \end{center} #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: -*** Nota bene - \hfill The links in the picture are *essential* +*** Source code is needed to: + - /reproduce/ and /verify/, + - /modify/ and /evolve/, *building new experiments* from old ones + \mbox{}\\ + \hfill N.B.: the /links/ in the picture are *essential* ** The state of the art (in CS!) is far from ideal *** ICSE (Zannier, Melrik, Maurer, 2006) - complete absence of replication studies *** ACM TOSEM 2001 to 2006 \hfill C. Ghezzi http://bit.ly/tosemreprod - 60% of all papers have tools: *only 20%* /installable/ *** Collberg's 2015 study \hfill http://reproducibility.cs.arizona.edu/ - 601 mainstream papers: 508 with tools, *only 40%* /installable/ #+BEAMER: \pause *** Main reasons \hfill source code (/or the right version of it/) cannot be found ** Where we stand -*** A wealth of initiatives! +*** A wealth of initiatives! :B_block: + :PROPERTIES: + :BEAMER_env: block + :END: - Policies: ACM [[https://www.acm.org/publications/policies/artifact-review-badging][Artifact Review and Badging]], ... - Working groups: [[https://www.force11.org/software-citation-principles][FORCE11]], [[https://www.rd-alliance.org/groups/software-source-code-ig][RDA]], [[https://www.ouvrirlascience.fr/logiciels-libres-et-open-source/][SPSO]], ... - Metrics: [[https://www.ouvrirlascience.fr/about-the-proposal-for-software-indicators-in-open-science-monitor-3/][Open Science Monitor]] (Elsevier!), ... - Journals: [[https://www.ipol.im/][IPOL]], ReScience, InsightJournal, eLife, ACM DL, ... - Repositories: FigShare, Zenodo, ... #+BEAMER: \pause +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: *** \hfill but ... \hfill \mbox{} *** Lack of recognition :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: not (yet) a first class citizen - in the EOSC plan # - in the EU copyright reform - in the scholarly works #+BEAMER: \pause *** Lack of proper guidance on how to :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - /archive/ and /reference/ software - choose a license - /cite/ a software project # #+BEAMER: \pause # *** :B_ignoreheading: # :PROPERTIES: # :BEAMER_env: ignoreheading # :END: # *** Lack of basic prerequisites to reproducibility # See a discussion in \url{annex.softwareheritage.org/talks/2018/2018-09-17-STScI_public.pdf} ** A plurality of needs *** Researcher - archive and reference sw used in articles - get credit for the software they develop - verify/reproduce/improve results #+BEAMER: \pause *** Laboratory/team - track software contributions - produce up-to date report / web page #+BEAMER: \pause *** University/Research Organization - central view of research software assets - tech transfer - impact metrics ** What is at stake \hfill in increasing order of difficulty +\vspace{-7pt} *** Archival Research software artifacts must be properly *archived*\\ \hfill make it sure we can /retrieve/ them (/reproducibility/) #+BEAMER: \pause *** Identification Research software artifacts must be properly *referenced*\\ \hfill make it sure we can /identify/ them (/reproducibility/) #+BEAMER: \pause *** Metadata Research software artifacts must be properly *described*\\ \hfill make it easy to /discover/ them (/visibility/) #+BEAMER: \pause *** Citation Research software artifacts must be properly *cited* /(not the same as referenced!)/\\ \hfill to give /credit/ to authors (/evaluation/!) +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +\pause +\vspace{-5pt} +\hfill Let's focus on the /first two!/ \hfill\mbox{} * Meet Software Heritage ** Software Heritage in a nutshell \hfill www.softwareheritage.org #+BEAMER: \transdissolve #+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :only-contents t :minlevel 3 ** An international, non profit initiative\hfill built for the long term :PROPERTIES: :CUSTOM_ID: support :END: *** Sharing the vision :B_block: :PROPERTIES: :CUSTOM_ID: endorsement :BEAMER_COL: .5 :BEAMER_env: block :END: #+LATEX: \begin{center}{\includegraphics[width=\extblockscale{.4\linewidth}]{unesco_logo_en_285}}\end{center} #+LATEX: \vspace{-0.8cm} #+LATEX: \begin{center}\vskip 1em \includegraphics[width=\extblockscale{1.4\linewidth}]{support.pdf}\end{center} #+latex: \small And many more ...\\ #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/testimonials} *** Donors, members, sponsors :B_block: :PROPERTIES: :CUSTOM_ID: sponsors :BEAMER_COL: .5 :BEAMER_env: block :END: #+LATEX: \begin{center}\includegraphics[width=\extblockscale{.4\linewidth}]{inria-logo-new}\end{center} #+LATEX: \begin{center} # #+LATEX: \includegraphics[width=\extblockscale{.2\linewidth}]{sponsors-levels.pdf} #+LATEX: \colorbox{white}{\includegraphics[width=\extblockscale{1.4\linewidth}]{sponsors.pdf}} #+LATEX: \end{center} # - sponsoring / partnership :: \hfill \url{sponsorship.softwareheritage.org} *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** Research collaboration :B_picblock:noexport: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: picblock :BEAMER_OPT: pic=Qwant_Logo, leftpic=true :END: source code search engine *** See more :noexport: \hfill\tiny\url{http:://www.softwareheritage.org/support/testimonials} *** Global network :B_picblock:noexport: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: picblock :BEAMER_OPT: pic=fossid, leftpic=true, width=.3\linewidth :END: - first *independent mirror* - increased reliability -** The largest software archive, a shared infrastructure +** The largest software archive, a shared infrastructure :noexport: #+latex: \begin{center} #+ATTR_LATEX: :width 0.7\linewidth file:SWH-as-foundation-slim.png #+latex: \end{center} #+BEAMER: \pause #+latex: \centering #+ATTR_LATEX: :width \extblockscale{.9\linewidth} file:2019-09-archive-growth.png +** Largest software archive, principled \hfill \url{http://bit.ly/swhpaper} + #+latex: \begin{center} + #+ATTR_LATEX: :width 0.5\linewidth + file:SWH-as-foundation-slim.png + #+latex: \end{center} + #+BEAMER: \pause + #+latex: \centering + #+ATTR_LATEX: :width \extblockscale{.7\linewidth} + file:2020-02-archive-growth.png + #+BEAMER: \pause +*** Technology + :PROPERTIES: + :BEAMER_col: 0.34 + :BEAMER_env: block + :END: + - transparency and FOSS + - replicas all the way down +*** Content (billions!) + :PROPERTIES: + :BEAMER_col: 0.32 + :BEAMER_env: block + :END: + - *intrinsic identifiers* + - facts and provenance +*** Organization + :PROPERTIES: + :BEAMER_col: 0.33 + :BEAMER_env: block + :END: + - non-profit + - multi-stakeholder + ** A peek under the hood #+BEAMER: \begin{center} #+BEAMER: \mode{\only<1>{\includegraphics[width=\extblockscale{1\textwidth}]{swh-dataflow-merkle-listers.pdf}}} #+BEAMER: \only<2-3>{\includegraphics[width=\extblockscale{1\textwidth}]{swh-dataflow-merkle.pdf}} #+BEAMER: \end{center} #+BEAMER: \pause #+BEAMER: \pause /Global development history/ permanently archived in a /unique/ git-like Merkle DAG - *~400 TB* (uncompressed) blobs, *~20 B* nodes, *~280 B* edges # - *GitHub*, Gitlab.com, Bitbucket, /Gitorious/, /GoogleCode/, GNU, PyPi, Debian, NPM... * Archive and reference /all/ the source code ** Archive and reference *** Software Heritage: a revolutionary infrastructure :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=PreservationTriangle.png,leftpic=true, width=.34\linewidth :END: - *universal archive* of all source code + we archive /all/ software: both research and non research + we /proactively collect software/ in a systematic way - *intrinsic* identifiers for *reproducibility* + identify software artefacts /without any third party/ + cryptographically strong, compatible with git hashes #+BEAMER: \pause +*** Demo + 2012 Parmap paper [[http://www.dicosmo.org/Publications/Parmap2012.html][before]] and [[http://www.dicosmo.org/share/parmap_swh.pdf][after]]; [[https://xavierleroy.org/bibrefs/OCamlP3L-98.html][OCamlP3l paper]] for the [[https://github.com/ReScience/ten-years][Ten year challenge]] +#+BEAMER: \pause *** Full guidelines available! \hfill \tiny https://www.softwareheritage.org/save-and-reference-research-software/ -*** - Save code now ... [[https://archive.softwareheritage.org/save/][in just a few clicks]] -*** Demo - My 2012 Parmap paper [[http://www.dicosmo.org/Publications/Parmap2012.html][before]] and [[http://www.dicosmo.org/share/parmap_swh.pdf][after]]; other links: [[https://www.softwareheritage.org/2019/07/20/archiving-and-referencing-the-apollo-source-code/][Apollo 11]] (and [[https://www.softwareheritage.org/2019/07/20/archiving-and-referencing-the-apollo-source-code/][blog]]), [[https://archive.softwareheritage.org/swh:1:cnt:bb0faf6919fc60636b2696f32ec9b3c2adb247fe;origin=https://github.com/id-Software/Quake-III-Arena;lines=548-572/][Quake III Arena]] +#+BEAMER: \pause +# *** +# Save code now ... [[https://archive.softwareheritage.org/save/][in just a few clicks]] +*** + \hfill See also: [[https://www.softwareheritage.org/2019/07/20/archiving-and-referencing-the-apollo-source-code/][Apollo 11]] (and the [[https://www.softwareheritage.org/2019/07/20/archiving-and-referencing-the-apollo-source-code/][blog]] post!), [[https://archive.softwareheritage.org/swh:1:cnt:bb0faf6919fc60636b2696f32ec9b3c2adb247fe;origin=https://github.com/id-Software/Quake-III-Arena;lines=548-572/][Quake III Arena]] ** The SWH-ID schema # TODO: drawing with swh:1:cnt:xxxxxxx "exploded" and explained #+LATEX: \centering\forcebeamerstart #+LATEX: \only<1>{\includegraphics[width=\linewidth]{SWH-ID-1.png}} #+LATEX: \only<2>{\includegraphics[width=\linewidth]{SWH-ID-2.png}} #+LATEX: \only<3>{\includegraphics[width=\linewidth]{SWH-ID-3.png}} #+LATEX: \forcebeamerend ** A worked example #+LATEX: \centering\forcebeamerstart #+LATEX: \only<1>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_1.pdf}}} #+LATEX: \only<2>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/contents.pdf}}} #+LATEX: \only<3>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_2_contents.pdf}}} #+LATEX: \only<4>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/directories.pdf}}} #+LATEX: \only<5>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_3_directories.pdf}}} #+LATEX: \only<6>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/revisions.pdf}}} #+LATEX: \only<7>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_4_revisions.pdf}}} #+LATEX: \only<8>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/releases.pdf}}} #+LATEX: \only<9>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_5_releases.pdf}}} #+LATEX: \only<10>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/snapshots.pdf}}} #+LATEX: \forcebeamerend ** Zoom on the trust model for identifiers \vspace{-5pt} *** Trust model for usual DOIs :B_block: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: block :END: #+ATTR_LATEX: :width \linewidth file:doi-vs-pid-1.pdf #+BEAMER: \pause *** Trust model for DOIs with checksums :B_block: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: block :END: #+ATTR_LATEX: :width \linewidth file:doi-vs-pid-2.pdf *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** Trust model for SWH-IDs :PROPERTIES: :END: #+ATTR_LATEX: :width .3\linewidth file:doi-vs-pid-3.pdf * Describe and cite /research/ source code ** Context *** Many articles/guidelines :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .4 :END: - reproducibility - archival - credit and evaluation #+BEAMER: \pause *** Most common limitations :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .6 :END: - software is 'just data' - citation = reference = DOIs - citation produced by automated tools #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** A few remarkable exceptions - [[https://www.ascl.net][ASCL]] (since 1999): metadata only, carefully curated - [[https://www.geodynamics.org][geodynamics.org]] : source, documentation, metadata - [[https://swmath.org][swmath.org]] : software catalog via articles #+BEAMER: \pause *** Software Citation WG at Inria (since 10/2018) - leverage a 50 year experience, make recommendations - read more https://hal.archives-ouvertes.fr/hal-02135891 ** Why it is not simple *** Software is complex - Structure :: monolithic/composite; self-contained/external dependencies - Lifetime :: one-shot/long term - Community :: one man/one team/distributed community - Authorship :: complex set of roles /(more later)/ - Authority :: institutions/organizations/communities/single person #+BEAMER: \pause *** Various granularities - Exact status of the source code :: for reproducibility, e.g. #+latex: \emph{``you can find at \href{https://archive.softwareheritage.org/swh:1:cnt:cdf19c4487c43c76f3612557d4dc61f9131790a4;lines=146-187/}{swh:1:cnt:cdf19c4487c43c76f3612557d4dc61f9131790a4;lines=146-187} the core algorithm used in this article''} - (Major) release :: \emph{``This functionality is available in OCaml version 4''} - Project :: \emph{``Inria has created OCaml and Scikit-Learn''}. ** Proposals for the scholarly world *** Refined ontology for contributors :B_block: :PROPERTIES: :BEAMER_COL: .55 :BEAMER_env: block :END: - Design, Architecture, - Coding, Testing, Debugging, - Documentation, Maintenance, Support, - Management \hfill see also [[https://www.casrai.org/credit.html][CRediT]], [[https://geodynamics.org/cig/metadata/?software=aspect&version=2.1.0][Geodynamics]] #+BEAMER: \pause *** Reference is distinct from citation :B_block: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: block :END: - *Reference* is for /reproducibility/ - *Citation* is for /credit/ \hfill They must not be conflated - Beware :: of the numbers game: \hfill ... do we really want an /s-index/ ? *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** Keep the human in the loop :B_block: :PROPERTIES: :BEAMER_env: block :END: When /credit/ is at stake, automation/crowdsourcing is not enough!\\ \hfill Humans /are needed/ to get /quality information/ ** First steps with HAL / Software Heritage *** How it works, what is special :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=deposit-communication.png,width=.4\linewidth,leftpic=true :END: \noindent *\hspace{1em}Generic mechanism:* - SWORD based - *review process* - versioning # - /industry chimes in/ (details on demand) #+BEAMER: \pause *Today*: deposit .zip or .tar.gz file ([[http://bit.ly/swhdeposithalen][/guide/]])\\ *Tomorrow*: just provide the /SWH id/ #+BEAMER: \pause *** Deposit/describe research software in HAL - author: https://hal.archives-ouvertes.fr/hal-01872189 - moderator: https://hal.archives-ouvertes.fr/hal-01876705 *** Examples [[https://hal.archives-ouvertes.fr/hal-02130801][LinBox]], [[https://hal.archives-ouvertes.fr/hal-01897934][SLALOM]], [[https://hal.archives-ouvertes.fr/hal-02130729][Givaro]], [[https://hal.archives-ouvertes.fr/hal-02137040][NS2DDV]], [[https://hal.archives-ouvertes.fr/lirmm-02136558][SumGra]], [[https://hal.archives-ouvertes.fr/hal-02155786][Coq proof]], ... ** The swmath.org approach *** Article based citation See for example: - [[https://swmath.org/software/7116][SemiPar on swmath.org]] * The road ahead ** We need to care more about research software \vspace{-5pt} *** Research software :B_block:noexport: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - pillar of open science # - not just data - finally in the limelight *** Doing it right is not easy :B_block:noexport: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - /simplistic/ approaches, "just data", ... # - /directives/ are coming - soon part of /research evaluation/ *** :B_ignoreheading:noexport: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** You can help make a change - leverage Software Heritage in conferences, journals, AEC for /archival/ and /reference/ \\ #+LATEX: \hfill {\href{https://www.softwareheritage.org/save-and-reference-research-software/}{https://www.softwareheritage.org/save-and-reference-research-software/} } - join the conversation on /software citation/ and /software evaluation/ criteria - tackle the scientific problems : big code, classification, infrastructure, etc. #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+latex: \hfill {\Large\bf Thank you!} \hfill\mbox{} #+BEGIN_EXPORT latex \begin{thebibliography}{Foo Bar, 1969} \footnotesize \bibitem{Abramatic2018} Jean-François Abramatic, Roberto Di Cosmo, Stefano Zacchiroli\newblock \emph{Building the Universal Archive of Source Code}, CACM, October 2018 \href{https://doi.org/10.1145/3183558}{(10.1145/3183558)} \bibitem{DiCosmo2019} Roberto Di Cosmo, Morane Gruenpeter, Stefano Zacchiroli\newblock \emph{Referencing Source Code Artifacts: a Separate Concern in Software Citation},\newblock CiSE 2020 \href{https://dx.doi.org/10.1109/MCSE.2019.2963148}{(10.1109/MCSE.2019.2963148)} \href{https://hal.archives-ouvertes.fr/hal-02446202}{(hal-02446202)} \bibitem{alliez:hal-02135891} Pierre Alliez, Roberto Di Cosmo, Benjamin Guedj, Alain Girault, Mohand-Said Hacid, Arnaud Legrand and Nicolas Rougier\newblock \emph{Attributing and referencing (research) software: Best practices and outlook from Inria}, \newblock CiSE 2020 \href{https://doi.ieeecomputersociety.org/10.1109/MCSE.2019.2949413}{(10.1109/MCSE.2019.2949413)} \href{https://hal.archives-ouvertes.fr/hal-02135891}{(hal-02135891)} \end{thebibliography} #+END_EXPORT * Appendix :B_appendix: :PROPERTIES: :BEAMER_env: appendix :END: ** \vfill \centerline{\Huge Appendix} \vfill ** Software Heritage for Research and Innovation *** Reference platform for /Big Code/ :B_picblock: :PROPERTIES: :BEAMER_opt: pic=universal, leftpic=true, width=.2\linewidth :BEAMER_env: picblock :BEAMER_act: :END: - unique *observatory* of all software development - *big data, machine learning* paradise: classification, trends, coding patterns, code completion... #+BEAMER: \pause *** First datasets are available! - full graph of software development (~20Bn nodes, ~200Bn edges) see Pietri, Spinellis, Zacchiroli, MSR 2019 https://dx.doi.org/10.1109/MSR.2019.00030 - MSR 2020 mining competition see https://2020.msrconf.org/track/msr-2020-mining-challenge#Call-for-Papers ** Raising awareness about Software Source Code *** :B_column:BMCOL: :PROPERTIES: :BEAMER_col: .53 :BEAMER_env: column :END: #+ATTR_LATEX: :width .7\linewidth file:UNESCOParisCallMeeting.png UNESCO, Inria, Software Heritage invite\\ [[https://en.unesco.org/news/experts-call-greater-recognition-software-source-code-heritage-sustainable-development][40 international experts meet in Paris]] ... #+BEAMER: \pause *** :B_column:BMCOL: :PROPERTIES: :BEAMER_col: .5 :BEAMER_env: column :END: #+ATTR_LATEX: :width .65\linewidth file:paris_call_ssc_cover.jpg [[https://en.unesco.org/foss/paris-call-software-source-code][Their call is published on Feb 2019]] \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** :PROPERTIES: :BEAMER_COL: 1.06 :BEAMER_env: block :END: It's an important /policy tool/, already referenced and used ... \hfill /yes, you can sign it!/\\ \vspace{10pt} \hfill https://en.unesco.org/foss/paris-call-software-source-code \hfill\mbox{} +* Guidelines detailed and SWH-ID use cases +** Prepare your software source code \hfill \href{https://www.softwareheritage.org/save-and-reference-research-software/}{complete guidelines} +# scientific software (save code now) use-case (one slide)- preapre +#+INCLUDE: "../../common/modules/swh-scientific-preservation.org::#prepare" :only-contents t :minlevel 3 + +** Submit save request on SWH \hfill \href{https://www.softwareheritage.org/save-and-reference-research-software/}{complete guidelines} +# scientific software (save code now) use-case (one slide) +#+INCLUDE: "../../common/modules/swh-scientific-preservation.org::#save" :only-contents t :minlevel 3 + +** Reference software artifacts in your articles \hfill \href{https://www.softwareheritage.org/save-and-reference-research-software/}{complete guidelines} +# scientific software (save code now) use-case (one slide) +#+INCLUDE: "../../common/modules/swh-scientific-preservation.org::#reference" :only-contents t :minlevel 3 +** Reference software artifacts in your articles \hfill \href{https://www.softwareheritage.org/save-and-reference-research-software/}{complete guidelines} +#+INCLUDE: "../../common/modules/swh-scientific-preservation.org::#referencecontd" :only-contents t :minlevel 3 + * News ** Milestones :noexport: #+INCLUDE: "../../common/modules/swh-key-dates.org::#keydates" :minlevel 3 :only-contents t ** News : archiving /public/ code #+latex: \begin{center} #+ATTR_LATEX: :width 0.7\linewidth file:codeetalab.png #+latex: \end{center} #+BEAMER: \pause https://code.etalab.gouv.fr ** News : SWHAP *** Paris Call on Software Source Code “[We call to] support efforts to gather and preserve the artifacts and narratives of the history of computing, while the earlier creators are still alive” #+BEAMER: \pause *** SWHAP : an important step forward - detailed guidelines to *curate* landmark legacy source code and *archive* it on Software Heritage - intense cooperation with *Università di Pisa* and *UNESCO* - open to all, we'll promote it worldwide *** https://www.softwareheritage.org/swhap ** News : ENEA mirror *** Thomas Jefferson, February 18, 1791 :B_block: :PROPERTIES: :BEAMER_ACT: :BEAMER_env: block :END: #+latex: {\em ...let us save what remains: not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond the reach of accident. #+latex: } #+BEAMER: \pause *** Welcoming ENEA :B_block: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=LogoENEAcompletoENG.png, leftpic=true, width=.7\linewidth :END: - first *institutional* mirror - increased resilience - *AI infrastructure* for researchers - stepping stone to \endgraf \hfill an European joint effort * Inria's committment ** Inria's ongoing contributions *** Software Heritage - universal archive :: (research) software source code [[https://archive.softwareheritage.org/][archived and referenced]] *** Reproducibility - tools :: [[https://www.gnu.org/software/guix/][Guix]] (now [[https://www.softwareheritage.org/2019/04/18/software-heritage-and-gnu-guix-join-forces-to-enable-long-term-reproducibility/][with Software Heritage]]) - training/research :: RR workshops, MOOC *** Research software curation - HAL - SWH bridge :: curation of metadata, and [[https://hal.inria.fr/hal-01872189][deposit in Software Heritage]] * Identifiers are not easy ** URL decay disrupts the /web of reference/ #+INCLUDE: "../../common/modules/urls-decay.org::#rfc" :minlevel 3 :only-contents t #+INCLUDE: "../../common/modules/urls-decay.org::#examples" :minlevel 2 ** DOI limitations #+INCLUDE: "../../common/modules/doi-analysis.org::#doiexplained" :minlevel 3 :only-contents t -* Looking for the right identifiers +* Looking for the right identifiers :noexport: #+INCLUDE: "../../common/modules/swh-pids.org::#main" :only-contents t