diff --git a/talks-public/2020-02-18-IDCC-15th/2020-02-18-IDCC-15th.org b/talks-public/2020-02-18-IDCC-15th/2020-02-18-IDCC-15th.org new file mode 100644 index 0000000..12271af --- /dev/null +++ b/talks-public/2020-02-18-IDCC-15th/2020-02-18-IDCC-15th.org @@ -0,0 +1,228 @@ +#+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) +#+TITLE: Curated Archiving of Research Software Artifacts: +#+SUBTITLE: lessons learned from the French open archive (HAL) +#+AUTHOR: Roberto Di Cosmo, Morane Gruenpeter, Bruno Marmol, Alain Monteil, Laurent Romary, Jozefina Sadowska +#+EMAIL: morane@softwareheritage.org @moraneottilia @swheritage +#+BEAMER_HEADER: \date[February 18th, 2020]{February 18th, 2020\\[-1em]} +#+BEAMER_HEADER: \title[Curated Archiving of Research Software Artifacts]{Curated Archiving of Research Software Artifacts:} +#+BEAMER_HEADER: \author[Di Cosmo, {\bf Gruenpeter}, Marmol, Monteil, Romary, Sadowska]{Roberto Di Cosmo, {\bf Morane Gruenpeter}, Bruno Marmol,\\ Alain Monteil, Laurent Romary, Jozefina Sadowska\\[1em]} +#+KEYWORDS: software heritage legacy preservation knowledge mankind technology +#+LATEX_HEADER: \usepackage{tcolorbox} +#+LATEX_HEADER: \definecolor{links}{HTML}{2A1B81} +#+LATEX_HEADER: \hypersetup{colorlinks,linkcolor=,urlcolor=links} +# +# prelude.org contains all the information needed to export the main beamer latex source +# use prelude-toc.org to get the table of contents +# + +#+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 + + +#+INCLUDE: "../../common/modules/169.org" + +# +LaTeX_CLASS_OPTIONS: [aspectratio=169,handout,xcolor=table] +#+LATEX_HEADER: \usepackage{bbding} +#+LATEX_HEADER: \usepackage{tcolorbox} +#+LATEX_HEADER: \DeclareUnicodeCharacter{66D}{\FiveStar} + + +# +# If you want to change the title logo it's here +# +#+BEAMER_HEADER: \titlegraphic{\includegraphics[width=0.5\textwidth]{Inria-HAL-CCSD-SWH-logo-horizontal.png}} + +# aspect ratio can be changed, but the slides need to be adapted +# - compute a "resizing factor" for the images (macro for picblocks?) +# +# set the background image +# +# https://pacoup.com/2011/06/12/list-of-true-169-resolutions/ +# +#+BEAMER_HEADER: \pgfdeclareimage[height=90mm,width=160mm]{bgd}{swh-world-169.png} +#+BEAMER_HEADER: \setbeamertemplate{background}{\pgfuseimage{bgd}} +#+LATEX: \addtocounter{framenumber}{-1} +* Introduction- Software is our heritage +** Source Code: /executable/ and /human readable/ knowledge +#+INCLUDE: "../../common/modules/source-code-different-short.org::#thesourcecode" :only-contents t :minlevel 3 + +*** + Len Shustek, CHM\hfill /“Source code provides a *view* into the mind of the designer.”/ + +** The Paris call: Software Source Code is part of our Heritage + #+INCLUDE: "../../common/modules/paris-call-2019.org::#pariscall2019" :only-contents t :minlevel 3 + + +#+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :minlevel 2 + +* The software deposit- a first class research output +# reproducibility and scientific knowledge pillars (one slide) +#+INCLUDE: "../../common/modules/swh-scientific-reproducibility.org::#main" :only-contents t :minlevel 2 +# deposit-communication-with-PID.png + +** The software deposit wrokflow +*** Collaboration + - Center for Direct Scientific Communication (*CCSD*) - behind the *HAL* platform + - the French National Institute for computer science and applied mathematics (*Inria*) + - Software Heritage - The largest library of *software source code* +#+BEAMER: \pause +*** A complete workflow with three major steps: + 1. *depositing* software source code on HAL’s platform + 2. *moderating* and curating the deposit by a certified IES-Inria moderator + 3. *sharing* the deposit and pushing the deposit to the SWH archive + +# scientific software (deposit) use-case (one slide) +#+INCLUDE: "../../common/modules/swh-scientific-deposit.org::#main" :only-contents t :minlevel 2 + +** Submit your source code \hfill \href{}{deposit guide} +#+latex: \begin{center} +#+ATTR_LATEX: :width \linewidth +file:HAL-form-IDCC.png +#+latex: \end{center} + +** Reference vs. citation +#+latex: \begin{center} +#+ATTR_LATEX: :width 0.7\linewidth +file:citation-format-IDCC.png +#+latex: \end{center} + +** The deposit view +#+latex: \begin{center} +#+ATTR_LATEX: :width 0.7\linewidth +file:HAL_deposit.png +#+latex: \end{center} + +* Keeping the human in the loop- metadata moderation +** Software deposit moderation +*** we need + - quality metadata to describe research software + - correct credit to all authors of the software +#+BEAMER: \pause +*** Main actions the digital archivist performs: + - detecting extraneous or abusive content (illegal or harassing), + - verifying consistency between the metadata and the software source code itself, + - completing or correcting the deposit metadata if needed. +#+BEAMER: \pause +*** Out of scope + - review source code functionality + - compile & run software + - assess reproducibility & accuracy + + + +** The moderation workflow +#+latex: \begin{center} +#+ATTR_LATEX: :width 0.62\linewidth +file:moderation-workflow.png +#+latex: \end{center} + + +** Publishing vs Sharing +*** Publishing + :PROPERTIES: + :BEAMER_col: 0.48 + :BEAMER_env: block + :END: + - an academic publication is a research result that has been + qualified through some form of *peer review* + - *software review* examples: AEC, IPOL, the Journal of Open Source Software +#+BEAMER: \pause +*** Sharing + :PROPERTIES: + :BEAMER_col: 0.48 + :BEAMER_env: block + :END: + - vast majority developed outside of academia + - code hosting platforms like GitHub, GitLab, and many more + - institutional repositories or archives (HAL, Zenodo, SWH, etc..) +#+BEAMER: \pause + +*** + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: + + + \hfill We do not indicate HAL or Software Heritage as a publisher. + + +* Conclusion +** Lessons learned +*** The importance of a software license + - can software be deposited without a license? +#+BEAMER: \pause + \hfill became a *mandatory* field on HAL +#+BEAMER: \pause +*** Collective authorship + - can the X project team be the author of software? +#+BEAMER: \pause + \hfill authorship can be established only with a *clear link* between a /person and a deposit/ +#+BEAMER: \pause +*** Legacy software + - should be archived in its original state + - where to put additional information? +#+BEAMER: \pause + create source code *container* to capture both /original/ and /added information/ as detailed in the + \href{www.softwareheritage.org/swhap}{legacy software acquisition process} + + +** Lessons learned (continued..) +*** research experiments + - deposit on HAL or just archive repository on SWH? +#+BEAMER: \pause + \hfill depends on the *life span* of the experiment +#+BEAMER: \pause + +*** software with large datasets + - include in software deposit or separate? +#+BEAMER: \pause + \hfill depends on *dataset nature* and *reuse possibilities* + +#+BEAMER: \pause + +*** Software collections :B_picblock: + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_OPT: pic=python3-matplotlib.pdf, width=.5\linewidth, leftpic=true + :END: + - Research Software does not exist in isolation + - large /web of dependencies/ on non-research software + - single or multiple deposits ? +#+BEAMER: \pause + \hfill depends on *reuse possibilities* + +** Next steps +TODO + +** Come in, we're open! +*** + This work is partially supported by the FAIRsFAIR European project. + \url{www.softwareheritage.org} --- learn more \\ + contribute to the [[https://gitlab.inria.fr/gt-sw-citation/bibtex-sw-entry/][@software bibtex proposal]] + \url{www.softwareheritage.org/swhap} --- legacy software acquisition process \\ + + #+BEAMER: \vspace{-1mm} \flushright {\Huge Questions?} \vfill + +*** References :B_block: + :PROPERTIES: + :BEAMER_env: block + :END: + #+BEGIN_EXPORT latex + \begin{thebibliography}{Foo Bar, 1969} + \footnotesize + \bibitem{Abramatic2018} Jean-François Abramatic, Roberto Di Cosmo, Stefano Zacchiroli\newblock + \emph{Building the Universal Archive of Source Code},\\ + Communications of the ACM, October 2018 + \href{https://doi.org/10.1145/3183558}{(10.1145/3183558)} + \bibitem{DiCosmo2019} Roberto Di Cosmo, Morane Gruenpeter, Stefano Zacchiroli\newblock + \emph{Referencing Source Code Artifacts: a Separate Concern in Software Citation},\\ + Computing in Science and Engineering, IEEE, pp.1-9. \href{https://dx.doi.org/10.1109/MCSE.2019.2963148}{(10.1109/MCSE.2019.2963148)} + \href{https://hal.archives-ouvertes.fr/hal-02446202}{(hal-02446202)} \end{thebibliography} + #+END_EXPORT + + + + + + + + diff --git a/talks-public/2020-02-18-IDCC-15th/METADATA b/talks-public/2020-02-18-IDCC-15th/METADATA new file mode 100644 index 0000000..600ef0f --- /dev/null +++ b/talks-public/2020-02-18-IDCC-15th/METADATA @@ -0,0 +1,33 @@ +Title: The swh-id: a digital fingerprint identifying software source code + + + Abstract: + + The Software Heritage universal archive of software source code relies on + well established techniques used in software development communities to + identify the over 20 billion code artefacts it preserves + cryptographic hashes in a Merkle DAG data structure. + + In this session we will first explain the motivations of this choice, + recalling Paskin's essential distinction between digital identifiers of + an object (DIOs) and identifiers of digital objects (IDOs). + + Then we will focus on the properties of the Software Heritage Identifiers + (SWH-IDs) that matter most in a reproducibility and long term archival framework: + intrinsic integrity and independent verifiability. + + Finally, we will show practically how they can be used to improve current + research publication practices. + + How would you run the session to support the spirit of PIDapalooza as a laid-back, + welcoming, energetic and exciting meeting, and ensure at least 10 minutes of + your session are used to interact with the audience? + + We will do a live demonstration of the swh-identify module that can extract + the PID from the digital artefact. + Also we will show how to resolve an swh-id on the online archive and how + to find a swh-id of a preserved artefact. + + Finally, we will invite participants that want to preserve their repositories + or important repositories to submit the code with Software Heritage's + "save code now" feature. diff --git a/talks-public/2020-02-18-IDCC-15th/Makefile b/talks-public/2020-02-18-IDCC-15th/Makefile new file mode 100644 index 0000000..68fbee7 --- /dev/null +++ b/talks-public/2020-02-18-IDCC-15th/Makefile @@ -0,0 +1 @@ +include ../Makefile.slides