diff --git a/talks-public/2022-05-31-HalPartners/2022-05-31-HalPartners.org b/talks-public/2022-05-31-HalPartners/2022-05-31-HalPartners.org index cd28ab4..a5cf961 100644 --- a/talks-public/2022-05-31-HalPartners/2022-05-31-HalPartners.org +++ b/talks-public/2022-05-31-HalPartners/2022-05-31-HalPartners.org @@ -1,810 +1,401 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) #+TITLE: Software as a pillar for open science #+SUBTITLE: policy, needs, and how to address them # #+AUTHOR: Roberto Di Cosmo # #+EMAIL: roberto@dicosmo.org @rdicosmo @swheritage #+BEAMER_HEADER: \date{May 2022} #+BEAMER_HEADER: \title[Towards a Software Pillar of Open Science]{The Software Pillar of Open Science} #+BEAMER_HEADER: \author[R. Di Cosmo~~~~ roberto@dicosmo.org ~~~~ (CC-BY 4.0)]{Roberto Di Cosmo} #+BEAMER_HEADER: \institute[Software Heritage]{Director, Software Heritage\\Inria and Universit\'e de Paris Cit\'e} # #+BEAMER_HEADER: \setbeameroption{show notes on second screen} #+BEAMER_HEADER: \setbeameroption{hide notes} #+KEYWORDS: software heritage legacy preservation knowledge mankind technology #+LATEX_HEADER: \usepackage{tcolorbox} #+LATEX_HEADER: \definecolor{links}{HTML}{2A1B81} #+LATEX_HEADER: \definecolor{links}{HTML}{0ADB11} #+LATEX_HEADER: \hypersetup{colorlinks,linkcolor=,urlcolor=links} #+LATEX_HEADER: \hypersetup{colorlinks,linkcolor=,urlcolor=cyan} # # prelude.org contains all the information needed to export the main beamer latex source # use prelude-toc.org to get the table of contents # #+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 #+INCLUDE: "../../common/modules/169.org" # +LaTeX_CLASS_OPTIONS: [aspectratio=169,handout,xcolor=table] #+LATEX_HEADER: \usepackage{bbding} #+LATEX_HEADER: \DeclareUnicodeCharacter{66D}{\FiveStar} # # If you want to change the title logo it's here # # +BEAMER_HEADER: \titlegraphic{\includegraphics[width=0.5\textwidth]{SWH-logo}} # aspect ratio can be changed, but the slides need to be adapted # - compute a "resizing factor" for the images (macro for picblocks?) # # set the background image # # https://pacoup.com/2011/06/12/list-of-true-169-resolutions/ # #+BEAMER_HEADER: \pgfdeclareimage[height=90mm,width=160mm]{bgd}{swh-world-169.png} #+BEAMER_HEADER: \setbeamertemplate{background}{\pgfuseimage{bgd}} #+LATEX: \addtocounter{framenumber}{-1} * Introduction :noexport: #+INCLUDE: "../../common/modules/rdc-bio.org::#main" :only-contents t :minlevel 2 * Software and Open Science ** Why Open Science? :noexport: #+BEAMER: \vspace{-.5em} *** Open Science ([[https://www.ouvrirlascience.fr/wp-content/uploads/2021/10/Second_French_Plan-for-Open-Science_web.pdf][Second National Plan for Open Science]], France, 2021) /Unhindered/ dissemination of results, methods and products from scientific research.\\ It draws on /the opportunity provided by recent digital progress/ to develop /open access/ to /publications/ and – as much as possible – /data/, /source code/ and /research methods/. #+BEAMER: \pause #+BEAMER: \vspace{-.3em} *** Jean-Eric Paquet (EU DGRI, [[https://www.eosc.eu/sites/default/files/EOSC-SRIA-V1.0_15Feb2021.pdf][on the objective of Open Science]]) # Preface of EOSC SRIA https://www.eosc.eu/sites/default/files/EOSC-SRIA-V1.0_15Feb2021.pdf “Increase /scientific quality/, the /pace of discovery and technological development/, as well as /societal trust in science/.” #+BEAMER: \pause #+BEAMER: \vspace{-.1em} *** Mariya Gabriel ([[https://www.s4d4c.eu/insights-from-commissioner-mariya-gabriel-towards-the-european-union-science-diplomacy/][EU Commissionneer]] for Research) # From the article: https://www.s4d4c.eu/insights-from-commissioner-mariya-gabriel-towards-the-european-union-science-diplomacy/ The COVID-19 crisis has also shown that cooperation at international level in research and innovation is more important than ever, including through /open access to data and results/. /No nation, no country can tackle any of these global challenges alone/. #+BEAMER: \pause #+BEAMER: \vspace{-.3em} *** Yuval Noah Harari (on COVID 19) \hfill /“The real antidote [to epidemic] is/ scientific knowledge /and/ global cooperation.” ** Software is a pillar of Open Science #+INCLUDE: "../../common/modules/swh-ardc.org::#pillaropenscience" :only-contents t :minlevel 3 #+BEAMER: \pause *** \hfill Preserving (the history of) source code is necessary for /reproducibility/ ** Software /Source Code/ is Precious Knowledge #+INCLUDE: "../../common/modules/source-code-different-short.org::#softwareisdifferent" :only-contents t :minlevel 3 * Policy framework and growing needs ** The Paris Call on Software Source code (2019, UNESCO) #+INCLUDE: "../../common/modules/policyactions.org::#pariscall2019science" :only-contents t :minlevel 3 ** Second French National plan for Open Science (2021, MESRI) #+INCLUDE: "../../common/modules/policyactions.org::#pnso2" :only-contents t :minlevel 3 ** Research Software is getting recognized #+INCLUDE: "../../common/modules/policyactions.org::#awards2022" :only-contents t :minlevel 3 ** A plurality of needs that we must address #+INCLUDE: "../../common/modules/swh-ardc.org::#userneeds" :only-contents t :minlevel 3 ** Archive, Reference, Describe, Cite and Credit #+INCLUDE: "../../common/modules/swh-ardc.org::#ardc" :only-contents t :minlevel 3 * Can you address these needs? ** A word of warning: forges are /not/ archives! *** 2015: the first big bad news Google Code and Gitorious.org shutdown: ~1M endangered repositories - broken links in the web of knowledge (my papers too) #+BEAMER: \pause *** 2019: big bad news keep coming in - summer 2019: BitBucket announces Mercurial VCS sunset - july 2020: BitBucket erases /250.000+/ repositories (including research software) #+BEAMER: \pause *** 2021: ... in Academia too - october 2021: Inria's old gforge is unplugged + [[https://github.com/ocaml/opam-repository/issues/19757][breaks the build chain]] of the OCaml package manager (Opam) #+BEAMER: \pause *** Bottomline \hfill we need a universal archive of software source code: \pause now we have one! ** Software Heritage in a nutshell \hfill www.softwareheritage.org #+BEAMER: \transdissolve #+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :only-contents t :minlevel 3 ** The largest software archive, a shared infrastructure #+latex: \begin{center} #+ATTR_LATEX: :width 0.7\linewidth file:SWH-as-foundation-slim.png #+latex: \end{center} #+BEAMER: \pause #+latex: \centering #+ATTR_LATEX: :width \extblockscale{.9\linewidth} file:archive-growth.png ** An international, non profit initiative\hfill built for the long term :PROPERTIES: :CUSTOM_ID: support :END: *** Sharing the vision :B_block: :PROPERTIES: :CUSTOM_ID: endorsement :BEAMER_COL: .5 :BEAMER_env: block :END: #+LATEX: \begin{center}{\includegraphics[width=\extblockscale{.4\linewidth}]{unesco_logo_en_285}}\end{center} #+LATEX: \vspace{-0.8cm} #+LATEX: \begin{center}\vskip 1em \includegraphics[width=\extblockscale{1.4\linewidth}]{support.pdf}\end{center} #+latex: \small And many more ...\\ #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/testimonials} #+BEAMER: \pause *** Donors, members, sponsors :B_block: :PROPERTIES: :CUSTOM_ID: sponsors :BEAMER_COL: .5 :BEAMER_env: block :END: #+LATEX: \begin{center}\includegraphics[width=\extblockscale{.4\linewidth}]{inria-logo-new}\end{center} #+LATEX: \begin{center} # #+LATEX: \includegraphics[width=\extblockscale{.2\linewidth}]{sponsors-levels.pdf} #+LATEX: \colorbox{white}{\includegraphics[width=\extblockscale{1.4\linewidth}]{sponsors.pdf}} #+LATEX: \end{center} # - sponsoring / partnership :: \hfill \url{sponsorship.softwareheritage.org} *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** Research collaboration :B_picblock:noexport: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: picblock :BEAMER_OPT: pic=Qwant_Logo, leftpic=true :END: source code search engine *** See more :noexport: \hfill\tiny\url{http:://www.softwareheritage.org/support/testimonials} *** Global network :B_picblock:noexport: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: picblock :BEAMER_OPT: pic=fossid, leftpic=true, width=.3\linewidth :END: - first *independent mirror* - increased reliability ** Addressing the four needs... \hfill (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) #+INCLUDE: "../../common/modules/swh-ardc.org::#swh-ardc-short" :only-contents t :minlevel 3 ** HAL and Software Heritage: building a curated software catalog #+INCLUDE: "../../common/modules/swh-ardc.org::#halswhoverview" :only-contents t :minlevel 3 * Yes you can! ** An example is worth a thousand words #+INCLUDE: "../../common/modules/swh-ardc.org::#demoswhhal" :only-contents t :minlevel 3 * Call to action ** Best practices for ARDC actionable today ... #+INCLUDE: "../../common/modules/swh-ardc.org::#ardc-best-france" :only-contents t :minlevel 3 ** Bottomline *** HAl+SWH let you address all the needs at once... - /researcher, engineer/: archival, reference, credit, CV etc. /with a little effort from them/ - /labs, organizations/: track and report software production in a simple way - /technology transfer offices/: view the software production - /national level/: a /curated/ catalog of the software production #+BEAMER: \pause *** ... with a little effort from your side - Update the Open Science policy to include software - Train on the use of SWH and HAL for software - Join the network of HAL moderators for software #+BEAMER: \pause *** #+LATEX: \centering{\large it's a long road, but together we can make it}\\[1em] #+latex: \centering{\huge\bf Questions?} -* END - \end{document} * Appendix :B_appendix: :PROPERTIES: :BEAMER_env: appendix :END: ** \vfill \centerline{\Huge Appendix} \vfill -* Policy news -** The Paris Call on Software Source code (2019, UNESCO) -#+BEAMER: \vspace{-.8em} -*** :B_column:BMCOL: - :PROPERTIES: - :BEAMER_col: .53 - :BEAMER_env: column - :END: - #+ATTR_LATEX: :width .65\linewidth - file:UNESCOParisCallMeeting.png - UNESCO, Inria, Software Heritage invite\\ - [[https://en.unesco.org/news/experts-call-greater-recognition-software-source-code-heritage-sustainable-development][40 international experts to meet in Paris]] - #+BEAMER: \pause -*** :B_column:BMCOL: - :PROPERTIES: - :BEAMER_col: .5 - :BEAMER_env: column - :END: - #+ATTR_LATEX: :width .6\linewidth - file:paris_call_ssc_cover.jpg - [[https://en.unesco.org/foss/paris-call-software-source-code][The call is published on Feb 2019]]\pause -*** :B_ignoreheading: - :PROPERTIES: - :BEAMER_env: ignoreheading - :END: -# #+BEAMER: \vspace{-.3em} -*** - :PROPERTIES: - :BEAMER_COL: 1.06 - :BEAMER_env: block - :END: - #+LATEX: {\it - “[We call to] promote software development as a valuable research activity, - and research software as a key enabler for Open Science/Open Research, - sharing good practices and recognising in the careers of academics their - contributions to high quality software development, in all their forms” - #+LATEX: } - https://en.unesco.org/foss/paris-call-software-source-code -** The UNESCO recommendations for Open Science, 2018-2021 - #+INCLUDE: "../../common/modules/policyactions.org::#unesco2021" :only-contents t :minlevel 3 -** The EOSC SIRS report: Software Source Code and Open Science, 2020 - #+INCLUDE: "../../common/modules/policyactions.org::#eoscsirs2020-expanded" :only-contents t :minlevel 3 -** French National plan for Open Science, 2021-2024 - #+LATEX: \vspace{-.3em} - #+INCLUDE: "../../common/modules/policyactions.org::#PNSO2" :only-contents t :minlevel 3 - #+BEAMER: \pause -*** - \hfill The "Collège Logiciel" of the National Committee on Open Science (CoSO) is now live! -** Software in the EOSC - #+INCLUDE: "../../common/modules/policyactions.org::#eoscswtf2021" :only-contents t :minlevel 3 -** Call to action: let's engage with policy makers (it may be us!) :noexport: -#+BEAMER: \vspace{-.5em} -*** Institutional representation :B_block: - :PROPERTIES: - :BEAMER_COL: .5 - :BEAMER_env: block - :END: - we need an (open source) software VP in - - universities - - ministries - - governments -#+BEAMER: \pause -*** Funding for infrastructures :B_block: - :PROPERTIES: - :BEAMER_env: block - :BEAMER_COL: .5 - :END: - push for funding instruments adapted to digital infrastructures (e.g. ESFRI): - + cost of human resources is /predominant/ - + /much shorter/ time frame -#+BEAMER: \pause -*** :B_ignoreheading: - :PROPERTIES: - :BEAMER_env: ignoreheading - :END: -*** Set the default to open: pass the message - /publicly funded research software should be open source/ \\ - \hfill exceptions must be justified -#+BEAMER: \pause -*** Career evaluation and incentives - - recognize /quality/ software development - + see e.g. [[https://hal-lara.archives-ouvertes.fr/hal-03110723][the 2021 Inria guidelines]] (in french) - and [[https://hal.archives-ouvertes.fr/hal-02135891][this CiSE 2020 article]] (in english) - - keep the human in the loop, avoid number games -** The floor is yours -*** -#+LATEX: \centering{\large it's a long road, but together we can make it}\\[1em] - - #+latex: \centering{\huge\bf Questions?} -*** References - #+BEGIN_EXPORT latex - \begin{thebibliography}{Foo Bar, 1969} - \footnotesize -% \bibitem{SwForumEu2021} R. Di Cosmo, \emph{A revolutionary infrastructure for Open Source}, 2021, EU Software Forum \href{https://annex.softwareheritage.org/public/talks/2021/2021-03-24-SwForum.pdf}{(slides)} \href{https://youtu.be/AwY527kDMfM?t=178}{(video)} - \bibitem{UNESCOOS} UNESCO, \emph{Draft recommendations on Open Science} - \newblock 2021, \href{https://unesdoc.unesco.org/ark:/48223/pf0000378381.locale=en}{(online)} - \bibitem{PNSO2} French Ministry of Research, \emph{Second National Plan for Open Science} - \newblock 2021, \href{https://www.enseignementsup-recherche.gouv.fr/cid159131/le-plan-national-pour-la-science-ouverte-2021-2024-vers-une-generalisation-de-la-science-ouverte-en-france.html}{(online)} - \bibitem{EOSCSirs2020} EOSC SIRS Task Force, \emph{Scholarly Infrastructures for Research Software} - \newblock 2020, Publications office of the European Commission, \href{https://doi.org/10.2777/28598}{(10.2777/28598)} - \bibitem{DiCosmo2020d} R. Di Cosmo, \emph{Archiving and Referencing Source Code with Software Heritage} - \newblock International Conference on Mathematical Software 2020 \href{https://dx.doi.org/10.1007/978-3-030-52200-1_36}{(10.1007/978-3-030-52200-1\_36)} -% \bibitem{DiCosmo2019} R. Di Cosmo, M. Gruenpeter, S. Zacchiroli\newblock -% \emph{Referencing Source Code Artifacts: a Separate Concern in Software Citation},\newblock -% CiSE 2020 \href{https://dx.doi.org/10.1109/MCSE.2019.2963148}{(10.1109/MCSE.2019.2963148)} -% \href{https://hal.archives-ouvertes.fr/hal-02446202}{(hal-02446202)} -% \bibitem{alliez:hal-02135891} P. Alliez, R. Di Cosmo, B. Guedj, A. Girault, M.-S. Hacid, A. Legrand and N. Rougier\newblock -% \emph{Attributing and referencing (research) software: Best practices and outlook from Inria}, \newblock -% CiSE 2020 \href{https://doi.ieeecomputersociety.org/10.1109/MCSE.2019.2949413}{(10.1109/MCSE.2019.2949413)} -% \href{https://hal.archives-ouvertes.fr/hal-02135891}{(hal-02135891)} - \bibitem{Abramatic2018} J.F. Abramatic, R. Di Cosmo, S. Zacchiroli, - \emph{Building the Universal Archive of Source Code} - \newblock CACM, October 2018 \href{https://doi.org/10.1145/3183558}{(10.1145/3183558)} - \end{thebibliography} - #+END_EXPORT - -* Introduction +* Software source code ** Source code is /special/ (software is /not/ data) # Was: #+INCLUDE: "../../common/modules/swh-ardc.org::#swnotdata" :only-contents t :minlevel 3 *** /Executable/ and /human readable/ knowledge \hfill copyright law :noexport: /“Programs must be written for people to read, and only incidentally for machines to execute.”/\\ \hfill Harold Abelson #+BEAMER: \pause *** Software /evolves/ over time - projects may last decades - the /development history/ is key to its /understanding/ #+BEAMER: \pause *** Complexity :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=python3-matplotlib.pdf, width=.6\linewidth :END: - /millions/ of lines of code - large /web of dependencies/ + easy to break, difficult to maintain + /research software/ a thin top layer - sophisticated /developer communities/ *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** The human side design, algorithm, code, test, documentation, community, funding\\ \hfill and so many more facets ... ** Where is the source code? *** Collaborative development platforms (aka "forges") - BitBucket, GitLab(.com), GitHub, etc. - support for version control, issues, etc. - example: + https://github.com/rdicosmo/parmap + https://gitlab.inria.fr/gt-sw-citation/bibtex-sw-entry/ #+BEAMER: \pause *** Distribution platforms - CTAN, CRAN, PyPi, Debian, etc. - example: https://ctan.org/pkg/biblatex-software #+BEAMER: \pause *** Archives - Software Heritage - example: [[https://archive.softwareheritage.org/swh:1:dir:92a6d0b9953aa3645ffac6bb4fb30a02932872eb;origin=https://gitlab.inria.fr/gt-sw-citation/bibtex-sw-entry;visit=swh:1:snp:05753fe748b7b85cbd0a9e2bea89aac5268b06c6;anchor=swh:1:rev:7c621448de21b0950cdff2dda37834cd4b389bfa][archived version of biblatex-software]] -* Adoption -** Growing adoption of SWH in Academia (selection) - #+INCLUDE: "../../common/modules/swh-adoption-academic.org::#adoption" :only-contents t :minlevel 3 -* Demo time! -** Overview of the Software Heritage / HAL synergy - #+ATTR_LATEX: :width \linewidth - file:hal-swh-overview.png -** A walkthrough - #+INCLUDE: "../../common/modules/swh-ardc.org::#demoswhhal" :only-contents t :minlevel 3 -** Growing adoption of SWH in Academia (selection) - #+INCLUDE: "../../common/modules/swh-adoption-academic.org::#adoption" :only-contents t :minlevel 3 -** An international, non profit initiative\hfill built for the long term - :PROPERTIES: - :CUSTOM_ID: support - :END: -*** Sharing the vision :B_block: - :PROPERTIES: - :CUSTOM_ID: endorsement - :BEAMER_COL: .5 - :BEAMER_env: block - :END: - #+LATEX: \begin{center}{\includegraphics[width=\extblockscale{.4\linewidth}]{unesco_logo_en_285}}\end{center} - #+LATEX: \vspace{-0.8cm} - #+LATEX: \begin{center}\vskip 1em \includegraphics[width=\extblockscale{1.4\linewidth}]{support.pdf}\end{center} - #+latex: \small And many more ...\\ - #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/testimonials} -#+BEAMER: \pause -*** Donors, members, sponsors :B_block: - :PROPERTIES: - :CUSTOM_ID: sponsors - :BEAMER_COL: .5 - :BEAMER_env: block - :END: - #+LATEX: \begin{center}\includegraphics[width=\extblockscale{.4\linewidth}]{inria-logo-new}\end{center} - #+LATEX: \begin{center} - # #+LATEX: \includegraphics[width=\extblockscale{.2\linewidth}]{sponsors-levels.pdf} - #+LATEX: \colorbox{white}{\includegraphics[width=\extblockscale{1.4\linewidth}]{sponsors.pdf}} - #+LATEX: \end{center} -# - sponsoring / partnership :: \hfill \url{sponsorship.softwareheritage.org} -*** :B_ignoreheading: - :PROPERTIES: - :BEAMER_env: ignoreheading - :END: -*** Research collaboration :B_picblock:noexport: - :PROPERTIES: - :BEAMER_COL: .5 - :BEAMER_env: picblock - :BEAMER_OPT: pic=Qwant_Logo, leftpic=true - :END: - source code search engine -*** See more :noexport: - \hfill\tiny\url{http:://www.softwareheritage.org/support/testimonials} -*** Global network :B_picblock:noexport: - :PROPERTIES: - :BEAMER_COL: .5 - :BEAMER_env: picblock - :BEAMER_OPT: pic=fossid, leftpic=true, width=.3\linewidth - :END: - - first *independent mirror* - - increased reliability - -** Call to action on ARDC: let's foster adoption! -*** Train students and colleagues to [[https://www.softwareheritage.org/save-and-reference-research-software/][archive and reference relevant source code]] -#+BEAMER: \vspace{-.5em} -***** :B_column:BMCOL: - :PROPERTIES: - :BEAMER_col: .05 - :BEAMER_env: column - :END: -***** :B_column:BMCOL: - :PROPERTIES: - :BEAMER_col: .45 - :BEAMER_env: column - :END: - + full details in the [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] article -***** :B_column:BMCOL: - :PROPERTIES: - :BEAMER_col: .5 - :BEAMER_env: column - :END: - + short operational [[https://www.softwareheritage.org/save-and-reference-research-software/][HOWTO online]]\pause -***** :B_ignoreheading: - :PROPERTIES: - :BEAMER_env: ignoreheading - :END: -*** Engage conferences, journals, learned societies to use Software Heritage and SWHIDs - APIs for [[https://save.softwareheritage.org][save code now]] and [[https://hal.inria.fr/hal-01872189][deposit]] are available to integrate with -#+BEAMER: \vspace{-.5em} -***** :B_column:BMCOL: - :PROPERTIES: - :BEAMER_col: .25 - :BEAMER_env: column - :END: - + Research Articles -***** :B_column:BMCOL: - :PROPERTIES: - :BEAMER_col: .42 - :BEAMER_env: column - :END: - + Artifact Evaluation Committees -***** :B_column:BMCOL: - :PROPERTIES: - :BEAMER_col: .3 - :BEAMER_env: column - :END: - + Badging initiatives -***** :B_ignoreheading: - :PROPERTIES: - :BEAMER_env: ignoreheading - :END: - #+BEAMER: \pause -*** Help grow and structure the community - - Promote the [[https://www.softwareheritage.org/ambassadors/][ambassador program]] - - Encourage our institutions to - + include Software Heritage in their Open Science policy - + become [[https://www.softwareheritage.org/support/sponsors/][member/sponsor]] - + build a Software Heritage mirror (see ENEA) -* What is at stake before and beyond ARDC -** What is at stake: before ARDC - #+INCLUDE: "../../common/modules/swh-ardc.org::#beforeardc" :only-contents t :minlevel 3 -** What is at stake: beyond ARDC - #+INCLUDE: "../../common/modules/swh-ardc.org::#beyondardc-evaluation" :only-contents t :minlevel 3 -* State of affairs in CS -** The state of the art (in CS!) is far from ideal -#+BEAMER: \vspace{-.6em} -*** ICSE (Zannier, Melrik, Maurer, 2006) :B_block: - :PROPERTIES: - :BEAMER_COL: .45 - :BEAMER_env: block - :END: - absence of replication studies -*** [[https://fr.slideshare.net/carloghezzi18/icse-2009-keynote-15919951][ACM TOSEM 2001 to 2006]] \hfill C. Ghezzi :B_block: - :PROPERTIES: - :BEAMER_COL: .55 - :BEAMER_env: block - :END: - 60% papers with tools: *only 20%* /installable/ -*** :B_ignoreheading: - :PROPERTIES: - :BEAMER_env: ignoreheading - :END: -*** Collberg's [[http://reproducibility.cs.arizona.edu/][2015 reproducibility study]] :B_picblock: - :PROPERTIES: - :BEAMER_env: picblock - :BEAMER_opt: pic=collberg-outcome-new.png, width=.6\linewidth, leftpic=true - :END: - 601 mainstream papers - - 508 with tools - - *only 40%* /installable/ -#+BEAMER: \pause -#+BEAMER: \vspace{-.6em} -*** Main reasons: source code (/or the right version of it/) cannot be found - - *policy issue*: opening up the code of research software - - *infrastructures*: archive and reference it\hfill *let's start here* - -* Detailed addressing ARDC -** Addressing the A(rchive) - #+INCLUDE: "../../common/modules/swh-ardc.org::#swh-a" :only-contents t :minlevel 3 -** Recent preservation news -*** Saving 250.000 endangered repositories... - - summer 2019: BitBucket announce Mercurial VCS phase out - - fall 2019: Software Heritage teams up with Octobus (funded by NLNet, thanks!) - - july 2020: BitBucket erases /250.000/ repositories - - august 2020: [[https://bitbucket-archive.softwareheritage.org][bitbucket-archive.softwareheritage.org]] is live -#+BEAMER: \pause -*** ... preserving the web of knowledge \hfill (original tweet [[https://twitter.com/gabrielaltay/status/1300218789762662401][is here]] ) :B_picblock: - :PROPERTIES: - :BEAMER_env: picblock - :BEAMER_OPT: pic=bitbucket_swh_praise.png, width=.6\linewidth, leftpic=true - :END: - -\\ - *Bottomline*\\ - /explicit deposit/ is important, ...\\ - \mbox{}\hfill ... and we must promote it...\hfill\mbox{}\\ - \mbox{}\hfill ... but will never be enough.\\ -\mbox{}\\ -\mbox{}\hfill /(think also of all software dependencies!)/ -** R(eference): granularity and identifiers \hfill [[http://doi.org/10.15497/RDA00053][10.15497/RDA00053]] - #+INCLUDE: "../../common/modules/swh-ardc.org::#swh-r" :only-contents t :minlevel 3 -** Addressing D(escribe) and C(ite) in ARDC (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) - #+INCLUDE: "../../common/modules/swh-ardc.org::#swh-dc" :only-contents t :minlevel 3 - +* All the source code +** All the source code! + #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{\linewidth}]{swh-collect-axes}\end{center} +** All the source: strategies + #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{\linewidth}]{swh-collect-strategies}\end{center} * FAIR ** What about FAIR? (Findable, Accessible, Interoperable, Reusable) *** FAIR data principles /for data/ *in a nutshell:* metadata, metadata, metadata all over the place (makes sense for data)\pause *** But software is /not data/ ... the terms /interoperability/ and /reusability/ have precise technical meaning for software, and /differ significantly/ from what is intended by the I and R of FAIR; - see the entries for [[https://en.wikipedia.org/wiki/Interoperability#Software][software interoperability]] and [[https://en.wikipedia.org/wiki/Reusability][software reusability]] - it is /very difficult/ to achieve these properties even for commercial software developed by multi billion dollars corporations\pause *** FAIR for software is a distraction \hfill let's focus on the real issues at stake: ARDC a good starting point -* SWHIDs -** R(eference): granularity and identifiers \hfill [[http://doi.org/10.15497/RDA00053][10.15497/RDA00053]] - #+LATEX: \centering\forcebeamerstart - #+LATEX: \only<1>{\includegraphics[width=0.8\linewidth]{Granularity-Level-animated-0.png}} - #+LATEX: \only<2>{\includegraphics[width=0.8\linewidth]{Granularity-Level-animated-1.png}} - #+LATEX: \only<3>{\includegraphics[width=0.8\linewidth]{Granularity-Level-animated-2.png}} - #+LATEX: \only<4>{\includegraphics[width=0.8\linewidth]{Granularity-Level-animated-3.png}} - #+LATEX: \forcebeamerend - #+LATEX: \only<1>{\begin{block}{}\centering Top concept layers vs. bottom artifact layers\end{block}} - #+LATEX: \only<2>{\begin{block}{}\centering Extrinsic identifiers are key for the concept layers\end{block}} - #+LATEX: \only<3>{\begin{block}{}\centering Intrinsic identifiers are key for the artifact layers\end{block}} - #+LATEX: \only<4>{\begin{block}{}\centering In some cases, extrinsic identifiers can be added too\end{block}} - -** Extrinsic and Intrinsic identifiers in a nutshell -*** Extrinsic identifiers: no /per se/ relation with the designated Object - A /register/ keeps the correspondence between the identifier and the object - - pre-internet era :: passport number, social security number, ISBN, ISSN, etc. - - internet era :: DOI, Handle, Ark, PURLs, RRID, etc.\pause -*** Intrinsic identifiers: derived from the designated Object - /No register/ needed to keep the correspondence between the identifier and the object - - pre-internet era :: musical notation, chemical notation (/NaCl/ is table salt)\pause - - internet era :: cryptographic hashes for distributed software development, Bitcoin\pause -*** - \hfill more in [[https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/][this dedicated blog post]] (with pointers to literature) -** Meet the Software Heritage Identifiers (SWHIDs) \hfill [[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][(full spec)]] :noexport: - #+INCLUDE: "../../common/modules/swhid.org::#oneslide" :only-contents t - for *20+ billions* software artifacts! -** Meet the SWHID intrinsic identifiers - #+INCLUDE: "../../common/modules/swh-ardc.org::#swh-r" :only-contents t :minlevel 3 -* Revolutionary infrastructure, scientific challenges -** Summing up: a revolutionary infrastructure /designed for source code/ :B_picblock: -#+latex:\vspace{-0.2em} -#+BEGIN_EXPORT latex - \begin{center} - \includegraphics[width=.4\linewidth]{SWH-logo.pdf} - { \url{www.softwareheritage.org}} - \end{center} -#+END_EXPORT -#+latex:\vspace{-0.4em} -*** /global/ source code /archive/ \hfill /Library of Alexandria of source code/ - :PROPERTIES: - :BEAMER_env: picblock - :BEAMER_OPT: pic=clock-spring-forward.png,width=.15\linewidth,leftpic=true - :END: - + harvest /all/ software source code - + /on demand harvesting/ and /curated deposit/ -#+latex:\vspace{-0.4em} -*** /universal/ intrinsic identifiers - \mbox{}\hfill SWHID standard is independent of version control systems -#+latex:\vspace{-0.4em} -*** /uniform/ data model, /full graph/ of development history - \mbox{}\hfill enables large scale, big code research -#+latex:\vspace{-0.4em} -*** /infrastructure/ for Open Science - \mbox{} \hfill /base layer/ for software source code in the /Open Science architecture/ - -** A revolutionary infrastructure /designed for software source code/ -#+INCLUDE: "../../common/modules/swh-as-infrastructure.org::#oneslide" :only-contents t :minlevel 3 -** A challenging scientific and technical undertaking -*** A novel, large infrastructure - - object storage [[https://www.softwareheritage.org/2021/03/11/towards-a-next-generation-object-storage-for-software-heritage/][with peculiar workload]] - - gigantic Merkle graph - - counting tens of billions of objects ([[https://www.softwareheritage.org/2021/05/11/next-generation-counters/][reuse P. Flajolet's seminal work]]) - - and much more: see [[https://www.softwareheritage.org/2021/04/08/swh-2021-technical-roadmap/][the 2021 technical roadmap]] -#+BEAMER: \pause -*** First datasets are available for Big Code analysis - - full graph of software development (~20Bn nodes, ~200Bn edges) - see Pietri, Spinellis, Zacchiroli, MSR 2019 https://dx.doi.org/10.1109/MSR.2019.00030 - - MSR 2020 mining competition - see https://2020.msrconf.org/track/msr-2020-mining-challenge#Call-for-Papers -* All the source code -** All the source code! - #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{\linewidth}]{swh-collect-axes}\end{center} -** All the source: strategies - #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{\linewidth}]{swh-collect-strategies}\end{center} -* Milestones, policy -** Milestones :B_block: - #+INCLUDE: "../../common/modules/swh-key-dates.org::#keydates" :minlevel 3 :only-contents t -** Policy highlight: the EU Copyright reform\hfill adopted March 28 2019 -*** "Upload filters": a threat to /all modern software development/ - - developing platforms (GitHub, GitLab, Bitbucket, etc.) - - *distribution platforms (Maven, Pypi, CRAN, CTAN, etc.)* - - *archives (Software Heritage)* -#+BEAMER: \pause -*** We got an exclusion for - \hfill /\sout{non for profit} open source software developing *and sharing* platforms/ -#+BEAMER: \pause -*** Key role of Software Heritage - \hfill policy-maker awareness, essential insights for NGOs, government contacts -* SWHIDs by the example :noexport: -** A word on the trust model for systems of identifiers - \vspace{-5pt} -*** Two general classes of systems of identifiers - - intrinsic :: /computed/ from the object /(no registry required, fully decentralised)/\\ - /(e.g.: chemical notation, music notation, hashes, SWHIDs)/\pause - - extrinsic :: /assigned/ by an authority /(need a registry)/\\ - /(e.g.: passport number, DOI, ARK, RRID, etc.)/\pause - \mbox{}\hfill See [[https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/][the dedicated blog post]] for more details -#+BEAMER: \pause -*** Trust model, extrinsic (e.g. DOIs) :B_block: - :PROPERTIES: - :BEAMER_COL: .5 - :BEAMER_env: block - :END: -#+ATTR_LATEX: :width \linewidth -file:doi-vs-pid-1.pdf -#+BEAMER: \pause -*** Trust model, intrinsic (e.g. SWHIDs) :B_block: - :PROPERTIES: - :BEAMER_env: block - :BEAMER_COL: .45 - :END: -#+ATTR_LATEX: :width .8\linewidth -file:doi-vs-pid-3.pdf -*** Trust model for DOIs with checksums :B_block:noexport: - :PROPERTIES: - :BEAMER_COL: .5 - :BEAMER_env: block - :END: -#+ATTR_LATEX: :width \linewidth -file:doi-vs-pid-2.pdf -*** :B_ignoreheading:noexport: - :PROPERTIES: - :BEAMER_env: ignoreheading - :END: -** A worked example - #+LATEX: \centering\forcebeamerstart - #+LATEX: \only<1>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_1.pdf}}} - #+LATEX: \only<2>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/contents.pdf}}} - #+LATEX: \only<3>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_2_contents.pdf}}} - #+LATEX: \only<4>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/directories.pdf}}} - #+LATEX: \only<5>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_3_directories.pdf}}} - #+LATEX: \only<6>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/revisions.pdf}}} - #+LATEX: \only<7>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_4_revisions.pdf}}} - #+LATEX: \only<8>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/releases.pdf}}} - #+LATEX: \only<9>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_5_releases.pdf}}} - #+LATEX: \only<10>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/snapshots.pdf}}} - #+LATEX: \forcebeamerend -* Big code :noexport: -** Software Heritage for Research and Innovation -*** Reference platform for /Big Code/ :B_picblock: - :PROPERTIES: - :BEAMER_opt: pic=universal, leftpic=true, width=.2\linewidth - :BEAMER_env: picblock - :BEAMER_act: - :END: - - unique *observatory* of all software development - - *big data, machine learning* paradise: classification, trends, coding patterns, code completion... -#+BEAMER: \pause -*** First datasets are available! - - full graph of software development (~20Bn nodes, ~200Bn edges) - see Pietri, Spinellis, Zacchiroli, MSR 2019 https://dx.doi.org/10.1109/MSR.2019.00030 - - MSR 2020 mining competition - see https://2020.msrconf.org/track/msr-2020-mining-challenge#Call-for-Papers - +* What is at stake before and beyond ARDC +** What is at stake: before ARDC + #+INCLUDE: "../../common/modules/swh-ardc.org::#beforeardc" :only-contents t :minlevel 3 +** What is at stake: beyond ARDC + #+INCLUDE: "../../common/modules/swh-ardc.org::#beyondardc-evaluation" :only-contents t :minlevel 3 +* Policy news +** The UNESCO recommendations for Open Science, 2018-2021 + #+INCLUDE: "../../common/modules/policyactions.org::#unesco2021" :only-contents t :minlevel 3 +** The EOSC SIRS report: Software Source Code and Open Science, 2020 + #+INCLUDE: "../../common/modules/policyactions.org::#eoscsirs2020-expanded" :only-contents t :minlevel 3 +** Software in the EOSC + #+INCLUDE: "../../common/modules/policyactions.org::#eoscswtf2021" :only-contents t :minlevel 3 +* Adoption +** Growing adoption of SWH in Academia (selection) + #+INCLUDE: "../../common/modules/swh-adoption-academic.org::#adoption" :only-contents t :minlevel 3 * Public code, mirrors :B_block: ** News : archiving /public/ code #+latex: \begin{center} #+ATTR_LATEX: :width 0.7\linewidth file:codeetalab.png #+latex: \end{center} #+BEAMER: \pause https://code.etalab.gouv.fr ** News : ENEA mirror *** Thomas Jefferson, February 18, 1791 :B_block: :PROPERTIES: :BEAMER_ACT: :BEAMER_env: block :END: #+latex: {\em ...let us save what remains: not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond the reach of accident. #+latex: } #+BEAMER: \pause *** Welcoming ENEA :B_block: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=LogoENEAcompletoENG.png, leftpic=true, width=.7\linewidth :END: - first *institutional* mirror - increased resilience - *AI infrastructure* for researchers - stepping stone to \endgraf \hfill an European joint effort ** Calling for preservation: Donald Knuth and Len Shustek *** Communications of the ACM, February 2021 :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=KnuthHistory.jpg, leftpic=true, width=.3\textwidth :END: /"Telling historical stories is the best way to teach. It's much easier to understand something if you know the threads it is connected to."/ \mbox{}\\ \mbox{}\\ \mbox{}\hfill /Let's Not Dumb Down the History of Computer Science/\\ \mbox{}\hfill Donald E. Knuth, Len Shustek\\ \mbox{}\hfill https://doi.org/10.1145/3442377 #+BEAMER: \pause *** A unique opportunity most of the creators are still here: we can talk to them!\\ \hfill but the clock is ticking... # - Software Heritage provides a key infrastructure for software historians ** Source code history for Security and Transparency #+LATEX: \vspace{-.5em} *** Where does reused software come from? :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: #+BEGIN_EXPORT latex \begin{center} \includegraphics[width=.7\linewidth]{myriadsources} \end{center} #+END_EXPORT #+BEAMER: \pause *** Do /you/ know where it comes from? :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .4 :END: - the software you ship - the software you use - the software you acquire - the software that + has that bug + has that vulnerability *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** KYSW: Know Your SoftWare :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=executiveorder.jpg,width=.4\linewidth,leftpic=true :END: Like KYC in banking, KYSW is now essential all over IT...\\ \mbox{}\\ *Sec. 4. Enhancing Software Supply Chain Security* \\ \hfill /ensuring and attesting, to the extent practicable, to the integrity and provenance of open source software/\\ \mbox{}\hfill [[https://www.whitehouse.gov/briefing-room/presidential-actions/2021/05/12/executive-order-on-improving-the-nations-cybersecurity/][May 2021 POTUS Executive Order]] - - +* SWHIDs +** R(eference): granularity and identifiers \hfill [[http://doi.org/10.15497/RDA00053][10.15497/RDA00053]] + #+LATEX: \centering\forcebeamerstart + #+LATEX: \only<1>{\includegraphics[width=0.8\linewidth]{Granularity-Level-animated-0.png}} + #+LATEX: \only<2>{\includegraphics[width=0.8\linewidth]{Granularity-Level-animated-1.png}} + #+LATEX: \only<3>{\includegraphics[width=0.8\linewidth]{Granularity-Level-animated-2.png}} + #+LATEX: \only<4>{\includegraphics[width=0.8\linewidth]{Granularity-Level-animated-3.png}} + #+LATEX: \forcebeamerend + #+LATEX: \only<1>{\begin{block}{}\centering Top concept layers vs. bottom artifact layers\end{block}} + #+LATEX: \only<2>{\begin{block}{}\centering Extrinsic identifiers are key for the concept layers\end{block}} + #+LATEX: \only<3>{\begin{block}{}\centering Intrinsic identifiers are key for the artifact layers\end{block}} + #+LATEX: \only<4>{\begin{block}{}\centering In some cases, extrinsic identifiers can be added too\end{block}} + +** Extrinsic and Intrinsic identifiers in a nutshell +*** Extrinsic identifiers: no /per se/ relation with the designated Object + A /register/ keeps the correspondence between the identifier and the object + - pre-internet era :: passport number, social security number, ISBN, ISSN, etc. + - internet era :: DOI, Handle, Ark, PURLs, RRID, etc.\pause +*** Intrinsic identifiers: derived from the designated Object + /No register/ needed to keep the correspondence between the identifier and the object + - pre-internet era :: musical notation, chemical notation (/NaCl/ is table salt)\pause + - internet era :: cryptographic hashes for distributed software development, Bitcoin\pause +*** + \hfill more in [[https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/][this dedicated blog post]] (with pointers to literature) +** Meet the Software Heritage Identifiers (SWHIDs) \hfill [[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][(full spec)]] :noexport: + #+INCLUDE: "../../common/modules/swhid.org::#oneslide" :only-contents t + for *20+ billions* software artifacts! +** Meet the SWHID intrinsic identifiers + #+INCLUDE: "../../common/modules/swh-ardc.org::#swh-r" :only-contents t :minlevel 3