diff --git a/talks-public/2021-06-15-RESAW/2021-06-15-RESAW.org b/talks-public/2021-06-15-RESAW/2021-06-15-RESAW.org index 48d3cb4..1028b9d 100644 --- a/talks-public/2021-06-15-RESAW/2021-06-15-RESAW.org +++ b/talks-public/2021-06-15-RESAW/2021-06-15-RESAW.org @@ -1,431 +1,431 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) #+TITLE: Software Heritage #+SUBTITLE: preserving our software commons # #+AUTHOR: Roberto Di Cosmo # #+EMAIL: roberto@dicosmo.org @rdicosmo @swheritage -#+BEAMER_HEADER: \date[15/06/2021]{15 June 2021\\RESAW 2021 Keynote} +#+BEAMER_HEADER: \date[18/06/2021]{18 June 2021\\RESAW 2021 Keynote} #+BEAMER_HEADER: \title[Preserving software commons~~~~ www.softwareheritage.org]{Software Heritage} #+BEAMER_HEADER: \author[R. Di Cosmo~~~~ roberto@dicosmo.org ~~~~ (CC-BY 4.0)]{Roberto Di Cosmo\\Inria and Universit\'e de Paris\vspace{-2em}} # #+BEAMER_HEADER: \setbeameroption{show notes on second screen} #+BEAMER_HEADER: \setbeameroption{hide notes} #+KEYWORDS: software heritage legacy preservation knowledge mankind technology #+LATEX_HEADER: \usepackage{tcolorbox} #+LATEX_HEADER: \definecolor{links}{HTML}{2A1B81} #+LATEX_HEADER: \hypersetup{colorlinks,linkcolor=,urlcolor=links} # # prelude.org contains all the information needed to export the main beamer latex source # use prelude-toc.org to get the table of contents # #+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 #+INCLUDE: "../../common/modules/169.org" # +LaTeX_CLASS_OPTIONS: [aspectratio=169,handout,xcolor=table] #+LATEX_HEADER: \usepackage{bbding} #+LATEX_HEADER: \DeclareUnicodeCharacter{66D}{\FiveStar} # # If you want to change the title logo it's here # # +BEAMER_HEADER: \titlegraphic{\includegraphics[width=0.5\textwidth]{SWH-logo}} # aspect ratio can be changed, but the slides need to be adapted # - compute a "resizing factor" for the images (macro for picblocks?) # # set the background image # # https://pacoup.com/2011/06/12/list-of-true-169-resolutions/ # #+BEAMER_HEADER: \pgfdeclareimage[height=90mm,width=160mm]{bgd}{swh-world-169.png} #+BEAMER_HEADER: \setbeamertemplate{background}{\pgfuseimage{bgd}} #+LATEX: \addtocounter{framenumber}{-1} * Introduction #+INCLUDE: "../../common/modules/rdc-bio.org::#main" :only-contents t :minlevel 2 * Software Source Code is Precious Knowledge ** What is /Source Code/? #+INCLUDE: "../../common/modules/source-code-different-short.org::#thesourcecode" :only-contents t :minlevel 3 ** Software /Source Code/ matters #+INCLUDE: "../../common/modules/software-all-around-us.org::#softwareisknowledge" :only-contents t :minlevel 3 ** Software /Source Code/ is part of our cultural heritage #+INCLUDE: "../../common/modules/source-code-different-short.org::#softwareisdifferent" :only-contents t :minlevel 3 ** /Source Code/ and the commons #+INCLUDE: "../../common/modules/foss-commons.org::#commonsdef" :only-contents t :minlevel 3 * An Endangered Knowledge ** Spread all around #+INCLUDE: "../../common/modules/swh-motivations.org::#spread" :only-contents t :minlevel 3 ** Fragile \vspace{-.5em} #+INCLUDE: "../../common/modules/swh-motivations.org::#fragile" :only-contents t :minlevel 3 \mbox{}\\ \hfill and what about all the landmark legacy source code that is rotting away? ** We are at a turning point #+INCLUDE: "../../common/modules/turningpoint.org::#turningpoint" :only-contents t :minlevel 3 * Meet Software Heritage ** Software Heritage in a nutshell \hfill www.softwareheritage.org #+BEAMER: \transdissolve #+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :only-contents t :minlevel 3 ** An international, non profit initiative\hfill built for the long term :PROPERTIES: :CUSTOM_ID: support :END: *** Sharing the vision :B_block: :PROPERTIES: :CUSTOM_ID: endorsement :BEAMER_COL: .5 :BEAMER_env: block :END: #+LATEX: \begin{center}{\includegraphics[width=\extblockscale{.4\linewidth}]{unesco_logo_en_285}}\end{center} #+LATEX: \vspace{-0.8cm} #+LATEX: \begin{center}\vskip 1em \includegraphics[width=\extblockscale{1.4\linewidth}]{support.pdf}\end{center} #+latex: \small And many more ...\\ #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/testimonials} #+BEAMER: \pause *** Donors, members, sponsors :B_block: :PROPERTIES: :CUSTOM_ID: sponsors :BEAMER_COL: .5 :BEAMER_env: block :END: #+LATEX: \begin{center}\includegraphics[width=\extblockscale{.4\linewidth}]{inria-logo-new}\end{center} #+LATEX: \begin{center} # #+LATEX: \includegraphics[width=\extblockscale{.2\linewidth}]{sponsors-levels.pdf} #+LATEX: \colorbox{white}{\includegraphics[width=\extblockscale{1.4\linewidth}]{sponsors.pdf}} #+LATEX: \end{center} # - sponsoring / partnership :: \hfill \url{sponsorship.softwareheritage.org} *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** Research collaboration :B_picblock:noexport: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: picblock :BEAMER_OPT: pic=Qwant_Logo, leftpic=true :END: source code search engine *** See more :noexport: \hfill\tiny\url{http:://www.softwareheritage.org/support/testimonials} *** Global network :B_picblock:noexport: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: picblock :BEAMER_OPT: pic=fossid, leftpic=true, width=.3\linewidth :END: - first *independent mirror* - increased reliability ** The largest software archive, a shared infrastructure #+latex: \begin{center} #+ATTR_LATEX: :width 0.7\linewidth file:SWH-as-foundation-slim.png #+latex: \end{center} #+BEAMER: \pause #+latex: \centering #+ATTR_LATEX: :width \extblockscale{.9\linewidth} file:2021-05-archive-growth.png ** Growing adoption in Academia (selection) #+INCLUDE: "../../common/modules/swh-adoption-academic.org::#adoption" :only-contents t :minlevel 3 ** Intrinsic identifiers (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) :noexport: #+INCLUDE: "../../common/modules/swh-ardc.org::#swh-r" :only-contents t :minlevel 3 ** Software Heritage Identifiers (SWHID) \hfill [[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][link to full docs]] # #+INCLUDE: "../../common/modules/swh-id-syntax.org::#swh-id-syntax" :only-contents t :minlevel 3 #+LATEX: \centering%\forcebeamerstart #+LATEX: \mode{\only<1>{\includegraphics[width=\linewidth]{SWHID-v1.4_1.png}}} #+LATEX: \mode{\only<2>{\includegraphics[width=\linewidth]{SWHID-v1.4_2.png}}} #+LATEX: \only<3->{\includegraphics[width=\linewidth]{SWHID-v1.4_3.png}} #+LATEX: %\forcebeamerend *** An emerging standard :B_block: :PROPERTIES: :BEAMER_act: <4-> :BEAMER_COL: .6 :BEAMER_env: block :END: - in Linux Foundation's [[https://spdx.github.io/spdx-spec/appendix-VI-external-repository-identifiers/#persistent-id][SPDX 2.2]] - IANA registered, WikiData property [[https://www.wikidata.org/wiki/Property:P6138][P6138]] *** Examples: :B_block: :PROPERTIES: :BEAMER_act: <5-> :BEAMER_COL: .4 :BEAMER_env: block :END: - [[https://archive.softwareheritage.org/swh:1:cnt:64582b78792cd6c2d67d35da5a11bb80886a6409;origin=https://github.com/virtualagc/virtualagc;lines=245-261/][Apollo 11 AGC excerpt]], - [[https://archive.softwareheritage.org/swh:1:cnt:bb0faf6919fc60636b2696f32ec9b3c2adb247fe;origin=https://github.com/id-Software/Quake-III-Arena;lines=549-572/][Quake III rsqrt]] * Demo time! ** Focus on Archive and Reference - The [[https://www.softwareheritage.org/2019/07/20/archiving-and-referencing-the-apollo-source-code/][Apollo 11 AGC source code example]] - Browse [[https://archive.softwareheritage.org][the archive]] - [[https://save.softwareheritage.org][Trigger archival]] of your preferred software in a breeze - Get and use SWHIDs ([[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][full specification available online]]) - Example use in a research article: compare Fig. 1 and conclusions - in [[http://www.dicosmo.org/Articles/2012-DaneluttoDiCosmo-Pcs.pdf][the 2012 version]] - in [[https://www.dicosmo.org/share/parmap_swh.pdf][the updated version]] using SWHIDs and Software Heritage - Cite software [[https://www.softwareheritage.org/2020/05/26/citing-software-with-style/][with the biblatex-software style]] from CTAN # - Example use in a research article: extensive use of SWHIDs in [[https://www.dicosmo.org/Articles/2020-ReScienceC.pdf][a replication experiment]] - Example in a real journal: [[http://www.ipol.im/pub/art/2020/300/][an article from IPOL]] # - Supporting reproducible builds: [[https://www.softwareheritage.org/2019/04/18/software-heritage-and-gnu-guix-join-forces-to-enable-long-term-reproducibility/][Guix]] and [[https://www.softwareheritage.org/2019/04/18/software-heritage-and-gnu-guix-join-forces-to-enable-long-term-reproducibility/][Nix]] ** Recent news, and a lesson to be learned *** Saving 250.000 endangered repositories... - summer 2019: BitBucket announce Mercurial VCS phase out - fall 2019: Software Heritage teams up with Octobus (funded by NLNet, thanks!) - july 2020: BitBucket erases /250.000/ repositories - august 2020: [[https://bitbucket-archive.softwareheritage.org][bitbucket-archive.softwareheritage.org]] is live #+BEAMER: \pause *** ... preserving the web of knowledge \hfill (Tweet [[https://twitter.com/gabrielaltay/status/1300218789762662401][is here]] ) :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=bitbucket_swh_praise.png, width=.6\linewidth, leftpic=true :END: \\ *Bottomline*\\ /explicit deposit/ is important, ...\\ \mbox{}\hfill ... and we must promote it...\hfill\mbox{}\\ \mbox{}\hfill ... but will never be enough.\\ \mbox{}\\ \mbox{}\hfill /(think also of all software dependencies!)/ ** Summing up: a revolutionary infrastructure /designed for source code/ #+latex:\vspace{-0.2em} #+BEGIN_EXPORT latex \begin{center} \includegraphics[width=.4\linewidth]{SWH-logo.pdf} { \url{www.softwareheritage.org}} \end{center} #+END_EXPORT #+latex:\vspace{-0.4em} *** /global/ source code /archive/ \hfill /Library of Alexandria of source code/ :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=clock-spring-forward.png,width=.15\linewidth,leftpic=true :END: + harvest /all/ software source code + /on demand harvesting/ and /curated deposit/ #+latex:\vspace{-0.4em} *** /universal/ intrinsic identifiers \mbox{}\hfill SWHID standard is independent of version control systems #+latex:\vspace{-0.4em} *** /uniform/ data model, /full graph/ of development history \mbox{}\hfill enables large scale, big code research #+BEAMER: \pause #+latex:\vspace{-0.4em} *** /infrastructure/ for Open Science \mbox{} \hfill /base layer/ for software source code in the /Open Science architecture/ * Preserving the history of our software commons ** All the source code! #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{\linewidth}]{swh-collect-axes}\end{center} ** All the source code: strategy #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{\linewidth}]{swh-collect-strategies}\end{center} ** The \hfill /\href{https://unesdoc.unesco.org/ark:/48223/pf0000371017}{SWHAP}/ process, with UNESCO and University of Pisa # variant of #+INCLUDE: "../../common/modules/swh-acquisition-process.org::#swhap" :only-contents t :minlevel 3 *** Paris Call on Software Source Code “[We call to] support efforts to gather and preserve the artifacts and narratives of the history of computing, while the earlier creators are still alive” #+BEAMER: \pause *** :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.3 :END: #+BEGIN_EXPORT latex \begin{center} \includegraphics[width=\extblockscale{1.1\linewidth}]{SWHAP-cover.pdf} \end{center} #+END_EXPORT *** :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.7 :END: - **Rescue** Legacy Software from different media\\ \mbox{}\\ - **Curate** the code - reconstruct the development history + /Software Heritage - GitHub/ work on fixing git - collect the metadata\\ \mbox{}\\ - **Archive** in Software Heritage *** *UNESCO, UniPi and Software Heritage* collaboration \hfill worldwide scope ** An example: TAUmus, from Pisa (70's) #+INCLUDE: "../../common/modules/swh-acquisition-process.org::#swhaptaumus" :only-contents t :minlevel 3 * The road ahead ** Calling for more History of Computer Science *** Communications of the ACM, February 2021 :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=KnuthHistory.jpg, leftpic=true, width=.3\textwidth :END: /"Telling historical stories is the best way to teach. It's much easier to understand something if you know the threads it is connected to."/ \mbox{}\\ \mbox{}\\ \mbox{}\hfill /Let's Not Dumb Down the History of Computer Science/\\ \mbox{}\hfill Donald E. Knuth, Len Shustek\\ \mbox{}\hfill https://doi.org/10.1145/3442377 #+BEAMER: \pause *** A unique opportunity - most of the creators are still here: we can talk to them! - Software Heritage provides a key infrastructure for software historians #+BEAMER: \pause *** Stay tuned \hfill /great news for software stories in a few months/... ** You may help! *** Foster adoption and best practices + [[https://www.softwareheritage.org/save-and-reference-research-software/][archive and reference relevant source code]] (save code now, and [[https://hal.inria.fr/hal-01872189][deposit]]), + [[https://www.softwareheritage.org/swhap/][rescue and preserve landmark legacy source code]] with SWHAP, + use Software Heritage in articles, journals, and books #+BEAMER: \pause *** Engage with Software Heritage as an individual - the [[https://www.softwareheritage.org/ambassadors/][ambassador program]] is open: + learn directly from the core team, share with your community #+BEAMER: \pause *** Engage with Software Heritage as an organization - join [[https://www.softwareheritage.org/support/sponsors/][as a member/sponsor]] - contribute to the preservation mission ** Thank you! #+latex: \centering{\huge\bf Questions?} \vspace{-.5em} *** References #+BEGIN_EXPORT latex \begin{thebibliography}{Foo Bar, 1969} \footnotesize \bibitem{SwForumEu2021} R. Di Cosmo, \emph{A revolutionary infrastructure for Open Source}, 2021, EU Software Forum \href{https://annex.softwareheritage.org/public/talks/2021/2021-03-24-SwForum.pdf}{(slides)} \href{https://youtu.be/AwY527kDMfM?t=178}{(video)} \bibitem{EOSCSirs2020} EOSC SIRS Task Force, \emph{Scholarly Infrastructures for Research Software} \newblock 2020, European Commission, \href{https://doi.org/10.2777/28598}{(10.2777/28598)} \bibitem{DiCosmo2020d} R. Di Cosmo, \emph{Archiving and Referencing Source Code with Software Heritage} \newblock ICMS 2020 \href{https://dx.doi.org/10.1007/978-3-030-52200-1_36}{(10.1007/978-3-030-52200-1\_36)} \bibitem{DiCosmo2019} R. Di Cosmo, M. Gruenpeter, S. Zacchiroli\newblock \emph{Referencing Source Code Artifacts: a Separate Concern in Software Citation},\newblock CiSE 2020 \href{https://dx.doi.org/10.1109/MCSE.2019.2963148}{(10.1109/MCSE.2019.2963148)} \href{https://hal.archives-ouvertes.fr/hal-02446202}{(hal-02446202)} \bibitem{alliez:hal-02135891} P. Alliez, R. Di Cosmo, B. Guedj, A. Girault, M.-S. Hacid, A. Legrand and N. Rougier\newblock \emph{Attributing and referencing (research) software: Best practices and outlook from Inria}, \newblock CiSE 2020 \href{https://doi.ieeecomputersociety.org/10.1109/MCSE.2019.2949413}{(10.1109/MCSE.2019.2949413)} \href{https://hal.archives-ouvertes.fr/hal-02135891}{(hal-02135891)} \bibitem{Abramatic2018} J.F. Abramatic, R. Di Cosmo, S. Zacchiroli, \emph{Building the Universal Archive of Source Code}, \newblock CACM, October 2018 \href{https://doi.org/10.1145/3183558}{(10.1145/3183558)} \end{thebibliography} #+END_EXPORT * Appendix :B_appendix:noexport: :PROPERTIES: :BEAMER_env: appendix :END: ** \vfill \centerline{\Huge Appendix} \vfill * SWHIDs by the example :noexport: ** A word on the trust model for systems of identifiers \vspace{-5pt} *** Two general classes of systems of identifiers - intrinsic :: /computed/ from the object /(no registry required, fully decentralised)/\\ /(e.g.: chemical notation, music notation, hashes, SWHIDs)/\pause - extrinsic :: /assigned/ by an authority /(need a registry)/\\ /(e.g.: passport number, DOI, ARK, RRID, etc.)/\pause \mbox{}\hfill See [[https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/][the dedicated blog post]] for more details #+BEAMER: \pause *** Trust model, extrinsic (e.g. DOIs) :B_block: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: block :END: #+ATTR_LATEX: :width \linewidth file:doi-vs-pid-1.pdf #+BEAMER: \pause *** Trust model, intrinsic (e.g. SWHIDs) :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .45 :END: #+ATTR_LATEX: :width .8\linewidth file:doi-vs-pid-3.pdf *** Trust model for DOIs with checksums :B_block:noexport: :PROPERTIES: :BEAMER_COL: .5 :BEAMER_env: block :END: #+ATTR_LATEX: :width \linewidth file:doi-vs-pid-2.pdf *** :B_ignoreheading:noexport: :PROPERTIES: :BEAMER_env: ignoreheading :END: ** A worked example #+LATEX: \centering\forcebeamerstart #+LATEX: \only<1>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_1.pdf}}} #+LATEX: \only<2>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/contents.pdf}}} #+LATEX: \only<3>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_2_contents.pdf}}} #+LATEX: \only<4>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/directories.pdf}}} #+LATEX: \only<5>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_3_directories.pdf}}} #+LATEX: \only<6>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/revisions.pdf}}} #+LATEX: \only<7>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_4_revisions.pdf}}} #+LATEX: \only<8>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/releases.pdf}}} #+LATEX: \only<9>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_5_releases.pdf}}} #+LATEX: \only<10>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/snapshots.pdf}}} #+LATEX: \forcebeamerend * Big code :noexport: ** Software Heritage for Research and Innovation *** Reference platform for /Big Code/ :B_picblock: :PROPERTIES: :BEAMER_opt: pic=universal, leftpic=true, width=.2\linewidth :BEAMER_env: picblock :BEAMER_act: :END: - unique *observatory* of all software development - *big data, machine learning* paradise: classification, trends, coding patterns, code completion... #+BEAMER: \pause *** First datasets are available! - full graph of software development (~20Bn nodes, ~200Bn edges) see Pietri, Spinellis, Zacchiroli, MSR 2019 https://dx.doi.org/10.1109/MSR.2019.00030 - MSR 2020 mining competition see https://2020.msrconf.org/track/msr-2020-mining-challenge#Call-for-Papers * News :noexport: ** Milestones :noexport: #+INCLUDE: "../../common/modules/swh-key-dates.org::#keydates" :minlevel 3 :only-contents t ** News : SWHAP *** Paris Call on Software Source Code “[We call to] support efforts to gather and preserve the artifacts and narratives of the history of computing, while the earlier creators are still alive” #+BEAMER: \pause *** SWHAP : an important step forward - detailed guidelines to *curate* landmark legacy source code and *archive* it on Software Heritage - intense cooperation with *Università di Pisa* and *UNESCO* - open to all, we'll promote it worldwide *** https://www.softwareheritage.org/swhap ** News : archiving /public/ code #+latex: \begin{center} #+ATTR_LATEX: :width 0.7\linewidth file:codeetalab.png #+latex: \end{center} #+BEAMER: \pause https://code.etalab.gouv.fr ** News : ENEA mirror *** Thomas Jefferson, February 18, 1791 :B_block: :PROPERTIES: :BEAMER_ACT: :BEAMER_env: block :END: #+latex: {\em ...let us save what remains: not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond the reach of accident. #+latex: } #+BEAMER: \pause *** Welcoming ENEA :B_block: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=LogoENEAcompletoENG.png, leftpic=true, width=.7\linewidth :END: - first *institutional* mirror - increased resilience - *AI infrastructure* for researchers - stepping stone to \endgraf \hfill an European joint effort