diff --git a/talks-public/2021-10-19-telecom-idia/2021-10-19-telecom-idia.org b/talks-public/2021-10-19-telecom-idia/2021-10-19-telecom-idia.org new file mode 100644 index 0000000..37fca92 --- /dev/null +++ b/talks-public/2021-10-19-telecom-idia/2021-10-19-telecom-idia.org @@ -0,0 +1,331 @@ +#+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) +#+TITLE: Software Heritage +#+SUBTITLE: A research platform for large-scale source code archival +#+BEAMER_HEADER: \date[2021-10-19, IDIA]{19 Oct 2021\\IDIA Prototype/Software/Platform Day\\Télécom Paris} +#+AUTHOR: Stefano Zacchiroli +#+DATE: 19 Oct 2021 +#+EMAIL: zack@upsilon.cc + +#+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 +#+INCLUDE: "../../common/modules/169.org" +#+BEAMER_HEADER: \institute[Télécom Paris]{Télécom Paris --- {\tt zack@upsilon.cc, @zacchiro}} +#+BEAMER_HEADER: \author{Stefano Zacchiroli} + +* About me :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: + #+INCLUDE: "this/zack.org" :minlevel 2 +* Why we must preserve the history of software source code +** Software /source code/ is precious knowledge + #+INCLUDE: "../../common/modules/source-code-different-short.org::#softwareisdifferent" :only-contents t :minlevel 3 +** Calling for source code preservation: UNESCO +*** :B_column:BMCOL: + :PROPERTIES: + :BEAMER_col: .53 + :BEAMER_env: column + :END: + #+ATTR_LATEX: :width .7\linewidth + file:UNESCOParisCallMeeting.png + UNESCO, Inria, Software Heritage invite\\ + [[https://en.unesco.org/news/experts-call-greater-recognition-software-source-code-heritage-sustainable-development][40 international experts meet in Paris]] ... + #+BEAMER: \pause +*** :B_column:BMCOL: + :PROPERTIES: + :BEAMER_col: .5 + :BEAMER_env: column + :END: + #+ATTR_LATEX: :width .65\linewidth + file:paris_call_ssc_cover.jpg + [[https://en.unesco.org/foss/paris-call-software-source-code][The call is published on Feb 2019]]\pause +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +*** + :PROPERTIES: + :BEAMER_COL: 1.06 + :BEAMER_env: block + :END: + “[We call to] support efforts to gather and preserve the artifacts and + narratives of the history of computing, while the earlier creators are still + alive” + + https://en.unesco.org/foss/paris-call-software-source-code + +** Source code history --- for open science + #+INCLUDE: "../../common/modules/swh-ardc.org::#pillaropenscience" :only-contents t :minlevel 3 +*** + \hfill Preserving the history of source code is important for /reproducibility/ +** Source code history --- for security and transparency +#+LATEX: \vspace{-.5em} +*** Where does reused software come from? :B_block: + :PROPERTIES: + :BEAMER_env: block + :BEAMER_COL: .5 + :END: +#+BEGIN_EXPORT latex +\begin{center} +\includegraphics[width=.7\linewidth]{myriadsources} +\end{center} +#+END_EXPORT +#+BEAMER: \pause +*** Do /you/ know where it comes from? :B_block: + :PROPERTIES: + :BEAMER_env: block + :BEAMER_COL: .4 + :END: + - the software you ship + - the software you use + - the software you acquire + - the software that + + has that bug + + has that vulnerability +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +#+BEAMER: \pause +*** KYSW: Know Your SoftWare :B_picblock: + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_OPT: pic=executiveorder.jpg,width=.4\linewidth,leftpic=true + :END: + Like KYC in banking, KYSW is now essential all over IT...\\ + \mbox{}\\ + *Sec. 4. Enhancing Software Supply Chain Security* \\ + \hfill /ensuring and attesting, to the extent practicable, to the integrity and provenance of open source software/\\ + \mbox{}\hfill [[https://www.whitehouse.gov/briefing-room/presidential-actions/2021/05/12/executive-order-on-improving-the-nations-cybersecurity/][May 2021 POTUS Executive Order]] + +** Fragile + #+INCLUDE: "../../common/modules/swh-motivations.org::#fragile" :only-contents t :minlevel 3 +* How we can preserve our software heritage +** Software Heritage in a nutshell \hfill www.softwareheritage.org + #+BEAMER: \transdissolve + #+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :only-contents t :minlevel 3 +** The largest public source code archive, principled \hfill \small \url{bit.ly/swhpaper} +*** + :PROPERTIES: + :BEAMER_env: block + :BEAMER_col: 0.5 + :END: + #+latex: \centering + #+ATTR_LATEX: :width \linewidth + file:SWH-as-foundation-slim.png +*** + :PROPERTIES: + :BEAMER_env: block + :BEAMER_col: 0.5 + :END: + #+latex: \centering + #+ATTR_LATEX: :width \linewidth + file:2021-09-archive-growth.png\\ + [[https://archive.softwareheritage.org][archive.softwareheritage.org]] +*** linebreak :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +#+BEAMER: \pause +*** Technology + :PROPERTIES: + :BEAMER_env: block + :BEAMER_col: 0.34 + :END: + - transparency and FOSS + - replicas all the way down +*** Content (billions!) + :PROPERTIES: + :BEAMER_env: block + :BEAMER_col: 0.32 + :END: + - intrinsic identifiers + - facts and provenance +*** Organization + :PROPERTIES: + :BEAMER_env: block + :BEAMER_col: 0.33 + :END: + - non-profit + - multi-stakeholder +** A peek under the hood: a global view on the software commons + #+BEAMER: \begin{center} + #+BEAMER: \mode{\only<1>{\includegraphics[width=\extblockscale{.9\textwidth}]{swh-dataflow-merkle-listers.pdf}}} + #+BEAMER: \only<2>{\includegraphics[width=\extblockscale{.9\textwidth}]{swh-dataflow-merkle.pdf}} + #+BEAMER: \end{center} + #+BEAMER: \pause +*** + A *global graph* linking together fully *deduplicated* source code artifact + (files, commits, directories, releases, etc.) to the places3that distribute + them (e.g., Git repositories), providing a *unified view* on the entire + */Software Commons/*. + + Size: *~20 B* nodes, *~200 B* edges, *~600 TB* (uncompressed) blobs + # - *GitHub*, Gitlab.com, Bitbucket, /Gitorious/, /GoogleCode/, GNU, PyPi, Debian, NPM... +** Software Heritage /intrinsic/ Identifiers (SWHID) \hfill [[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][(full spec)]] + #+LATEX: \centering%\forcebeamerstart + #+LATEX: \mode{\only<1>{\includegraphics[width=\linewidth]{SWHID-v1.4_1.png}}} + #+LATEX: \mode{\only<2>{\includegraphics[width=\linewidth]{SWHID-v1.4_2.png}}} + #+LATEX: \only<3->{\includegraphics[width=\linewidth]{SWHID-v1.4_3.png}} + #+LATEX: %\forcebeamerend +*** An emerging standard :B_block: + :PROPERTIES: + :BEAMER_act: <4-> + :BEAMER_COL: .6 + :BEAMER_env: block + :END: + - in Linux Foundation's [[https://spdx.github.io/spdx-spec/appendix-VI-external-repository-identifiers/#persistent-id][SPDX 2.2]] + - IANA registered, WikiData property [[https://www.wikidata.org/wiki/Property:P6138][P6138]] +*** Examples: :B_block: + :PROPERTIES: + :BEAMER_act: <5-> + :BEAMER_COL: .4 + :BEAMER_env: block + :END: + - [[https://archive.softwareheritage.org/swh:1:cnt:64582b78792cd6c2d67d35da5a11bb80886a6409;origin=https://github.com/virtualagc/virtualagc;lines=245-261/][Apollo 11 AGC excerpt]] + - [[https://archive.softwareheritage.org/swh:1:cnt:bb0faf6919fc60636b2696f32ec9b3c2adb247fe;origin=https://github.com/id-Software/Quake-III-Arena;lines=549-572/][Quake III rsqrt]] +* Demo time! +** A walkthrough + - Browse [[https://archive.softwareheritage.org][the archive]] + - [[https://save.softwareheritage.org][Trigger archival]] of your preferred software in a breeze + - Get and use SWHIDs ([[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][full spec available online]]) + - The [[https://www.softwareheritage.org/2019/07/20/archiving-and-referencing-the-apollo-source-code/][Apollo 11 AGC source code example]] + - Cite software [[https://www.softwareheritage.org/2020/05/26/citing-software-with-style/][with the biblatex-software style]] from CTAN + - Example use in a research article: compare Fig. 1 and conclusions + - in [[http://www.dicosmo.org/Articles/2012-DaneluttoDiCosmo-Pcs.pdf][the 2012 version]] + - in [[https://www.dicosmo.org/share/parmap_swh.pdf][the updated version]] using SWHIDs and Software Heritage +# - Example use in a research article: extensive use of SWHIDs in [[https://www.dicosmo.org/Articles/2020-ReScienceC.pdf][a replication experiment]] + - Example in a journal: [[http://www.ipol.im/pub/art/2020/300/][an article from IPOL]] + - [[https://doc.archives-ouvertes.fr/en/deposit/deposit-software-source-code/][Curated deposit in SWH via HAL]], see for example: + [[https://hal.archives-ouvertes.fr/hal-02130801][LinBox]], [[https://hal.archives-ouvertes.fr/hal-01897934][SLALOM]], [[https://hal.archives-ouvertes.fr/hal-02130729][Givaro]], [[https://hal.archives-ouvertes.fr/hal-02137040][NS2DDV]], [[https://hal.archives-ouvertes.fr/lirmm-02136558][SumGra]], [[https://hal.archives-ouvertes.fr/hal-02155786][Coq proof]], ... + - Rescue landmark legacy software, see the [[https://www.softwareheritage.org/swhap/][SWHAP process with UNESCO]] +* Scientific challenges +** A revolutionary research infrastructure designed for source code + #+INCLUDE: "../../common/modules/swh-as-infrastructure.org::#oneslide" :only-contents t :minlevel 3 +** A challenging scientific and technical undertaking +*** A novel, large infrastructure + - gigantic Merkle graph + - object storage [[https://www.softwareheritage.org/2021/03/11/towards-a-next-generation-object-storage-for-software-heritage/][with peculiar workload]] + - simple problems become hard, e.g., counting tens of billions of objects, + or sorting all possible origins of a node + # - and much more: see [[https://www.softwareheritage.org/2021/04/08/swh-2021-technical-roadmap/][the 2021 technical roadmap]] + #+BEAMER: \pause \vspace{-1mm} +*** First dataset available as open data + #+BEGIN_EXPORT latex + \begin{thebibliography}{Foo Bar, 1969} + \footnotesize + \bibitem{Pietri2019} Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli\newblock + The Software Heritage Graph Dataset: Public software development under one roof\newblock + MSR 2019: 16th Intl. Conf. on Mining Software Repositories. IEEE\newblock + preprint: \url{http://deb.li/swhmsr19} + \end{thebibliography} + \vspace{-1mm} + #+END_EXPORT + - used as topic for the MSR 2020 mining competition + #+BEAMER: \pause \vspace{-1mm} +*** + → for more, see last week's talk with the DIG team: *Analyzing the Global + Graph of Public Software Development* + [[https://upsilon.cc/~zack/talks/2021/2021-10-14-telecom-dig.pdf][upsilon.cc/~zack/talks/2021/2021-10-14-telecom-dig.pdf]] + +* Preserving our software commons: the present and the future +** Focus on Academia: growing adoption (selection) + #+INCLUDE: "../../common/modules/swh-adoption-academic.org::#adoption" :only-contents t :minlevel 3 +** An international, non profit initiative\hfill built for the long term + :PROPERTIES: + :CUSTOM_ID: support + :END: +*** Sharing the vision :B_block: + :PROPERTIES: + :CUSTOM_ID: endorsement + :BEAMER_COL: .5 + :BEAMER_env: block + :END: + #+LATEX: \begin{center}{\includegraphics[width=\extblockscale{.4\linewidth}]{unesco_logo_en_285}}\end{center} + #+LATEX: \vspace{-0.8cm} + #+LATEX: \begin{center}\vskip 1em \includegraphics[width=\extblockscale{1.4\linewidth}]{support.pdf}\end{center} + #+latex: \small And many more ...\\ + #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/testimonials} +#+BEAMER: \pause +*** Donors, members, sponsors :B_block: + :PROPERTIES: + :CUSTOM_ID: sponsors + :BEAMER_COL: .5 + :BEAMER_env: block + :END: + #+LATEX: \begin{center}\includegraphics[width=\extblockscale{.4\linewidth}]{inria-logo-new}\end{center} + #+LATEX: \begin{center} + # #+LATEX: \includegraphics[width=\extblockscale{.2\linewidth}]{sponsors-levels.pdf} + #+LATEX: \colorbox{white}{\includegraphics[width=\extblockscale{1.4\linewidth}]{sponsors.pdf}} + #+LATEX: \end{center} +# - sponsoring / partnership :: \hfill \url{sponsorship.softwareheritage.org} +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +*** Research collaboration :B_picblock:noexport: + :PROPERTIES: + :BEAMER_COL: .5 + :BEAMER_env: picblock + :BEAMER_OPT: pic=Qwant_Logo, leftpic=true + :END: + source code search engine +*** See more :noexport: + \hfill\tiny\url{http:://www.softwareheritage.org/support/testimonials} +*** Global network :B_picblock:noexport: + :PROPERTIES: + :BEAMER_COL: .5 + :BEAMER_env: picblock + :BEAMER_OPT: pic=fossid, leftpic=true, width=.3\linewidth + :END: + - first *independent mirror* + - increased reliability +** You may help! +*** Foster the adoption of research best practices + - [[https://www.softwareheritage.org/save-and-reference-research-software/][archive and reference relevant source code]] (save code now, and [[https://hal.inria.fr/hal-01872189][deposit]]) + - use Software Heritage and [[https://www.softwareheritage.org/2020/05/26/citing-software-with-style/][biblatex-software]] in articles, journals, and books + - [[https://www.softwareheritage.org/swhap/][rescue and preserve landmark legacy source code]] with SWHAP +#+BEAMER: \pause +*** Engage with Software Heritage as a researcher + - use the archive for your own software-related experiments + - work with us to tackle open technical and research problems + #+BEAMER: \pause +*** Engage with Software Heritage as an organization + - become [[https://www.softwareheritage.org/support/sponsors/][a member/sponsor]] + - build a Software Heritage mirror + - contribute to the preservation mission +** Thank you! + #+BEAMER: \vspace{-1mm} +*** Resources + - archive :: [[https://archive.softwareheritage.org/][archive.softwareheritage.org]] + - stay posted :: [[https://www.softwareheritage.org/newsletter/][softwareheritage.org/newsletter]] + - blog :: [[https://www.softwareheritage.org/blog/][softwareheritage.org/blog]] + #+BEAMER: \vspace{-2mm} +*** References (selected; full list at [[https://www.softwareheritage.org/publications][softwareheritage.org/publications]]) + #+BEGIN_EXPORT latex + \begin{thebibliography}{Foo Bar, 1969} + \scriptsize \vspace{-2mm} + % \bibitem{DiCosmo2017} Roberto Di Cosmo, Stefano Zacchiroli + % \newblock Software Heritage: Why and How to Preserve Software Source Code + % \newblock iPRES 2017: Intl. Conf. on Digital Preservation + \bibitem{Abramatic2018} Jean-François Abramatic, Roberto Di Cosmo, Stefano Zacchiroli + \newblock Building the Universal Archive of Source Code + \newblock Communication of the ACM, October 2018 + \bibitem{Pietri2020c} Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli + \newblock The Software Heritage Graph Dataset: Large-scale Analysis of Public Software Development History + \newblock MSR 2020: 17th Intl. Conf. on Mining Software Repositories. IEEE + \bibitem{DiCosmo2020d} Roberto Di Cosmo + \newblock Archiving and Referencing Source Code with Software Heritage + \newblock International Congress on Mathematical Software (ICMS), 2020 + \bibitem{PNSO2} MESRI + \newblock Second French Plan for Open Science + \newblock \href{https://www.ouvrirlascience.fr/second-national-plan-for-open-science}{www.ouvrirlascience.fr/second-national-plan-for-open-science}, 2001 + \end{thebibliography} + #+END_EXPORT +* Appendix :B_appendix:noexport: + :PROPERTIES: + :BEAMER_env: appendix + :END: +** + \vfill + \centerline{\Huge Appendix} + \vfill diff --git a/talks-public/2021-10-19-telecom-idia/Makefile b/talks-public/2021-10-19-telecom-idia/Makefile new file mode 100644 index 0000000..68fbee7 --- /dev/null +++ b/talks-public/2021-10-19-telecom-idia/Makefile @@ -0,0 +1 @@ +include ../Makefile.slides diff --git a/talks-public/2021-10-19-telecom-idia/this/zack.org b/talks-public/2021-10-19-telecom-idia/this/zack.org new file mode 100644 index 0000000..2a01ae4 --- /dev/null +++ b/talks-public/2021-10-19-telecom-idia/this/zack.org @@ -0,0 +1,13 @@ + +** About me + - Professor, Télécom Paris, ACES team (newbie!) + - Free/Open Source Software activist (20+ years) + - Debian Developer & Former 3x Debian Project Leader + - Former Open Source Initiative (OSI) director + - Software Heritage co-founder & CTO + +*** Research interests + - Software engineering, Free/Open Source Software (FOSS) + - Digital commons + - Computer security + - Software supply chain