diff --git a/talks-public/2022-03-16-Bologna/2022-03-16-Bologna.org b/talks-public/2022-03-16-Bologna/2022-03-16-Bologna.org new file mode 100644 index 0000000..62b9ca4 --- /dev/null +++ b/talks-public/2022-03-16-Bologna/2022-03-16-Bologna.org @@ -0,0 +1,598 @@ +#+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) +#+TITLE: Why we must preserve the world's software history, and how we can do it. +# #+AUTHOR: Roberto Di Cosmo +# #+EMAIL: roberto@dicosmo.org @rdicosmo @swheritage +#+BEAMER_HEADER: \date[16/03/2022]{16/03/2022\\Convegno sul Software, Bologna 2022} +#+BEAMER_HEADER: \title[Preserving software history~~~~ www.softwareheritage.org]{Why we must preserve the world's software history, and how we can do it.} +#+BEAMER_HEADER: \author[R. Di Cosmo~~~~ roberto@dicosmo.org ~~~~ (CC-BY 4.0)]{Roberto Di Cosmo} +#+BEAMER_HEADER: \institute[Software Heritage]{Director, Software Heritage\\Inria and Universit\'e de Paris} +# #+BEAMER_HEADER: \setbeameroption{show notes on second screen} +#+BEAMER_HEADER: \setbeameroption{hide notes} +#+KEYWORDS: software heritage legacy preservation knowledge mankind technology +#+LATEX_HEADER: \usepackage{tcolorbox} +#+LATEX_HEADER: \definecolor{links}{HTML}{2A1B81} +#+LATEX_HEADER: \hypersetup{colorlinks,linkcolor=,urlcolor=links} + +# +# prelude.org contains all the information needed to export the main beamer latex source +# use prelude-toc.org to get the table of contents +# + +#+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 + + +#+INCLUDE: "../../common/modules/169.org" + +# +LaTeX_CLASS_OPTIONS: [aspectratio=169,handout,xcolor=table] + +#+LATEX_HEADER: \usepackage{bbding} +#+LATEX_HEADER: \DeclareUnicodeCharacter{66D}{\FiveStar} + +# +# If you want to change the title logo it's here +# +# +BEAMER_HEADER: \titlegraphic{\includegraphics[width=0.5\textwidth]{SWH-logo}} + +# aspect ratio can be changed, but the slides need to be adapted +# - compute a "resizing factor" for the images (macro for picblocks?) +# +# set the background image +# +# https://pacoup.com/2011/06/12/list-of-true-169-resolutions/ +# +#+BEAMER_HEADER: \pgfdeclareimage[height=90mm,width=160mm]{bgd}{swh-world-169.png} +#+BEAMER_HEADER: \setbeamertemplate{background}{\pgfuseimage{bgd}} +#+LATEX: \addtocounter{framenumber}{-1} +* Introduction +#+INCLUDE: "../../common/modules/rdc-bio.org::#main" :only-contents t :minlevel 2 +* Why we must preserve the history of Software Source Code +** Software is all around us :noexport: + #+INCLUDE: "../../common/modules/software-all-around-us.org::#softwareispervasive" :only-contents t :minlevel 3 +** Focus on the /Source Code/ :noexport: + #+INCLUDE: "../../common/modules/source-code-different-short.org::#thesourcecode" :only-contents t :minlevel 3 +** Software /Source Code/ is Precious Knowledge + #+INCLUDE: "../../common/modules/source-code-different-short.org::#softwareisdifferent" :only-contents t :minlevel 3 +** Calling for preservation: UNESCO +*** :B_column:BMCOL: + :PROPERTIES: + :BEAMER_col: .53 + :BEAMER_env: column + :END: + #+ATTR_LATEX: :width .7\linewidth + file:UNESCOParisCallMeeting.png + UNESCO, Inria, Software Heritage invite\\ + [[https://en.unesco.org/news/experts-call-greater-recognition-software-source-code-heritage-sustainable-development][40 international experts meet in Paris]] ... + #+BEAMER: \pause +*** :B_column:BMCOL: + :PROPERTIES: + :BEAMER_col: .5 + :BEAMER_env: column + :END: + #+ATTR_LATEX: :width .65\linewidth + file:paris_call_ssc_cover.jpg + [[https://en.unesco.org/foss/paris-call-software-source-code][The call is published on Feb 2019]]\pause +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +*** + :PROPERTIES: + :BEAMER_COL: 1.06 + :BEAMER_env: block + :END: + “[We call to] support efforts to gather and preserve the artifacts and + narratives of the history of computing, while the earlier creators are still + alive” + \vspace{10pt} + \hfill https://en.unesco.org/foss/paris-call-software-source-code \hfill\mbox{} + +** Calling for preservation: Donald Knuth and Len Shustek +*** Communications of the ACM, February 2021 :B_picblock: + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_OPT: pic=KnuthHistory.jpg, leftpic=true, width=.3\textwidth + :END: + /"Telling historical stories is the best way to teach. It's much easier to understand something if you know the threads it is connected to."/ + \mbox{}\\ + \mbox{}\\ + \mbox{}\hfill /Let's Not Dumb Down the History of Computer Science/\\ + \mbox{}\hfill Donald E. Knuth, Len Shustek\\ + \mbox{}\hfill https://doi.org/10.1145/3442377 +#+BEAMER: \pause +*** A unique opportunity + most of the creators are still here: we can talk to them!\\ + \hfill but the clock is ticking... +# - Software Heritage provides a key infrastructure for software historians +** /Source Code/ and the commons :noexport: + #+INCLUDE: "../../common/modules/foss-commons.org::#commonsdef" :only-contents t :minlevel 3 +** Source code history for Open Science + #+INCLUDE: "../../common/modules/swh-ardc.org::#pillaropenscience" :only-contents t :minlevel 3 +*** + \hfill Preserving the history of source code is important for /reproducibility/ +** Source code is /special/ (software is /not/ data) :noexport: + # Was: #+INCLUDE: "../../common/modules/swh-ardc.org::#swnotdata" :only-contents t :minlevel 3 +*** /Executable/ and /human readable/ knowledge \hfill copyright law :noexport: + /“Programs must be written for people to read, and only incidentally for machines to execute.”/\\ + \hfill Harold Abelson +#+BEAMER: \pause +*** Software /evolves/ over time + - projects may last decades + - the /development history/ is key to its /understanding/ +#+BEAMER: \pause +*** Complexity :B_picblock: + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_OPT: pic=python3-matplotlib.pdf, width=.6\linewidth + :END: + - /millions/ of lines of code + - large /web of dependencies/ + + easy to break, difficult to maintain + + /research software/ a thin top layer + - sophisticated /developer communities/ +#+BEAMER: \pause +*** The human side + design, algorithm, code, test, documentation, community, funding, and so many more facets ... +** Source code history for Security and Transparency +#+LATEX: \vspace{-.5em} +*** Where does reused software come from? :B_block: + :PROPERTIES: + :BEAMER_env: block + :BEAMER_COL: .5 + :END: +#+BEGIN_EXPORT latex +\begin{center} +\includegraphics[width=.7\linewidth]{myriadsources} +\end{center} +#+END_EXPORT +#+BEAMER: \pause +*** Do /you/ know where it comes from? :B_block: + :PROPERTIES: + :BEAMER_env: block + :BEAMER_COL: .4 + :END: + - the software you ship + - the software you use + - the software you acquire + - the software that + + has that bug + + has that vulnerability +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +#+BEAMER: \pause +*** KYSW: Know Your SoftWare :B_picblock: + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_OPT: pic=executiveorder.jpg,width=.4\linewidth,leftpic=true + :END: + Like KYC in banking, KYSW is now essential all over IT...\\ + \mbox{}\\ + *Sec. 4. Enhancing Software Supply Chain Security* \\ + \hfill /ensuring and attesting, to the extent practicable, to the integrity and provenance of open source software/\\ + \mbox{}\hfill [[https://www.whitehouse.gov/briefing-room/presidential-actions/2021/05/12/executive-order-on-improving-the-nations-cybersecurity/][May 2021 POTUS Executive Order]] + +* An Endangered Knowledge +** Spread all around :noexport: + #+INCLUDE: "../../common/modules/swh-motivations.org::#spread" :only-contents t :minlevel 3 +** Fragile + \vspace{-.5em} + #+INCLUDE: "../../common/modules/swh-motivations.org::#fragile" :only-contents t :minlevel 3 + \mbox{}\\ + \hfill and what about all the landmark legacy source code that is rotting away? +** We are at a turning point + #+INCLUDE: "../../common/modules/turningpoint.org::#turningpoint" :only-contents t :minlevel 3 +* How we can preserve our Software Heritage +** Software Heritage in a nutshell \hfill www.softwareheritage.org +#+BEAMER: \transdissolve +#+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :only-contents t :minlevel 3 +** Largest software archive, principled \hfill \url{http://bit.ly/swhpaper} + #+latex: \begin{center} + #+ATTR_LATEX: :width 0.5\linewidth + file:SWH-as-foundation-slim.png + #+latex: \end{center} + #+BEAMER: \pause + #+latex: \centering + #+ATTR_LATEX: :width \extblockscale{.6\linewidth} + file:archive-growth.png + #+BEAMER: \pause +*** Technology + :PROPERTIES: + :BEAMER_col: 0.34 + :BEAMER_env: block + :END: + - transparency and FOSS + - replicas all the way down +*** Content (billions!) + :PROPERTIES: + :BEAMER_col: 0.32 + :BEAMER_env: block + :END: + - *intrinsic identifiers* + - facts and provenance +*** Organization + :PROPERTIES: + :BEAMER_col: 0.33 + :BEAMER_env: block + :END: + - non-profit + - multi-stakeholder +** A peek under the hood: a universal archive + #+BEAMER: \begin{center} + #+BEAMER: \mode{\only<1>{\includegraphics[width=\extblockscale{1\textwidth}]{swh-dataflow-merkle-listers.pdf}}} + #+BEAMER: \only<2-3>{\includegraphics[width=\extblockscale{1\textwidth}]{swh-dataflow-merkle.pdf}} + #+BEAMER: \end{center} +#+BEAMER: \pause +#+BEAMER: \pause + /Global development history/ permanently archived in a /unique/ git-like Merkle DAG + - *~400 TB* (uncompressed) blobs, *~20 B* nodes, *~300 B* edges + # - *GitHub*, Gitlab.com, Bitbucket, /Gitorious/, /GoogleCode/, GNU, PyPi, Debian, NPM... +** Software Heritage /intrinsic/ Identifiers (SWHID) \hfill [[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][link to full docs]] +# #+INCLUDE: "../../common/modules/swh-id-syntax.org::#swh-id-syntax" :only-contents t :minlevel 3 + #+LATEX: \centering%\forcebeamerstart + #+LATEX: \mode{\only<1>{\includegraphics[width=\linewidth]{SWHID-v1.4_1.png}}} + #+LATEX: \mode{\only<2>{\includegraphics[width=\linewidth]{SWHID-v1.4_2.png}}} + #+LATEX: \only<3->{\includegraphics[width=\linewidth]{SWHID-v1.4_3.png}} + #+LATEX: %\forcebeamerend +*** An emerging standard :B_block: + :PROPERTIES: + :BEAMER_act: <4-> + :BEAMER_COL: .6 + :BEAMER_env: block + :END: + - in Linux Foundation's [[https://spdx.github.io/spdx-spec/appendix-VI-external-repository-identifiers/#persistent-id][SPDX 2.2]] + - IANA registered, WikiData property [[https://www.wikidata.org/wiki/Property:P6138][P6138]] +*** Examples: :B_block: + :PROPERTIES: + :BEAMER_act: <5-> + :BEAMER_COL: .4 + :BEAMER_env: block + :END: + - [[https://archive.softwareheritage.org/swh:1:cnt:64582b78792cd6c2d67d35da5a11bb80886a6409;origin=https://github.com/virtualagc/virtualagc;lines=245-261/][Apollo 11 AGC excerpt]], + - [[https://archive.softwareheritage.org/swh:1:cnt:bb0faf6919fc60636b2696f32ec9b3c2adb247fe;origin=https://github.com/id-Software/Quake-III-Arena;lines=549-572/][Quake III rsqrt]] +* Demo time! +** A walkthrough + - Browse [[https://archive.softwareheritage.org][the archive]] + - [[https://save.softwareheritage.org][Trigger archival]] of your preferred software in a breeze + - Get and use SWHIDs ([[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][full specification available online]]) + - The [[https://www.softwareheritage.org/2019/07/20/archiving-and-referencing-the-apollo-source-code/][Apollo 11 AGC source code example]] + - Cite software [[https://www.softwareheritage.org/2020/05/26/citing-software-with-style/][with the biblatex-software style]] from CTAN + - Example use in a research article: compare Fig. 1 and conclusions + - in [[http://www.dicosmo.org/Articles/2012-DaneluttoDiCosmo-Pcs.pdf][the 2012 version]] + - in [[https://www.dicosmo.org/share/parmap_swh.pdf][the updated version]] using SWHIDs and Software Heritage +# - Example use in a research article: extensive use of SWHIDs in [[https://www.dicosmo.org/Articles/2020-ReScienceC.pdf][a replication experiment]] + - Example in a journal: [[http://www.ipol.im/pub/art/2020/300/][an article from IPOL]] + - [[https://doc.archives-ouvertes.fr/en/deposit/deposit-software-source-code/][Curated deposit in SWH via HAL]], see for example: + [[https://hal.archives-ouvertes.fr/hal-02130801][LinBox]], [[https://hal.archives-ouvertes.fr/hal-01897934][SLALOM]], [[https://hal.archives-ouvertes.fr/hal-02130729][Givaro]], [[https://hal.archives-ouvertes.fr/hal-02137040][NS2DDV]], [[https://hal.archives-ouvertes.fr/lirmm-02136558][SumGra]], [[https://hal.archives-ouvertes.fr/hal-02155786][Coq proof]], ... + - Rescue landmark legacy software, see the [[https://www.softwareheritage.org/swhap/][SWHAP process with UNESCO]] +** Summing up: a revolutionary infrastructure /designed for source code/ :noexport: +#+latex:\vspace{-0.2em} +#+BEGIN_EXPORT latex + \begin{center} + \includegraphics[width=.4\linewidth]{SWH-logo.pdf} + { \url{www.softwareheritage.org}} + \end{center} +#+END_EXPORT +#+latex:\vspace{-0.4em} +*** /global/ source code /archive/ \hfill /Library of Alexandria of source code/ + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_OPT: pic=clock-spring-forward.png,width=.15\linewidth,leftpic=true + :END: + + harvest /all/ software source code + + /on demand harvesting/ and /curated deposit/ +#+latex:\vspace{-0.4em} +*** /universal/ intrinsic identifiers + \mbox{}\hfill SWHID standard is independent of version control systems +#+latex:\vspace{-0.4em} +*** /uniform/ data model, /full graph/ of development history + \mbox{}\hfill enables large scale, big code research +#+latex:\vspace{-0.4em} +*** /infrastructure/ for Open Science + \mbox{} \hfill /base layer/ for software source code in the /Open Science architecture/ +* Preserving our software commons: the past +** The \hfill /\href{https://unesdoc.unesco.org/ark:/48223/pf0000371017}{SWHAP}/ process, with UNESCO and University of Pisa + # variant of #+INCLUDE: "../../common/modules/swh-acquisition-process.org::#swhap" :only-contents t :minlevel 3 +*** Paris Call on Software Source Code + “[We call to] support efforts to gather and preserve the artifacts and + narratives of the history of computing, while the earlier creators are still + alive” + #+BEAMER: \pause +*** :B_block:BMCOL: + :PROPERTIES: + :BEAMER_col: 0.3 + :END: +#+BEGIN_EXPORT latex +\begin{center} +\includegraphics[width=\extblockscale{1.1\linewidth}]{SWHAP-cover.pdf} +\end{center} +#+END_EXPORT +*** :B_block:BMCOL: + :PROPERTIES: + :BEAMER_col: 0.7 + :END: + - **Rescue** Legacy Software from different media\\ + \mbox{}\\ + - **Curate** the code + - reconstruct the development history + + /Software Heritage - GitHub/ work on fixing git + - collect the metadata\\ + \mbox{}\\ + - **Archive** in Software Heritage +*** + *UNESCO, UniPi and Software Heritage* collaboration \hfill worldwide scope + +** An example: TAUmus, from Pisa (70's) + #+INCLUDE: "../../common/modules/swh-acquisition-process.org::#swhaptaumus" :only-contents t :minlevel 3 +#+BEAMER: \pause +\vspace{-4em} +*** Stay tuned + \hfill /great news for software stories in a few months/... +* Preserving our software commons: the present and the future +** Focus on Academia: growing adoption (selection) + #+INCLUDE: "../../common/modules/swh-adoption-academic.org::#adoption" :only-contents t :minlevel 3 +** Recent preservation news +*** Saving 250.000 endangered repositories... + - summer 2019: BitBucket announce Mercurial VCS phase out + - fall 2019: Software Heritage teams up with Octobus (funded by NLNet, thanks!) + - july 2020: BitBucket erases /250.000/ repositories + - august 2020: [[https://bitbucket-archive.softwareheritage.org][bitbucket-archive.softwareheritage.org]] is live +#+BEAMER: \pause +*** ... preserving the web of knowledge \hfill (original tweet [[https://twitter.com/gabrielaltay/status/1300218789762662401][is here]] ) :B_picblock: + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_OPT: pic=bitbucket_swh_praise.png, width=.6\linewidth, leftpic=true + :END: + +\\ + *Bottomline*\\ + /explicit deposit/ is important, ...\\ + \mbox{}\hfill ... and we must promote it...\hfill\mbox{}\\ + \mbox{}\hfill ... but will never be enough.\\ +\mbox{}\\ +\mbox{}\hfill /(think also of all software dependencies!)/ +* The road ahead +** A committed core team :B_block: + #+ATTR_LATEX: :width .6\linewidth + file:team-2020.png +# *** Team +# - 1.5 / 3 management - 6 / 15 dev/ops; - 1 / 4 product management +# - 0.5 / 4 outreach - 0.2 / 3 partnerships - 0 / support +# #+BEAMER: \pause +*** + plus Jean-François Abramatic ...\\ + \hfill ... and many more that are supporting our daily work (*/thanks Inria!/*) + +** An international, non profit initiative\hfill built for the long term + :PROPERTIES: + :CUSTOM_ID: support + :END: +*** Sharing the vision :B_block: + :PROPERTIES: + :CUSTOM_ID: endorsement + :BEAMER_COL: .5 + :BEAMER_env: block + :END: + #+LATEX: \begin{center}{\includegraphics[width=\extblockscale{.4\linewidth}]{unesco_logo_en_285}}\end{center} + #+LATEX: \vspace{-0.8cm} + #+LATEX: \begin{center}\vskip 1em \includegraphics[width=\extblockscale{1.4\linewidth}]{support.pdf}\end{center} + #+latex: \small And many more ...\\ + #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/testimonials} +#+BEAMER: \pause +*** Donors, members, sponsors :B_block: + :PROPERTIES: + :CUSTOM_ID: sponsors + :BEAMER_COL: .5 + :BEAMER_env: block + :END: + #+LATEX: \begin{center}\includegraphics[width=\extblockscale{.4\linewidth}]{inria-logo-new}\end{center} + #+LATEX: \begin{center} + # #+LATEX: \includegraphics[width=\extblockscale{.2\linewidth}]{sponsors-levels.pdf} + #+LATEX: \colorbox{white}{\includegraphics[width=\extblockscale{1.4\linewidth}]{sponsors.pdf}} + #+LATEX: \end{center} +# - sponsoring / partnership :: \hfill \url{sponsorship.softwareheritage.org} +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +*** Research collaboration :B_picblock:noexport: + :PROPERTIES: + :BEAMER_COL: .5 + :BEAMER_env: picblock + :BEAMER_OPT: pic=Qwant_Logo, leftpic=true + :END: + source code search engine +*** See more :noexport: + \hfill\tiny\url{http:://www.softwareheritage.org/support/testimonials} +*** Global network :B_picblock:noexport: + :PROPERTIES: + :BEAMER_COL: .5 + :BEAMER_env: picblock + :BEAMER_OPT: pic=fossid, leftpic=true, width=.3\linewidth + :END: + - first *independent mirror* + - increased reliability +** You may help! +*** Foster adoption and best practices + + [[https://www.softwareheritage.org/save-and-reference-research-software/][archive and reference relevant source code]] (save code now, and [[https://hal.inria.fr/hal-01872189][deposit]]) + + use Software Heritage in research articles, journals, and books + + [[https://www.softwareheritage.org/swhap/][rescue and preserve landmark legacy source code]] with SWHAP +#+BEAMER: \pause +*** Engage with Software Heritage as an individual + - join the [[https://www.softwareheritage.org/ambassadors/][ambassador program]], help raise awareness + - contribute to technical and scientific development +#+BEAMER: \pause +*** Engage with Software Heritage as an organization + - become [[https://www.softwareheritage.org/support/sponsors/][a member/sponsor]] + - build a Software Heritage mirror (like ENEA is doing) + - contribute to the preservation mission +** Thank you! + #+latex: \centering{\huge\bf Questions?} + \vspace{-.5em} +*** Resources + - newsletter :: https://www.softwareheritage.org/newsletter/ + - blog :: https://www.softwareheritage.org/blog/ + - archive :: https://archive.softwareheritage.org/ + - media, press, etc. :: https://annex.softwareheritage.org/ + +*** References (see https://www.softwareheritage.org/publications) + #+BEGIN_EXPORT latex + \begin{thebibliography}{Foo Bar, 1969} + \footnotesize +% \bibitem{SwForumEu2021} R. Di Cosmo, \emph{A revolutionary infrastructure for Open Source}, 2021, EU Software Forum \href{https://annex.softwareheritage.org/public/talks/2021/2021-03-24-SwForum.pdf}{(slides)} \href{https://youtu.be/AwY527kDMfM?t=178}{(video)} + \bibitem{EOSCSirs2020} EOSC SIRS Task Force, \emph{Scholarly Infrastructures for Research Software} + \newblock 2020, European Commission, \href{https://doi.org/10.2777/28598}{(10.2777/28598)} + \bibitem{DiCosmo2020d} R. Di Cosmo, \emph{Archiving and Referencing Source Code with Software Heritage} + \newblock ICMS 2020 \href{https://dx.doi.org/10.1007/978-3-030-52200-1_36}{(10.1007/978-3-030-52200-1\_36)} +% \bibitem{DiCosmo2019} R. Di Cosmo, M. Gruenpeter, S. Zacchiroli\newblock +% \emph{Referencing Source Code Artifacts: a Separate Concern in Software Citation},\newblock +% CiSE 2020 \href{https://dx.doi.org/10.1109/MCSE.2019.2963148}{(10.1109/MCSE.2019.2963148)} +% \href{https://hal.archives-ouvertes.fr/hal-02446202}{(hal-02446202)} +% \bibitem{alliez:hal-02135891} P. Alliez, R. Di Cosmo, B. Guedj, A. Girault, M.-S. Hacid, A. Legrand and N. Rougier\newblock +% \emph{Attributing and referencing (research) software: Best practices and outlook from Inria}, \newblock +% CiSE 2020 \href{https://doi.ieeecomputersociety.org/10.1109/MCSE.2019.2949413}{(10.1109/MCSE.2019.2949413)} +% \href{https://hal.archives-ouvertes.fr/hal-02135891}{(hal-02135891)} + \bibitem{Abramatic2018} J.F. Abramatic, R. Di Cosmo, S. Zacchiroli, + \emph{Building the Universal Archive of Source Code}, + \newblock CACM, October 2018 \href{https://doi.org/10.1145/3183558}{(10.1145/3183558)} + \end{thebibliography} + #+END_EXPORT +* Appendix :B_appendix: + :PROPERTIES: + :BEAMER_env: appendix + :END: +** + \vfill + \centerline{\Huge Appendix} + \vfill +* Revolutionary infrastructure, scientific challenges +** A revolutionary infrastructure /designed for software source code/ +#+INCLUDE: "../../common/modules/swh-as-infrastructure.org::#oneslide" :only-contents t :minlevel 3 +** A challenging scientific and technical undertaking +*** A novel, large infrastructure + - object storage [[https://www.softwareheritage.org/2021/03/11/towards-a-next-generation-object-storage-for-software-heritage/][with peculiar workload]] + - gigantic Merkle graph + - counting tens of billions of objects ([[https://www.softwareheritage.org/2021/05/11/next-generation-counters/][reuse P. Flajolet's seminal work]]) + - and much more: see [[https://www.softwareheritage.org/2021/04/08/swh-2021-technical-roadmap/][the 2021 technical roadmap]] +#+BEAMER: \pause +*** First datasets are available for Big Code analysis + - full graph of software development (~20Bn nodes, ~200Bn edges) + see Pietri, Spinellis, Zacchiroli, MSR 2019 https://dx.doi.org/10.1109/MSR.2019.00030 + - MSR 2020 mining competition + see https://2020.msrconf.org/track/msr-2020-mining-challenge#Call-for-Papers +* All the source code +** All the source code! + #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{\linewidth}]{swh-collect-axes}\end{center} +** All the source: strategies + #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{\linewidth}]{swh-collect-strategies}\end{center} +* Milestones, policy +** Milestones :B_block: + #+INCLUDE: "../../common/modules/swh-key-dates.org::#keydates" :minlevel 3 :only-contents t +** Policy highlight: the EU Copyright reform\hfill adopted March 28 2019 +*** "Upload filters": a threat to /all modern software development/ + - developing platforms (GitHub, GitLab, Bitbucket, etc.) + - *distribution platforms (Maven, Pypi, CRAN, CTAN, etc.)* + - *archives (Software Heritage)* +#+BEAMER: \pause +*** We got an exclusion for + \hfill /\sout{non for profit} open source software developing *and sharing* platforms/ +#+BEAMER: \pause +*** Key role of Software Heritage + \hfill policy-maker awareness, essential insights for NGOs, government contacts +* SWHIDs by the example :noexport: +** A word on the trust model for systems of identifiers + \vspace{-5pt} +*** Two general classes of systems of identifiers + - intrinsic :: /computed/ from the object /(no registry required, fully decentralised)/\\ + /(e.g.: chemical notation, music notation, hashes, SWHIDs)/\pause + - extrinsic :: /assigned/ by an authority /(need a registry)/\\ + /(e.g.: passport number, DOI, ARK, RRID, etc.)/\pause + \mbox{}\hfill See [[https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/][the dedicated blog post]] for more details +#+BEAMER: \pause +*** Trust model, extrinsic (e.g. DOIs) :B_block: + :PROPERTIES: + :BEAMER_COL: .5 + :BEAMER_env: block + :END: +#+ATTR_LATEX: :width \linewidth +file:doi-vs-pid-1.pdf +#+BEAMER: \pause +*** Trust model, intrinsic (e.g. SWHIDs) :B_block: + :PROPERTIES: + :BEAMER_env: block + :BEAMER_COL: .45 + :END: +#+ATTR_LATEX: :width .8\linewidth +file:doi-vs-pid-3.pdf +*** Trust model for DOIs with checksums :B_block:noexport: + :PROPERTIES: + :BEAMER_COL: .5 + :BEAMER_env: block + :END: +#+ATTR_LATEX: :width \linewidth +file:doi-vs-pid-2.pdf +*** :B_ignoreheading:noexport: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +** A worked example + #+LATEX: \centering\forcebeamerstart + #+LATEX: \only<1>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_1.pdf}}} + #+LATEX: \only<2>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/contents.pdf}}} + #+LATEX: \only<3>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_2_contents.pdf}}} + #+LATEX: \only<4>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/directories.pdf}}} + #+LATEX: \only<5>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_3_directories.pdf}}} + #+LATEX: \only<6>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/revisions.pdf}}} + #+LATEX: \only<7>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_4_revisions.pdf}}} + #+LATEX: \only<8>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/releases.pdf}}} + #+LATEX: \only<9>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/merkle_5_releases.pdf}}} + #+LATEX: \only<10>{\colorbox{white}{\includegraphics[width=\extblockscale{\linewidth}]{git-merkle/snapshots.pdf}}} + #+LATEX: \forcebeamerend +* Big code :noexport: +** Software Heritage for Research and Innovation +*** Reference platform for /Big Code/ :B_picblock: + :PROPERTIES: + :BEAMER_opt: pic=universal, leftpic=true, width=.2\linewidth + :BEAMER_env: picblock + :BEAMER_act: + :END: + - unique *observatory* of all software development + - *big data, machine learning* paradise: classification, trends, coding patterns, code completion... +#+BEAMER: \pause +*** First datasets are available! + - full graph of software development (~20Bn nodes, ~200Bn edges) + see Pietri, Spinellis, Zacchiroli, MSR 2019 https://dx.doi.org/10.1109/MSR.2019.00030 + - MSR 2020 mining competition + see https://2020.msrconf.org/track/msr-2020-mining-challenge#Call-for-Papers + +* Public code, mirrors :B_block: +** News : archiving /public/ code + #+latex: \begin{center} + #+ATTR_LATEX: :width 0.7\linewidth + file:codeetalab.png + #+latex: \end{center} +#+BEAMER: \pause + https://code.etalab.gouv.fr +** News : ENEA mirror +*** Thomas Jefferson, February 18, 1791 :B_block: + :PROPERTIES: + :BEAMER_ACT: + :BEAMER_env: block + :END: +#+latex: {\em + ...let us save what remains: not by vaults and locks which fence them + from the public eye and use in consigning them to the waste of time, + but by such a multiplication of copies, as shall place them beyond + the reach of accident. +#+latex: } + #+BEAMER: \pause +*** Welcoming ENEA :B_block: + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_OPT: pic=LogoENEAcompletoENG.png, leftpic=true, width=.7\linewidth + :END: + - first *institutional* mirror + - increased resilience + - *AI infrastructure* for researchers + - stepping stone to \endgraf + \hfill an European joint effort diff --git a/talks-public/2022-03-16-Bologna/METADATA b/talks-public/2022-03-16-Bologna/METADATA new file mode 100644 index 0000000..094b3d2 --- /dev/null +++ b/talks-public/2022-03-16-Bologna/METADATA @@ -0,0 +1,58 @@ +Dear colleague, + +We are pleased to invite you to our last event before our summer break (July and August): + +Date and time: Tuesday, June 29, 2021 at 5:00 p.m. (17:00) Central European Summer Time (UTC+2) + +Topic: “Should we preserve the world's software history, and can we?” +(scroll down for abstract and CV) + +Speaker: Roberto Di Cosmo (INRIA, France) +Moderator & Respondent: Edward A. Lee (UC Berkeley, USA) + +To participate via Zoom go to: https://tuwien.zoom.us/j/96389928143?pwd=UU5YRkNuRmdoWHV4MFBwMWRCcUErdz09 +(Password: 0dzqxqiy) + +The talk will be live streamed and recorded on our YouTube channel: +https://www.youtube.com/digitalhumanism + +For further announcements and information about the speakers in the Lecture Series, see https://dighum.ec.tuwien.ac.at/news-events +Also, please note that you can access the slides and recordings of all our past events via that link. + +Have a good summer, we are looking forward to seeing you again at our next events in September! + +Stay safe! +Hannes Werthner + +----- +Sign the Vienna Manifesto: https://dighum.ec.tuwien.ac.at +Follow us on Twitter @DigHumTUWien +----- + + +ABSTRACT: + +Cultural heritage is the legacy of physical artifacts and intangible attributes of a group or society that are inherited from past generations, maintained in the present and bestowed for the benefit of future generations. What role does software play in it? + +We claim that software source code is an important product of human creativity, and embodies a growing part of our scientific, organisational and technological knowledge: it is a part of our cultural heritage, and it is our collective responsibility to ensure that it is not lost. +Preserving the history of software is also a key enabler for reproducibility of research, and as a means to foster better and more secure software for society. + +This is the mission of Software Heritage, a non-profit organization dedicated to building the universal archive of software source code, catering to the needs of science, industry and culture, for the benefit of society as a whole. In this presentation we will survey the principles and key technology used in the archive that contains over 10 billion unique source code files from some 160 millions projects worldwide. + +Riassunto: + +Il patrimonio culturale è l'eredità di artefatti fisici e attributi intangibili di un gruppo o società che sono ereditati dalle generazioni passate, mantenuti nel presente e donati a beneficio delle generazioni future. Che ruolo ha il software in tutto ciò? + +Il codice sorgente del software è un importante prodotto della creatività umana, e incarna una parte crescente della nostra conoscenza scientifica, organizzativa e tecnologica: è una parte del nostro patrimonio culturale, ed è nostra responsabilità collettiva assicurarci che non vada perso. +Preservare la storia del software è anche un fattore chiave per la riproducibilità della ricerca, e come mezzo per promuovere un software migliore e più sicuro per la società. + +Questa è la missione di Software Heritage, un'organizzazione no-profit dedicata alla costruzione dell'archivio universale del codice sorgente del software, che soddisfa i bisogni della scienza, dell'industria e della cultura, a beneficio di tutta la società. In questa presentazione esamineremo i principi e la tecnologia chiave utilizzati nell'archivio che contiene oltre 10 miliardi di file di codice sorgente unici da circa 160 milioni di progetti in tutto il mondo. + + +Short Bio of Roberto Di Cosmo: + +An alumnus of the Scuola Normale Superiore di Pisa, with a PhD in Computer Science from the University of Pisa, Roberto Di Cosmo was associate professor for almost a decade at Ecole Normale Supérieure in Paris. In 1999, he became a Computer Science full professor at University Paris Diderot, where he was head of doctoral studies for Computer Science from 2004 to 2009. A trustee of the IMDEA Software institute, and member of the national committee for Open Science in France, he is currently on leave at Inria. + +His research activity spans theoretical computing, functional programming, parallel and distributed programming, the semantics of programming languages, type systems, rewriting and linear logic, and, more recently, the new scientific problems posed by the general adoption of Free Software, with a particular focus on static analysis of large software collections. He has published over 20 international journals articles and 50 international conference articles. + +After creating the Free Software thematic group of Systematic, that helped fund over 50 Open Source research and development collaborative projects, and IRILL, a research structure dedicated to Free and Open Source Software quality, he got support from Inria to create Software Heritage, with the mission to build the universal archive of all the source code publicly available, in partnership with UNESCO. diff --git a/talks-public/2022-03-16-Bologna/Makefile b/talks-public/2022-03-16-Bologna/Makefile new file mode 100644 index 0000000..68fbee7 --- /dev/null +++ b/talks-public/2022-03-16-Bologna/Makefile @@ -0,0 +1 @@ +include ../Makefile.slides