diff --git a/common/images/2016-11-archive-growth.png b/common/images/2016-11-archive-growth.png index 145f1f3..2eb8e15 100644 Binary files a/common/images/2016-11-archive-growth.png and b/common/images/2016-11-archive-growth.png differ diff --git a/common/modules/source-code-different-short.org b/common/modules/source-code-different-short.org index 785ca7e..2ea71e1 100644 --- a/common/modules/source-code-different-short.org +++ b/common/modules/source-code-different-short.org @@ -1,98 +1,98 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) # # Software is all around us # #+INCLUDE: "prelude.org" :minlevel 1 * Software source code :PROPERTIES: :CUSTOM_ID: main :END: ** The source code matters! :PROPERTIES: :CUSTOM_ID: thesourcecode :END: #+LATEX: \includegraphics[width=.10\linewidth]{software.png} #+BEGIN_QUOTE “The source code for a work means the preferred form of the work for making modifications to it." \hfill GPL Licence #+END_QUOTE #+Beamer: \pause *** :PROPERTIES: :BEAMER_env: block :BEAMER_act: +- :END: #+latex: \begin{center} Hello World \end{center} *** Program (excerpt of binary) :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.5 :BEAMER_env: block :BEAMER_act: +- :END: #+begin_src hex :exports code 4004e6: 55 4004e7: 48 89 e5 4004ea: bf 84 05 40 00 4004ef: b8 00 00 00 00 4004f4: e8 c7 fe ff ff 4004f9: 90 4004fa: 5d 4004fb: c3 #+end_src *** Program (source code) :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.55 :BEAMER_env: block :BEAMER_act: +- :END: #+begin_src c :exports code /* Hello World program */ #include void main() { printf("Hello World"); } #+end_src ** Software Source Code is /special/ :PROPERTIES: :CUSTOM_ID: softwareisdifferent :END: *** Harold Abelson, Structure and Interpretation of Computer Programs /“Programs must be written for people to read, and only incidentally for machines to execute.”/ *** Quake 2 source code (excerpt) :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.45 :BEAMER_env: block :END: #+LATEX: \includegraphics[width=\linewidth]{quake-carmack-sqrt-1.png} # smart efficient implementation of 1/sqrt(x) on a CPU without special support -*** Network queue in Linux (excerpt) :B_block:BMCOL: +*** Net. queue in Linux (excerpt) :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.45 :BEAMER_env: block :END: #+LATEX: \includegraphics[width=\linewidth]{juliusz-sfb-short.png} # Juliusz implementation of stochastic fair blue in the Linux Kernel linux/net/sched/sch_sfb.c *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** Len Shustek, Computer History Museum \hfill /“Source code provides a view into the mind of the designer.”/ *** Distinguishing features :noexport: - /executable/ and /human readable/ knowledge (an /all time new/) + even hardware is... software! (VHDL, FPGA, ...) + /text files are forever/ - naturally /evolves/ over time + the /development history/ is key to its /understanding/ - complex: large /web of dependencies/, millions of SLOCs *** In a word :noexport: - software /is not just another/ sequence of bits - a software archive /is not just another/ digital archive diff --git a/common/modules/status-extended.org b/common/modules/status-extended.org index ddb1433..be0ed00 100644 --- a/common/modules/status-extended.org +++ b/common/modules/status-extended.org @@ -1,216 +1,239 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) #+INCLUDE: "prelude.org" :minlevel 1 * Status :PROPERTIES: :CUSTOM_ID: main :END: ** The people :PROPERTIES: :CUSTOM_ID: people :END: *** The core team :B_picblock: :PROPERTIES: + :CUSTOM_ID: coreteam :BEAMER_env: picblock :BEAMER_opt: pic=team,width=.4\linewidth :END: - Roberto Di Cosmo - Stefano Zacchiroli - Nicolas Dandrimont (Engineer) - Antoine Dumont (Engineer) - and /Jordi, Quentin and Guillaume/ *** Scientific advisors - Serge Abiteboul (French Science Academy) - Jean-François Abramatic (former W3C director) - Gerard Berry (CNRS Gold Medal, French Science Academy) - Julia Lawall (Coccinelle, Linux Kernel, Outreachy) ** Archive coverage :PROPERTIES: :CUSTOM_ID: archive :END: *** Our sources :PROPERTIES: :BEAMER_act: +- :END: - GitHub --- full, up-to-date mirror - Debian --- daily snapshots of all suites since 2005--2015 - GNU --- all releases as of August 2015 - Gitorious, Google Code --- local copy (Archive Team & Google) *** Some numbers :PROPERTIES: :BEAMER_act: +- :END: #+latex: \centering #+ATTR_LATEX: :width \extblockscale{.8\linewidth} file:growth.png #+latex: \footnotesize\vspace{-3mm} - 150 TB blobs, 5 TB database (as a graph: 4 B nodes + 40 B edges) + 150 TB blobs, 6 TB database (as a graph: 5 B nodes + 50 B edges) *** :PROPERTIES: :BEAMER_act: +- :END: \hfill The /richest/ source code archive already, ... and growing daily! ** The structure of the archive :noexport: *** On-disk storage - flat file storage for contents - postgres database for the metadata *** Data model: /one/ big Merkle DAG, inspired by the git model - Origins (= repositories) - Occurrences (= branches) - Releases (= tags) - Revisions (= commits) - Directories (= trees) - Contents (= blobs) ** Architecture :noexport: :PROPERTIES: :CUSTOM_ID: architecture :END: *** Data flow :PROPERTIES: :CUSTOM_ID: dataflow :END: #+BEAMER: \hspace*{-0.7cm}\includegraphics[width=1.15\textwidth]{swh-dataflow.pdf} ** Data model :noexport: *** General schema - VCS-independent - fully deduplicated + files, directories and commits are /shared/ - biggest git-like /graph/ in the world *** \begin{center} \url{http://deb.li/swhdm} \end{center} *** full hash index (sha1, sha256, ...) Some funny facts: - the GPL2 licence appears under more than 500 names + including /aa.css.txt/ and /FullSync.txt/ ~ :-) -** Merkle structure :noexport: - :PROPERTIES: - :CUSTOM_ID: merkle - :END: -*** Merkle trees - :PROPERTIES: - :CUSTOM_ID: merkletree - :END: - # R. C. Merkle, A digital signature based on a conventional encryption - # function, Crypto '87 - #+BEAMER: \vspace{-3mm} -**** Merkle tree (R. C. Merkle, Crypto 1979) :B_picblock: +** Merkle DAG +*** Merkle structure + :PROPERTIES: + :CUSTOM_ID: merkle + :END: +**** Merkle trees :PROPERTIES: - :BEAMER_opt: pic=merkle, leftpic=true, width=.7\linewidth - :BEAMER_env: picblock - :BEAMER_act: + :CUSTOM_ID: merkletree :END: - Combination of - - tree - - hash function - #+BEAMER: \pause -**** Classical cryptographic construction - - fast, parallel signature of large data structures - - widely used (e.g., Git, Bitcoin, IPFS, ...) - - built-in deduplication -*** The archive in a few pictures - :PROPERTIES: - :CUSTOM_ID: merkledemo - :END: -**** A giant (extended) Merkle DAG - #+LATEX: \only<1>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_1.pdf}}} - #+LATEX: \only<2>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/contents.pdf}}} - #+LATEX: \only<3>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_2_contents.pdf}}} - #+LATEX: \only<4>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/directories.pdf}}} - #+LATEX: \only<5>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_3_directories.pdf}}} - #+LATEX: \only<6>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/revisions.pdf}}} - #+LATEX: \only<7>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_4_revisions.pdf}}} - #+LATEX: \only<8>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/releases.pdf}}} - #+LATEX: \only<9>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_5_releases.pdf}}} - # #+LATEX: {\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_1.pdf}}} -** Merkle structure (short) :noexport: - :PROPERTIES: - :CUSTOM_ID: giantdag - :END: -*** The archive: a (giant) Merkle DAG - # Using an empty frame because the image is difficult to read on swh bg. - # Finding a way to override image bg for just this frame would be better. -**** - #+BEAMER: \includegraphics[width=\textwidth]{git-merkle/merkle_5_releases} + # R. C. Merkle, A digital signature based on a conventional encryption + # function, Crypto '87 + #+BEAMER: \vspace{-3mm} +***** Merkle tree (R. C. Merkle, Crypto 1979) :B_picblock: + :PROPERTIES: + :BEAMER_opt: pic=merkle, leftpic=true, width=.7\linewidth + :BEAMER_env: picblock + :BEAMER_act: + :END: + Combination of + - tree + - hash function + #+BEAMER: \pause +***** Classical cryptographic construction + - fast, parallel signature of large data structures + - widely used (e.g., Git, Bitcoin, IPFS, ...) + - built-in deduplication +**** The archive in a few pictures + :PROPERTIES: + :CUSTOM_ID: merkledemo + :END: +***** A giant (extended) Merkle DAG + #+LATEX: \only<1>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_1.pdf}}} + #+LATEX: \only<2>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/contents.pdf}}} + #+LATEX: \only<3>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_2_contents.pdf}}} + #+LATEX: \only<4>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/directories.pdf}}} + #+LATEX: \only<5>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_3_directories.pdf}}} + #+LATEX: \only<6>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/revisions.pdf}}} + #+LATEX: \only<7>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_4_revisions.pdf}}} + #+LATEX: \only<8>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/releases.pdf}}} + #+LATEX: \only<9>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_5_releases.pdf}}} + # #+LATEX: {\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_1.pdf}}} +*** A revision node + :PROPERTIES: + :CUSTOM_ID: merklerevision + :END: +**** Example: a Software Heritage revision +***** + #+BEAMER: \includegraphics[width=0.9\textwidth]{git-merkle/revisions} +*** Giant DAG + :PROPERTIES: + :CUSTOM_ID: giantdag + :END: +**** The archive: a (giant) Merkle DAG + # Using an empty frame because the image is difficult to read on swh bg. + # Finding a way to override image bg for just this frame would be better. +***** + #+BEAMER: \includegraphics[width=\textwidth]{git-merkle/merkle_5_releases} ** Technology :noexport: :PROPERTIES: :CUSTOM_ID: technology :END: *** Hardware **** hosted by Inria - Hypervisor with a dozen virtual machines - High density storage array (60 * 6TB => 300TB usable) - Copy in another server room; logical leader/follower mirroring - Soon to enable a mirror network to duplicate our contents **** Azure cloud (work in progress prototype) - full mirror using distributed object storage - workers for batch analyses and crawling *** Software **** 3rd party FOSS - Debian distribution, orchestrated with Puppet - PostgreSQL for metadata storage - RabbitMQ for task scheduling - Python3 and psycopg2 for the backend - Flask and Bootstrap for the web apps #+BEAMER: \\ $\to$ \alert{\footnotesize \url{https://www.softwareheritage.org/jobs/}} - Phabricator forge **** in-house FOSS - ~50 Git repositories (~20 Python packages, ~10 Puppet modules) - ~20 kSLOC Python / ~10 kSLOC SQL / ~1 kSLOC Puppet - licence choice: GPLv3 (backend) / AGPLv3 (frontend) - https://forge.softwareheritage.org/ *** Software architecture **** Module dependencies (internal + external) :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_opt: pic=swh-modules-deps-all,width=\linewidth :END: **** let's zoom in: http://deb.li/swhdeps ** Software development :noexport: :PROPERTIES: :CUSTOM_ID: development :END: *** Software development **** classic FOSS development - language: English - development mailing list #+BEAMER: \\{\small \url{https://sympa.inria.fr/sympa/info/swh-devel}} - IRC #+BEAMER: \\ #swh-devel / FreeNode - Forge #+BEAMER: \\{\small \url{https://forge.softwareheritage.org}} - Git, tasks, code review, etc. **** for more information #+BEAMER: \scriptsize https://www.softwareheritage.org/community/developers/ -** The road ahead +** Roadmap :PROPERTIES: :CUSTOM_ID: features :END: -*** Planned features... - - /lookup/ by content hash (done) - - /download/: wget and git clone from Software Heritage - - /provenance information/ for all archived code and metadata - - /browsing/: wayback machine for archived code and its history - - /full-text search/ on all archived source code files +*** Features... + - (done) *lookup* by content hash + - *browsing*: "wayback machine" for archived code + - (done) Web API + - (todo) Web UI + - (todo) *download*: =wget=/=git clone= from the archive + - (todo) *provenance information* for all archived content + - (todo) *full-text search* on all archived source code files #+BEAMER: \pause *** ... and much more than one could possibly imagine all the world's software development history in a single graph! - # \hfill /that makes a 150TB archive / 5TB database already.../ +** Web API :noexport: + :PROPERTIES: + :CUSTOM_ID: api + :END: +*** Web API (FOSDEM'17 release) +**** empty placeholder + TODO +*** Web API --- endpoints +**** empty placeholder + TODO +*** Web API --- limitations +**** empty placeholder + TODO ** Some technical challenges :PROPERTIES: :CUSTOM_ID: techchallenges :END: *** Expanding the archive - discover and classify /all/ the software sources - importers for other VCSs (SVN, Hg, ...) \hfill /We need your help!/ *** Staying current get new repositories and commits ASAP\\ \hfill /We need reliable, standardised event feeds./ *** Handling the backlog ingesting all the pre-existing data\\ \hfill /Decades of software development are waiting!/ diff --git a/common/modules/swh-motivations.org b/common/modules/swh-motivations.org index 8d60844..799084d 100644 --- a/common/modules/swh-motivations.org +++ b/common/modules/swh-motivations.org @@ -1,69 +1,69 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) # # Software is all around us # #+INCLUDE: "prelude.org" :minlevel 1 * What we are missing today :PROPERTIES: :CUSTOM_ID: main :END: ** Software is spread all around :PROPERTIES: :CUSTOM_ID: spread :END: #+latex: \begin{flushleft} #+ATTR_LATEX: :width \extblockscale{.5\linewidth} file:myriadsources.png #+latex: \end{flushleft} *** Fashion victims - many disparate development platforms - a myriad places where distribution may happen - - projects tend to migrate from one place to the other over time + - projects tend to migrate from one place to another over time #+BEAMER: \pause *** One place... :B_block: :PROPERTIES: :BEAMER_env: block :END: \hfill ... where can we find, track and search /all/ source code? ** Software is fragile :PROPERTIES: :CUSTOM_ID: fragile :END: #+latex: \begin{flushleft} #+ATTR_LATEX: :width \extblockscale{.5\linewidth} file:fragilecloud.png #+latex: \end{flushleft} *** Like all digital information, FOSS is fragile - inconsiderate and/or malicious code loss (e.g., Code Spaces) - business-driven code loss (e.g., Gitorious, Google Code) - for obsolete code: physical media decay (data rot) #+BEAMER: \pause *** If a website disappears you go to the Internet Archive... :B_block: :PROPERTIES: :BEAMER_env: block :END: \hfill ... where do you go if (a repository on) GitHub goes away? ** Software is missing its own Research Infrastructure :PROPERTIES: :CUSTOM_ID: research :END: #+latex: \begin{flushleft} #+ATTR_LATEX: :width \extblockscale{.4\linewidth} file:atacama-telescope.jpg #+latex: \mbox{}\\{\tiny Photo: ALMA(ESO/NAOJ/NRAO), R. Hills} #+latex: \end{flushleft} *** A wealth of software research on crucial issues... - safety, security; test, verification, proof; - software engineering, software evolution; - empirical and big data studies; #+BEAMER: \pause *** If you study the stars, you go to Atacama... :B_block: :PROPERTIES: :BEAMER_env: block :END: \hfill ... where is the /very large telescope/ of source code? diff --git a/talks-public/2017-02-04-FOSDEM/2017-02-04-FOSDEM.org b/talks-public/2017-02-04-FOSDEM/2017-02-04-FOSDEM.org index e19feaa..de77433 100644 --- a/talks-public/2017-02-04-FOSDEM/2017-02-04-FOSDEM.org +++ b/talks-public/2017-02-04-FOSDEM/2017-02-04-FOSDEM.org @@ -1,217 +1,217 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) #+TITLE: Software Heritage #+SUBTITLE: Preserving the Free Software Commons -#+AUTHOR: Roberto Di Cosmo -#+DATE: February 2017 +#+AUTHOR: Roberto Di Cosmo and Stefano Zacchiroli +#+DATE: 4 February 2017 #+DESCRIPTION: Preserving the Free Software Commons #+KEYWORDS: software heritage legacy preservation knowledge mankind technology -#+BEAMER_HEADER: \date[February 4th 2017]{February 4th 2017\\ FOSDEM} -#+BEAMER_HEADER: \author[R. Di Cosmo, S. Zacchiroli ]{Roberto Di Cosmo and Stefano Zacchiroli} +#+BEAMER_HEADER: \date[FOSDEM'17]{4 February 2017\\ FOSDEM'17\\ Brussels, Belgium} +#+BEAMER_HEADER: \author[R. Di Cosmo, S. Zacchiroli]{Roberto Di Cosmo and Stefano Zacchiroli} # # prelude.org contains all the information needed to export the main beamer latex source # use prelude-toc.org to get the table of contents # -#+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 +#+INCLUDE: "../../common/modules/prelude.org" :minlevel 1 #+BEAMER_HEADER: \institute[Irill/INRIA/UPD]{\url{roberto@dicosmo.org, zack@upsilon.cc}} # #+LATEX_HEADER: \usepackage{enumitem} # # Part I: vision # * Software, Source Code, and the Software Commons ** Free Software is everywhere :PROPERTIES: :CUSTOM_ID: softwareispervasive :END: #+latex: \begin{center} #+ATTR_LATEX: :width .9\linewidth file:software-center.pdf #+latex: \end{center} # # Source code # #+INCLUDE: "../../common/modules/source-code-different-short.org::#softwareisdifferent" :minlevel 2 ** Our Software Commons - # taken from #+INCLUDE: "../../common/modules/foss-commons.org::#commonsdef" :only-contents t - :PROPERTIES: - :CUSTOM_ID: commonsdef - :END: -*** Definition (Commons) - The *commons* is the cultural and natural resources accessible to all - members of a society, including natural materials such as air, water, and a - habitable earth. These resources are held in common, not owned privately. - #+BEAMER: {\tiny\url{https://en.wikipedia.org/wiki/Commons}} -*** Definition (Software Commons) - The *software commons* consists of all computer software which is available - at little or no cost and which can be altered and reused with few - restrictions. Thus /all open source software and all free software are part - of the [software] commons/. [...] - #+BEAMER: {\tiny\url{https://en.wikipedia.org/wiki/Software_Commons}} - #+BEAMER: \pause + #+INCLUDE: "../../common/modules/foss-commons.org::#commonsdef" :only-contents t + #+BEAMER: \pause *** Source code is /a precious part/ of our commons \hfill we need to take care of it! # # Negative presentation (what we are missing) # # *** Our source code is /precious knowledge/ # \hfill are we taking care of it? # #+INCLUDE: "../../common/modules/swh-motivations.org::#main" :only-contents t :minlevel 2 # # The project # #+INCLUDE: "../../common/modules/swh-overview-sourcecode.org::#mission" :minlevel 2 # # Positive presentation (what we are building) # #+INCLUDE: "../../common/modules/swh-goals.org::#main" :only-contents t :minlevel 2 +** Our principles + #+latex: \begin{center} + #+ATTR_LATEX: :width .9\linewidth + file:SWH-as-foundation-slim.png + #+latex: \end{center} +*** Open approach :B_block:BMCOL: + :PROPERTIES: + :BEAMER_col: 0.4 + :BEAMER_env: block + :END: + - 100% FOSS + - transparency +*** In for the long haul :B_block:BMCOL: + :PROPERTIES: + :BEAMER_col: 0.4 + :BEAMER_env: block + :END: + - replication + - non profit + +** Archiving goals + Targets: VCS repositories & source code releases (e.g., tarballs) +*** We DO archive + - file *content* (= blobs) + - *revisions* (= commits), with full metadata + - *releases* (= tags), ditto + - (project metadata) + - where (*origin*) & when (*visit*) we found any of the above + # - time-indexed repo *snapshots* (i.e., we never delete anything) + … in a VCS-/archive-agnostic *canonical data model* +*** We DON'T archive (UNIX philosophy) + # - diffs → derived data from related contents + - homepages, wikis → collaboration with the Internet Archive + - BTS/issues/code reviews/etc. + - mailing lists + Long term vision: play our part in a /"semantic wikipedia of software"/ # # Part II: roadmap # * Where we are today: technical overview #+INCLUDE: "../../common/modules/status-extended.org::#architecture" :only-contents t #+INCLUDE: "../../common/modules/status-extended.org::#merkletree" :minlevel 2 + #+INCLUDE: "../../common/modules/status-extended.org::#merklerevision" :only-contents t #+INCLUDE: "../../common/modules/status-extended.org::#giantdag" :only-contents t -** SHA1 collisions considered harmful - #+BEAMER: \lstinputlisting[language=SQL,basicstyle=\small]{../../common/source/swh-content.sql} #+INCLUDE: "../../common/modules/status-extended.org::#archive" :minlevel 2 + #+INCLUDE: "../../common/modules/status-extended.org::#api" :only-contents t #+INCLUDE: "../../common/modules/status-extended.org::#features" :minlevel 2 # # Part III: # * Come in, we're open! -** An ambitious, worldwide initiative -*** Inria as initiator :B_picblock: - :PROPERTIES: - :BEAMER_env: picblock - :BEAMER_opt: pic=inria-logo-new,leftpic=true,width=\extblockscale{.4\linewidth} - :END: - - .fr national CS research institution - - strong FOSS culture, W3C founding partner - # - creating a non profit, international organisation - #+BEAMER: \pause -*** Benefits society as a whole :B_picblock: - :PROPERTIES: - :BEAMER_env: picblock - :BEAMER_opt: pic=unesco,leftpic=true - :END: - + preserve knowledge embedded in software - + access knowledge embedded in software -*** Supporters and /first partners/ - *Société Générale, Microsoft, Huawei, Nokia Bell Labs, DANS,* - ACM, Adullact, Creative Commons, Eclipse, FSF, OSI, GitHub, GitLab, IEEE, - OIN, OW2, SFC, SFLC, The Document Foundation, The Linux Foundation, ... - #+BEAMER: \pause -** Our principles -#+latex: \begin{center} -#+ATTR_LATEX: :width \extblockscale{.7\linewidth} -file:SWH-as-foundation-slim.png -#+latex: \end{center} -#+BEAMER: \pause -*** Open approach :B_block:BMCOL: - :PROPERTIES: - :BEAMER_COL: 0.4 - :BEAMER_env: block - :END: - open source, transparency -*** Unix philosophy :B_block:BMCOL:noexport:noexport: - :PROPERTIES: - :BEAMER_opt: - :BEAMER_env: block - :BEAMER_col: 0.3 - :END: - - do /one/ thing - - do it /well/ -*** In for the long haul :B_block:BMCOL: - :PROPERTIES: - :BEAMER_COL: 0.4 - :BEAMER_env: block - :END: - non profit, replication - -#+BEAMER: \pause -*** Thomas Jefferson, February 18, 1791 - :PROPERTIES: - :BEAMER_act: +- - :END: -# #+latex: \begin{quote} - ...let us save what remains: not by vaults and locks which fence them - from the public eye and use in consigning them to the waste of time, - but by such a multiplication of copies, as shall place them beyond - the reach of accident. -# #+latex: \end{quote} -#+BEAMER: \pause -*** Building an /open, multistakeholder, nonprofit/ global organisation -# -# explain what we need and plan to do -# -** There is a whole lot to do! +** There is a whole lot to do! (WIP slide) #+latex: \begin{center} #+ATTR_LATEX: :width \extblockscale{\textwidth} file:SWH-as-foundation-block.png #+latex: \end{center} #+BEAMER: \pause *** Collect :B_exampleblock: :PROPERTIES: :BEAMER_env: exampleblock :BEAMER_COL: .3 :END: - discover + sources - harvest + protocols - ingest + VCS + data models *** Organise and Preserve :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .4 :END: - enrich + metadata - analyze + traits - replicate + locations + technologies + stakeholders *** Share :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .3 :END: - download - browse + wayback machine - search + facets - watch + trends - - #+BEAMER: \pause *** \hfill we need *your* help! + +** Come and join us + #+BEAMER: \begin{center} + #+BEAMER: \includegraphics[width=.6\linewidth]{team} + #+BEAMER: \end{center} + - \url{www.softwareheritage.org/jobs} --- *job openings* + - \url{wiki.softwareheritage.org} --- *internships* +** An ambitious, worldwide initiative +*** Inria as initiator :B_picblock: + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_opt: pic=inria-logo-new,leftpic=true,width=\extblockscale{.4\linewidth} + :END: + - .fr national CS research institution + - strong FOSS culture, W3C founding partner + # - creating a non profit, international organisation + #+BEAMER: \pause +*** Benefits society as a whole :B_picblock: + :PROPERTIES: + :BEAMER_env: picblock + :BEAMER_opt: pic=unesco,leftpic=true + :END: + + preserve knowledge embedded in software + + access knowledge embedded in software +*** Supporters and /first partners/ + *Société Générale, Microsoft, Huawei, Nokia Bell Labs, DANS,* ACM, + Adullact, Creative Commons, Eclipse, Free Software Foundation, Open Source + Initiative, GitHub, IEEE, OIN, OW2, Software Freedom Conservancy, SFLC, The + Document Foundation, ... + * Conclusion ** Conclusion *** Software Heritage is - - a revolutionary /reference archive/ of /all/ FOSS ever written + - a /reference archive/ of /all/ FOSS ever written # - a fantastic new tool for /research/ software - a unique /complement/ for /development platforms/ - an international, open, nonprofit, /mutualized infrastructure/ - - at the service of our community, at the service of society as a whole! + - at the service of our community, at the service of society *** Come in, we're open! \url{www.softwareheritage.org} --- /sponsoring/, /*job openings*/ \\ \url{wiki.softwareheritage.org} --- /*internships*/, /leads/ \\ \url{forge.softwareheritage.org} --- /*our own code*/ #+BEAMER: \vfill \flushright {\Huge Questions?} \vfill +* FAQ :B_appendix: + :PROPERTIES: + :BEAMER_env: appendix + :END: +** Q: how about SHA1 collisions? + #+BEAMER: \lstinputlisting[language=SQL,basicstyle=\small]{../../common/source/swh-content.sql} +** Q: do you archive /only/ Free Software? + - We only crawl origins /meant/ to host source code (e.g., forges) + - Most (~90%) of what we /actually/ retrieve is textual content + #+BEAMER: \vfill + - Our goal: archive /the entire Free Software commons/ + #+BEAMER: \vfill + - Large parts of what we retrieve is /already/ Free Software, today + - Most of the rest /will become/ Free Software in the long term + - e.g., at copyright © expiration