diff --git a/talks-public/2018-01-25-rocq-sesi/2018-01-25-rocq-sesi.org b/talks-public/2018-01-25-rocq-sesi/2018-01-25-rocq-sesi.org index 947dbb9..39c7ec6 100644 --- a/talks-public/2018-01-25-rocq-sesi/2018-01-25-rocq-sesi.org +++ b/talks-public/2018-01-25-rocq-sesi/2018-01-25-rocq-sesi.org @@ -1,140 +1,157 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) #+TITLE: Software Heritage #+SUBTITLE: Technical challenges when archiving the entire Software Commons #+BEAMER_HEADER: \date[Inria Rocquencourt]{25 January 2018\\Inria Rocquencourt} #+AUTHOR: Stefano Zacchiroli #+DATE: 25 January 2018 #+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 #+INCLUDE: "../../common/modules/169.org" #+BEAMER_HEADER: \institute{Inria, Software Heritage} #+LATEX_HEADER: \usepackage{bbding} #+LATEX_HEADER: \DeclareUnicodeCharacter{66D}{\FiveStar} * The Software Commons #+INCLUDE: "../../common/modules/source-code-different-short.org::#softwareisdifferent" :minlevel 2 ** Our Software Commons #+INCLUDE: "../../common/modules/foss-commons.org::#commonsdef" :only-contents t #+BEAMER: \pause *** Source code is /a precious part/ of our commons \hfill are we taking care of it? # #+INCLUDE: "../../common/modules/swh-motivations-foss.org::#main" :only-contents t :minlevel 2 #+INCLUDE: "../../common/modules/swh-motivations-foss.org::#fragile" :minlevel 2 #+INCLUDE: "../../common/modules/swh-motivations-foss.org::#research" :minlevel 2 * Software Heritage #+INCLUDE: "../../common/modules/swh-overview-sourcecode.org::#mission" :minlevel 2 #+INCLUDE: "../../common/modules/principles-compact.org::#principles" :minlevel 2 * Architecture #+INCLUDE: "../../common/modules/status-extended.org::#archivinggoals" :minlevel 2 #+INCLUDE: "../../common/modules/status-extended.org::#architecture" :only-contents t #+INCLUDE: "../../common/modules/status-extended.org::#merkletree" :minlevel 2 #+INCLUDE: "../../common/modules/status-extended.org::#merklerevision" :only-contents t #+INCLUDE: "../../common/modules/status-extended.org::#giantdag" :only-contents t #+INCLUDE: "../../common/modules/status-extended.org::#archive" :minlevel 2 #+INCLUDE: "../../common/modules/status-extended.org::#apiintro" :minlevel 2 #+INCLUDE: "../../common/modules/status-extended.org::#features" :minlevel 2 * Technical challenges ** Technology: how do you store the SWH DAG? *** Problem statement - How would you store and query a graph with 10 billion nodes and 60 billion edges? - How would you store the contents of more than 3 billion files, 300TB of raw data? - ... on a limited budget (100 000 € of hardware overall) #+BEAMER: \pause *** Our hardware stack - two hypervisors with 512GB RAM, 20TB SSD each, sharing access to a storage array (60 x 6TB spinning rust) - one backup server with 48GB RAM and another storage array *** Our software stack - A RDBMS (PostgreSQL, what else?), for storage of the graph nodes and edges - filesystems for storing the actual file contents ** Technology: archive storage components *** Metadata storage - Python module *swh.storage* - thin Python API over a pile of PostgreSQL functions - motivation: keeping relational integrity at the lowest layer *** Content ("object") storage - Python module *swh.objstorage* - very thin object storage abstraction layer (PUT, APPEND and GET) over regular storage technologies - separate layer for asynchronous replication and integrity management (*swh.archiver*) - motivation: stay as technology neutral as possible for future mirrors ** Technology: object storage *** Primary deployment - Storage on 16 sharded XFS filesystems; key = /sha1/ (content), value = /gzip/ (content) - if sha1 = *abcdef01234...*, file path = / srv / storage / *a* / *ab* / *cd* / *ef* / *abcdef01234...* - 3 directory levels deep, each level 256-wide = 16 777 216 directories (1 048 576 per partition) *** Secondary deployment - Storage on Azure blob storage - 16 storage containers, objects stored in a flat structure there ** Technology: object storage review *** Generic model is fine The abstraction layer is fairly simple and generic, and the implementation of the upper layers (replication, integrity checking) was a breeze. *** Filesystem implementation is bad Slow spinning storage + little RAM (48GB) + 16 million dentries = (very) bad performance ** Technology: metadata storage *** Current deployment - PostgreSQL deployed in primary/replica mode, using pg\under{}logical for replication: different indexes on primary (tuned for writes) and replicas (tuned for reads). - most logic done in SQL - thin Pythonic API over the SQL functions *** End goals - proper handling of relations between objects at the lowest level - doing fast recursive queries on the graph (e.g., find the provenance info for a content, walking up the whole graph, with a single query) ** Technology: metadata storage review *** Limited resources PostgreSQL works really well #+BEAMER: \pause ... until your indexes don't fit in RAM #+BEAMER: \pause *** Our recursive queries jump between different object types, and between evenly distributed hashes. Data locality doesn't exist. Caches break down. #+BEAMER: \pause *** Massive deduplication = efficient storage #+BEAMER: \pause *but* Massive deduplication = exponential width for recursive queries #+BEAMER: \pause *** Reality check Referential integrity? #+BEAMER: \pause Real repositories downloaded from the internet are all kinds of broken. ** Technology: outlook *** Object storage Our Azure prototype shows that using a scale-out "cloudy" technology for our object storage works really well. Plain filesystems on spinning rust, not so much. #+BEAMER: \pause We are now experimenting with scale-out object storages (and in particular Ceph) for the main copy of the archive. #+BEAMER: \pause *** Metadata storage Our initial assumption that we wanted referential integrity and built-in recursive queries was wrong. #+BEAMER: \pause We could probably migrate to "dumb" object storages for each type of object, with another layer to check metadata integrity regularly. * Community +*** Sponsors + :PROPERTIES: + :BEAMER_col: 0.4 + :BEAMER_env: block + :END: + #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{0.6\linewidth}]{inria-logo-new}\end{center} + #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{1.2\linewidth}]{sponsors.pdf}\end{center} +*** Testimonials + :PROPERTIES: + :BEAMER_col: 0.6 + :BEAMER_env: block + :END: + #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{1.2\linewidth}]{support.pdf}\end{center} +*** UNESCO/Inria agreement (April 3rd, 2017) + #+BEAMER: \centering + #+BEAMER: \includegraphics[width=\extblockscale{.2\linewidth}]{unesco} + #+BEAMER: \includegraphics[width=\extblockscale{.35\linewidth}]{unesco-accord} ** You can help! *** Coding - \url{www.softwareheritage.org/community/developers/} - \url{forge.softwareheritage.org} --- *our own code* *** Current development priorities | ٭٭٭ | listers for unsupported forges, distros, pkg. managers | | ٭٭٭ | loaders for unsupported VCS, source package formats | | ٭٭ | Web UI: eye candy wrapper around the Web API | | ٭ | content indexing and search | … /all/ contributions equally welcome! #+INCLUDE: "../../common/modules/swh-backmatter.org::#conclusion" :minlevel 2