diff --git a/common/modules/graph-compression.org b/common/modules/graph-compression.org
index 8c2d452..64598b2 100644
--- a/common/modules/graph-compression.org
+++ b/common/modules/graph-compression.org
@@ -1,306 +1,306 @@
 #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt)
 #+INCLUDE: "prelude.org" :minlevel 1
 
 # Depends: \usepackage{pdfpages}
 
 * Graph Compression
   :PROPERTIES:
   :CUSTOM_ID: main
   :END:
 ** Graph compression on the Software Heritage archive
    :PROPERTIES:
    :CUSTOM_ID: intro
    :END:
 *** 
      #+BEGIN_EXPORT latex
      \vspace{-3mm}
      \begin{thebibliography}{Foo Bar, 1969}
      \bibitem{Boldi2020} Paolo Boldi, Antoine Pietri, Sebastiano Vigna, Stefano Zacchiroli
      \newblock Ultra-Large-Scale Repository Analysis via Graph Compression
      \newblock SANER 2020, 27th Intl. Conf. on Software Analysis, Evolution and Reengineering. IEEE
      \end{thebibliography}
      #+END_EXPORT
 *** Research question
     Is it possible to efficiently perform software development history analyses
     at ultra large scale (= the scale of Software Heritage archive or more), on
     a single, relatively cheap machine?
 *** Idea
     Apply state-of-the-art graph compression techniques from the field of Web
     graph / social network analysis.
 
 ** Background --- (Web) graph compression
    :PROPERTIES:
    :CUSTOM_ID: background1
    :END:
 *** The graph of the Web
     :PROPERTIES:
     :BEAMER_env: definition
     :END:
     Directed graph that has Web pages as nodes and hyperlinks between them as
     edges.
 *** Properties (1)
-    - **Locality:** pages links to pages whose URLs are lexicographically
+    - **Locality:** pages link to pages whose URLs are lexicographically
       similar. URLs share long common prefixes.
 
     → use *D-gap compression*
 *** Adjacency lists
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.51
     :END:
     #+BEAMER: \scriptsize
     | *Node* | *Outdegree* | *Successors*                         |
     |--------+-------------+--------------------------------------|
     |    ... |         ... | ...                                  |
     |     15 |          11 | 13,15,16,17,18,19,23,24,203,315,1034 |
     |     16 |          10 | 15,16,17,22,23,24,315,316,317,3041   |
     |     17 |           0 |                                      |
     |     18 |           5 | 13,15,16,17,50                       |
     |    ... |         ... | ...                                  |
 *** D-gapped adjacency lists
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.48
     :END:
     #+BEAMER: \scriptsize
     | *Node* | *Outdegree* | *Successors*                |
     |--------+-------------+-----------------------------|
     |    ... |         ... | ...                         |
     |     15 |          11 | 3,1,0,0,0,0,3,0,178,111,718 |
     |     16 |          10 | 1,0,0,4,0,0,290,0,0,2723    |
     |     17 |           0 |                             |
     |     18 |           5 | 9,1,0,0,32                  |
     |    ... |         ... | ...                         |
 
 ** Background --- (Web) graph compression (cont.)
    :PROPERTIES:
    :CUSTOM_ID: background2
    :END:
 *** The graph of the Web
     :PROPERTIES:
     :BEAMER_env: definition
     :END:
     Directed graph that has Web pages as nodes and hyperlinks between them as
     edges.
 *** Properties (2)
     - **Similarity:** pages that are close together in lexicographic order tend
       to have many common successors.
     → use *reference compression*
 *** Adjacency lists
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.47
     :END:
     #+BEAMER: \scriptsize
     | *Node* | *Outd.* | *Successors*                         |
     |--------+---------+--------------------------------------|
     |    ... |     ... | ...                                  |
     |     15 |      11 | 13,15,16,17,18,19,23,24,203,315,1034 |
     |     16 |      10 | 15,16,17,22,23,24,315,316,317,3041   |
     |     17 |       0 |                                      |
     |     18 |       5 | 13,15,16,17,50                       |
     |    ... |     ... | ...                                  |
 *** Copy lists
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.60
     :END:
     #+BEAMER: \scriptsize
     | *Node* | *Ref.* | *Copy list* | *Extra nodes*                        |
     |--------+--------+-------------+--------------------------------------|
     |    ... |    ... | ...         | ...                                  |
     |     15 |      0 |             | 13,15,16,17,18,19,23,24,203,315,1034 |
     |     16 |      1 | 01110011010 | 22,316,317,3041                      |
     |     17 |        |             |                                      |
     |     18 |      3 | 11110000000 | 50                                   |
     |    ... |    ... | ...         |                                      |
 
 ** Background --- Web graph compression (OLD)                      :noexport:
    Borrowing (great!) slides from:
    #+BEGIN_EXPORT latex
    \begin{thebibliography}{}
    \bibitem{Pibiri2018} Giulio Ermanno Pibiri
    \newblock Effective Web Graph Representations, 2018
    \newblock \url{http://pages.di.unipi.it/pibiri/slides/webgraphs\_compression.pdf}
    \end{thebibliography}
    #+END_EXPORT
 
 ** Background -- Web graph compression (imported slides) (OLD)     :noexport:
    :PROPERTIES:
    :BEAMER_env: ignoreheading
    :END:
    #+BEGIN_EXPORT latex
    {
    \setbeamercolor{background canvas}{bg=}
    \setbeamertemplate{background}{}
    \includepdf[pages={4,11,12,13}]{webgraphs_compression.pdf}
    \addtocounter{framenumber}{4}
    }
    #+END_EXPORT
 
 ** Corpus
    :PROPERTIES:
    :CUSTOM_ID: corpus
    :END:
 *** Nodes
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.5
     :END:
     | *Node type* | *N. of nodes* |
     |-------------+---------------|
     | origins     | 88 M          |
     | snapshots   | 57 M          |
     | releases    | 9.9 M         |
     | revisions   | 1.1 B         |
     | directories | 4.9 B         |
     | contents    | 5.5 B         |
     |-------------+---------------|
     | Total nodes | 12 B          |
 *** Edges
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.5
     :END:
     | *Edge type*           | *N. of edges* |
     |-----------------------+---------------|
     | origin → snapshot     | 195 M         |
     | snapshot → revision   | 616 M         |
     | snapshot → release    | 215 M         |
     | release → revision    | 9.9 M         |
     | revision → revision   | 1.2 B         |
     | revision → directory  | 1.1 B         |
     | directory → directory | 48 B          |
     | directory → revisiony | 482 M         |
     | directory → content   | 112 B         |
     |-----------------------+---------------|
     | Total edges           | 165 B         |
 *** 
     :PROPERTIES:
     :BEAMER_env: ignoreheading
     :END:
     Archive snapshot 2018-09-25, from the Software Heritage graph dataset.\\
-    Growth rate: exponential, doubling every 22-30 months (Rousseau, Di Cosmo,
-    Zacchiroli; ESE 2020, to appear).
+    Growth rate: exponential, doubling every 22-30 months [Rousseau, Di Cosmo,
+    Zacchiroli, Empirical Software Engineering 2020].
 
 ** Graph compression pipeline
    :PROPERTIES:
    :CUSTOM_ID: pipeline
    :END:
     #+BEAMER: \hspace*{-0.1\linewidth} \includegraphics[width=1.2\linewidth]{compression/compression-steps}
     #+BEAMER: \vspace{-1cm}
 *** 
     - *MPH*: minimal perfect hash, mapping Merkle IDs to 0..N-1 integers
     - *BV* compress: Boldi-Vigna compression (based on MPH order)
     - *BFS*: breadth-first visit to renumber
     - *Permute*: update BV compression according to BFS order
 *** (Re)establishing locality
     - key for good compression is a node ordering that ensures locality and
       similarity
     - which is very much /not/ the case with Merkle IDs, ...but is the case
       /again/ after BFS reordering
 
 ** Compression experiment
    :PROPERTIES:
    :CUSTOM_ID: compexp
    :END:
    | *Step*      | *Wall time* (hours) |
    |-------------+---------------------|
    | MPH         |                   2 |
    | BV Compress |                  84 |
    | BFS         |                  19 |
    | Permute     |                  18 |
    | Transpose   |                  15 |
    |-------------+---------------------|
    | Total       |        138 (6 days) |
 
    - server equipped with 24 CPUs and 750 GB of RAM
    - RAM mostly used as I/O cache for the BFS step
    - /minimum/ memory requirements are close to the RAM needed to load the
      final compressed graph in memory
 
 ** Compression efficiency (space)
    :PROPERTIES:
    :CUSTOM_ID: spaceefficiency
    :END:
 *** 
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.4
     :END:
     | *Forward  graph*  |        |
     | total size        | 91 GiB |
     | bits per edge     |   4.91 |
     | compression ratio |  15.8% |
 *** 
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.4
     :END:
     | *Backward  graph* |        |
     | total size        | 83 GiB |
     | bits per edge     |   4.49 |
     | compression ratio |  14.4% |
 *** Operating cost
     The structure of a full bidirectional archive graph fits in less than 200
     GiB of RAM, for a hardware cost of ~300 USD.
 
 ** Compression efficiency (time)
    :PROPERTIES:
    :CUSTOM_ID: timeefficiency
    :END:
 *** Benchmark --- Full BFS visit (single thread)
     #+BEAMER: \begin{columns}\begin{column}{0.45\textwidth}
     | *Forward  graph* |                |
     |------------------+----------------|
     | wall time        | 1h48m          |
     | throughput       | 1.81 M nodes/s |
     |                  | (553 ns/node)  |
     #+BEAMER: \end{column}\begin{column}{0.45\textwidth}
     | *Backward graph* |                |
     |------------------+----------------|
     | wall time        | 3h17m          |
     | throughput       | 988 M nodes/s  |
     |                  | (1.01 µs/node) |
     #+BEAMER: \end{column}\end{columns}
 *** Benchmark --- Edge lookup
     random sample: 1 B nodes (8.3% of entire graph); then enumeration of all
     successors
     #+BEAMER: \begin{columns}\begin{column}{0.45\textwidth}
     | *Forward  graph* |                |
     |------------------+----------------|
     | visited edges    | 13.6 B         |
     | throughput       | 12.0 M edges/s |
     |                  | (83 ns/edge)   |
     #+BEAMER: \end{column}\begin{column}{0.45\textwidth}
     | *Backward graph* |                |
     |------------------+----------------|
     | visited edges    | 13.6 B         |
     | throughput       | 9.45 M edges/s |
     |                  | (106 ns/edge)  |
     #+BEAMER: \end{column}\end{columns}
 *** 
     :PROPERTIES:
     :BEAMER_env: ignoreheading
     :END:
     Note how edge lookup time is close to DRAM random access time (50-60 ns).
 
 ** Discussion
    :PROPERTIES:
    :CUSTOM_ID: discussion
    :END:
 *** Incrementality
     compression is *not incremental*, due to the use of contiguous integer
     ranges
     - but the graph is append-only, so...
     - ...based on expected graph growth rate it should be possible to
       pre-allocate enough free space in the integer ranges to support
       *amortized incrementality* (future work)
   #+BEAMER: \pause
 *** In-memory v. on-disk
     the compressed in-memory graph structure has *no attributes*
     - usual design is to exploit the 0..N-1 integer ranges to *memory map node
       attributes* to disk for efficient access
     - works well for queries that does graph traversal first and "join" node
       attributes last; ping-pong between the two is expensive
     - edge attributes are more problematic
diff --git a/common/modules/status-extended.org b/common/modules/status-extended.org
index 78440f5..c555370 100644
--- a/common/modules/status-extended.org
+++ b/common/modules/status-extended.org
@@ -1,505 +1,504 @@
 #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt)
 #+INCLUDE: "prelude.org" :minlevel 1
 
 # not to be included as a whole, just pick individual slides as you see fit
 
 * Status
   :PROPERTIES:
   :CUSTOM_ID: main
   :END:
 ** The people
   :PROPERTIES:
   :CUSTOM_ID: people
   :END:
 *** The core team 						 :B_picblock:
     :PROPERTIES:
     :CUSTOM_ID: core-team-formal
     :BEAMER_env: picblock
     :BEAMER_opt: pic=team,width=.4\linewidth
     :END:
     - Roberto Di Cosmo 
     - Stefano Zacchiroli 
     - Nicolas Dandrimont (Engineer)
     - Antoine Dumont (Engineer)
 #    - and /Jordi, Quentin and Guillaume/
 *** Scientific advisors
     - Serge Abiteboul (French Science Academy)
     - Jean-François Abramatic (former W3C director)
     - Gerard Berry (CNRS Gold Medal, French Science Academy)
     - Julia Lawall (Coccinelle, Linux Kernel, Outreachy)
 ** Archive coverage --- archive.softwareheritage.org
   :PROPERTIES:
   :CUSTOM_ID: archive
   :END:
   #+BEAMER: \vspace{-1mm}
   #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{1\linewidth}]{archive-growth.png}\end{center}
   #+BEAMER: \vspace{-2mm}
 *** 
     #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{1\linewidth}]{archive-coverage.png}\end{center}
     # #+BEAMER: \includegraphics[width=0.12\linewidth]{coverage/github}
     # #+BEAMER: \hfill
     # #+BEAMER: \includegraphics[width=0.13\linewidth]{coverage/gitlab}
     # #+BEAMER: \hfill
     # #+BEAMER: \raisebox{2mm}{\includegraphics[width=0.14\linewidth]{coverage/bitbucket}}
     # #+BEAMER: \hfill
     # #+BEAMER: \includegraphics[width=0.14\linewidth]{coverage/googlecode}
     # #+BEAMER: \hfill
     # #+BEAMER: \includegraphics[width=0.14\linewidth]{coverage/gitorious}
     # #+BEAMER: \hfill
     # #+BEAMER: \includegraphics[width=0.12\linewidth]{coverage/framagit}
     # #+BEAMER: \\
     # #+BEAMER: \includegraphics[width=0.10\linewidth]{coverage/hal}
     # #+BEAMER: \hfill
     # #+BEAMER: \raisebox{2mm}{\includegraphics[width=0.12\linewidth]{coverage/debian}}
     # #+BEAMER: \hfill
     # #+BEAMER: \raisebox{1mm}{\includegraphics[width=0.11\linewidth]{coverage/npm}}
     # #+BEAMER: \hfill
     # #+BEAMER: \includegraphics[width=0.06\linewidth]{coverage/cran}
     # #+BEAMER: \hfill
     # #+BEAMER: \includegraphics[width=0.12\linewidth]{coverage/gnu}
     # #+BEAMER: \hfill
     # #+BEAMER: \includegraphics[width=0.12\linewidth]{coverage/inria}
     # #+BEAMER: \hfill
     # #+BEAMER: \raisebox{-1mm}{\includegraphics[width=0.11\linewidth]{coverage/pypi}}
   #+BEAMER: \pause
 *** 
     - on disk: ~700 TB (uncompressed); as a graph ~20 B nodes, ~200 B edges
     - the largest public source code archive in the world (and growing!)
 ** The structure of the archive					   :noexport:
 *** On-disk storage
     - flat file storage for contents
     - postgres database for the metadata
 *** Data model: /one/ big Merkle DAG, inspired by the git model
     - Origins (= repositories)
     - Occurrences (= branches)
     - Releases (= tags)
     - Revisions (= commits)
     - Directories (= trees)
     - Contents (= blobs)
 ** Archiving goals
    :PROPERTIES:
    :CUSTOM_ID: archivinggoals
    :END:
    Targets: VCS repositories & source code releases (e.g., tarballs, packages)
 *** We DO archive
     - file *content* (= blobs)
     - *revisions* (= commits), with full metadata
     - *releases* (= tags), ditto
     - where (*origin*) & when (*visit*) we found any of the above
     # - time-indexed repo *snapshots* (i.e., we never delete anything)
     … in a VCS-/archive-agnostic *canonical data model*
 *** We DON'T archive (yet)
     # - diffs → derived data from related contents
     - homepages, wikis
     - BTS/issues/code reviews/etc.
     - mailing lists
     Long term vision: play our part in a /"semantic wikipedia of software"/
 ** Architecture
    :PROPERTIES:
    :CUSTOM_ID: architecture
    :END:
 *** Data flow
     :PROPERTIES:
     :CUSTOM_ID: dataflow
     :END:
     # 
     #+BEAMER: \begin{center}\includegraphics[width=\extblockscale{1.4\textwidth}]{swh-dataflow.pdf}\end{center}
 ** Data model							   :noexport:
 *** General schema
     - VCS-independent
     - fully deduplicated
       + files, directories and commits are /shared/
     - biggest git-like /graph/ in the world
 *** 
     \begin{center}
       \url{http://deb.li/swhdm}
     \end{center}
 *** full hash index (sha1, sha256, ...)
     Some funny facts:
      - the GPL2 licence appears under more than 500 names
        + including /aa.css.txt/ and /FullSync.txt/ ~ :-)
 ** Merkle DAG
 *** Merkle structure
     :PROPERTIES:
     :CUSTOM_ID: merkle
     :END:
 **** Merkle trees
      :PROPERTIES:
      :CUSTOM_ID: merkletree
      :END:
    # R. C. Merkle, A digital signature based on a conventional encryption
    # function, Crypto '87
-   #+BEAMER: \vspace{-3mm}
+   #+BEAMER: \vspace{-2mm}
 ***** Merkle tree (R. C. Merkle, CRYPTO 1987)                    :B_picblock:
       :PROPERTIES:
       :BEAMER_opt: pic=merkle, leftpic=true, width=.7\linewidth
       :BEAMER_env: picblock
       :BEAMER_act:
       :END:
       Combination of 
        - tree
        - hash function
    #+BEAMER: \pause
    #+BEAMER: \footnotesize
 ***** Classical cryptographic construction
       - fast, parallel signature of large data structures
       - widely used (e.g., Git, blockchains, IPFS, ...)
       - built-in deduplication
       #+BEAMER: \vspace{-1mm}
 **** Data Model
     :PROPERTIES:
     :CUSTOM_ID: datamodel
     :END:
 ***** The archive: a (giant) Merkle DAG
-     #+BEAMER: \vspace{-3mm}
      #+BEAMER: \centering \includegraphics[width=\textwidth]{swh-data-model-h}
 
 **** The archive in a few pictures
      :PROPERTIES:
      :CUSTOM_ID: merkledemo
      :END:
 ***** A giant (extended) Merkle DAG
        #+LATEX: \only<1>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_1.pdf}}}
        #+LATEX: \only<2>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/contents.pdf}}}
        #+LATEX: \only<3>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_2_contents.pdf}}}
        #+LATEX: \only<4>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/directories.pdf}}}
        #+LATEX: \only<5>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_3_directories.pdf}}}
        #+LATEX: \only<6>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/revisions.pdf}}}
        #+LATEX: \only<7>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_4_revisions.pdf}}}
        #+LATEX: \only<8>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/releases.pdf}}}
        #+LATEX: \only<9>{\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_5_releases.pdf}}}
   #    #+LATEX: {\colorbox{white}{\includegraphics[width=\extblockscale{.9\linewidth}]{git-merkle/merkle_1.pdf}}}
 *** A revision node
     :PROPERTIES:
     :CUSTOM_ID: merklerevision
     :END:
 **** Example: a Software Heritage revision
 ***** 
       #+BEAMER: \vspace{-.5cm}\centering\includegraphics[width=0.9\textwidth]{git-merkle/revisions}
 ***** 
       Note: most object kinds currently have Git-compatible identifiers
 *** Giant DAG
     :PROPERTIES:
     :CUSTOM_ID: giantdag
     :END:
 **** The archive: a (giant) Merkle DAG
      # Using an empty frame because the image is difficult to read on swh bg.
      # Finding a way to override image bg for just this frame would be better.
 ***** 
       #+BEAMER: \centering \includegraphics[width=\extblockscale{\textwidth}]{git-merkle/merkle_5_releases}
 *** Giant DAG (single slide)
     :PROPERTIES:
     :CUSTOM_ID: giantdag1slide
     :END:
 **** The Software Heritage archive: a gigantic Merkle DAG
      #+LATEX: \centering\forcebeamerstart{}
      #+LATEX: \only<1>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/merkle_1}}}
      #+LATEX: \only<2>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/contents}}}
      #+LATEX: \only<3>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/merkle_2_contents}}}
      #+LATEX: \only<4>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/directories}}}
      #+LATEX: \only<5>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/merkle_3_directories}}}
      #+LATEX: \only<6>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/revisions}}}
      #+LATEX: \only<7>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/merkle_4_revisions}}}
      #+LATEX: \only<8>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/releases}}}
      #+LATEX: \only<9>{\colorbox{white}{\includegraphics[width=.75\linewidth]{git-merkle/merkle_5_releases}}}
      #+LATEX: \forcebeamerend{}
 *** Giant DAG (detailed)
     :PROPERTIES:
     :CUSTOM_ID: dagdetail
     :END:
 **** The archive: a (giant) Merkle DAG
      #+BEAMER: \vspace{-3mm}
      #+BEAMER: \centering \includegraphics[width=\textwidth]{swh-merkle-dag-wide}
 *** Giant DAG (detailed)
     :PROPERTIES:
     :CUSTOM_ID: dagdetailsmall
     :END:
 **** The archive: a (giant) Merkle DAG
      #+BEAMER: \centering
      #+BEAMER: \only<1>{\includegraphics[width=\textwidth]{swh-merkle-dag-small-visit1}}
      #+BEAMER: \only<2>{\includegraphics[width=\textwidth]{swh-merkle-dag-small-visit2}}
      #+BEAMER: \only<3>{\includegraphics[width=\textwidth]{swh-merkle-dag-small}}
 ** Technology							   :noexport:
   :PROPERTIES:
   :CUSTOM_ID: technology
   :END:
 *** Software stack
     :PROPERTIES:
     :CUSTOM_ID: swstack
     :END:
 **** 3rd party
      - Debian, Proxmox, ZFS on Linux, Puppet
      - PostgreSQL for metadata storage, with barman & pglogical
      - Celery (RabbitMQ backend) for task scheduling
      - Python3 and psycopg2 for the backend
      - Django, Bootstrap, D3.js for Web stuff
 **** in house
      - /ad hoc/ object storage (to avoid imposing tech to mirrors)
      - data model implementation, listers, loaders, scheduler
      - ~60 Git repositories (~20 Python packages, ~30 Puppet modules)
      - ~60 kSLOC Python / ~12 kSLOC SQL / ~4 kSLOC Puppet
      - licence choice: GPLv3 (backend) / AGPLv3 (frontend)
 *** Deployment architecture
      #+BEAMER: \vspace{1mm}
      #+BEAMER: \centering \includegraphics[height=.9\textheight]{general-architecture}
 *** Hardware stack
     :PROPERTIES:
     :CUSTOM_ID: hwstack
     :END:
 **** in house
      - 3x hypervisors with ~40 VMs
      - 1x high performance database server; read-only replica on a container
      - 2x dedicated storage servers, one of them using ZFS.
      - 3x high density storage array (2 x 60 x 6TB; 1 x 60 x 10TB)
      - 3x nodes for a kafka+elasticsearch cluster
 **** on Azure
      - full object storage mirror
      - full mirror of the database containing the graph
      - workers for content indexing
      - workers for download bundle preparation
 
 *** Software architecture                                          :noexport:
 **** Module dependencies (internal + external) 			 :B_picblock:
      :PROPERTIES:
      :BEAMER_env: picblock
      :BEAMER_opt: pic=swh-modules-deps-all,width=\linewidth
      :END:
 **** 
      let's zoom in: http://deb.li/swhdeps
 ** Technology							   :noexport:
   :PROPERTIES:
   :CUSTOM_ID: technology-short
   :END:
 *** Deployment and resource usage
 **** Software
      - around 60k SLOC of custom Python code, running on Debian Stable
      - PostgreSQL database for the metadata storage
      - Full docker-compose development environment
      - Work in progress: scale-out metadata storage (Cassandra?)
      - Work in progress: mirroring infrastructure (Kafka)
 **** Hardware
      - 12 servers (hypervisors, database, storage, staging and testing infrastructure) / 40 virtual machines with mass storage and a backup server at Inria
      - In-kind sponsorship of cloud and storage resources (Microsoft, University of Bologna)
 ** Software development						   :noexport:
   :PROPERTIES:
   :CUSTOM_ID: development
   :END:
 *** Software development
 **** classic FOSS development
      - language: English
      - development mailing list
        #+BEAMER: \\{\small \url{https://sympa.inria.fr/sympa/info/swh-devel}}
      - IRC
        #+BEAMER: \\
        #swh-devel / FreeNode
      - Forge
        #+BEAMER: \\{\small \url{https://forge.softwareheritage.org}}
      - Git, tasks, code review, etc.
 **** for more information
      #+BEAMER: \scriptsize
      https://www.softwareheritage.org/community/developers/
 ** Roadmap
   :PROPERTIES:
   :CUSTOM_ID: features
   :END:
 *** Features...
      - (done) *lookup* by content hash
      - (done) *browsing*: "wayback machine" for source code (API + UI)
      - (early access) *deposit* of source code bundles directly to the archive
      - (early access) *save code now*, on-demand archive
      - (done) *download*: =wget= / =git clone= from the archive
      - (todo) *provenance* lookup for all archived content
      - (todo) *full-text search* on all archived source code files
 #+BEAMER: \pause
 *** ... and much more than one could possibly imagine
     all the world's software development history at hand's reach!
 ** Web API                                                         :noexport:
   :PROPERTIES:
   :CUSTOM_ID: api
   :END:
 *** Web API
     :PROPERTIES:
     :CUSTOM_ID: apiintro
     :END:
 **** 
      RESTful API to programmatically access the Software Heritage archive \\
      *\url{https://archive.softwareheritage.org/api/}*
 **** Features
      - pointwise *browsing* of the archive
        - … snapshots → revisions → directories → contents …
      - full access to the *metadata* of archived objects
      - *crawling* information
        - /when have you last visited this Git repository I care about?/
        - /where were its branches/tags pointing to at the time?/
      # - derived information about archived contents (WIP)
      #   - MIME type, programming language, license, etc.
 **** Endpoint index
      \url{https://archive.softwareheritage.org/api/1/}
 *** A tour of the Web API --- origins & visits
     :PROPERTIES:
     :CUSTOM_ID: apitourvisits
     :END:
     #+BEAMER: \footnotesize
 #+BEGIN_SRC
 GET https://archive.softwareheritage.org/api/1/origin/ \
       git/url/https://github.com/hylang/hy
 { "id": 1,
   "origin_visits_url": "/api/1/origin/1/visits/",
   "type": "git",
   "url": "https://github.com/hylang/hy"
 }
 #+END_SRC
 #+BEAMER: \vfill
 #+BEGIN_SRC
 GET https://archive.softwareheritage.org/api/1/origin/ \
       1/visits/
 [ ...,
   { "date": "2016-09-14T11:04:26.769266+00:00",
     "origin": 1,
     "origin_visit_url": "/api/1/origin/1/visit/13/",
     "status": "full",
     "visit": 13
   }, ...
 ]
 #+END_SRC
 *** A tour of the Web API --- snapshots
     :PROPERTIES:
     :CUSTOM_ID: apitoursnapshots
     :END:
     #+BEAMER: \footnotesize
 #+BEGIN_SRC
 GET https://archive.softwareheritage.org/api/1/origin/ \
       1/visit/13/
 { ...,
   "occurrences": { ...,
     "refs/heads/master": {
       "target": "b94211251...",
       "target_type": "revision",
       "target_url": "/api/1/revision/b94211251.../"
     },
     "refs/tags/0.10.0": {
       "target": "7045404f3...",
       "target_type": "release",
       "target_url": "/api/1/release/7045404f3.../"
     }, ...
   },
   "origin": 1,
   "origin_url": "/api/1/origin/1/",
   "status": "full",
   "visit": 13
 }
 #+END_SRC
 *** A tour of the Web API --- releases                             :noexport:
     :PROPERTIES:
     :CUSTOM_ID: apitourreleases
     :END:
 #+BEAMER: \footnotesize
 #+BEGIN_SRC
 GET https://archive.softwareheritage.org/api/1/release/ \
       7045404f3d1c54e6473c71bbb716529fbad4be24/
 {
   "author": {
     "email": "tag@pault.ag",
     "fullname": "Paul Tagliamonte <tag@pault.ag>",
     "id": 96,
     "name": "Paul Tagliamonte"
   },
   "date": "2014-04-10T23:01:28-04:00",
   "message": "0.10: The Oh f*ck it's PyCon release",
   "name": "0.10.0",
   "synthetic": false,
   "target": "6072557b6...",
   "target_type": "revision",
   "target_url": "/api/1/revision/6072557b6.../",
   ...
 }
 #+END_SRC
 *** A tour of the Web API --- revisions
     :PROPERTIES:
     :CUSTOM_ID: apitourrevisions
     :END:
 #+BEAMER: \footnotesize
 #+BEGIN_SRC
 GET https://archive.softwareheritage.org/api/1/revision/ \
       6072557b6c10cd9a21145781e26ad1f978ed14b9/
 {
   "author": {
     "email": "tag@pault.ag",
     "fullname": "Paul Tagliamonte <tag@pault.ag>",
     "id": 96,
     "name": "Paul Tagliamonte"
   },
   "committer": { ... },
   "date": "2014-04-10T23:01:11-04:00",
   "committer_date": "2014-04-10T23:01:11-04:00",
   "directory": "2df4cd84e...",
   "directory_url": "/api/1/directory/2df4cd84e.../",
   "history_url": "/api/1/revision/6072557b6.../log/",
   "merge": false,
   "message": "0.10: The Oh f*ck it's PyCon release",
   "parents": [ {
      "id": "10149f66e...",
      "url": "/api/1/revision/10149f66e.../"
   } ],
   ...
 }
 #+END_SRC
 *** A tour of the Web API --- contents
     :PROPERTIES:
     :CUSTOM_ID: apitourcontents
     :END:
 #+BEAMER: \footnotesize
 #+BEGIN_SRC
 GET https://archive.softwareheritage.org/api/1/content/ \
       adc83b19e793491b1c6ea0fd8b46cd9f32e592fc/
 {
   "data_url": "/api/1/content/sha1:adc83b19e.../raw/",
   "filetype_url": "/api/1/content/sha1:.../filetype/",
   "language_url": "/api/1/content/sha1:.../language/",
   "length": 1,
   "license_url": "/api/1/content/sha1:.../license/",
   "sha1": "adc83b19e...",
   "sha1_git": "8b1378917...",
   "sha256": "01ba4719c...",
   "status": "visible"
 }
 #+END_SRC
 #+BEAMER: \normalsize \vfill \pause
 **** Caveats
      - rate limits apply throughout the API
      - raw download available for textual contents
 ** Accessing the archive                                           :noexport:
   :PROPERTIES:
   :CUSTOM_ID: accessing-short
   :END:
 *** Browse                                                    :B_block:BMCOL:
     :PROPERTIES:
     :BEAMER_col: 0.4
     :BEAMER_env: block
     :END:
     #+BEAMER: \begin{center}\includegraphics[width=0.5\textwidth]{archive-browse}\end{center}
     - https://archive.softwareheritage.org/browse
     - way back machine for software source code
   #+BEAMER: \pause
 *** Web API                                                   :B_block:BMCOL:
     :PROPERTIES:
     :BEAMER_col: 0.4
     :BEAMER_env: block
     :END:
     #+BEAMER: \begin{center}\includegraphics[width=0.5\textwidth]{archive-webapi}\end{center}
     - https://archive.softwareheritage.org/api
     - point-wise navigation of the archive as a graph
 ** Some technical challenges
   :PROPERTIES:
   :CUSTOM_ID: techchallenges
   :END:
 *** Expanding the archive
     - discover and classify /all/ the software sources
     - importers for other VCSs (SVN, Hg, ...)
    \hfill /We need your help!/
 *** Staying current 
     get new repositories and commits ASAP\\
    \hfill /We need reliable, standardised event feeds./
 *** Handling the backlog
     ingesting all the pre-existing data\\
     \hfill /Decades of software development are waiting!/
diff --git a/talks-public/2021-03-30-cergy/2020-03-30-cergy.org b/talks-public/2021-03-30-cergy/2020-03-30-cergy.org
new file mode 100644
index 0000000..e5b67bc
--- /dev/null
+++ b/talks-public/2021-03-30-cergy/2020-03-30-cergy.org
@@ -0,0 +1,185 @@
+#+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt)
+#+TITLE: Software Heritage
+#+SUBTITLE: Analyzing the Global Graph of Public Software Development
+#+BEAMER_HEADER: \date[2021-03-30, CYU]{30 Mars 2021\\Lab. ETIS --- CY Cergy Paris Université\\ (online)\\[-2ex]}
+#+AUTHOR: Stefano Zacchiroli
+#+DATE: 30 March 2021
+#+EMAIL: zack@upsilon.cc
+
+#+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1
+#+INCLUDE: "../../common/modules/169.org"
+#+BEAMER_HEADER: \institute[UParis \& Inria]{Université de Paris \& Inria --- {\tt zack@upsilon.cc, @zacchiro}}
+#+BEAMER_HEADER: \author{Stefano Zacchiroli}
+
+# Required by graph-compression.org module
+#+LATEX_HEADER_EXTRA: \usepackage{pdfpages}
+
+# Syntax highlighting setup
+#+LATEX_HEADER_EXTRA: \usepackage{minted}
+#+LaTeX_HEADER_EXTRA: \usemintedstyle{tango}
+#+LaTeX_HEADER_EXTRA: \newminted{sql}{fontsize=\scriptsize}
+#+name: setup-minted
+#+begin_src emacs-lisp :exports results :results silent
+   (setq org-latex-listings 'minted)
+   (setq org-latex-minted-options
+         '(("fontsize" "\\scriptsize")
+           ("linenos" "")))
+   (setq org-latex-to-pdf-process
+         '("pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f"
+           "pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f"
+           "pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f"))
+#+end_src
+# End syntax highlighting setup
+
+* About me                                                  :B_ignoreheading:
+  :PROPERTIES:
+  :BEAMER_env: ignoreheading
+  :END:
+  #+INCLUDE: "this/zack.org" :minlevel 2
+* Software Heritage
+#+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :minlevel 2
+** An international, non profit initiative
+  :PROPERTIES:
+  :CUSTOM_ID: support
+  :END:
+*** Sharing the vision                                              :B_block:
+  :PROPERTIES:
+  :CUSTOM_ID: endorsement
+  :BEAMER_COL: .5
+  :BEAMER_env: block
+  :END:
+   #+LATEX: \begin{center}{\includegraphics[width=\extblockscale{.4\linewidth}]{unesco_logo_en_285}}\end{center}
+   #+LATEX: \vspace{-0.8cm}
+   #+LATEX: \begin{center}\vskip 1em \includegraphics[width=\extblockscale{1.4\linewidth}]{support.pdf}\end{center}
+   #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/testimonials}
+*** Donors, members, sponsors                                       :B_block:
+    :PROPERTIES:
+    :CUSTOM_ID: sponsors
+    :BEAMER_COL: .5
+    :BEAMER_env: block
+    :END:
+   #+LATEX: \begin{center}\includegraphics[width=\extblockscale{.4\linewidth}]{inria-logo-new}\end{center}
+   #+LATEX: \begin{center}
+   #+LATEX: \colorbox{white}{\includegraphics[width=\extblockscale{1.4\linewidth}]{sponsors.pdf}}
+   #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/sponsors}
+   #+LATEX: \end{center}
+** Status                                                   :B_ignoreheading:
+   :PROPERTIES:
+   :BEAMER_env: ignoreheading
+   :END:
+#+INCLUDE: "../../common/modules/status-extended.org::#archivinggoals" :minlevel 2
+#+INCLUDE: "../../common/modules/status-extended.org::#architecture" :minlevel 2 :only-contents t
+#+INCLUDE: "../../common/modules/status-extended.org::#merkletree" :minlevel 2
+#+INCLUDE: "../../common/modules/status-extended.org::#datamodel" :minlevel 2 :only-contents t
+#+INCLUDE: "../../common/modules/status-extended.org::#dagdetailsmall" :minlevel 2 :only-contents t
+#+INCLUDE: "../../common/modules/status-extended.org::#archive" :minlevel 2
+* Querying the archive
+** Exploitation
+   #+BEAMER: \LARGE \centering
+   How do you query the Software Heritage archive?
+   #+BEAMER: \Large \\
+   (on a budget)
+
+** Use cases --- product needs
+   e.g., for https://archive.softwareheritage.org
+*** Browsing
+    - =ls=
+    - =git log= (Linux kernel: 800K+ commits)
+*** Wayback machine
+    - tarball
+    - =git bundle= (Linux kernel: 7M+ nodes)
+*** Provenance tracking
+    - commit provenance (one/all contexts) (note: requires backtracking)
+    - origin provenance (one/all contexts)
+*** Note                                                    :B_ignoreheading:
+    :PROPERTIES:
+    :BEAMER_env: ignoreheading
+    :END:
+    Note: we therefore need both the direct Merkle DAG graph and its
+    *transposed*
+
+** Use cases --- research questions
+*** For the sake of it
+    - local graph topology
+    - connected component size
+      - enabling question to identify the best approach (e.g., scale-up
+        v. scale-out) to conduct large-scale analyses
+    - any other emerging property
+*** Software Engineering topics
+    - software provenance analysis at this scale is pretty much unexplored yet
+    - industry frontier: increase granularity down to the individual line of
+      code
+    - replicate at this scale (famous) studies that have generally been
+      conducted on (much) smaller version control system samples to
+      confirm/refute their findings
+    - ...
+
+ #+INCLUDE: "../../common/modules/dataset.org::#main" :minlevel 2 :only-contents t
+ #+INCLUDE: "../../common/modules/dataset.org::#morequery" :minlevel 2 :only-contents t
+** Sample study --- 50 years of gender differnces in code contributions
+   - start from the Software Heritage graph dataset
+   - detect gender of author names using standard tooling (=gender-guesser=)
+   # - caveat: how to identify /first/ name?
+   - analyze both authors and commits over time, bucketing by commit timestamp
+   #+BEAMER: \begin{center} \includegraphics[height=0.45\textheight]{this/commits-pie.pdf} \includegraphics[height=0.45\textheight]{this/ratio-female-authors.pdf} \\ \scriptsize total commits by author gender (left), ratio of active female commiters over time (right)\end{center}
+*** 
+   #+BEGIN_EXPORT latex
+   \vspace{-1mm}
+   \begin{thebibliography}{} \footnotesize
+   \bibitem{Zacchiroli2021} Stefano Zacchiroli
+   \newblock Gender Differences in Public Code Contributions: a 50-year Perspective
+   \newblock IEEE Software, March/April 2021
+   \end{thebibliography}
+   #+END_EXPORT
+
+** Discussion
+   - one /can/ query such a corpus SQL-style
+   - but relational representation shows its limits at this scale
+     - ...at least as deployed on commercial SQL offerings such as Athena
+   - note: (naive) sharding is ineffective, due to the pseudo-random
+     distribution of node identifiers
+   - experiments with Google BigQuery are ongoing
+     - (we broke it at the first import attempt..., due to very large arrays in
+       directory entry tables)
+
+* Graph compression
+ #+INCLUDE: "../../common/modules/graph-compression.org::#main" :minlevel 2 :only-contents t
+
+* Conclusion
+  #+INCLUDE: "this/roadmap.org" :minlevel 2
+** Wrapping up
+  #+latex: \vspace{-2mm}
+*** 
+    - Software Heritage archives all public source code as a huge Merkle DAG
+    - Querying and analyzing it at scale (20/200 B nodes/edges) is an open
+      problem
+    - Gold mine of research leads in sw. eng., complex networks, big code,
+      reproducibility
+  #+latex: \vspace{-2mm}
+*** References
+  #+latex: \vspace{-1mm}
+  #+BEGIN_EXPORT latex
+  \begin{thebibliography}{}
+  \scriptsize
+
+  \bibitem{Abramatic2018} Jean-François Abramatic, Roberto Di Cosmo, Stefano Zacchiroli
+  \newblock Building the Universal Archive of Source Code
+  \newblock Communications of the ACM, October 2018
+
+  \bibitem{Pietri2019} Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli
+  \newblock The Software Heritage graph dataset: public software development under one roof
+  \newblock MSR 2019: 16th Intl. Conf. on Mining Software Repositories. IEEE
+
+  \bibitem{Boldi2020} Paolo Boldi, Antoine Pietri, Sebastiano Vigna, Stefano Zacchiroli
+  \newblock Ultra-Large-Scale Repository Analysis via Graph Compression
+  \newblock SANER 2020, 27th Intl. Conf. on Software Analysis, Evolution and Reengineering. IEEE
+
+  \end{thebibliography}
+  #+END_EXPORT
+*** Contacts
+    Stefano Zacchiroli / [[https://upsilon.cc/~zack/][upsilon.cc]] / [[mailto:zack@upsilon.cc][zack@upsilon.cc]] / [[https://twitter.com/zacchiro][@zacchiro]]
+
+* Appendix                                                       :B_appendix:
+  :PROPERTIES:
+  :BEAMER_env: appendix
+  :END:
diff --git a/talks-public/2021-03-30-cergy/Makefile b/talks-public/2021-03-30-cergy/Makefile
new file mode 100644
index 0000000..68fbee7
--- /dev/null
+++ b/talks-public/2021-03-30-cergy/Makefile
@@ -0,0 +1 @@
+include ../Makefile.slides
diff --git a/talks-public/2021-03-30-cergy/this/commits-pie.pdf b/talks-public/2021-03-30-cergy/this/commits-pie.pdf
new file mode 100644
index 0000000..8b0e856
Binary files /dev/null and b/talks-public/2021-03-30-cergy/this/commits-pie.pdf differ
diff --git a/talks-public/2021-03-30-cergy/this/ratio-female-authors.pdf b/talks-public/2021-03-30-cergy/this/ratio-female-authors.pdf
new file mode 100644
index 0000000..bcf325a
Binary files /dev/null and b/talks-public/2021-03-30-cergy/this/ratio-female-authors.pdf differ
diff --git a/talks-public/2021-03-30-cergy/this/ratio-female-commits.pdf b/talks-public/2021-03-30-cergy/this/ratio-female-commits.pdf
new file mode 100644
index 0000000..baab5bf
Binary files /dev/null and b/talks-public/2021-03-30-cergy/this/ratio-female-commits.pdf differ
diff --git a/talks-public/2021-03-30-cergy/this/roadmap.org b/talks-public/2021-03-30-cergy/this/roadmap.org
new file mode 100644
index 0000000..0559610
--- /dev/null
+++ b/talks-public/2021-03-30-cergy/this/roadmap.org
@@ -0,0 +1,77 @@
+** A (brief) research roadmap --- 1
+*** Graph compression
+    - incremental, amortized compression → ongoing UniMi collaboration
+    - graph query languages on top of the compressed representation → LIRIS
+      collaboration (early stages)
+  #+BEAMER: \vfill \pause
+*** Complex networks
+    - local topology of the global VCS graph
+    - emergent properties (the "classics": scale-free, small world, etc.)
+    - dynamic modeling of graph evolution over time → collab. with physics @
+      UParis
+*** 
+    #+BEGIN_EXPORT latex
+    \vspace{-2mm}
+    \begin{thebibliography}{}
+    \small
+    \bibitem{Pietri2019}  Antoine Pietri, Guillaume Rousseau, Stefano Zacchiroli\newblock
+    Determining the Intrinsic Structure of Public Software Development History\newblock
+    MSR 2020: 17th Intl. Conf. on Mining Software Repositories. IEEE\newblock
+    registered study protocol
+    \end{thebibliography}
+    #+END_EXPORT
+** A (brief) research roadmap --- 2
+  #+BEAMER: \vspace{-2mm}
+*** Very-large-scale "big code"
+    - /big code/ = apply ML/DL to source code and other development byproducts
+    - current results are language-specific and limited in scale; even the
+      simplest problems become challenging at this scale and heterogeneity
+      - lead: scalable language detection → collaboration with UniBo
+      - lead: project classification → collaboration with CELI
+    - the VCS graph remains largely unexplored in big code
+      - lead: apply GNN (Graph Neural Network) for VCS node classification
+    #+BEAMER: \vspace{-1mm}
+***                                                         :B_ignoreheading:
+    :PROPERTIES:
+    :BEAMER_env: ignoreheading
+    :END:
+    #+BEAMER: \vspace{-1mm} \pause
+*** Very-large-scale source code indexing
+    - common AST-based approaches for code indexing are not viable here due do
+      maximum heterogeneity
+    - alternative: treat code as text and full-text index it
+      - previous exp.: 3-gram based indexing in Debsources, supporting regexp
+        matching
+    - goal: find a sweet spot between the two
+    #+BEAMER: \vspace{-1mm}
+** Recent results on (visual) programming language identification
+   #+BEGIN_EXPORT latex
+   \begin{thebibliography}{}
+   \small
+
+   \bibitem{DelBonifro2021a} Francesca Del Bonifro, Maurizio Gabbrielli, Stefano Zacchiroli
+   \newblock Content-Based Textual File Type Detection at Scale
+   \newblock ICMLC 2021: The 13th International Conference on Machine Learning and Computing. ACM, 2021 
+
+   \bibitem{DelBonifro2021b} Francesca Del Bonifro, Maurizio Gabbrielli, Antonio Lategano, Stefano Zacchiroli
+   \newblock Image-based many-language programming language identification
+   \newblock (under review)
+
+   \end{thebibliography}
+   #+END_EXPORT
+** A (brief) research roadmap --- 3
+*** Very-large-scale reproducibility in software engineering
+    - most results in empirical software engineering are determined on corpuses
+      significantly smaller than Software Heritage
+
+      → external validity threat; do results generalize to the full body of
+      public code?
+    - 2-year research plan
+      1) identify impactful sw. eng. studies that can be reproduced using
+         Software Heritage
+	 - selected topics (tentative): code reuse, code quality, project
+           classification, technical debt, developer productivity
+      2) reproduce selected studies one-by-one, at Software Heritage scale
+      3) document findings, e.g., via RENE (REproducibility Studies and
+         NEgative Results) scientific initiatives
+    - collaboration with Microsoft Research (early stages)
diff --git a/talks-public/2021-03-30-cergy/this/zack.org b/talks-public/2021-03-30-cergy/this/zack.org
new file mode 100644
index 0000000..25b290b
--- /dev/null
+++ b/talks-public/2021-03-30-cergy/this/zack.org
@@ -0,0 +1,25 @@
+
+** About me
+   - Associate Professor (/Maître de conférences/), Université de Paris
+   - on leave (/délégation/) at Inria
+   - Free/Open Source Software activist (20+ years)
+   - Debian Developer & Former 3x Debian Project Leader
+   - Former Open Source Initiative (OSI) director
+   - Software Heritage co-founder & CTO
+
+  #+BEAMER: \vfill \pause
+
+*** Research path
+    1) Formal methods for ensuring the quality of software upgrades (Mancoosi
+       project)
+
+       Industry adoption: Debian, OPAM, Eclipse P2
+    2) Formal methods for automated upgrade planning in the cloud (Aeolus
+       project)
+
+       Industry adoption: Mandriva, Kyriba
+    3) Large-scale software evolution analysis (Debsources platform)
+    4) Very-large-scale source code analysis and preservation (Software
+       Heritage)
+
+       → this talk