diff --git a/talks-public/2021-05-19-telecom-paris/2021-05-19-telecom-paris.org b/talks-public/2021-05-19-telecom-paris/2021-05-19-telecom-paris.org
index 834f69d..e0e6181 100644
--- a/talks-public/2021-05-19-telecom-paris/2021-05-19-telecom-paris.org
+++ b/talks-public/2021-05-19-telecom-paris/2021-05-19-telecom-paris.org
@@ -1,188 +1,191 @@
 #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt)
 #+TITLE: Software Heritage
 #+SUBTITLE: Analyzing the Global Graph of Public Software Development
 #+BEAMER_HEADER: \date[2021-05-19, ACES]{19 May 2021\\Team ACES --- Télécom Paris\\ (online)\\[-2ex]}
 #+AUTHOR: Stefano Zacchiroli
 #+DATE: 19 May 2021
 #+EMAIL: zack@upsilon.cc
 
 #+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1
 #+INCLUDE: "../../common/modules/169.org"
 #+BEAMER_HEADER: \institute[UParis \& Inria]{Université de Paris \& Inria --- {\tt zack@upsilon.cc, @zacchiro}}
 #+BEAMER_HEADER: \author{Stefano Zacchiroli}
 
 # Syntax highlighting setup
 #+LATEX_HEADER_EXTRA: \usepackage{minted}
 #+LaTeX_HEADER_EXTRA: \usemintedstyle{tango}
 #+LaTeX_HEADER_EXTRA: \newminted{sql}{fontsize=\scriptsize}
 #+name: setup-minted
 #+begin_src emacs-lisp :exports results :results silent
    (setq org-latex-listings 'minted)
    (setq org-latex-minted-options
          '(("fontsize" "\\scriptsize")
            ("linenos" "")))
    (setq org-latex-to-pdf-process
          '("pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f"
            "pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f"
            "pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f"))
 #+end_src
 # End syntax highlighting setup
 
 * About me                                                  :B_ignoreheading:
   :PROPERTIES:
   :BEAMER_env: ignoreheading
   :END:
   #+INCLUDE: "this/zack.org" :minlevel 2
 * Software Heritage
 #+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :minlevel 2
 ** An international, non profit initiative
   :PROPERTIES:
   :CUSTOM_ID: support
   :END:
 *** Sharing the vision                                              :B_block:
   :PROPERTIES:
   :CUSTOM_ID: endorsement
   :BEAMER_COL: .5
   :BEAMER_env: block
   :END:
    #+LATEX: \begin{center}{\includegraphics[width=\extblockscale{.4\linewidth}]{unesco_logo_en_285}}\end{center}
    #+LATEX: \vspace{-0.8cm}
    #+LATEX: \begin{center}\vskip 1em \includegraphics[width=\extblockscale{1.4\linewidth}]{support.pdf}\end{center}
    #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/testimonials}
 *** Donors, members, sponsors                                       :B_block:
     :PROPERTIES:
     :CUSTOM_ID: sponsors
     :BEAMER_COL: .5
     :BEAMER_env: block
     :END:
    #+LATEX: \begin{center}\includegraphics[width=\extblockscale{.4\linewidth}]{inria-logo-new}\end{center}
    #+LATEX: \begin{center}
    #+LATEX: \colorbox{white}{\includegraphics[width=\extblockscale{1.4\linewidth}]{sponsors.pdf}}
    #+latex:\mbox{}~~~~~~~\tiny\url{www.softwareheritage.org/support/sponsors}
    #+LATEX: \end{center}
 ** Status                                                   :B_ignoreheading:
    :PROPERTIES:
    :BEAMER_env: ignoreheading
    :END:
 #+INCLUDE: "../../common/modules/status-extended.org::#archivinggoals" :minlevel 2
 #+INCLUDE: "../../common/modules/status-extended.org::#architecture" :minlevel 2 :only-contents t
 #+INCLUDE: "../../common/modules/status-extended.org::#merkletree" :minlevel 2
 #+INCLUDE: "../../common/modules/data-model.org::#merklestruct" :minlevel 2
 #+INCLUDE: "../../common/modules/status-extended.org::#dagdetailsmall" :minlevel 2 :only-contents t
 #+INCLUDE: "../../common/modules/status-extended.org::#archive" :minlevel 2
 * Querying the archive
 ** Use cases --- product needs
    e.g., for https://archive.softwareheritage.org
 *** Browsing
     - =ls=
     - =git log= (Linux kernel: 800K+ commits)
 *** Wayback machine
     - tarball
     - =git bundle= (Linux kernel: 7M+ nodes)
 *** Provenance tracking
     - commit provenance (one/all contexts) \hfill note: requires backtracking
     - origin provenance (one/all contexts)
 *** Note                                                    :B_ignoreheading:
     :PROPERTIES:
     :BEAMER_env: ignoreheading
     :END:
     Note: we therefore need both the direct Merkle DAG graph and its
     *transposed*
 
 ** Use cases --- research questions
 *** For the sake of it
     - local graph topology
     - connected component size
       - enabling question to identify the best approach (e.g., scale-up
         v. scale-out) to conduct large-scale analyses
     - any other emerging property
 *** Software Engineering topics
     - software provenance analysis at this scale is pretty much unexplored yet
     - industry frontier: increase granularity down to the individual line of
       code
     - replicate at this scale (famous) studies that have generally been
       conducted on (much) smaller version control system samples to
       confirm/refute their findings
     - ...
 ** Exploitation
    #+BEAMER: \LARGE \centering
    How do you query the Software Heritage archive?
    #+BEAMER: \Large \\
    (on a budget)
 
 ** The Software Heritage Graph Dataset                      :B_ignoreheading:
    :PROPERTIES:
    :BEAMER_env: ignoreheading
    :END:
    #+INCLUDE: "../../common/modules/dataset.org::#main" :minlevel 2 :only-contents t
    #+INCLUDE: "../../common/modules/dataset.org::#morequery" :minlevel 2 :only-contents t
 
 ** Sample study --- 50 years of gender differences in code contributions
    - start from the Software Heritage graph dataset
    - detect gender of author names using standard tooling (=gender-guesser=)
    # - caveat: how to identify /first/ name?
    - analyze both authors and commits over time, bucketing by commit timestamp
    #+BEAMER: \begin{center} \includegraphics[height=0.45\textheight]{this/commits-pie.pdf} \includegraphics[height=0.45\textheight]{this/ratio-female-authors.pdf} \\ \scriptsize total commits by author gender (left), ratio of active female commiters over time (right)\end{center}
 *** 
    #+BEGIN_EXPORT latex
    \vspace{-1mm}
    \begin{thebibliography}{} \footnotesize
    \bibitem{Zacchiroli2021} Stefano Zacchiroli
    \newblock Gender Differences in Public Code Contributions: a 50-year Perspective
    \newblock IEEE Softw. 38(2): 45-50 (2021)
    \end{thebibliography}
    #+END_EXPORT
 
 ** Discussion
    - one /can/ query such a corpus SQL-style
    - but relational representation shows its limits at this scale
      - ...at least as deployed on commercial SQL offerings such as Athena
    - note: (naive) sharding is ineffective, due to the pseudo-random
      distribution of node identifiers
    - experiments with Google BigQuery are ongoing
      - (we broke it at the first import attempt..., due to very large arrays in
        directory entry tables)
 
 * Graph compression
  #+INCLUDE: "../../common/modules/graph-compression.org::#main" :minlevel 2 :only-contents t
 
 * Security synergies and outlook
   #+INCLUDE: "this/security.org" :minlevel 2
   #+INCLUDE: "this/roadmap.org" :minlevel 2
 
 ** Wrapping up
   #+latex: \vspace{-1mm}
 *** 
     - Software Heritage archives all public source code as a huge Merkle DAG
     - Querying and analyzing it at scale (20/200 B nodes/edges) is an open
       problem
     - Gold mine of research leads in sw. eng., big code, reproducibility,
       security
   #+latex: \vspace{-2mm}
 *** References (selected)
   #+latex: \vspace{-1mm}
   #+BEGIN_EXPORT latex
   \begin{thebibliography}{}
   \scriptsize
 
   \bibitem{Abramatic2018} Jean-François Abramatic, Roberto Di Cosmo, Stefano Zacchiroli
   \newblock Building the Universal Archive of Source Code
   \newblock Communications of the ACM, October 2018
 
   \bibitem{Pietri2019} Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli
   \newblock The Software Heritage graph dataset: public software development under one roof
   \newblock MSR 2019: 16th Intl. Conf. on Mining Software Repositories. IEEE
 
   \bibitem{Boldi2020} Paolo Boldi, Antoine Pietri, Sebastiano Vigna, Stefano Zacchiroli
   \newblock Ultra-Large-Scale Repository Analysis via Graph Compression
   \newblock SANER 2020, 27th Intl. Conf. on Software Analysis, Evolution and Reengineering. IEEE
 
   \end{thebibliography}
   #+END_EXPORT
 *** Contacts
     Stefano Zacchiroli / [[https://upsilon.cc/~zack/][upsilon.cc]] / [[mailto:zack@upsilon.cc][zack@upsilon.cc]] / [[https://twitter.com/zacchiro][@zacchiro]]
 
 * Appendix                                                       :B_appendix:
   :PROPERTIES:
   :BEAMER_env: appendix
   :END:
+
+** Meet the Software Heritage Identifiers (SWHIDs) \hfill [[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][(full spec)]]
+  #+INCLUDE: "../../common/modules/swhid.org::#oneslide" :only-contents t
diff --git a/talks-public/2021-05-19-telecom-paris/this/security.org b/talks-public/2021-05-19-telecom-paris/this/security.org
index 370729c..d394279 100644
--- a/talks-public/2021-05-19-telecom-paris/this/security.org
+++ b/talks-public/2021-05-19-telecom-paris/this/security.org
@@ -1,120 +1,122 @@
 ** Securing the open source supply chain
 
    *Software supply chain attacks* are becoming more and more popular and
    raising in profile. → Cf. /SolarWindws attacks/ (2021), breaching several US
    govt. branches
 
 *** Definition --- Reproducible Builds (R-B)
     The build process of a software product is *reproducible* if, after
     designating a specific version of its source code and all of its build
     dependencies, every build produces *bit-for-bit identical artifacts*, no
     matter the environment in which the build is performed.
 
 *** 
     - R-B allows to *increase trust in binary executables* built from trusted
       (open source) code by untrusted 3rd-party software vendors (e.g., app
       stores, distros)
 
     - The *[[https://reproducible-builds.org/][reproducible-builds.org project]]* has popularized the notion, is
       backed by major open source industry players, and has made large open
       source software collections reproducible (e.g., 95% of Debian packages)
 
 *** References                                              :B_ignoreheading:
     :PROPERTIES:
     :BEAMER_env: ignoreheading
     :END:
     #+BEGIN_EXPORT latex
     \begin{thebibliography}{}
     \footnotesize
     \bibitem{Lamb2021RB} Chris Lamb, Stefano Zacchiroli 
     \newblock Reproducible Builds: Increasing the Integrity of Software Supply 
     \newblock IEEE Software 2021 (to appear, DOI 10.1109/MS.2021.3073045)
     \end{thebibliography}
     #+END_EXPORT
 
 ** Securing the open source supply chain (cont.)
    #+BEAMER: \begin{center}\includegraphics[width=\textwidth]{this/r-b-approach}\end{center}
 
 ** Securing the open source supply chain (cont.)
 *** 
     - Software Heritage provides key ingredients for R-B pipelines: on-demand
       archival (e.g., of VCS commits referenced by build recipes) + long-term
       availability
     - We have implemented this by integrating the GNU Guix package manager with
       Software Heritage
 
 ***                                                         :B_ignoreheading:
     :PROPERTIES:
     :BEAMER_env: ignoreheading
     :END:
     #+BEAMER: \begin{center}\hfill\includegraphics[height=0.4\textheight]{swh-guix-1}\hfill\includegraphics[height=0.4\textheight]{swh-guix-2}\hfill~\end{center}
     #+BEAMER: \scriptsize
     - \url{https://www.softwareheritage.org/2019/04/18/software-heritage-and-gnu-guix-join-forces-to-enable-long-term-reproducibility/}
     - \url{https://guix.gnu.org/blog/2019/connecting-reproducible-deployment-to-a-long-term-source-code-archive/}
 
 ** Tracking of vulnerable source code artifacts
 
 *** 
     Software Heritage provides a unique observatory on the (best approximation
     of) the entire /Software Commons/, i.e., all software published in source
     code form
 
 *** Software provenance tracking at the scale of the world
     - by following the /transposed/ Software Heritage graph we can locate *all
       known public occurrences* of source code artifacts (individual source
       files, entier source tree, commits) in other commits or repositories
 
     - we have developed two approaches to do that:
 
       1. database-based (Rousseau et al. EMSE 2020): incremental, answers a
          fixed set of queries, requires significant disk space
 
-      2. compressed-graph-base (Boldi et al. SANER 2020): non-incremental,
+      2. compressed-graph-based (Boldi et al. SANER 2020): non-incremental,
          flexible graph-base querying, fits in RAM
 
     - current applications: "intellectual property"/prior art, open source
-      license compliance, software composition analysis (SCA)
+      license compliance, software composition analysis (SCA) → collab. with
+      CAST
 
 ** Tracking of vulnerable source code artifacts (cont.)
 
 *** Adding in-memory commit timestamps (experimental)
     Idea: in-memory timestamp array (us precision, 8 bytes each), indexed by
     revision node id. This enables to efficiently exploit timestamp information
     during graph visits.
 
 *** Finding the /earliest/ commit referencing a source file/dir
     Early experiment: finding the earliest revision containing a given file
     using in-memory commit timestamps, on 10 M randomly selected blobs.
 
     Mean lookup time: 4.1 ms (avg on 95% percentile: 2.2 ms)
 
 *** Tracking vulnerable source code files/trees
     Given a source file/tree affected by a known vulnerability (e.g.,
     identified by a CVE) we can efficiently identify /all/ commits (and
     repositories, extending the traversals) that reference it, triggering
     further inspection. Furthermore, we can efficiently select which commits to
-    filter out during visits, based on commit timestamps of other attributes
-    that can be made to fit in memory (or memory mapped to disk).
+    filter out during visits (e.g., "recent" ones, only in selected repos,
+    etc.), based on timestamps of other attributes (that fit in memory or are
+    mmap()-ed to disk).
 
 ** Tracking of vulnerable source code artifacts (cont.)
 
 *** v. State-of-the-art industry offerings
     Similar to what GitHub/GitLab offer as a service, but:
 
     - without having to rely on repository scanning, because the "big picture"
       is already present in the Software Heritage archive by design
 
     - independent from the development platform vendor (e.g., a "vulnerable
       file" primarily hosted on GitHub can be spotted in GitLab repositories
       and vice-versa)
 
     - complementary and synergistic with analyses of vulnerable dependency
       information (which are also available in Software Heritage via metadata
       mining)
 
 *** Caveats
 
     - current granularity stops at the file level and traceability breaks with
       even just whitespace changes. Increasing tracking granularity to the
       snippet/line of code level is possible, but untested at this scale yet
       (cf. research roadmap)