diff --git a/common/images/commit-time-distro.png b/common/images/commit-time-distro.png
new file mode 100644
index 0000000..3f7895a
Binary files /dev/null and b/common/images/commit-time-distro.png differ
diff --git a/common/modules/graph-compression.org b/common/modules/graph-compression.org
index e42d9ae..4a3a687 100644
--- a/common/modules/graph-compression.org
+++ b/common/modules/graph-compression.org
@@ -1,345 +1,352 @@
 #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt)
 #+INCLUDE: "prelude.org" :minlevel 1
 
 # Depends: \usepackage{pdfpages}
 
 * Graph Compression
   :PROPERTIES:
   :CUSTOM_ID: main
   :END:
 ** Graph compression on the Software Heritage archive              :noexport:
    :PROPERTIES:
    :CUSTOM_ID: oneslide
    :END:
 *** 
      #+BEGIN_EXPORT latex
      \vspace{-3mm}
      \footnotesize
      \begin{thebibliography}{Foo Bar, 1969}
      \bibitem{Boldi2020} Paolo Boldi, Antoine Pietri, Sebastiano Vigna, Stefano Zacchiroli
      \newblock Ultra-Large-Scale Repository Analysis via Graph Compression
      \newblock SANER 2020, 27th Intl. Conf. on Software Analysis, Evolution and Reengineering. IEEE
      \end{thebibliography}
      #+END_EXPORT
   #+BEAMER: \vspace{-1mm}
 *** Research question
     Is it possible to efficiently perform software development history analyses
     at the scale of Software Heritage archive on a single, relatively cheap
     machine?
   #+BEAMER: \vspace{-1mm}
 *** Idea
     Apply state-of-the-art graph compression techniques from the field of Web
     graph / social network analysis.
   #+BEAMER: \vspace{-1mm}
 *** Results
     The entire archive graph (25 B nodes, 350 B edges) can be loaded in 200 GiB
-    and then traversed at the cost of tens of nanoseconds per edge (= a few
-    hours for a complete single-thread traversal of the archive).
+    and then traversed at the cost of tens of ns per edge (= a few hours for a
+    full single-thread visit).
+*** 
+   :PROPERTIES:
+   :BEAMER_env: ignoreheading
+   :END:
+   Java and gRPC APIs available:
+   #+BEAMER: \small
+   [[https://docs.softwareheritage.org/devel/swh-graph/grpc-api.html][docs.softwareheritage.org/devel/swh-graph/grpc-api.html]]
 
 ** Graph compression on the Software Heritage archive
    :PROPERTIES:
    :CUSTOM_ID: intro
    :END:
 *** 
      #+BEGIN_EXPORT latex
      \vspace{-3mm}
      \begin{thebibliography}{Foo Bar, 1969}
      \bibitem{Boldi2020} Paolo Boldi, Antoine Pietri, Sebastiano Vigna, Stefano Zacchiroli
      \newblock Ultra-Large-Scale Repository Analysis via Graph Compression
      \newblock SANER 2020, 27th Intl. Conf. on Software Analysis, Evolution and Reengineering. IEEE
      \end{thebibliography}
      #+END_EXPORT
 *** Research question
     Is it possible to efficiently perform software development history analyses
     at ultra large scale (= the scale of Software Heritage archive or more), on
     a single, relatively cheap machine?
 *** Idea
     Apply state-of-the-art graph compression techniques from the field of Web
     graph / social network analysis.
 
 ** Background --- (Web) graph compression
    :PROPERTIES:
    :CUSTOM_ID: background1
    :END:
 *** The graph of the Web
     :PROPERTIES:
     :BEAMER_env: definition
     :END:
     Directed graph that has Web pages as nodes and hyperlinks between them as
     edges.
 *** Properties (1)
     - **Locality:** pages link to pages whose URLs are lexicographically
       similar. URLs share long common prefixes.
 
     → use *D-gap compression*
 *** Adjacency lists
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.51
     :END:
     #+BEAMER: \scriptsize
     | *Node* | *Outdegree* | *Successors*                         |
     |--------+-------------+--------------------------------------|
     |    ... |         ... | ...                                  |
     |     15 |          11 | 13,15,16,17,18,19,23,24,203,315,1034 |
     |     16 |          10 | 15,16,17,22,23,24,315,316,317,3041   |
     |     17 |           0 |                                      |
     |     18 |           5 | 13,15,16,17,50                       |
     |    ... |         ... | ...                                  |
 *** D-gapped adjacency lists
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.48
     :END:
     #+BEAMER: \scriptsize
     | *Node* | *Outdegree* | *Successors*                |
     |--------+-------------+-----------------------------|
     |    ... |         ... | ...                         |
     |     15 |          11 | 3,1,0,0,0,0,3,0,178,111,718 |
     |     16 |          10 | 1,0,0,4,0,0,290,0,0,2723    |
     |     17 |           0 |                             |
     |     18 |           5 | 9,1,0,0,32                  |
     |    ... |         ... | ...                         |
 
 ** Background --- (Web) graph compression (cont.)
    :PROPERTIES:
    :CUSTOM_ID: background2
    :END:
 *** The graph of the Web
     :PROPERTIES:
     :BEAMER_env: definition
     :END:
     Directed graph that has Web pages as nodes and hyperlinks between them as
     edges.
 *** Properties (2)
     - **Similarity:** pages that are close together in lexicographic order tend
       to have many common successors.
     → use *reference compression*
 *** Adjacency lists
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.47
     :END:
     #+BEAMER: \scriptsize
     | *Node* | *Outd.* | *Successors*                         |
     |--------+---------+--------------------------------------|
     |    ... |     ... | ...                                  |
     |     15 |      11 | 13,15,16,17,18,19,23,24,203,315,1034 |
     |     16 |      10 | 15,16,17,22,23,24,315,316,317,3041   |
     |     17 |       0 |                                      |
     |     18 |       5 | 13,15,16,17,50                       |
     |    ... |     ... | ...                                  |
 *** Copy lists
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.60
     :END:
     #+BEAMER: \scriptsize
     | *Node* | *Ref.* | *Copy list* | *Extra nodes*                        |
     |--------+--------+-------------+--------------------------------------|
     |    ... |    ... | ...         | ...                                  |
     |     15 |      0 |             | 13,15,16,17,18,19,23,24,203,315,1034 |
     |     16 |      1 | 01110011010 | 22,316,317,3041                      |
     |     17 |        |             |                                      |
     |     18 |      3 | 11110000000 | 50                                   |
     |    ... |    ... | ...         |                                      |
 
 ** Background --- Web graph compression (OLD)                      :noexport:
    Borrowing (great!) slides from:
    #+BEGIN_EXPORT latex
    \begin{thebibliography}{}
    \bibitem{Pibiri2018} Giulio Ermanno Pibiri
    \newblock Effective Web Graph Representations, 2018
    \newblock \url{http://pages.di.unipi.it/pibiri/slides/webgraphs\_compression.pdf}
    \end{thebibliography}
    #+END_EXPORT
 
 ** Background -- Web graph compression (imported slides) (OLD)     :noexport:
    :PROPERTIES:
    :BEAMER_env: ignoreheading
    :END:
    #+BEGIN_EXPORT latex
    {
    \setbeamercolor{background canvas}{bg=}
    \setbeamertemplate{background}{}
    \includepdf[pages={4,11,12,13}]{webgraphs_compression.pdf}
    \addtocounter{framenumber}{4}
    }
    #+END_EXPORT
 
 ** Corpus
    :PROPERTIES:
    :CUSTOM_ID: corpus
    :END:
 *** Nodes
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.5
     :END:
     #+BEAMER: \small
     | *Node type* | *N. of nodes* |
     |-------------+---------------|
     | origins     | 88 M          |
     | snapshots   | 57 M          |
     | releases    | 9.9 M         |
     | revisions   | 1.1 B         |
     | directories | 4.9 B         |
     | contents    | 5.5 B         |
     |-------------+---------------|
     | Total nodes | 12 B          |
 *** Edges
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.5
     :END:
     #+BEAMER: \footnotesize
     | *Edge type*           | *N. of edges* |
     |-----------------------+---------------|
     | origin → snapshot     | 195 M         |
     | snapshot → revision   | 616 M         |
     | snapshot → release    | 215 M         |
     | release → revision    | 9.9 M         |
     | revision → revision   | 1.2 B         |
     | revision → directory  | 1.1 B         |
     | directory → directory | 48 B          |
     | directory → revisiony | 482 M         |
     | directory → content   | 112 B         |
     |-----------------------+---------------|
     | Total edges           | 165 B         |
 *** 
     :PROPERTIES:
     :BEAMER_env: ignoreheading
     :END:
     #+BEAMER: \vspace{1mm}
     Stats for archive snapshot 2018-09-25, from the Software Heritage graph
     dataset. Growth rate: exponential, doubling every 22-30 months, cf.:
     #+BEGIN_EXPORT latex
     \begin{thebibliography}{Foo Bar, 1969}
     \footnotesize
     \bibitem{Rousseau2020} Roberto Di Cosmo, Guillaume Rousseau, Stefano Zacchiroli
     \newblock Software Provenance Tracking at the Scale of Public Source Code
     \newblock Empirical Software Engineering 25(4): 2930-2959 (2020)
     \end{thebibliography}
     #+END_EXPORT
 
 ** Graph compression pipeline
    :PROPERTIES:
    :CUSTOM_ID: pipeline
    :END:
     #+BEAMER: \hspace*{-0.1\linewidth} \includegraphics[width=1.2\linewidth]{compression/compression-steps}
     #+BEAMER: \vspace{-1cm}
 *** 
     - *MPH*: minimal perfect hash, mapping Merkle IDs to 0..N-1 integers
     - *BV* compress: Boldi-Vigna compression (based on MPH order)
     - *BFS*: breadth-first visit to renumber
     - *Permute*: update BV compression according to BFS order
 *** (Re)establishing locality
     - key for good compression is a node ordering that ensures locality and
       similarity
     - which is very much /not/ the case with Merkle IDs, ...but is the case
       /again/ after BFS reordering
 
 ** Compression experiment
    :PROPERTIES:
    :CUSTOM_ID: compexp
    :END:
    | *Step*      | *Wall time* (hours) |
    |-------------+---------------------|
    | MPH         |                   2 |
    | BV Compress |                  84 |
    | BFS         |                  19 |
    | Permute     |                  18 |
    | Transpose   |                  15 |
    |-------------+---------------------|
    | Total       |        138 (6 days) |
 
    - server equipped with 24 CPUs and 750 GB of RAM
    - RAM mostly used as I/O cache for the BFS step
    - /minimum/ memory requirements are close to the RAM needed to load the
      final compressed graph in memory
 
 ** Compression efficiency (space)
    :PROPERTIES:
    :CUSTOM_ID: spaceefficiency
    :END:
 *** 
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.4
     :END:
     | *Forward  graph*  |        |
     | total size        | 91 GiB |
     | bits per edge     |   4.91 |
     | compression ratio |  15.8% |
 *** 
     :PROPERTIES:
     :BEAMER_env: block
     :BEAMER_col: 0.4
     :END:
     | *Backward  graph* |        |
     | total size        | 83 GiB |
     | bits per edge     |   4.49 |
     | compression ratio |  14.4% |
 *** Operating cost
     The structure of a full bidirectional archive graph fits in less than 200
     GiB of RAM, for a hardware cost of ~300 USD.
 
 ** Compression efficiency (time)
    :PROPERTIES:
    :CUSTOM_ID: timeefficiency
    :END:
 *** Benchmark --- Full BFS visit (single thread)
     #+BEAMER: \begin{columns}\begin{column}{0.45\textwidth}
     | *Forward  graph* |                |
     |------------------+----------------|
     | wall time        | 1h48m          |
     | throughput       | 1.81 M nodes/s |
     |                  | (553 ns/node)  |
     #+BEAMER: \end{column}\begin{column}{0.45\textwidth}
     | *Backward graph* |                |
     |------------------+----------------|
     | wall time        | 3h17m          |
     | throughput       | 988 M nodes/s  |
     |                  | (1.01 µs/node) |
     #+BEAMER: \end{column}\end{columns}
 *** Benchmark --- Edge lookup
     random sample: 1 B nodes (8.3% of entire graph); then enumeration of all
     successors
     #+BEAMER: \begin{columns}\begin{column}{0.45\textwidth}
     | *Forward  graph* |                |
     |------------------+----------------|
     | visited edges    | 13.6 B         |
     | throughput       | 12.0 M edges/s |
     |                  | (83 ns/edge)   |
     #+BEAMER: \end{column}\begin{column}{0.45\textwidth}
     | *Backward graph* |                |
     |------------------+----------------|
     | visited edges    | 13.6 B         |
     | throughput       | 9.45 M edges/s |
     |                  | (106 ns/edge)  |
     #+BEAMER: \end{column}\end{columns}
 *** 
     :PROPERTIES:
     :BEAMER_env: ignoreheading
     :END:
     Note how edge lookup time is close to DRAM random access time (50-60 ns).
 
 ** Discussion
    :PROPERTIES:
    :CUSTOM_ID: discussion
    :END:
 *** Incrementality
     compression is *not incremental*, due to the use of contiguous integer
     ranges
     - but the graph is append-only, so...
     - ...based on expected graph growth rate it should be possible to
       pre-allocate enough free space in the integer ranges to support
       *amortized incrementality* (future work)
   #+BEAMER: \pause
 *** In-memory v. on-disk
     the compressed in-memory graph structure has *no attributes*
     - usual design is to exploit the 0..N-1 integer ranges to *memory map node
       attributes* to disk for efficient access
     - works well for queries that does graph traversal first and "join" node
       attributes last; ping-pong between the two is expensive
     - edge attributes are more problematic (work in progress)
diff --git a/common/modules/swh-fuse.org b/common/modules/swh-fuse.org
index 74f6bbf..3597a0b 100644
--- a/common/modules/swh-fuse.org
+++ b/common/modules/swh-fuse.org
@@ -1,139 +1,139 @@
 #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt)
 #+INCLUDE: "prelude.org" :minlevel 1
 
 ** Software Heritage Filesystem (SwhFS)
   :PROPERTIES:
   :CUSTOM_ID: oneslide
   :END:
 *** 
     The *Software Heritage Filesystem (SwhFS)* is a user-space POSIX filesystem
     that enables browsing parts of the Software Heritage archive as if it were
     locally available.
 
 *** 
-    - code: https://forge.softwareheritage.org/source/swh-fuse/
-    - documentation: https://docs.softwareheritage.org/devel/swh-fuse/
+    - code: [[https://forge.softwareheritage.org/source/swh-fuse/][forge.softwareheritage.org/source/swh-fuse]]
+    - documentation: [[https://docs.softwareheritage.org/devel/swh-fuse/][docs.softwareheritage.org/devel/swh-fuse]]
 *** 
     #+BEGIN_EXPORT latex
     \begin{thebibliography}{Foo Bar, 1969}
     \bibitem{Allancon2021} Thibault Allançon, Antoine Pietri, Stefano Zacchiroli
     \newblock The Software Heritage Filesystem (SwhFS): Integrating Source Code Archival with Development
     \newblock ICSE 2021: The 43rd International Conference on Software Engineering
     \newblock \url{https://arxiv.org/abs/2102.06390}
     \end{thebibliography}
     #+END_EXPORT
 
 ** Software Heritage Filesystem (SwhFS) --- Example                :noexport:
   :PROPERTIES:
   :CUSTOM_ID: examplemini
   :END:
 *** 
 #+BEAMER: \footnotesize
 #+BEGIN_SRC
 $ mkdir swhfs
 $ swh fs mount swhfs/  # mount the archive
 $ cd swhfs/
 
 $ cat archive/swh:1:cnt:c839dea9e8e6f0528b468214348fee8669b305b2
 #include <stdio.h>
 
 int main(void) {
     printf("Hello, World!\n");
 }
 
 $ cd archive/swh:1:dir:1fee702c7e6d14395bbf5ac3598e73bcbf97b030
 $ ls | wc -l
 127
 $ grep -i antenna THE_LUNAR_LANDING.s | cut -f 5
 # IS THE LR ANTENNA IN POSITION 1 YET
 # BRANCH IF ANTENNA ALREADY IN POSITION 1
 #+END_SRC
 * Software Heritage Filesystem (SwhFS) --- Tutorial
   :PROPERTIES:
   :CUSTOM_ID: tutorial
   :END:
 ** Software Heritage Filesystem (SwhFS) --- Tutorial
 *** 
   #+BEGIN_SRC
 $ pip install swh.fuse  # install SwhFS
 
 $ mkdir swhfs
 $ swh fs mount swhfs/  # mount the archive
 
 $ ls -1F swhfs/  # list entry points
 archive/  # <- start browsing from here
 cache/
 origin/
 README
 #+END_SRC
 
 ** Software Heritage Filesystem (SwhFS) --- Tutorial (cont.)
 *** 
   #+BEAMER: \footnotesize
   #+BEGIN_SRC
 $ cd swhfs/
 
 $ cat archive/swh:1:cnt:c839dea9e8e6f0528b468214348fee8669b305b2
 #include <stdio.h>
 
 int main(void) {
     printf("Hello, World!\n");
 }
 #+END_SRC
 
 ** Software Heritage Filesystem (SwhFS) --- Tutorial (cont.)
 *** 
   #+BEAMER: \footnotesize
   #+BEGIN_SRC
 $ cd archive/swh:1:dir:1fee702c7e6d14395bbf5ac3598e73bcbf97b030
 
 $ ls | wc -l
 127
 
 $ grep -i antenna THE_LUNAR_LANDING.s | cut -f 5
 # IS THE LR ANTENNA IN POSITION 1 YET
 # BRANCH IF ANTENNA ALREADY IN POSITION 1
 #+END_SRC
 
 ** Software Heritage Filesystem (SwhFS) --- Tutorial (cont.)
 *** 
   #+BEAMER: \footnotesize
   #+BEGIN_SRC
 $ cd archive/swh:1:rev:9d76c0b163675505d1a901e5fe5249a2c55609bc
 
 $ ls -F
 history/  meta.json@  parent@  parents/  root@
 
 $ jq '.author.name, .date, .message' meta.json
 "Michal Golebiowski-Owczarek"
 "2020-03-02T23:02:42+01:00"
 "Data:Event:Manipulation: Prevent collisions with Object.prototype ..."
 
 $ find root/src/ -type f -name '*.js' | xargs cat | wc -l
 10136
 #+END_SRC
 
 ** Software Heritage Filesystem (SwhFS) --- Tutorial (cont.)
 *** 
   #+BEAMER: \footnotesize
   #+BEGIN_SRC
 $ swh web search git-annex --limit 1
 ...
 git://git.joeyh.name/git-annex.git \
   https://archive.softwareheritage.org/api/1/origin/git://git.joeyh.name/git-annex.git/visits/
 ...
 
 $ swh web search git-annex --url-encode | cut -f 1
 git%3A%2F%2Fgit.joeyh.name%2Fgit-annex.git
 
 $ cd origin/git%3A%2F%2Fgit.joeyh.name%2Fgit-annex.git
 $ ls -F
 2020-12-19/
 
 $ ls 2020-12-19/snapshot/refs/heads/master/root/
 Annex/                COPYRIGHT           NEWS
 Annex.hs              Creds.hs            P2P/
 Assistant/            Crypto.hs           README
 Assistant.hs          Database/           Remote/
 Backend/              debian/             RemoteDaemon/
 #+END_SRC
diff --git a/talks-public/2022-09-28-ese-research/2022-09-28-ese-research.org b/talks-public/2022-09-28-ese-research/2022-09-28-ese-research.org
index c5ab101..4c39892 100644
--- a/talks-public/2022-09-28-ese-research/2022-09-28-ese-research.org
+++ b/talks-public/2022-09-28-ese-research/2022-09-28-ese-research.org
@@ -1,67 +1,115 @@
 #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt)
 #+TITLE: Empirical Software Engineering Research with Software Heritage
 #+BEAMER_HEADER: \date[2022-09-28]{28 September 2022}
 #+BEAMER_HEADER: \title[Empirical Software Eng. Research with Software Heritage]{Empirical Software Engineering Research with Software Heritage}
 #+AUTHOR: Stefano Zacchiroli
 #+DATE: 28 September 2022
 #+EMAIL: stefano.zacchiroli@telecom-paris.fr
 
 #+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1
 #+INCLUDE: "../../common/modules/169.org"
 #+BEAMER_HEADER: \institute[Télécom Paris]{Télécom Paris, Polytechnic Institute of Paris\\ {\tt stefano.zacchiroli@telecom-paris.fr}}
 #+BEAMER_HEADER: \author{Stefano Zacchiroli}
 
 * Datasets
 ** Graph dataset
 #+INCLUDE: "../../common/modules/dataset.org::#graphdataset" :only-contents t
 ** Graph dataset --- example
 #+INCLUDE: "../../common/modules/dataset.org::#graphquery1" :only-contents t
 ** License dataset
 #+INCLUDE: "../../common/modules/dataset.org::#licensedataset" :only-contents t
 * Accessing source code artifacts
 ** The Software Heritage Filesystem (SwhFS)
 #+INCLUDE: "../../common/modules/swh-fuse.org::#oneslide" :only-contents t
 ** The Software Heritage Filesystem (SwhFS) --- example
 #+INCLUDE: "../../common/modules/swh-fuse.org::#examplemini" :only-contents t
 ** Graph compression
 #+INCLUDE: "../../common/modules/graph-compression.org::#oneslide" :only-contents t
 * Software provenance and evolution
 ** Software provenance and evolution
-   TODO
+#+BEAMER: \begin{center} \includegraphics[width=0.7\textwidth]{commit-time-distro} \end{center} \vspace{-2mm}
+*** Key findings
+    - The amount of original commits in public code doubles every ~30 months
+      and has been doing so for 20+ years; original source code files double
+      every ~22 months
+    - It is possible to trace the provenance of source code artifacts at this
+      scale in a compact relational model via the notion of isochrone graphs.
+
+  #+BEAMER: \vspace{-2mm}
+*** 
+     #+BEGIN_EXPORT latex
+     \vspace{-2mm}
+     \footnotesize
+     \begin{thebibliography}{Foo Bar, 1969}
+     \bibitem{Rousseau2020} Rousseau, Di Cosmo, Zacchiroli\newblock
+     Software Provenance Tracking at the Scale of Public Source Code\newblock
+     In Empirical Software Engineering, 2020
+     \end{thebibliography}
+     #+END_EXPORT
 * Software forks
 ** Software forks
-   TODO
+*** Idea
+- Forks can be detected via either platform metadata (e.g., GitHub keeping
+  track of who clicked "fork" on what repo; the most common approach), or via
+  shared version control system history.
+- Thanks to deduplication and platform agnosticity, Software Heritage provide a
+  privileged observation point on the global fork ecosystem in public code.
+*** Research questions
+- What is the right definition of "being a fork"? (methodology)
+- How many forks could we miss by looking only at platform metadata?
+- How many "cross-platform" forks (e.g., GitHub → GitLab) exist in the wild?
+** Software forks (cont.)
+*** Findings
+- Forks classification: based on platform metadata (“type 1” forks), sharing at
+  least one commit (“type 2”), sharing a common root directory at some point in
+  VCS history (“type 3”).
+- Up to 16% forks could be overlooked by considering only GitHub type 1 forks
+  (a potentially significant threat to validity!).
+- Relevant independent development activity can happen on GitLab.com for
+  projects initially just mirrored from GitHub.
+*** 
+     #+BEGIN_EXPORT latex
+     \vspace{-3mm} \footnotesize
+     \begin{thebibliography}{Foo Bar, 1969}
+     \bibitem{Pietri2020} Pietri, Rousseau, Zacchiroli.\newblock
+     Forking Without Clicking: on How to Identify Software Repository Forks.\newblock
+     MSR 2020
+     \bibitem{Bhattacharjee2020} Bhattacharjee et al.\newblock
+     An exploratory study to find motives behind cross-platform forks from Software Heritage dataset.\newblock
+     MSR 2020
+     \end{thebibliography}
+     #+END_EXPORT
 * Diversity, equity, and inclusion
 ** Diversity, equity, and inclusion
 *** Idea
     Archived commit metadata contains public information that can be mined to
     study long-term trends of diversity, equity, and inclusion (DEI) traits of
     the global population of public code contributors.
 
 *** Key findings on the gender gap
     - Male authors contributed 92% of public code commits up to 2019.
     - The ratio of female authors (and their contributions) has grown stably
       for 15 years reaching for the first time 10% of yearly contributions
       in 2019.
     - The COVID-19 pandemic has reversed the trend.
 
 ** Diversity, equity, and inclusion (cont.)
 
 *** Key findings on the geographic gap
     - The early decades of public code were dominated by contributions from
       North America, followed by a period of alternating dominance between
       North America and Europe.
     - Since then geographic diversity has increased constantly, with raising
       importance of contributions from Central and South America.
     - The trend of increased female contributions is almost worlwide, with the
       notable exception of specific regions of Asia were it is either slower or
       flat.
 
 *** References
     #+BEAMER: \footnotesize
     - Zacchiroli. /Gender differences in public code contributions: a 50-year
       perspective/. IEEE Software, 2021
     - Rossi and Zacchiroli. /Worldwide gender differences in public code
       contributions/. ICSE SEIS, 2022
     - Rossi and Zacchiroli. /Geographic diversity in public code
       contributions/. MSR 2022