diff --git a/common/modules/r+d-challenges.org b/common/modules/r+d-challenges.org
new file mode 100644
index 0000000..551dafb
--- /dev/null
+++ b/common/modules/r+d-challenges.org
@@ -0,0 +1,77 @@
+#+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt)
+
+# R&D challenges
+
+#+INCLUDE: "prelude.org" :minlevel 1
+* R&D challenges
+  :PROPERTIES:
+  :CUSTOM_ID: main
+  :END:
+** Data model
+*** The real world sucks
+    - corrupted repositories
+    - takedown notices
+    - partial irrecoverable data losses
+    #+BEAMER: \pause
+*** /Incomplete/ Merkle DAGs
+    - nodes can go missing at archival time or disappear later on
+    - top-level hash(es) no longer capture the full state of the archive
+    #+BEAMER: \pause
+*** Open questions
+    - how do you capture such full state then?
+    - how do you efficiently check if something is to be re-archived?
+    - ultimately, what's your notion of having "fully archived" something?
+** Storage
+*** Archive stats
+    - as a graph: ~10 B nodes, ~100 B edges
+    - nodes breakdown: ~40% contents, ~40% directories, ~10% commits
+    - content size: ~400 TB (raw), ~200 TB compressed (content by content)
+    - median compressed size: 3 KB
+    - i.e., *a lot of very small files*
+  #+BEAMER: \pause
+*** Current storage solution (unsatisfactory)
+    - contents: ad hoc object storage with multiple backends
+      - file-system, Azure, AWS, etc.
+    - rest of the graph: Postgres (~6 TB)
+      - rationale: recursive queries to traverse the graph
+      - (no, it doesn't work at this scale)
+** Storage (cont.)
+*** Requirements
+    - long-term storage
+    - suitable for distribution/replica
+    - suitable for scale-out processing
+  #+BEAMER: \pause
+*** Graph
+    - early experiences with Ceph (RADOS)
+      - not a good fit out of the box
+      - 7x size increase over target retention policy due to large minimum
+        chunk size (64 KB)
+    - ad-hoc object packing (?)
+    - .oO( do we really have to re-invent a file-system? )
+** Storage (cont.)
+*** Contents --- size considerations
+    - a few hundreds TB is not /that/ big, but it cuts off volunteer mirrors
+  #+BEAMER: \pause
+*** Content compression
+    - low compression ration (2x) with 1-by-1 compression
+    - typical Git/VCS packing heuristics do not work here, because contents
+      occur in many different contexts
+    - early experiences with Rabin-style compression & co. were unsatisfactory
+  #+BEAMER: \pause
+*** Distributed archival
+    - massively distributed archival (e.g., P2P) would be nice
+    - but most P2P techs are more like CDNs than archives and do not offer
+      retention policy guarantees (e.g., self-healing)
+** Efficient graph processing
+*** Use cases
+    - Vault: recursive visits to collect archived objects
+    - Provenance: single-destination shortest path
+  #+BEAMER: \pause
+*** Technology
+   - beyond the capabilities of off-the-shelf graph DBs
+   - graph topology: scale-free, but not small world
+   - /probably/ bad fit for Pregel/Chaoss/etc
+   - are web graph style compression techniques suitable for storing and
+     processing the Merkle DAG in memory? (unclear)
+** Provenance tracking                                        :noexport:
+   - TODO