diff --git a/reports/node_mapping/node_mapping.tex b/reports/node_mapping/node_mapping.tex
index ab831e8..cd7e81c 100644
--- a/reports/node_mapping/node_mapping.tex
+++ b/reports/node_mapping/node_mapping.tex
@@ -1,155 +1,203 @@
 \documentclass[11pt,a4paper]{article}
 
 \usepackage[a4paper,left=2cm,right=2cm,top=2.5cm,bottom=2.5cm]{geometry}
 \usepackage[english]{babel}
 \usepackage{booktabs}
 \usepackage{graphicx}
 \usepackage{hyperref}
 \usepackage{minted}
 \usepackage{parskip}
 \usepackage{siunitx}
 \usepackage{tikz}
 
 \title{Google Summer of Code 2019}
 \author{Thibault Allançon}
 \date{June 6, 2019}
 
 \newcommand{\mmap}{\mintinline{text}{mmap}}
 
 \begin{document}
 
 \maketitle
 
 Software Heritage refers to its graph nodes using string persistent
 identifiers\footnote{\url{https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html}}.
 During the compression, WebGraph maps those labels to long identifiers. We need
 an efficient way to have a bidirectional map between the labels and internal
 node ids.
 
 \section{Additional informations}
 
 Example of a SWH string identifier:
 
 \begin{minted}{text}
     swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2
 \end{minted}
 
 The first step of the compression process is to map the input strings to long
 identifiers using the \mintinline{text}{GOVMinimalPerfectHashFunction} from the
 Sux4J\footnote{\url{http://sux4j.di.unimi.it/}} framework. The
 \mintinline{text}{.mph} function generated is a \textbf{minimal perfect hash
 function} (mapping with no collisions $n$ keys to $n$ consecutive integers).
 
 Later during the compression process, the nodes are re-ordered in a certain way
 (to be precise: in the order of a BFS traversal), so the MPH mapping is
 shuffled.
 
 \begin{figure}[H]
     \centering
     \begin{tikzpicture}[auto, node distance=6cm]
         \node[label=below:{(string)}] (input) {\mintinline{text}{.nodes.csv}};
         \node[right of=input, label=below:{(long)}] (mph) {\mintinline{text}{.mph}};
         \node[right of=mph, label=below:{(long)}] (bfs) {\mintinline{text}{.order}};
 
         \draw[->] (input) edge node{MPH function} (mph);
         \draw[->] (mph) edge node{BFS ordering} (bfs);
     \end{tikzpicture}
     \caption{End-to-end node mapping}
 \end{figure}
 
 We want to create a bidirectional map to retrieve efficiently the corresponding
 strings or longs. Using naive hash maps is not possible due to memory usage (we
-have around 20B nodes), hence we must use a disk-based solution.
+have around 10B nodes), hence we must use a disk-based solution.
 
-\section{Solution 1: \mmap{}}
+\newpage
+
+\section{Solution 1: Simple file}
 
 Idea: dump the two mappings into huge files and load them using \mmap{} system
 call to avoid storing them into memory.
 
 Map from long to string only needs to store the SWH id because the MPH is a
 perfect mapping, so we can store on line $i$ the SWH persistent identifier of
 node $i$, and simply read the $i$-th line to retrieve it.
 
 Map from string to long needs to store both the SWH id and the node id on the
 same line. To efficiently get the corresponding node id we can store the lines
 ordered by SWH id, and binary search to get the mapping.
 
-The main problem with this solution is technical since Java has a maximum size
-on \mmap{}-ed files of ~2GB. To overcome the limitation and \mmap{} very large
-files, there is the \mintinline{text}{ByteBufferInputStream} class from the
+
+\subsection{Java I/O API}
+
+Java offers many ways to write to a file, three were considered and timed:
+
+\begin{small}
+\begin{minted}{java}
+// Version 1
+try (Writer writer = new BufferedWriter(new OutputStreamWriter(
+         new FileOutputStream("output.txt"), "utf-8"))) {
+  writer.write("test");
+}
+
+// Version 2
+try (Writer writer = new BufferedWriter(new FileWriter("output.txt"))) {
+  writer.write("test");
+}
+
+// Version 3
+try (FileWriter writer = new FileWriter("output.txt")) {
+  writer.write("test");
+}
+\end{minted}
+\end{small}
+
+String ids are sorted in alphabetical order. When reading them, the long to
+string map needs to write data in random places in the output file. One way to
+transform it into a sequential write is to put the data into an array, and dump
+the array sequentially into the file once the reading process is done.
+
+\subsection{\mmap{}}
+
+Java has a maximum size on \mmap{}-ed files of 2GB. To overcome the limitation
+and \mmap{} very large files, there is the
+\mintinline{text}{ByteBufferInputStream} class from the
 dsiutils\footnote{\url{http://dsiutils.di.unimi.it/}} package. But this only
 applies to read from files, not to write to them. I had to create the
-\mintinline{text}{ByteBufferOutputStream} to allow writing on huge \mmap{}-ed
-files.
+\mintinline{text}{ByteBufferOutputStream} class to allow writing on huge
+\mmap{}-ed files.
+
+\subsection{Timings}
+
+Experiments on single file were done on a VM with 1TB of RAM and 40 vCPUs using
+the \mintinline{text}{rev_to_rev} dataset (1B nodes and 1B edges).
+
+\begin{center}
+    \begin{tabular}{@{} l *5r @{}}
+        \toprule
+        \multicolumn{1}{c}{} &
+            \textbf{I/O v1} & \textbf{I/O v2} &
+            \textbf{I/O v3} &
+            \textbf{random \mmap{}} & \textbf{sequential \mmap{}} \\
+        \midrule
+        \texttt{Dump time}
+            & 4990s & \textbf{4890s} & 5350s & 9290s & 5240s \\
+        \bottomrule
+    \end{tabular}
+\end{center}
 
-Note: I first started with the \mmap{} solution and got early results that
-allowed me to compare with the DB-based solution, before going full in with the
-\mmap{} one and coding the \mintinline{text}{ByteBufferOutputStream}.
 
 \section{Solution 2: Databases}
 
 Another approach would be to simply use a disk-based database, two were
 considered in my tests: RocksDB, and HaloDB. In both cases, I tried optimizing
 the database options to fit our needs and get the best timings.
 
 On the technical side, the code needed is simpler and more generic than the
-custom \mmap{} solution.  However, this can only be a feasible solution if map
+custom file solution. However, this can only be a feasible solution if map
 retrieval/building time overhead are small enough to scale to the entire graph.
 
 \subsection{RocksDB}
 
 RocksDB is a popular and mature persistent key-value database developed by
 Facebook. It is coded in C++ but provides a Java API.
 
 The code used to dump and load the mappings can be found in
 \mintinline{text}{NodeIdMapRocksDB.java} file.
 
 RocksDB has \textbf{many} parameters to tune (see their tuning
 guide\footnote{\url{https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide}}),
 I tried the most obvious ones but being completely new to this software, the
 benchmarks are not be the best possible.
 
 \subsection{HaloDB}
 
 HaloDB is a relatively new database scheme developed by Yahoo. It got my
 attention because it is written in Java and leaves out features of RocksDB we
 don't need (like range scans) in order to have better performance.
 
 The code used to dump and load the mappings can be found in
 \mintinline{text}{NodeIdMapHaloDB.java} file.
 
-\newpage
-
 \section{Results}
 
 Experiments were done on a VM with 700GB of RAM and 72 vCPUs using the
 \mintinline{text}{rev_to_rev} dataset (1B nodes and 1B edges).
 
 \begin{center}
     \begin{tabular}{@{} l *3r @{}}
         \toprule
         \multicolumn{1}{c}{} &
-            \textbf{\mmap{}} & \textbf{RocksDB} &
+            \textbf{file solution} & \textbf{RocksDB} &
             \textbf{HaloDB} \\
         \midrule
         \texttt{dump}
-            & 1h24min & estimated 4h & estimated 8h \\
+            & < 1h & estimated 4h & estimated 8h \\
         \texttt{load}
             & 0.003s & & \\
         \texttt{get node id}
             & ~900µs & & \\
         \texttt{get swh id}
             & ~50µs & & \\
         \texttt{RAM usage (dump)}
             & 130GB & & \\
         \texttt{RAM usage (load)}
             & 5GB & & \\
         \texttt{Disk space}
             & 108GB & & \\
         \bottomrule
     \end{tabular}
 \end{center}
 
 Note: \texttt{RAM usage} includes in-memory compressed graph.
 
 \end{document}