Page MenuHomeSoftware Heritage

No OneTemporary

diff --git a/docs/graph/_images/db-schema.svg b/docs/graph/_images/db-schema.svg
new file mode 100644
index 0000000..2f8dff7
--- /dev/null
+++ b/docs/graph/_images/db-schema.svg
@@ -0,0 +1,522 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
+ "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by graphviz version 2.38.0 (20140413.2041)
+ -->
+<!-- Title: g Pages: 1 -->
+<svg width="1156pt" height="611pt"
+ viewBox="0.00 0.00 1156.00 611.07" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 607.071)">
+<title>g</title>
+<polygon fill="white" stroke="none" points="-4,4 -4,-607.071 1152,-607.071 1152,4 -4,4"/>
+<g id="clust2" class="cluster"><title>cluster_content</title>
+<polygon fill="#f2f2f2" stroke="gray" points="986,-8 986,-209 1140,-209 1140,-8 986,-8"/>
+<text text-anchor="start" x="1041.5" y="-194.8" font-family="Times,serif" font-weight="bold" font-size="14.00">content</text>
+</g>
+<g id="clust3" class="cluster"><title>cluster_directory</title>
+<polygon fill="#f2f2f2" stroke="gray" points="630,-19 630,-356 966,-356 966,-19 630,-19"/>
+<text text-anchor="start" x="767" y="-341.8" font-family="Times,serif" font-weight="bold" font-size="14.00">directories</text>
+</g>
+<g id="clust4" class="cluster"><title>cluster_revision</title>
+<polygon fill="#f2f2f2" stroke="gray" points="218,-364 218,-535 776,-535 776,-364 218,-364"/>
+<text text-anchor="start" x="471" y="-520.8" font-family="Times,serif" font-weight="bold" font-size="14.00">revisions</text>
+</g>
+<g id="clust5" class="cluster"><title>cluster_snapshots</title>
+<polygon fill="#f2f2f2" stroke="gray" points="16.5,-154 16.5,-356 400,-356 400,-154 16.5,-154"/>
+<text text-anchor="start" x="179.75" y="-341.8" font-family="Times,serif" font-weight="bold" font-size="14.00">snapshots</text>
+</g>
+<g id="clust6" class="cluster"><title>cluster_origins</title>
+<polygon fill="#f2f2f2" stroke="gray" points="8,-19 8,-146 370.5,-146 370.5,-19 8,-19"/>
+<text text-anchor="start" x="169.25" y="-131.8" font-family="Times,serif" font-weight="bold" font-size="14.00">origins</text>
+</g>
+<g id="clust7" class="cluster"><title>cluster_release</title>
+<polygon fill="#f2f2f2" stroke="gray" points="14,-371 14,-528 178,-528 178,-371 14,-371"/>
+<text text-anchor="start" x="73" y="-513.8" font-family="Times,serif" font-weight="bold" font-size="14.00">releases</text>
+</g>
+<!-- content -->
+<g id="node1" class="node"><title>content</title>
+<text text-anchor="start" x="1005" y="-162" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="1010,-156 1010,-173 1116,-173 1116,-156 1010,-156"/>
+<polygon fill="none" stroke="black" points="1010,-156 1010,-173 1116,-173 1116,-156 1010,-156"/>
+<text text-anchor="start" x="1045" y="-162" font-family="Times,serif" font-size="10.00"> content </text>
+<text text-anchor="start" x="1012" y="-146" font-family="Times,serif" font-size="10.00"> sha1 </text>
+<text text-anchor="start" x="1056" y="-146" font-family="Times,serif" font-size="10.00"> sha1 </text>
+<text text-anchor="start" x="1100" y="-146" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1109" y="-146" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1118" y="-146" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="1012" y="-131" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="1056" y="-131" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="1100" y="-131" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1109" y="-131" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1118" y="-131" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="1012" y="-116" font-family="Times,serif" font-size="10.00"> length </text>
+<text text-anchor="start" x="1056" y="-116" font-family="Times,serif" font-size="10.00"> bigint </text>
+<text text-anchor="start" x="1100" y="-116" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1109" y="-116" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1118" y="-116" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="1002,-110 1002,-174 1124,-174 1124,-110 1002,-110"/>
+</g>
+<!-- directory -->
+<g id="node2" class="node"><title>directory</title>
+<text text-anchor="start" x="649.5" y="-219" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="654.5,-213 654.5,-230 771.5,-230 771.5,-213 654.5,-213"/>
+<polygon fill="none" stroke="black" points="654.5,-213 654.5,-230 771.5,-230 771.5,-213 654.5,-213"/>
+<text text-anchor="start" x="692" y="-219" font-family="Times,serif" font-size="10.00"> directory </text>
+<text text-anchor="start" x="656.5" y="-203" font-family="Times,serif" font-size="10.00"> id </text>
+<text text-anchor="start" x="711.5" y="-203" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="755.5" y="-203" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="764.5" y="-203" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="773.5" y="-203" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="656.5" y="-188" font-family="Times,serif" font-size="10.00"> dir_entries </text>
+<text text-anchor="start" x="711.5" y="-188" font-family="Times,serif" font-size="10.00"> bigint[] </text>
+<text text-anchor="start" x="755.5" y="-188" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="764.5" y="-188" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="773.5" y="-188" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="656.5" y="-173" font-family="Times,serif" font-size="10.00"> file_entries </text>
+<text text-anchor="start" x="711.5" y="-173" font-family="Times,serif" font-size="10.00"> bigint[] </text>
+<text text-anchor="start" x="755.5" y="-173" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="764.5" y="-173" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="773.5" y="-173" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="656.5" y="-158" font-family="Times,serif" font-size="10.00"> rev_entries </text>
+<text text-anchor="start" x="711.5" y="-158" font-family="Times,serif" font-size="10.00"> bigint[] </text>
+<text text-anchor="start" x="755.5" y="-158" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="764.5" y="-158" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="773.5" y="-158" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="646,-152.5 646,-231.5 779,-231.5 779,-152.5 646,-152.5"/>
+</g>
+<!-- directory_entry_dir -->
+<g id="node3" class="node"><title>directory_entry_dir</title>
+<text text-anchor="start" x="834.5" y="-203" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="839.5,-197 839.5,-214 942.5,-214 942.5,-197 839.5,-197"/>
+<polygon fill="none" stroke="black" points="839.5,-197 839.5,-214 942.5,-214 942.5,-197 839.5,-197"/>
+<text text-anchor="start" x="848.5" y="-203" font-family="Times,serif" font-size="10.00"> directory_entry_dir </text>
+<text text-anchor="start" x="841.5" y="-187" font-family="Times,serif" font-size="10.00"> id </text>
+<text text-anchor="start" x="874.5" y="-187" font-family="Times,serif" font-size="10.00"> bigserial </text>
+<text text-anchor="start" x="926.5" y="-187" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-187" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-187" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="841.5" y="-172" font-family="Times,serif" font-size="10.00"> target </text>
+<text text-anchor="start" x="874.5" y="-172" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="926.5" y="-172" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-172" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-172" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="841.5" y="-157" font-family="Times,serif" font-size="10.00"> name </text>
+<text text-anchor="start" x="874.5" y="-157" font-family="Times,serif" font-size="10.00"> unix_path </text>
+<text text-anchor="start" x="926.5" y="-157" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-157" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-157" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="841.5" y="-142" font-family="Times,serif" font-size="10.00"> perms </text>
+<text text-anchor="start" x="874.5" y="-142" font-family="Times,serif" font-size="10.00"> file_perms </text>
+<text text-anchor="start" x="926.5" y="-142" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-142" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-142" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="831,-136.5 831,-215.5 950,-215.5 950,-136.5 831,-136.5"/>
+</g>
+<!-- directory&#45;&gt;directory_entry_dir -->
+<g id="edge14" class="edge"><title>directory:rtcol2&#45;&gt;directory_entry_dir:ltcol1</title>
+<path fill="none" stroke="black" stroke-dasharray="5,2" d="M779.5,-190C798.819,-190 806.027,-190 821.372,-190"/>
+<polygon fill="black" stroke="black" points="821.5,-193.5 831.5,-190 821.5,-186.5 821.5,-193.5"/>
+</g>
+<!-- directory_entry_file -->
+<g id="node4" class="node"><title>directory_entry_file</title>
+<text text-anchor="start" x="834.5" y="-98" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="839.5,-92 839.5,-109 942.5,-109 942.5,-92 839.5,-92"/>
+<polygon fill="none" stroke="black" points="839.5,-92 839.5,-109 942.5,-109 942.5,-92 839.5,-92"/>
+<text text-anchor="start" x="847.5" y="-98" font-family="Times,serif" font-size="10.00"> directory_entry_file </text>
+<text text-anchor="start" x="841.5" y="-82" font-family="Times,serif" font-size="10.00"> id </text>
+<text text-anchor="start" x="874.5" y="-82" font-family="Times,serif" font-size="10.00"> bigserial </text>
+<text text-anchor="start" x="926.5" y="-82" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-82" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-82" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="841.5" y="-67" font-family="Times,serif" font-size="10.00"> target </text>
+<text text-anchor="start" x="874.5" y="-67" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="926.5" y="-67" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-67" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-67" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="841.5" y="-52" font-family="Times,serif" font-size="10.00"> name </text>
+<text text-anchor="start" x="874.5" y="-52" font-family="Times,serif" font-size="10.00"> unix_path </text>
+<text text-anchor="start" x="926.5" y="-52" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-52" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-52" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="841.5" y="-37" font-family="Times,serif" font-size="10.00"> perms </text>
+<text text-anchor="start" x="874.5" y="-37" font-family="Times,serif" font-size="10.00"> file_perms </text>
+<text text-anchor="start" x="926.5" y="-37" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-37" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-37" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="831,-31.5 831,-110.5 950,-110.5 950,-31.5 831,-31.5"/>
+</g>
+<!-- directory&#45;&gt;directory_entry_file -->
+<g id="edge15" class="edge"><title>directory:rtcol3&#45;&gt;directory_entry_file:ltcol1</title>
+<path fill="none" stroke="black" stroke-dasharray="5,2" d="M779.5,-175C821.727,-175 791.607,-99.8041 821.398,-86.8798"/>
+<polygon fill="black" stroke="black" points="822.309,-90.2705 831.5,-85 821.028,-83.3886 822.309,-90.2705"/>
+</g>
+<!-- directory_entry_rev -->
+<g id="node5" class="node"><title>directory_entry_rev</title>
+<text text-anchor="start" x="834.5" y="-308" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="839.5,-302 839.5,-319 942.5,-319 942.5,-302 839.5,-302"/>
+<polygon fill="none" stroke="black" points="839.5,-302 839.5,-319 942.5,-319 942.5,-302 839.5,-302"/>
+<text text-anchor="start" x="848" y="-308" font-family="Times,serif" font-size="10.00"> directory_entry_rev </text>
+<text text-anchor="start" x="841.5" y="-292" font-family="Times,serif" font-size="10.00"> id </text>
+<text text-anchor="start" x="874.5" y="-292" font-family="Times,serif" font-size="10.00"> bigserial </text>
+<text text-anchor="start" x="926.5" y="-292" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-292" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-292" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="841.5" y="-277" font-family="Times,serif" font-size="10.00"> target </text>
+<text text-anchor="start" x="874.5" y="-277" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="926.5" y="-277" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-277" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-277" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="841.5" y="-262" font-family="Times,serif" font-size="10.00"> name </text>
+<text text-anchor="start" x="874.5" y="-262" font-family="Times,serif" font-size="10.00"> unix_path </text>
+<text text-anchor="start" x="926.5" y="-262" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-262" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-262" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="841.5" y="-247" font-family="Times,serif" font-size="10.00"> perms </text>
+<text text-anchor="start" x="874.5" y="-247" font-family="Times,serif" font-size="10.00"> file_perms </text>
+<text text-anchor="start" x="926.5" y="-247" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="935.5" y="-247" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="944.5" y="-247" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="831,-241.5 831,-320.5 950,-320.5 950,-241.5 831,-241.5"/>
+</g>
+<!-- directory&#45;&gt;directory_entry_rev -->
+<g id="edge16" class="edge"><title>directory:rtcol4&#45;&gt;directory_entry_rev:ltcol1</title>
+<path fill="none" stroke="black" stroke-dasharray="5,2" d="M779.5,-160C839.904,-160 775.961,-279.147 821.308,-293.575"/>
+<polygon fill="black" stroke="black" points="821.112,-297.082 831.5,-295 822.081,-290.149 821.112,-297.082"/>
+</g>
+<!-- directory_entry_dir&#45;&gt;directory -->
+<g id="edge10" class="edge"><title>directory_entry_dir:rtcol2&#45;&gt;directory:ltcol1</title>
+<path fill="none" stroke="black" stroke-dasharray="5,2" d="M950.5,-174C968.95,-174 962.702,-145.388 949.5,-132.5 939.442,-122.682 834.462,-124.364 823,-132.5 783.456,-160.569 826.544,-207.431 787,-235.5 736.442,-271.387 692.077,-278.592 647.5,-235.5 641.458,-229.659 637.363,-219.195 638.649,-212.428"/>
+<polygon fill="black" stroke="black" points="640.98,-215.043 646.5,-206 636.545,-209.627 640.98,-215.043"/>
+</g>
+<!-- directory_entry_file&#45;&gt;content -->
+<g id="edge11" class="edge"><title>directory_entry_file:rtcol2&#45;&gt;content:ltcol2</title>
+<path fill="none" stroke="black" stroke-dasharray="5,2" d="M950.5,-69C983.017,-69 969.503,-119.766 991.951,-130.871"/>
+<polygon fill="black" stroke="black" points="991.492,-134.351 1002,-133 992.943,-127.503 991.492,-134.351"/>
+</g>
+<!-- skipped_content -->
+<g id="node12" class="node"><title>skipped_content</title>
+<text text-anchor="start" x="1005" y="-72" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="1010,-66 1010,-83 1116,-83 1116,-66 1010,-66"/>
+<polygon fill="none" stroke="black" points="1010,-66 1010,-83 1116,-83 1116,-66 1010,-66"/>
+<text text-anchor="start" x="1026.5" y="-72" font-family="Times,serif" font-size="10.00"> skipped_content </text>
+<text text-anchor="start" x="1012" y="-56" font-family="Times,serif" font-size="10.00"> sha1 </text>
+<text text-anchor="start" x="1056" y="-56" font-family="Times,serif" font-size="10.00"> sha1 </text>
+<text text-anchor="start" x="1100" y="-56" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1109" y="-56" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1118" y="-56" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="1012" y="-41" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="1056" y="-41" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="1100" y="-41" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1109" y="-41" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1118" y="-41" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="1012" y="-26" font-family="Times,serif" font-size="10.00"> length </text>
+<text text-anchor="start" x="1056" y="-26" font-family="Times,serif" font-size="10.00"> bigint </text>
+<text text-anchor="start" x="1100" y="-26" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1109" y="-26" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="1118" y="-26" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="1002,-20 1002,-84 1124,-84 1124,-20 1002,-20"/>
+</g>
+<!-- directory_entry_file&#45;&gt;skipped_content -->
+<g id="edge12" class="edge"><title>directory_entry_file:rtcol2&#45;&gt;skipped_content:ltcol2</title>
+<path fill="none" stroke="black" stroke-dasharray="5,2" d="M950.5,-69C972.334,-69 975.734,-50.1459 992.096,-44.5486"/>
+<polygon fill="black" stroke="black" points="992.661,-48.0029 1002,-43 991.579,-41.0869 992.661,-48.0029"/>
+</g>
+<!-- revision -->
+<g id="node10" class="node"><title>revision</title>
+<text text-anchor="start" x="439" y="-488" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="444,-482 444,-499 586,-499 586,-482 444,-482"/>
+<polygon fill="none" stroke="black" points="444,-482 444,-499 586,-499 586,-482 444,-482"/>
+<text text-anchor="start" x="496" y="-488" font-family="Times,serif" font-size="10.00"> revision </text>
+<text text-anchor="start" x="446" y="-472" font-family="Times,serif" font-size="10.00"> id </text>
+<text text-anchor="start" x="519" y="-472" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="570" y="-472" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="579" y="-472" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="588" y="-472" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="446" y="-457" font-family="Times,serif" font-size="10.00"> date </text>
+<text text-anchor="start" x="519" y="-457" font-family="Times,serif" font-size="10.00"> timestamp </text>
+<text text-anchor="start" x="570" y="-457" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="579" y="-457" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="588" y="-457" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="446" y="-442" font-family="Times,serif" font-size="10.00"> committer_date </text>
+<text text-anchor="start" x="519" y="-442" font-family="Times,serif" font-size="10.00"> timestamp </text>
+<text text-anchor="start" x="570" y="-442" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="579" y="-442" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="588" y="-442" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="446" y="-427" font-family="Times,serif" font-size="10.00"> directory </text>
+<text text-anchor="start" x="519" y="-427" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="570" y="-427" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="579" y="-427" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="588" y="-427" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="446" y="-412" font-family="Times,serif" font-size="10.00"> message </text>
+<text text-anchor="start" x="519" y="-412" font-family="Times,serif" font-size="10.00"> bytea </text>
+<text text-anchor="start" x="570" y="-412" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="579" y="-412" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="588" y="-412" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="446" y="-397" font-family="Times,serif" font-size="10.00"> author </text>
+<text text-anchor="start" x="519" y="-397" font-family="Times,serif" font-size="10.00"> bigint </text>
+<text text-anchor="start" x="570" y="-397" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="579" y="-397" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="588" y="-397" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="446" y="-382" font-family="Times,serif" font-size="10.00"> committer </text>
+<text text-anchor="start" x="519" y="-382" font-family="Times,serif" font-size="10.00"> bigint </text>
+<text text-anchor="start" x="570" y="-382" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="579" y="-382" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="588" y="-382" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="436,-376 436,-500 594,-500 594,-376 436,-376"/>
+</g>
+<!-- directory_entry_rev&#45;&gt;revision -->
+<g id="edge13" class="edge"><title>directory_entry_rev:rtcol2&#45;&gt;revision:ltcol1</title>
+<path fill="none" stroke="black" stroke-dasharray="5,2" d="M950.5,-279C968.95,-279 962.702,-250.388 949.5,-237.5 939.442,-227.682 836.424,-233.335 823,-237.5 723.625,-268.329 687.678,-283.403 630,-370 596.272,-420.638 649.217,-465.63 602,-504 573.544,-527.124 463.37,-529.477 437,-504 431.203,-498.4 427.224,-488.472 428.203,-481.813"/>
+<polygon fill="black" stroke="black" points="430.773,-484.216 436,-475 426.167,-478.945 430.773,-484.216"/>
+</g>
+<!-- origin -->
+<g id="node6" class="node"><title>origin</title>
+<text text-anchor="start" x="253" y="-98" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="258,-92 258,-109 347,-109 347,-92 258,-92"/>
+<polygon fill="none" stroke="black" points="258,-92 258,-109 347,-109 347,-92 258,-92"/>
+<text text-anchor="start" x="287.5" y="-98" font-family="Times,serif" font-size="10.00"> origin </text>
+<text text-anchor="start" x="260" y="-82" font-family="Times,serif" font-size="10.00"> id </text>
+<text text-anchor="start" x="287" y="-82" font-family="Times,serif" font-size="10.00"> bigserial </text>
+<text text-anchor="start" x="331" y="-82" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="340" y="-82" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="349" y="-82" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="260" y="-67" font-family="Times,serif" font-size="10.00"> type </text>
+<text text-anchor="start" x="287" y="-67" font-family="Times,serif" font-size="10.00"> text </text>
+<text text-anchor="start" x="331" y="-67" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="340" y="-67" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="349" y="-67" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="260" y="-52" font-family="Times,serif" font-size="10.00"> url </text>
+<text text-anchor="start" x="287" y="-52" font-family="Times,serif" font-size="10.00"> text </text>
+<text text-anchor="start" x="331" y="-52" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="340" y="-52" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="349" y="-52" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="249.5,-46 249.5,-110 354.5,-110 354.5,-46 249.5,-46"/>
+</g>
+<!-- origin_visit -->
+<g id="node7" class="node"><title>origin_visit</title>
+<text text-anchor="start" x="27" y="-98" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="32,-92 32,-109 160,-109 160,-92 32,-92"/>
+<polygon fill="none" stroke="black" points="32,-92 32,-109 160,-109 160,-92 32,-92"/>
+<text text-anchor="start" x="69.5" y="-98" font-family="Times,serif" font-size="10.00"> origin_visit </text>
+<text text-anchor="start" x="34" y="-82" font-family="Times,serif" font-size="10.00"> origin </text>
+<text text-anchor="start" x="93" y="-82" font-family="Times,serif" font-size="10.00"> bigint </text>
+<text text-anchor="start" x="144" y="-82" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="153" y="-82" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="162" y="-82" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="34" y="-67" font-family="Times,serif" font-size="10.00"> visit </text>
+<text text-anchor="start" x="93" y="-67" font-family="Times,serif" font-size="10.00"> bigint </text>
+<text text-anchor="start" x="144" y="-67" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="153" y="-67" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="162" y="-67" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="34" y="-52" font-family="Times,serif" font-size="10.00"> date </text>
+<text text-anchor="start" x="93" y="-52" font-family="Times,serif" font-size="10.00"> timestamp </text>
+<text text-anchor="start" x="144" y="-52" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="153" y="-52" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="162" y="-52" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="34" y="-37" font-family="Times,serif" font-size="10.00"> snapshot_id </text>
+<text text-anchor="start" x="93" y="-37" font-family="Times,serif" font-size="10.00"> bigint </text>
+<text text-anchor="start" x="144" y="-37" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="153" y="-37" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="162" y="-37" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="24,-31.5 24,-110.5 168,-110.5 168,-31.5 24,-31.5"/>
+</g>
+<!-- origin_visit&#45;&gt;origin -->
+<g id="edge1" class="edge"><title>origin_visit:rtcol1&#45;&gt;origin:ltcol1</title>
+<path fill="none" stroke="black" d="M168,-85C200.743,-85 211.423,-85 239.684,-85"/>
+<polygon fill="black" stroke="black" points="240,-88.5001 250,-85 240,-81.5001 240,-88.5001"/>
+</g>
+<!-- snapshot -->
+<g id="node13" class="node"><title>snapshot</title>
+<text text-anchor="start" x="242" y="-203" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="247,-197 247,-214 357,-214 357,-197 247,-197"/>
+<polygon fill="none" stroke="black" points="247,-197 247,-214 357,-214 357,-197 247,-197"/>
+<text text-anchor="start" x="281.5" y="-203" font-family="Times,serif" font-size="10.00"> snapshot </text>
+<text text-anchor="start" x="249" y="-187" font-family="Times,serif" font-size="10.00"> object_id </text>
+<text text-anchor="start" x="297" y="-187" font-family="Times,serif" font-size="10.00"> bigserial </text>
+<text text-anchor="start" x="341" y="-187" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="350" y="-187" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="359" y="-187" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="249" y="-172" font-family="Times,serif" font-size="10.00"> id </text>
+<text text-anchor="start" x="297" y="-172" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="341" y="-172" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="350" y="-172" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="359" y="-172" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="239,-166.5 239,-215.5 365,-215.5 365,-166.5 239,-166.5"/>
+</g>
+<!-- origin_visit&#45;&gt;snapshot -->
+<g id="edge2" class="edge"><title>origin_visit:rtcol6&#45;&gt;snapshot:ltcol1</title>
+<path fill="none" stroke="black" d="M168,-39C238.012,-39 172.626,-174.152 228.887,-187.879"/>
+<polygon fill="black" stroke="black" points="228.675,-191.377 239,-189 229.447,-184.419 228.675,-191.377"/>
+</g>
+<!-- person -->
+<g id="node8" class="node"><title>person</title>
+<text text-anchor="start" x="668.5" y="-405" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="673.5,-399 673.5,-416 752.5,-416 752.5,-399 673.5,-399"/>
+<polygon fill="none" stroke="black" points="673.5,-399 673.5,-416 752.5,-416 752.5,-399 673.5,-399"/>
+<text text-anchor="start" x="697" y="-405" font-family="Times,serif" font-size="10.00"> person </text>
+<text text-anchor="start" x="675.5" y="-389" font-family="Times,serif" font-size="10.00"> id </text>
+<text text-anchor="start" x="692.5" y="-389" font-family="Times,serif" font-size="10.00"> bigserial </text>
+<text text-anchor="start" x="736.5" y="-389" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="745.5" y="-389" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="754.5" y="-389" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="665,-383 665,-417 760,-417 760,-383 665,-383"/>
+</g>
+<!-- release -->
+<g id="node9" class="node"><title>release</title>
+<text text-anchor="start" x="33" y="-480" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="38,-474 38,-491 154,-491 154,-474 38,-474"/>
+<polygon fill="none" stroke="black" points="38,-474 38,-491 154,-491 154,-474 38,-474"/>
+<text text-anchor="start" x="79.5" y="-480" font-family="Times,serif" font-size="10.00"> release </text>
+<text text-anchor="start" x="40" y="-464" font-family="Times,serif" font-size="10.00"> id </text>
+<text text-anchor="start" x="87" y="-464" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="138" y="-464" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="147" y="-464" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="156" y="-464" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="40" y="-449" font-family="Times,serif" font-size="10.00"> target </text>
+<text text-anchor="start" x="87" y="-449" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="138" y="-449" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="147" y="-449" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="156" y="-449" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="40" y="-434" font-family="Times,serif" font-size="10.00"> date </text>
+<text text-anchor="start" x="87" y="-434" font-family="Times,serif" font-size="10.00"> timestamp </text>
+<text text-anchor="start" x="138" y="-434" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="147" y="-434" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="156" y="-434" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="40" y="-419" font-family="Times,serif" font-size="10.00"> name </text>
+<text text-anchor="start" x="87" y="-419" font-family="Times,serif" font-size="10.00"> bytea </text>
+<text text-anchor="start" x="138" y="-419" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="147" y="-419" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="156" y="-419" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="40" y="-404" font-family="Times,serif" font-size="10.00"> comment </text>
+<text text-anchor="start" x="87" y="-404" font-family="Times,serif" font-size="10.00"> bytea </text>
+<text text-anchor="start" x="138" y="-404" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="147" y="-404" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="156" y="-404" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="40" y="-389" font-family="Times,serif" font-size="10.00"> author </text>
+<text text-anchor="start" x="87" y="-389" font-family="Times,serif" font-size="10.00"> bigint </text>
+<text text-anchor="start" x="138" y="-389" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="147" y="-389" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="156" y="-389" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="30,-383.5 30,-492.5 162,-492.5 162,-383.5 30,-383.5"/>
+</g>
+<!-- release&#45;&gt;person -->
+<g id="edge3" class="edge"><title>release:rtcol7&#45;&gt;person:ltcol1</title>
+<path fill="none" stroke="black" d="M162,-391C231.43,-391 155.593,-498.518 212,-539 353.133,-640.289 459.558,-612.439 602,-513 648.989,-480.197 609.959,-401.673 655.264,-391.986"/>
+<polygon fill="black" stroke="black" points="655.882,-395.443 665.5,-391 655.211,-388.475 655.882,-395.443"/>
+</g>
+<!-- release&#45;&gt;revision -->
+<g id="edge17" class="edge"><title>release:rtcol2&#45;&gt;revision:ltcol1</title>
+<path fill="none" stroke="black" stroke-dasharray="5,2" d="M162,-452C197.055,-452 180.315,-498.003 212,-513 287.523,-548.747 323.244,-546.015 400,-513 417.531,-505.459 414.87,-484.719 426.19,-477.509"/>
+<polygon fill="black" stroke="black" points="427.179,-480.869 436,-475 425.445,-474.087 427.179,-480.869"/>
+</g>
+<!-- revision&#45;&gt;directory -->
+<g id="edge18" class="edge"><title>revision:rtcol7&#45;&gt;directory:ltcol1</title>
+<path fill="none" stroke="black" stroke-dasharray="5,2" d="M594,-429C641.132,-429 602.359,-237.873 636.739,-209.503"/>
+<polygon fill="black" stroke="black" points="638.27,-212.672 646.5,-206 635.905,-206.083 638.27,-212.672"/>
+</g>
+<!-- revision&#45;&gt;person -->
+<g id="edge4" class="edge"><title>revision:rtcol9&#45;&gt;person:ltcol1</title>
+<path fill="none" stroke="black" d="M594,-399C622.354,-399 631.5,-392.71 655.503,-391.285"/>
+<polygon fill="black" stroke="black" points="655.604,-394.783 665.5,-391 655.404,-387.786 655.604,-394.783"/>
+</g>
+<!-- revision&#45;&gt;person -->
+<g id="edge5" class="edge"><title>revision:rtcol10&#45;&gt;person:ltcol1</title>
+<path fill="none" stroke="black" d="M594,-384C622.188,-384 631.446,-389.456 655.197,-390.734"/>
+<polygon fill="black" stroke="black" points="655.413,-394.241 665.5,-391 655.594,-387.243 655.413,-394.241"/>
+</g>
+<!-- revision_history -->
+<g id="node11" class="node"><title>revision_history</title>
+<text text-anchor="start" x="237" y="-488" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="242,-482 242,-499 362,-499 362,-482 242,-482"/>
+<polygon fill="none" stroke="black" points="242,-482 242,-499 362,-499 362,-482 242,-482"/>
+<text text-anchor="start" x="266" y="-488" font-family="Times,serif" font-size="10.00"> revision_history </text>
+<text text-anchor="start" x="244" y="-472" font-family="Times,serif" font-size="10.00"> id </text>
+<text text-anchor="start" x="302" y="-472" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="346" y="-472" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="355" y="-472" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="364" y="-472" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="244" y="-457" font-family="Times,serif" font-size="10.00"> parent_id </text>
+<text text-anchor="start" x="302" y="-457" font-family="Times,serif" font-size="10.00"> sha1_git </text>
+<text text-anchor="start" x="346" y="-457" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="355" y="-457" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="364" y="-457" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="244" y="-442" font-family="Times,serif" font-size="10.00"> parent_rank </text>
+<text text-anchor="start" x="302" y="-442" font-family="Times,serif" font-size="10.00"> integer </text>
+<text text-anchor="start" x="346" y="-442" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="355" y="-442" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="364" y="-442" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="234,-436 234,-500 370,-500 370,-436 234,-436"/>
+</g>
+<!-- revision_history&#45;&gt;revision -->
+<g id="edge6" class="edge"><title>revision_history:rtcol1&#45;&gt;revision:ltcol1</title>
+<path fill="none" stroke="black" d="M370,-475C395.667,-475 404.49,-475 425.945,-475"/>
+<polygon fill="black" stroke="black" points="426,-478.5 436,-475 426,-471.5 426,-478.5"/>
+</g>
+<!-- revision_history&#45;&gt;revision -->
+<g id="edge19" class="edge"><title>revision_history:rtcol2&#45;&gt;revision:ltcol1</title>
+<path fill="none" stroke="black" stroke-dasharray="5,2" d="M370,-459C396.41,-459 404.025,-471.25 425.736,-474.312"/>
+<polygon fill="black" stroke="black" points="425.788,-477.824 436,-475 426.256,-470.839 425.788,-477.824"/>
+</g>
+<!-- snapshot_branch -->
+<g id="node14" class="node"><title>snapshot_branch</title>
+<text text-anchor="start" x="223" y="-308" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="228,-302 228,-319 376,-319 376,-302 228,-302"/>
+<polygon fill="none" stroke="black" points="228,-302 228,-319 376,-319 376,-302 228,-302"/>
+<text text-anchor="start" x="265" y="-308" font-family="Times,serif" font-size="10.00"> snapshot_branch </text>
+<text text-anchor="start" x="230" y="-292" font-family="Times,serif" font-size="10.00"> object_id </text>
+<text text-anchor="start" x="286" y="-292" font-family="Times,serif" font-size="10.00"> bigserial </text>
+<text text-anchor="start" x="360" y="-292" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="369" y="-292" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="378" y="-292" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="230" y="-277" font-family="Times,serif" font-size="10.00"> name </text>
+<text text-anchor="start" x="286" y="-277" font-family="Times,serif" font-size="10.00"> bytea </text>
+<text text-anchor="start" x="360" y="-277" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="369" y="-277" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="378" y="-277" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="230" y="-262" font-family="Times,serif" font-size="10.00"> target </text>
+<text text-anchor="start" x="286" y="-262" font-family="Times,serif" font-size="10.00"> bytea </text>
+<text text-anchor="start" x="360" y="-262" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="369" y="-262" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="378" y="-262" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="230" y="-247" font-family="Times,serif" font-size="10.00"> target_type </text>
+<text text-anchor="start" x="286" y="-247" font-family="Times,serif" font-size="10.00"> snapshot_target </text>
+<text text-anchor="start" x="360" y="-247" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="369" y="-247" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="378" y="-247" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="220,-241.5 220,-320.5 384,-320.5 384,-241.5 220,-241.5"/>
+</g>
+<!-- snapshot_branch&#45;&gt;revision -->
+<g id="edge9" class="edge"><title>snapshot_branch:rtcol3&#45;&gt;revision:ltcol1</title>
+<path fill="none" stroke="black" d="M384,-264C476.811,-264 350.082,-458.838 425.742,-474.059"/>
+<polygon fill="black" stroke="black" points="425.722,-477.572 436,-475 426.361,-470.601 425.722,-477.572"/>
+</g>
+<!-- snapshot_branches -->
+<g id="node15" class="node"><title>snapshot_branches</title>
+<text text-anchor="start" x="36" y="-255" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="#e5e5e5" stroke="none" points="41,-249 41,-266 152,-266 152,-249 41,-249"/>
+<polygon fill="none" stroke="black" points="41,-249 41,-266 152,-266 152,-249 41,-249"/>
+<text text-anchor="start" x="55.5" y="-255" font-family="Times,serif" font-size="10.00"> snapshot_branches </text>
+<text text-anchor="start" x="43" y="-239" font-family="Times,serif" font-size="10.00"> snapshot_id </text>
+<text text-anchor="start" x="102" y="-239" font-family="Times,serif" font-size="10.00"> bigint </text>
+<text text-anchor="start" x="136" y="-239" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="145" y="-239" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="154" y="-239" font-family="Times,serif" font-size="10.00"> </text>
+<text text-anchor="start" x="43" y="-224" font-family="Times,serif" font-size="10.00"> branch_id </text>
+<text text-anchor="start" x="102" y="-224" font-family="Times,serif" font-size="10.00"> bigint </text>
+<text text-anchor="start" x="136" y="-224" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="145" y="-224" font-family="Times,serif" font-size="10.00"> &#160;</text>
+<text text-anchor="start" x="154" y="-224" font-family="Times,serif" font-size="10.00"> </text>
+<polygon fill="none" stroke="black" points="32.5,-218.5 32.5,-267.5 159.5,-267.5 159.5,-218.5 32.5,-218.5"/>
+</g>
+<!-- snapshot_branches&#45;&gt;snapshot -->
+<g id="edge7" class="edge"><title>snapshot_branches:rtcol1&#45;&gt;snapshot:ltcol1</title>
+<path fill="none" stroke="black" d="M160,-241C198.258,-241 197.496,-197.924 228.743,-190.184"/>
+<polygon fill="black" stroke="black" points="229.467,-193.623 239,-189 228.665,-186.67 229.467,-193.623"/>
+</g>
+<!-- snapshot_branches&#45;&gt;snapshot_branch -->
+<g id="edge8" class="edge"><title>snapshot_branches:rtcol2&#45;&gt;snapshot_branch:ltcol1</title>
+<path fill="none" stroke="black" d="M160,-226C196.829,-226 182.806,-282.669 210.101,-293.294"/>
+<polygon fill="black" stroke="black" points="209.551,-296.751 220,-295 210.74,-289.853 209.551,-296.751"/>
+</g>
+</g>
+</svg>
diff --git a/docs/graph/dataset.rst b/docs/graph/dataset.rst
index 3fbe822..d27822d 100644
--- a/docs/graph/dataset.rst
+++ b/docs/graph/dataset.rst
@@ -1,86 +1,89 @@
Dataset
=======
We provide the full graph dataset along with two "teaser" datasets that can be
used for trying out smaller-scale experiments before using the full graph.
All the main URLs are relative to our dataset prefix:
`https://annex.softwareheritage.org/public/dataset/ <https://annex.softwareheritage.org/public/dataset/>`__.
The Software Heritage Graph Dataset contains a table representation of the full
Software Heritage Graph. It is available in the following formats:
- **PostgreSQL (compressed)**:
+ - **Total size**: 1.2 TiB
- **URL**: `/graph/latest/sql/
<https://annex.softwareheritage.org/public/dataset/graph/latest/sql/>`_
- - **Total size**: 1.2 TiB
- **Apache Parquet**:
+ - **Total size**: 1.2 TiB
- **URL**: `/graph/latest/parquet/
<https://annex.softwareheritage.org/public/dataset/graph/latest/parquet/>`_
- - **Total size**: 1.2 TiB
+ - **S3**: ``s3://softwareheritage/graph``
Teaser datasets
---------------
If the above dataset is too big, we also provide the following "teaser"
datasets that can get you started and have a smaller size fingerprint.
popular-4k
~~~~~~~~~~
The ``popular-4k`` teaser contains a subset of 4000 popular
repositories from GitHub, Gitlab, PyPI and Debian. The selection criteria to
pick the software origins was the following:
- The 1000 most popular GitHub projects (by number of stars)
- The 1000 most popular Gitlab projects (by number of stars)
- The 1000 most popular PyPI projects (by usage statistics, according to the
`Top PyPI Packages <https://hugovk.github.io/top-pypi-packages/>`_ database),
- The 1000 most popular Debian packages (by "votes" according to the `Debian
Popularity Contest <https://popcon.debian.org/>`_ database)
This teaser is available in the following formats:
- **PostgreSQL (compressed)**:
+ - **Total size**: 23 GiB
- **URL**: `/graph/latest/popular-4k/sql/
<https://annex.softwareheritage.org/public/dataset/graph/latest/popular-4k/sql/>`_
- - **Total size**: TODO
- **Apache Parquet**:
+ - **Total size**: 27 GiB
- **URL**: `/graph/latest/popular-4k/parquet/
<https://annex.softwareheritage.org/public/dataset/graph/latest/popular-4k/parquet/>`_
- - **Total size**: TODO
+ - **S3**: ``s3://softwareheritage/teasers/popular-4k``
popular-3k-python
~~~~~~~~~~~~~~~~~
The ``popular-3k-python`` teaser contains a subset of 3052 popular
repositories **tagged as being written in the Python language**, from GitHub,
Gitlab, PyPI and Debian. The selection criteria to pick the software origins
was the following, similar to ``popular-4k``:
- the 1000 most popular GitHub projects written in Python (by number of stars),
- the 131 Gitlab projects written in Python that have 2 stars or more,
- the 1000 most popular PyPI projects (by usage statistics, according to the
`Top PyPI Packages <https://hugovk.github.io/top-pypi-packages/>`_ database),
- the 1000 most popular Debian packages with the
`debtag <https://debtags.debian.org/>`_ ``implemented-in::python`` (by
"votes" according to the `Debian Popularity Contest
<https://popcon.debian.org/>`_ database).
- **PostgreSQL (compressed)**:
+ - **Total size**: 4.7 GiB
- **URL**: `/graph/latest/popular-3k-python/sql/
<https://annex.softwareheritage.org/public/dataset/graph/latest/popular-3k-python/sql/>`_
- - **Total size**: TODO
- **Apache Parquet**:
+ - **Total size**: 5.3 GiB
- **URL**: `/graph/latest/popular-3k-python/sql/
<https://annex.softwareheritage.org/public/dataset/graph/latest/popular-3k-python/parquet/>`_
- - **Total size**: TODO
+ - **S3**: ``s3://softwareheritage/teasers/popular-4k``
diff --git a/docs/graph/index.rst b/docs/graph/index.rst
index 26fea8b..71707b7 100644
--- a/docs/graph/index.rst
+++ b/docs/graph/index.rst
@@ -1,54 +1,55 @@
.. _swh-graph-dataset:
Software Heritage Graph Dataset
===============================
This is the Software Heritage graph dataset: a fully-deduplicated Merkle
DAG representation of the Software Heritage archive. The dataset links
together file content identifiers, source code directories, Version
Control System (VCS) commits tracking evolution over time, up to the
full states of VCS repositories as observed by Software Heritage during
periodic crawls. The dataset’s contents come from major development
forges (including `GitHub <https://github.com/>`__ and
`GitLab <https://gitlab.com>`__), FOSS distributions (e.g.,
`Debian <debian.org>`__), and language-specific package managers (e.g.,
`PyPI <https://pypi.org/>`__). Crawling information is also included,
providing timestamps about when and where all archived source code
artifacts have been observed in the wild.
The Software Heritage graph dataset is available in multiple formats,
including downloadable CSV dumps and Apache Parquet files for local use,
as well as a public instance on Amazon Athena interactive query service
for ready-to-use powerful analytical processing.
By accessing the dataset, you agree with the Software Heritage `Ethical
Charter for using the archive
data <https://www.softwareheritage.org/legal/users-ethical-charter/>`__,
and the `terms of use for bulk
access <https://www.softwareheritage.org/legal/bulk-access-terms-of-use/>`__.
If you use this dataset for research purposes, please cite the following paper:
*
| Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli.
| *The Software Heritage Graph Dataset: Public software development under one roof.*
| In proceedings of `MSR 2019 <http://2019.msrconf.org/>`_: The 16th International Conference on Mining Software Repositories, May 2019, Montreal, Canada. Co-located with `ICSE 2019 <https://2019.icse-conferences.org/>`_.
| `preprint <https://upsilon.cc/~zack/research/publications/msr-2019-swh.pdf>`_, `bibtex <https://upsilon.cc/~zack/research/publications/msr-2019-swh.bib>`_
.. toctree::
:maxdepth: 2
:caption: Contents:
dataset
+ schema
postgresql
athena
databricks
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
diff --git a/docs/graph/postgresql.rst b/docs/graph/postgresql.rst
index 02ab33f..0cb0c38 100644
--- a/docs/graph/postgresql.rst
+++ b/docs/graph/postgresql.rst
@@ -1,98 +1,98 @@
Setup on a PostgreSQL instance
==============================
This tutorial will guide you through the steps required to setup the Software
Heritage Graph Dataset in a PostgreSQL database.
.. highlight:: bash
PostgreSQL local setup
----------------------
You need to have access to a running PostgreSQL instance to load the dataset.
This section contains information on how to setup PostgreSQL for the first
time.
*If you already have a PostgreSQL server running on your machine, you can skip
to the next section.*
- For **Ubuntu** and **Debian**::
sudo apt install postgresql
- For **Archlinux**::
sudo pacman -S --needed postgresql
sudo -u postgres initdb -D '/var/lib/postgres/data'
sudo systemctl enable --now postgresql
Once PostgreSQL is running, you also need an user that will be able to create
databases and run queries. The easiest way to achieve that is simply to create
an account that has the same name as your username and that can create
databases::
sudo -u postgres createuser --createdb $USER
Retrieving the dataset
----------------------
You need to download the dataset in SQL format. Use the following command on
your machine, after making sure that it has enough available space for the
dataset you chose:
.. tabs::
.. group-tab:: full
::
- mkdir full && cd full
- wget -c -A gz,sql -nd -r -np -nH https://annex.softwareheritage.org/public/dataset/graph/2019-01-28/sql/
+ mkdir swhgd && cd swhgd
+ wget -c -q --show-progress -A gz,sql -nd -r -np -nH https://annex.softwareheritage.org/public/dataset/graph/2019-01-28/sql/
.. group-tab:: teaser: popular-4k
::
- mkdir full && cd full
- wget -c -A gz,sql -nd -r -np -nH https://annex.softwareheritage.org/public/dataset/graph/2019-01-28/popular-4k/sql/
+ mkdir popular-4k && cd popular-4k
+ wget -c -q --show-progress -A gz,sql -nd -r -np -nH https://annex.softwareheritage.org/public/dataset/graph/2019-01-28/popular-4k/sql/
.. group-tab:: teaser: popular-3k-python
::
- mkdir full && cd full
- wget -c -A gz,sql -nd -r -np -nH https://annex.softwareheritage.org/public/dataset/graph/2019-01-28/popular-3k-python/sql/
+ mkdir popular-3k-python && cd popular-3k-python
+ wget -c -q --show-progress -A gz,sql -nd -r -np -nH https://annex.softwareheritage.org/public/dataset/graph/2019-01-28/popular-3k-python/sql/
Loading the dataset
-------------------
Once you have retrieved the dataset of your choice, create a database that will
contain it, and load the database:
.. tabs::
.. group-tab:: full
::
createdb swhgd
- psql swhgd < swh_import.sql
+ psql swhgd < load.sql
.. group-tab:: teaser: popular-4k
::
createdb swhgd-popular-4k
- psql swhgd-popular-4k < swh_import.sql
+ psql swhgd-popular-4k < load.sql
.. group-tab:: teaser: popular-3k-python
::
createdb swhgd-popular-3k-python
- psql swhgd-popular-3k-python < swh_import.sql
+ psql swhgd-popular-3k-python < load.sql
You can now run SQL queries on your database. Run ``psql <database_name>`` to
start an interactive PostgreSQL console.
diff --git a/docs/graph/schema.rst b/docs/graph/schema.rst
new file mode 100644
index 0000000..f4fa884
--- /dev/null
+++ b/docs/graph/schema.rst
@@ -0,0 +1,128 @@
+Relational schema
+=================
+
+The Merkle DAG of the Software Heritage archive is encoded in the dataset as a
+set of relational tables.
+A simplified view of the corresponding database schema is shown here:
+
+.. image:: _images/db-schema.svg
+
+This page documents the details of the schema.
+
+- **content**: contains information on the contents stored in
+ the archive.
+
+ - ``sha1`` (bytes): the SHA-1 of the content
+ - ``sha1_git`` (bytes): the Git SHA-1 of the content
+ - ``length`` (integer): the length of the content
+
+- **skipped_content**: contains information on the contents that were not archived for
+ various reasons.
+
+ - ``sha1`` (bytes): the SHA-1 of the missing content
+ - ``sha1_git`` (bytes): the Git SHA-1 of the missing content
+ - ``length`` (integer): the length of the missing content
+
+- **directory**: contains the directories stored in the archive.
+
+ - ``id`` (bytes): the intrinsic identifier of the directory, recursively
+ computed with the Git SHA-1 algorithm
+ - ``dir_entries`` (array of integers): the list of directories contained in
+ this directory, as references to an entry in the ``directory_entry_dir``
+ table.
+ - ``file_entries`` (array of integers): the list of files contained in
+ this directory, as references to an entry in the ``directory_entry_file``
+ table.
+ - ``rev_entries`` (array of integers): the list of revisions contained in
+ this directory, as references to an entry in the ``directory_entry_rev``
+ table.
+
+- **directory_entry_file**: contains informations about file entries in
+ directories.
+
+ - ``id`` (integer): unique identifier for the entry
+ - ``target`` (bytes): the Git SHA-1 of the content this entry points to
+ - ``name`` (bytes): the name of the file (basename of its path)
+ - ``perms`` (integer): the permissions of the file
+
+- **directory_entry_dir**: contains informations about directory entries in
+ directories.
+
+ - ``id`` (integer): unique identifier for the entry
+ - ``target`` (bytes): the Git SHA-1 of the directory this entry points to
+ - ``name`` (bytes): the name of the directory
+ - ``perms`` (integer): the permissions of the directory
+
+- **directory_entry_rev**: contains informations about revision entries in
+ directories.
+
+ - ``id`` (integer): unique identifier for the entry
+ - ``target`` (bytes): the Git SHA-1 of the revision this entry points to
+ - ``name`` (bytes): the name of the directory that contains this revision
+ - ``perms`` (integer): the permissions of the revision
+
+- **revision**: contains the revisions stored in the archive.
+
+ - ``id`` (bytes): the intrinsic identifier of the revision, recursively
+ computed with the Git SHA-1 algorithm. For Git repositories, this
+ corresponds to the commit hash.
+
+- The ``revision`` table contains all the revisions, identified by
+ their intrinsic hash in the ``id`` field. Each revision points to the
+ root directory of the project source tree, identified by the
+ ``directory`` field which references the ``sha1_git`` cryptographic
+ hash of the directory. The table also contains metadata on the
+ revisions, notably the ``author`` and ``committer`` fields, the
+ ``date`` and ``committer_date`` fields and the ``message`` field.
+
+ Each revision has an ordered set of parents (0 for the initial commit
+ of a repository, 1 for a normal commit and 2 or more for a merge
+ commit). These parents are stored in the ``revision_history`` table,
+ one row per parent. Each parent is identified by the ``id``
+ identifier, pointing to the hash of the revision, the ``parent_id``
+ identifier, pointing to the hash of the parent revision, and the
+ ``parent_rank`` integer which defines the order of the parents of
+ each revision.
+
+- The ``person`` table deduplicates commit authors by their name and
+ e-mail addresses. For pseudonymization purposes and in order to
+ prevent abuse, these columns were removed from the dataset, and this
+ table only contains the ``id`` column referenced by the ``author``
+ and ``committer`` fields of the ``revision`` table. Individual
+ authors may be retrieved using this ID from the Software Heritage
+ api.
+
+- The ``release`` table contains the releases in the archive. They are
+ also identified by their intrinsic hash ``id`` and point to a
+ revision referenced by its hash in the ``target`` field. The metadata
+ fields are semantically similar to the ``revision`` table (i.e
+ ``author``, ``date``, ``message``).
+
+- The ``snapshot`` table contains the list of snapshots identified by
+ their intrinsic hash ``id``, and their integer primary key in the
+ archive ``object_id``. Each snapshot maps to a list of branches
+ listed in the table ``snapshot_branch`` through the many-to-many
+ relationship intermediate table ``snapshot_branches``, which
+ references the ``object_id`` fields of the ``snapshot`` and
+ ``snapshot_branch`` tables. The ``snapshot_branch`` table also
+ contains the ``name`` of the branch and the ``target`` it points to
+ (identified by its intrinsic hash), either a ``release``,
+ ``revision``, ``directory`` or ``content`` object depending on the
+ value of the ``target_type`` field.
+
+In addition to the nodes and edges of the graph, the dataset also
+contains crawling information, as a set of triples capturing where (an
+origin url) and when (a timestamp) a given snapshot has been
+encountered.
+
+- The ``origin`` table contains the origins from which the software
+ projects in the dataset were archived, identified by their ``id``
+ identifier, and ``type`` and ``url`` metadata.
+
+ Since Software Heritage archives software continuously, software
+ origins are crawled more than once. Every “visit” of an origin is
+ stored in the ``origin_visit`` table, which contains the identifier
+ ``origin`` of the origin visited, the ``date`` of the visit and a
+ ``snapshot_id`` integer which points to the ``object_id`` identifier
+ of the ``snapshot`` table.
+

File Metadata

Mime Type
image/svg+xml
Expires
Sun, May 18, 2:47 PM (1 d, 17 h)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3214707

Event Timeline