diff --git a/docs/_images/athena_tables.png b/docs/_images/athena_tables.png
deleted file mode 100644
index 94f67de..0000000
Binary files a/docs/_images/athena_tables.png and /dev/null differ
diff --git a/docs/athena.rst b/docs/athena.rst
deleted file mode 100644
index cf80875..0000000
--- a/docs/athena.rst
+++ /dev/null
@@ -1,115 +0,0 @@
-Setup on Amazon Athena
-======================
-
-The Software Heritage Graph Dataset is available as a public dataset in `Amazon
-Athena <https://aws.amazon.com/athena/>`_. Athena uses `presto
-<https://prestodb.github.io/>`_, a distributed SQL query engine, to
-automatically scale queries on large datasets.
-
-The pricing of Athena depends on the amount of data scanned by each query,
-generally at a cost of $5 per TiB of data scanned. Full pricing details are
-available `here <https://aws.amazon.com/athena/pricing/>`_.
-
-Note that because the Software Heritage Graph Dataset is available as a public
-dataset, you **do not have to pay for the storage, only for the queries**
-(except for the data you store on S3 yourself, like query results).
-
-
-Loading the tables
-------------------
-
-.. highlight:: bash
-
-AWS account
-~~~~~~~~~~~
-
-In order to use Amazon Athena, you will first need to `create an AWS account
-and setup billing
-<https://aws.amazon.com/premiumsupport/knowledge-center/create-and-activate-aws-account/>`_.
-
-
-Setup
-~~~~~
-
-Athena needs to be made aware of the location and the schema of the Parquet
-files available as a public dataset. Unfortunately, since Athena does not
-support queries that contain multiple commands, it is not as simple as pasting
-an installation script in the console. Instead, we provide a Python script that
-can be run locally on your machine, that will communicate with Athena to create
-the tables automatically with the appropriate schema.
-
-To run this script, you will need to install a few dependencies on your
-machine:
-
-- For **Ubuntu** and **Debian**::
-
-    sudo apt install python3 python3-boto3 awscli
-
-- For **Archlinux**::
-
-    sudo pacman -S --needed python python-boto3 aws-cli
-
-Once the dependencies are installed, run::
-
-  aws configure
-
-This will ask for an AWS Access Key ID and an AWS Secret Access Key in
-order to give Python access to your AWS account. These keys can be generated at
-`this address
-<https://console.aws.amazon.com/iam/home#/security_credentials>`_.
-
-It will also ask for the region in which you want to run the queries. We
-recommand to use ``us-east-1``, since that's where the public dataset is
-located.
-
-Creating the tables
-~~~~~~~~~~~~~~~~~~~
-
-Download and run the Python script that will create the tables on your account:
-
-.. tabs::
-
-  .. group-tab:: full
-
-    ::
-
-      wget https://annex.softwareheritage.org/public/dataset/graph/latest/athena/tables.py
-      wget https://annex.softwareheritage.org/public/dataset/graph/latest/athena/gen_schema.py
-      ./gen_schema.py
-
-  .. group-tab:: popular-4k
-
-    This dataset is not available on Athena yet.
-
-  .. group-tab:: popular-3k-python
-
-    This dataset is not available on Athena yet.
-
-To check that the tables have been successfully created in your account, you
-can open your `Amazon Athena console
-<https://console.aws.amazon.com/athena/home>`_. You should be able to select
-the database corresponding to your dataset, and see the tables:
-
-.. image:: _images/athena_tables.png
-
-
-Running queries
----------------
-
-.. highlight:: sql
-
-From the console, once you have selected the database of your dataset, you can
-run SQL queries directly from the Query Editor.
-
-Try for instance this query that computes the most frequent file names in the
-archive::
-
-  SELECT from_utf8(name, '?') AS name, COUNT(DISTINCT target) AS cnt
-  FROM directory_entry_file
-  GROUP BY name
-  ORDER BY cnt DESC
-  LIMIT 10;
-
-Other examples are available in the preprint of our article: `The Software
-Heritage Graph Dataset: Public software development under one roof.
-<https://upsilon.cc/~zack/research/publications/msr-2019-swh.pdf>`_
diff --git a/docs/datasets.rst b/docs/datasets.rst
deleted file mode 100644
index dcd5cd1..0000000
--- a/docs/datasets.rst
+++ /dev/null
@@ -1,87 +0,0 @@
-Dataset
-=======
-
-We provide the full graph dataset along with two "teaser" datasets that can be
-used for trying out smaller-scale experiments before using the full graph.
-
-The main URLs of the datasets are relative to our dataset prefix:
-`https://annex.softwareheritage.org/public/dataset/ <https://annex.softwareheritage.org/public/dataset/>`__
-
-
-Main dataset
-------------
-
-The main dataset contains the full Software Heritage Graph. It is available
-in the following formats:
-
-- **PostgreSQL (compressed)**:
-
-  - **URL**: `/graph/latest/sql/
-    <https://annex.softwareheritage.org/public/dataset/graph/latest/sql/>`_
-  - **Total size**: 1.2 TiB
-
-- **Apache Parquet**:
-
-  - **URL**: `/graph/latest/parquet/
-    <https://annex.softwareheritage.org/public/dataset/graph/latest/parquet/>`_
-  - **Total size**: 1.2 TiB
-
-Teaser datasets
----------------
-
-popular-4k
-~~~~~~~~~~
-
-The ``popular-4k`` teaser contains a subset of 4000 popular
-repositories from GitHub, Gitlab, PyPI and Debian. The selection criteria to
-pick the software origins was the following:
-
-- The 1000 most popular GitHub projects (by number of stars)
-- The 1000 most popular Gitlab projects (by number of stars)
-- The 1000 most popular PyPI projects (by usage statistics, according to the
-  `Top PyPI Packages <https://hugovk.github.io/top-pypi-packages/>`_ database),
-- The 1000 most popular Debian packages (by "votes" according to the `Debian
-  Popularity Contest <https://popcon.debian.org/>`_ database)
-
-This teaser is available in the following formats:
-
-- **PostgreSQL (compressed)**:
-
-  - **URL**: `/graph/latest/popular-4k/sql/
-    <https://annex.softwareheritage.org/public/dataset/graph/latest/popular-4k/sql/>`_
-  - **Total size**: TODO
-
-- **Apache Parquet**:
-
-  - **URL**: `/graph/latest/popular-4k/parquet/
-    <https://annex.softwareheritage.org/public/dataset/graph/latest/popular-4k/parquet/>`_
-  - **Total size**: TODO
-
-popular-3k-python
-~~~~~~~~~~~~~~~~~
-
-The ``popular-3k-python`` teaser contains a subset of 3052 popular
-repositories **tagged as being written in the Python language**, from GitHub,
-Gitlab, PyPI and Debian. The selection criteria to pick the software origins
-was the following, similar to ``popular-4k``:
-
-- the 1000 most popular GitHub projects written in Python (by number of stars),
-- the 131 Gitlab projects written in Python that have 2 stars or more,
-- the 1000 most popular PyPI projects (by usage statistics, according to the
-  `Top PyPI Packages <https://hugovk.github.io/top-pypi-packages/>`_ database),
-- the 1000 most popular Debian packages with the
-  `debtag <https://debtags.debian.org/>`_ ``implemented-in::python`` (by
-  "votes" according to the `Debian Popularity Contest
-  <https://popcon.debian.org/>`_ database).
-
-- **PostgreSQL (compressed)**:
-
-  - **URL**: `/graph/latest/popular-3k-python/sql/
-    <https://annex.softwareheritage.org/public/dataset/graph/latest/popular-3k-python/sql/>`_
-  - **Total size**: TODO
-
-- **Apache Parquet**:
-
-  - **URL**: `/graph/latest/popular-3k-python/sql/
-    <https://annex.softwareheritage.org/public/dataset/graph/latest/popular-3k-python/parquet/>`_
-  - **Total size**: TODO
diff --git a/docs/index.rst b/docs/index.rst
index 0e99243..f251325 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,53 +1,11 @@
 .. _swh-dataset:
 
-Software Heritage Graph Dataset
-===============================
+Software Heritage Datasets
+==========================
 
-This is the Software Heritage graph dataset: a fully-deduplicated Merkle
-DAG representation of the Software Heritage archive. The dataset links
-together file content identifiers, source code directories, Version
-Control System (VCS) commits tracking evolution over time, up to the
-full states of VCS repositories as observed by Software Heritage during
-periodic crawls. The dataset’s contents come from major development
-forges (including `GitHub <https://github.com/>`__ and
-`GitLab <https://gitlab.com>`__), FOSS distributions (e.g.,
-`Debian <debian.org>`__), and language-specific package managers (e.g.,
-`PyPI <https://pypi.org/>`__). Crawling information is also included,
-providing timestamps about when and where all archived source code
-artifacts have been observed in the wild.
+This page lists the different public datasets and periodic data dumps of the
+archive published by Software Heritage.
 
-The Software Heritage graph dataset is available in multiple formats,
-including downloadable CSV dumps and Apache Parquet files for local use,
-as well as a public instance on Amazon Athena interactive query service
-for ready-to-use powerful analytical processing.
-
-By accessing the dataset, you agree with the Software Heritage `Ethical
-Charter for using the archive
-data <https://www.softwareheritage.org/legal/users-ethical-charter/>`__,
-and the `terms of use for bulk
-access <https://www.softwareheritage.org/legal/bulk-access-terms-of-use/>`__.
-
-
-If you use this dataset for research purposes, please cite the following paper:
-
-* 
-    | Antoine Pietri, Diomidis Spinellis, Stefano Zacchiroli.
-    | *The Software Heritage Graph Dataset: Public software development under one roof.*
-    | In proceedings of `MSR 2019 <http://2019.msrconf.org/>`_: The 16th International Conference on Mining Software Repositories, May 2019, Montreal, Canada. Co-located with `ICSE 2019 <https://2019.icse-conferences.org/>`_.
-    | `preprint <https://upsilon.cc/~zack/research/publications/msr-2019-swh.pdf>`_, `bibtex <https://upsilon.cc/~zack/research/publications/msr-2019-swh.bib>`_
-
-.. toctree::
-   :maxdepth: 2
-   :caption: Contents:
-
-   datasets
-   postgresql
-   athena
-
-
-Indices and tables
-==================
-
-* :ref:`genindex`
-* :ref:`modindex`
-* :ref:`search`
+:ref:`The Software Heritage Graph Dataset <swh-graph-dataset>`
+    the entire graph of Software Heritage in a fully-deduplicated Merkle DAG
+    representation.
diff --git a/docs/postgresql.rst b/docs/postgresql.rst
deleted file mode 100644
index b3e7556..0000000
--- a/docs/postgresql.rst
+++ /dev/null
@@ -1,98 +0,0 @@
-Setup on a PostgreSQL instance
-==============================
-
-This tutorial will guide you through the steps required to setup the Software
-Heritage Graph Dataset in a PostgreSQL database.
-
-.. highlight:: bash
-
-PostgreSQL local setup
-----------------------
-
-You need to have access to a running PostgreSQL instance to load the dataset.
-This section contains information on how to setup PostgreSQL for the first
-time.
-
-*If you already have a PostgreSQL server running on your machine, you can skip
-to the next section.*
-
-- For **Ubuntu** and **Debian**::
-
-    sudo apt install postgresql
-
-- For **Archlinux**::
-
-    sudo pacman -S --needed postgresql
-    sudo -u postgres initdb -D '/var/lib/postgres/data'
-    sudo systemctl enable --now postgresql
-
-Once PostgreSQL is running, you also need an user that will be able to create
-databases and run queries. The easiest way to achieve that is simply to create
-an account that has the same name as your username and that can create
-databases::
-
-    sudo -u postgres createuser --createdb $USER
-
-
-Retrieving the dataset
-----------------------
-
-You need to download the dataset in SQL format. Use the following command on
-your machine, after making sure that it has enough available space for the
-dataset you chose:
-
-.. tabs::
-
-  .. group-tab:: full
-
-    ::
-
-      mkdir full && cd full
-      wget -c -A gz,sql -nd -r -np -nH https://annex.softwareheritage.org/public/dataset/graph/2019-01-28/sql/
-
-  .. group-tab:: popular-4k
-
-    ::
-
-      mkdir full && cd full
-      wget -c -A gz,sql -nd -r -np -nH https://annex.softwareheritage.org/public/dataset/graph/2019-01-28/popular-4k/sql/
-
-  .. group-tab:: popular-3k-python
-
-    ::
-
-      mkdir full && cd full
-      wget -c -A gz,sql -nd -r -np -nH https://annex.softwareheritage.org/public/dataset/graph/2019-01-28/popular-3k-python/sql/
-
-Loading the dataset
--------------------
-
-Once you have retrieved the dataset of your choice, create a database that will
-contain it, and load the database:
-
-.. tabs::
-
-  .. group-tab:: full
-
-    ::
-
-      createdb swhgd
-      psql swhgd < swh_import.sql
-
-  .. group-tab:: popular-4k
-
-    ::
-
-      createdb swhgd-popular-4k
-      psql swhgd-popular-4k < swh_import.sql
-
-  .. group-tab:: popular-3k-python
-
-    ::
-
-      createdb swhgd-popular-3k-python
-      psql swhgd-popular-3k-python < swh_import.sql
-
-
-You can now run SQL queries on your database. Run ``psql <database_name>`` to
-start an interactive PostgreSQL console.