diff --git a/docs/getting-started.rst b/docs/getting-started.rst new file mode 100644 index 0000000..08a048a --- /dev/null +++ b/docs/getting-started.rst @@ -0,0 +1,58 @@ +.. _getting-started: + +.. highlight:: bash + +Run your own Software Heritage +============================== + +This walkthrough will guide from the basic step of obtaining the source code of +the Software Heritage stack to running a local copy of it in which you can +ingest source code of existing repositories and browse them using the archive +web application. + + +Step 0 - get the code +--------------------- + +The `swh-environment +`_ Git (meta) +repository orchestrates the Git repositories of all Software Heritage modules. +Clone it:: + + git clone https://forge.softwareheritage.org/source/swh-environment.git + +then recursively clone all Python module repositories. For this step you will +need the `mr `_ tool, see the `README` file of +swh-environment for more information:: + + cd swh-environment + readlink -f .mrconfig >> ~/.mrtrust + mr up + +For periodic code updates in the future you can use the following helper:: + + cd swh-environment + bin/update + + +Step 1 - set up storage +----------------------- + +Then you will need a local storage to archive source code artifacts. It comes +in two parts: a content-addressable object storage on your file system (for +file contents) and a Postgres database (for the graph structure of the +archive). See the :ref:`data-model` for more information. + +**TO BE WRITTEN** + + +Step 2 - ingest repositories +---------------------------- + +**TO BE WRITTEN** + + +Step 3 - browse the archive +--------------------------- + +**TO BE WRITTEN** diff --git a/docs/index.rst b/docs/index.rst index 1fcaf6e..97ca9f7 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,108 +1,115 @@ .. _swh-docs: Software Heritage - Development Documentation ============================================= .. toctree:: :maxdepth: 2 :caption: Contents: +Getting started +--------------- + +* :ref:`getting-started` ← start here to hack on the Software Heritage software + stack + + Components ---------- Here is brief overview of the most relevant software components in the Software Heritage stack. Each component name is linked to the development documentation of the corresponding Python module. :ref:`swh.archiver ` orchestrator in charge of guaranteeing that object storage content is pristine and available in a sufficient amount of copies :ref:`swh.core ` low-level utilities and helpers used by almost all other modules in the stack :ref:`swh.deposit ` push-based deposit of software artifacts to the archive swh.docs developer documentation (used to generate this doc you are reading) :ref:`swh.indexer ` tools and workers used to crawl the content of the archive and extract derived information from any artifact stored in it :ref:`swh.journal ` persistent logger of changes to the archive, with publish-subscribe support :ref:`swh.lister ` collection of listers for all sorts of source code hosting and distribution places (forges, distributions, package managers, etc.) :ref:`swh.loader-core ` low-level loading utilities and helpers used by all other loaders :ref:`swh.loader-debian ` loader for `Debian `_ source packages :ref:`swh.loader-dir ` loader for source directories (e.g., expanded tarballs) :ref:`swh.loader-git ` loader for `Git `_ repositories :ref:`swh.loader-mercurial ` loader for `Mercurial `_ repositories :ref:`swh.loader-svn ` loader for `Subversion `_ repositories :ref:`swh.loader-tar ` loader for source tarballs (including Tar, ZIP and other archive formats) :ref:`swh.model ` implementation of the :ref:`data-model` to archive source code artifacts :ref:`swh.objstorage ` content-addressable object storage :ref:`swh.scheduler ` task manager for asynchronous/delayed tasks, used for recurrent (e.g., listing a forge, loading new stuff from a Git repository) and one-off activities (e.g., loading a specific version of a source package) :ref:`swh.storage ` abstraction layer over the archive, allowing to access all stored source code artifacts as well as their metadata :ref:`swh.vault ` implementation of the vault service, allowing to retrieve parts of the archive as self-contained bundles (e.g., individual releases, entire repository snapshots, etc.) :ref:`swh.web ` Web application(s) to browse the archive, for both interactive (HTML UI) and mechanized (REST API) use Dependencies ------------ The dependency relationships among the various modules are depicted below. .. _py-deps-swh: .. figure:: images/py-deps-swh.svg :width: 1024px :align: center Dependencies among top-level Python modules (click to zoom). Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * `URLs index `_ * :ref:`search`