diff --git a/docs/archive-journal.rst b/docs/archive-changelog.rst similarity index 91% copy from docs/archive-journal.rst copy to docs/archive-changelog.rst index 6497f12..18fcf22 100644 --- a/docs/archive-journal.rst +++ b/docs/archive-changelog.rst @@ -1,129 +1,129 @@ -.. _archive-journal: +.. _archive-changelog: -Software Heritage --- Notable Archive Changes -============================================= +Software Heritage --- Archive ChangeLog +======================================= Below you can find a time-indexed list of notable events and changes to archival policies in the Software Heritage Archive. Each of them might have -(had) an impact on how content is archived and explain apparent anomalies or -other changes in archival behaviour over time. They are collected in this -document for historical reasons. +(had) an impact on how content is archived and explain apparent statistical +anomalies or other changes in archival behaviour over time. They are collected +in this document for historical reasons. - -**WARNING:** this document is **work in progress** and not considered complete -yet (tracking: `T2793 `_). +**WARNING:** this document is **work in progress** and not complete yet. You +can follow the status of completing it on our development forge: `T2793 +`_. 2020 ---- * **2020-10-06 - 2020-11-23:** source code crawlers have been paused to avoid an out of disk condition, due to an unexpected delay in the arrival of new storage hardware. Push archival (both `deposit` and `save code now`) remained in operation. (tracking: `T2656 `_) * **2020-06-11:** completed integration with the IPOL_ journal, allowing paper authors to explicitly `deposit` source code to the archive (`announcement `_) 2019 ---- * **2019-09-10:** completed first ingestion of Bitbucket_ Git repositories and added Bitbucket as a regularly crawled forge (tracking: `T592 `_) * **2019-06-30:** completed first ingestion of, and added to regular crawling, several GitLab_ instances: `0xacab.org `_, `framagit.org `_, `gite.lirmm.fr `_, `gitlab.com `_, `gitlab.common-lisp.net `_, `gitlab.freedesktop.org `_, `gitlab.gnome.org `_, `gitlab.inria.fr `_, `salsa.debian.org `_ * **2019-06-12:** completed first ingestion of CRAN_ packages and added CRAN as a regularly crawled package repository (tracking: `T1709 `_) * **2019-06-11:** completed a full ingestion of GNU_ source code releases from `ftp.gnu.org`_, and added it to regular crawling (tracking: `T1722 `_) * **2019-05-27:** completed a full ingestion of NPM_ packages andded it as a regularly crawled package repository (tracking: `T1378 `_) * **2019-01-10:** enabled the `save code now`_ service, allowing users to explicitly request archival of a specific source code repository (`announcement `_) 2018 ---- * **2018-10-10:** completed first ingestion of PyPI_ packages and added PyPI as a regularly crawled package repository (`announcement `_) * **2018-09-25:** completed integration with HAL_, allowing paper authors to explicitly `deposit` source code to the archive (`announcement `_) * **2018-08-31:** completed first ingestion of public GitLab_ repositories from `gitlab.com `_ and added it as a regularly crawled forge (tracking: `T1111 `_) * **2018-03-21:** completed import of `Google Code`_ Mercurial repositories. (tracking: `T682 `_) * **2018-02-20:** completed import of Debian_ packages and added Debian as a regularly crawled distribution (`announcement `_) 2017 ---- * **2017-10-02:** completed import of `Google Code`_ Subversion repositories (tracking: `T617 `_) * **2017-06-06:** completed import of `Google Code`_ Git repositories (tracking: `T673 `_) 2016 ---- * **2016-04-04:** completed import of the Gitorious_ (tracking: `T312 `_) 2015 ---- * **2015-11-06:** archived all GNU_ source code releases from `ftp.gnu.org`_ (tracking: `T90 `_) * **2015-07-28:** started archiving public GitHub_ repositories .. _Bitbucket: https://bitbucket.org .. _CRAN: https://cran.r-project.org .. _Debian: https://www.debian.org .. _ftp.gnu.org: http://ftp.gnu.org .. _GitHub: https://github.com .. _GitLab: https://gitlab.com .. _Gitorious: https://en.wikipedia.org/wiki/Gitorious .. _GNU: https://en.wikipedia.org/wiki/Google_Code .. _Google Code: https://en.wikipedia.org/wiki/Google_Code .. _HAL: https://hal.archives-ouvertes.fr .. _IPOL: http://www.ipol.im .. _NPM: https://www.npmjs.com .. _PyPI: https://pypi.org .. _deposit: https://deposit.softwareheritage.org .. _save code now: https://save.softwareheritage.org diff --git a/docs/archive-journal.rst b/docs/archive-journal.rst index 6497f12..0888a36 100644 --- a/docs/archive-journal.rst +++ b/docs/archive-journal.rst @@ -1,129 +1,129 @@ -.. _archive-journal: +.. _archive-changelog: -Software Heritage --- Notable Archive Changes -============================================= +Software Heritage --- Archive ChangeLog +======================================= Below you can find a time-indexed list of notable events and changes to archival policies in the Software Heritage Archive. Each of them might have -(had) an impact on how content is archived and explain apparent anomalies or -other changes in archival behaviour over time. They are collected in this -document for historical reasons. +(had) an impact on how content is archived and explain apparent statistical +anomalies or other changes in archival behavior over time. They are collected +in this document for historical reasons. - -**WARNING:** this document is **work in progress** and not considered complete -yet (tracking: `T2793 `_). +**WARNING:** this document is **work in progress** and not complete yet. You +can follow the status of completing it on our development forge: `T2793 +`_. 2020 ---- * **2020-10-06 - 2020-11-23:** source code crawlers have been paused to avoid an out of disk condition, due to an unexpected delay in the arrival of new storage hardware. Push archival (both `deposit` and `save code now`) remained in operation. (tracking: `T2656 `_) * **2020-06-11:** completed integration with the IPOL_ journal, allowing paper authors to explicitly `deposit` source code to the archive (`announcement `_) 2019 ---- * **2019-09-10:** completed first ingestion of Bitbucket_ Git repositories and added Bitbucket as a regularly crawled forge (tracking: `T592 `_) * **2019-06-30:** completed first ingestion of, and added to regular crawling, several GitLab_ instances: `0xacab.org `_, `framagit.org `_, `gite.lirmm.fr `_, `gitlab.com `_, `gitlab.common-lisp.net `_, `gitlab.freedesktop.org `_, `gitlab.gnome.org `_, `gitlab.inria.fr `_, `salsa.debian.org `_ * **2019-06-12:** completed first ingestion of CRAN_ packages and added CRAN as a regularly crawled package repository (tracking: `T1709 `_) * **2019-06-11:** completed a full ingestion of GNU_ source code releases from `ftp.gnu.org`_, and added it to regular crawling (tracking: `T1722 `_) * **2019-05-27:** completed a full ingestion of NPM_ packages andded it as a regularly crawled package repository (tracking: `T1378 `_) * **2019-01-10:** enabled the `save code now`_ service, allowing users to explicitly request archival of a specific source code repository (`announcement `_) 2018 ---- * **2018-10-10:** completed first ingestion of PyPI_ packages and added PyPI as a regularly crawled package repository (`announcement `_) * **2018-09-25:** completed integration with HAL_, allowing paper authors to explicitly `deposit` source code to the archive (`announcement `_) * **2018-08-31:** completed first ingestion of public GitLab_ repositories from `gitlab.com `_ and added it as a regularly crawled forge (tracking: `T1111 `_) * **2018-03-21:** completed import of `Google Code`_ Mercurial repositories. (tracking: `T682 `_) * **2018-02-20:** completed import of Debian_ packages and added Debian as a regularly crawled distribution (`announcement `_) 2017 ---- * **2017-10-02:** completed import of `Google Code`_ Subversion repositories (tracking: `T617 `_) * **2017-06-06:** completed import of `Google Code`_ Git repositories (tracking: `T673 `_) 2016 ---- * **2016-04-04:** completed import of the Gitorious_ (tracking: `T312 `_) 2015 ---- * **2015-11-06:** archived all GNU_ source code releases from `ftp.gnu.org`_ (tracking: `T90 `_) * **2015-07-28:** started archiving public GitHub_ repositories .. _Bitbucket: https://bitbucket.org .. _CRAN: https://cran.r-project.org .. _Debian: https://www.debian.org .. _ftp.gnu.org: http://ftp.gnu.org .. _GitHub: https://github.com .. _GitLab: https://gitlab.com .. _Gitorious: https://en.wikipedia.org/wiki/Gitorious .. _GNU: https://en.wikipedia.org/wiki/Google_Code .. _Google Code: https://en.wikipedia.org/wiki/Google_Code .. _HAL: https://hal.archives-ouvertes.fr .. _IPOL: http://www.ipol.im .. _NPM: https://www.npmjs.com .. _PyPI: https://pypi.org .. _deposit: https://deposit.softwareheritage.org .. _save code now: https://save.softwareheritage.org diff --git a/docs/index.rst b/docs/index.rst index e2b76dc..50a5a0d 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,167 +1,169 @@ .. _swh-docs: + Software Heritage - Development Documentation ============================================= Getting started --------------- * :ref:`getting-started` ← start here to get your own Software Heritage platform running in less than 5 minutes, or * :ref:`developer-setup` ← here to hack on the Software Heritage software stack Architecture ------------ * :ref:`architecture` ← go there to have a glimpse on the Software Heritage software architecture Components ---------- Here is brief overview of the most relevant software components in the Software Heritage stack. Each component name is linked to the development documentation of the corresponding Python module. :ref:`swh.core ` low-level utilities and helpers used by almost all other modules in the stack :ref:`swh.dataset ` public datasets and periodic data dumps of the archive released by Software Heritage :ref:`swh.deposit ` push-based deposit of software artifacts to the archive swh.docs developer documentation (used to generate this doc you are reading) :ref:`swh.fuse ` Virtual file system to browse the Software Heritage archive, based on `FUSE `_ :ref:`swh.graph ` Fast, compressed, in-memory representation of the archive, with tooling to generate and query it. :ref:`swh.indexer ` tools and workers used to crawl the content of the archive and extract derived information from any artifact stored in it :ref:`swh.journal ` persistent logger of changes to the archive, with publish-subscribe support :ref:`swh.lister ` collection of listers for all sorts of source code hosting and distribution places (forges, distributions, package managers, etc.) :ref:`swh.loader-core ` low-level loading utilities and helpers used by all other loaders :ref:`swh.loader-git ` loader for `Git `_ repositories :ref:`swh.loader-mercurial ` loader for `Mercurial `_ repositories :ref:`swh.loader-svn ` loader for `Subversion `_ repositories :ref:`swh.model ` implementation of the :ref:`data-model` to archive source code artifacts :ref:`swh.objstorage ` content-addressable object storage :ref:`swh.objstorage.replayer ` Object storage replication tool :ref:`swh.scanner ` source code scanner to analyze code bases and compare them with source code artifacts archived by Software Heritage :ref:`swh.scheduler ` task manager for asynchronous/delayed tasks, used for recurrent (e.g., listing a forge, loading new stuff from a Git repository) and one-off activities (e.g., loading a specific version of a source package) :ref:`swh.storage ` abstraction layer over the archive, allowing to access all stored source code artifacts as well as their metadata :ref:`swh.vault ` implementation of the vault service, allowing to retrieve parts of the archive as self-contained bundles (e.g., individual releases, entire repository snapshots, etc.) :ref:`swh.web ` Web application(s) to browse the archive, for both interactive (HTML UI) and mechanized (REST API) use :ref:`swh.web.client ` Python client for :ref:`swh.web ` Dependencies ------------ The dependency relationships among the various modules are depicted below. .. _py-deps-swh: .. figure:: images/py-deps-swh.svg :width: 1024px :align: center Dependencies among top-level Python modules (click to zoom). Archive ------- -* :ref:`Notable archive changes ` over time +* :ref:`Archive ChangeLog `: notable changes to the archive + over time Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * `URLs index `_ * :ref:`search` * :ref:`glossary` .. ensure sphinx does not complain about index files not being included .. toctree:: :maxdepth: 2 :caption: Contents: :titlesonly: :hidden: architecture getting-started developer-setup API documentation swh.core swh.dataset swh.deposit swh.fuse swh.graph swh.indexer swh.journal swh.lister swh.loader swh.model swh.objstorage swh.scanner swh.scheduler swh.storage swh.vault swh.web swh.web.client