Page MenuHomeSoftware Heritage

D668.id.diff
No OneTemporary

D668.id.diff

diff --git a/docs/getting-started.rst b/docs/getting-started.rst
--- a/docs/getting-started.rst
+++ b/docs/getting-started.rst
@@ -119,7 +119,7 @@
Then you will need a local storage service that will archive and serve source
code artifacts via a REST API. The Software Heritage storage layer comes in two
-parts: a content-addressable object storage on your file system (for file
+parts: a content-addressable :term:`object storage` on your file system (for file
contents) and a Postgres database (for the graph structure of the archive). See
the :ref:`data-model` for more information. The storage layer is configured via
a YAML configuration file, located at
@@ -137,13 +137,13 @@
root: /srv/softwareheritage/objects/
slicing: 0:2/2:4
-Make sure that the object storage root exists on the filesystem and is writable
+Make sure that the :term:`object storage` root exists on the filesystem and is writable
to your user, e.g.::
sudo mkdir -p /srv/softwareheritage/objects
sudo chown "${USER}:" /srv/softwareheritage/objects
-You are done with object storage setup! Let's setup the database::
+You are done with :term:`object storage` setup! Let's setup the database::
swh-db-init storage -d softwareheritage-dev
diff --git a/docs/glossary.rst b/docs/glossary.rst
new file mode 100644
--- /dev/null
+++ b/docs/glossary.rst
@@ -0,0 +1,169 @@
+:orphan:
+
+.. _glossary:
+
+Glossary
+========
+
+.. glossary::
+
+ archive
+
+ An instance of the |swh| data store.
+
+ archiver
+
+ A component dedicated at replicating an :term:`archive` and ensure there
+ are enough copies of each element to ensure resiliency.
+
+ ark
+
+ `Archival Resource Key`_ (ARK) is a Uniform Resource Locator (URL) that is
+ a multi-purpose persistent identifier for information objects of any type.
+
+ artifact
+ software artifact
+
+ An artifact is one of many kinds of tangible by-products produced during
+ the development of software.
+
+ content
+ blob
+
+ A (specific version of a) file stored in the archive, identified by its
+ cryptographic hashes (SHA1, "git-like" SHA1, SHA256) and its size. Also
+ known as: :term:`blob`. Note: it is incorrect to refer to Contents as
+ "files", because files are usually considered to be named, whereas
+ Contents are nameless. It is only in the context of specific
+ :term:`directories <directory>` that :term:`contents <content>` acquire
+ (local) names.
+
+ directory
+
+ A set of named pointers to contents (file entries), directories (directory
+ entries) and revisions (revision entries). All entries are associated to
+ the local name of the entry (i.e., a relative path without any path
+ separator) and permission metadata (e.g., ``chmod`` value or equivalent).
+
+ doi
+
+ A Digital Object Identifier or DOI_ is a persistent identifier or handle
+ used to uniquely identify objects, standardized by the International
+ Organization for Standardization (ISO).
+
+ journal
+
+ The :ref:`journal <swh-journal>` is the persistent logger of the |swh| architecture in charge
+ of logging changes of the archive, with publish-subscribe_ support.
+
+ lister
+
+ A :ref:`lister <swh-lister>` is a component of the |swh| architecture that is in charge of
+ enumerating the :term:`software origin` (e.g., VCS, packages, etc.)
+ available at a source code distribution place.
+
+ loader
+
+ A :ref:`loader <swh-loader-core>` is a component of the |swh| architecture
+ responsible for reading a source code :term:`origin` (typically a git
+ reposiitory) and import or update its content in the :term:`archive` (ie.
+ add new file contents int :term:`object storage` and repository structure
+ in the :term:`storage database`).
+
+ hash
+ cryptographic hash
+ checksum
+ digest
+
+ A fixed-size "summary" of a stream of bytes that is easy to compute, and
+ hard to reverse. (Cryptographic hash function Wikipedia article) also
+ known as: :term:`checksum`, :term:`digest`.
+
+ indexer
+
+ A component of the |swh| architecture dedicated to producing metadata
+ linked to the known :term:`blobs <blob>` in the :term:`archive`.
+
+ objstore
+ objstorage
+ object store
+ object storage
+
+ Content-addressable object storage. It is the place where actual object
+ :term:`blobs <blob>` objects are stored.
+
+ origin
+ software origin
+ data source
+
+ A location from which a coherent set of sources has been obtained, like a
+ git repository, a directory containing tarballs, etc.
+
+ person
+
+ An entity referenced by a revision as either the author or the committer
+ of the corresponding change. A person is associated to a full name and/or
+ an email address.
+
+ release
+ tag
+ milestone
+
+ a revision that has been marked as noteworthy with a specific name (e.g.,
+ a version number), together with associated development metadata (e.g.,
+ author, timestamp, etc).
+
+ revision
+ commit
+ changeset
+
+ A point in time snapshot of the content of a directory, together with
+ associated development metadata (e.g., author, timestamp, log message,
+ etc).
+
+ scheduler
+
+ The component of the |swh| architecture dedicated to the management and
+ the prioritization of the many tasks.
+
+ snapshot
+
+ the state of all visible branches during a specific visit of an origin
+
+ storage
+ storage database
+
+ The main database of the |swh| platform in which the all the elements of
+ the :ref:`data-model` but the :term:`content` are stored as a :ref:`Merkle
+ DAG <swh-merkle-dag>`.
+
+ type of origin
+
+ Information about the kind of hosting, e.g., whether it is a forge, a
+ collection of repositories, an homepage publishing tarball, or a one shot
+ source code repository. For all kind of repositories please specify which
+ VCS system is in use (Git, SVN, CVS, etc.) object.
+
+ vault
+ vault service
+
+ User-facing service that allows to retrieve parts of the :term:`archive`
+ as self-contained bundles (e.g., individual releases, entire repository
+ snapshots, etc.)
+
+ visit
+
+ The passage of |swh| on a given :term:`origin`, to retrieve all source
+ code and metadata available there at the time. A visit object stores the
+ state of all visible branches (if any) available at the origin at visit
+ time; each of them points to a revision object in the archive. Future
+ visits of the same origin will create new visit objects, without removing
+ previous ones.
+
+
+
+.. _blob: https://en.wikipedia.org/wiki/Binary_large_object
+.. _DOI: https://www.doi.org
+.. _`persistent identifier`: https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html#persistent-identifiers
+.. _`Archival Resource Key`: http://n2t.net/e/ark_ids.html
+.. _publish-subscribe: https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern
diff --git a/docs/index.rst b/docs/index.rst
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -116,6 +116,7 @@
* :ref:`modindex`
* `URLs index <http-routingtable.html>`_
* :ref:`search`
+* :ref:`glossary`
.. ensure sphinx does not complain about index files not being included

File Metadata

Mime Type
text/plain
Expires
Mon, Apr 14, 5:02 AM (13 h, 18 m ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3219366

Event Timeline