diff --git a/swh/web/ui/templates/api-endpoints.html b/swh/web/ui/templates/api-endpoints.html index 3361933f9..0e588d10d 100644 --- a/swh/web/ui/templates/api-endpoints.html +++ b/swh/web/ui/templates/api-endpoints.html @@ -1,61 +1,61 @@ {% extends "layout.html" %} -{% block title %}API endpoints overview{% endblock %} +{% block title %}API endpoints{% endblock %} {% block content %}
This lists the current API endpoints for version 1. For a more general description, please refer to the main documentation.
{% for route, doc in doc_routes %} {% if 'tags' in doc and doc['tags'] is not none %} {% else %} {% endif %} {% endfor %}
Route Status Description
{{ route }} {{ ', '.join(doc['tags']) }}{{ route }} opened{{ doc['docstring'] | safe_docstring_display | safe }}
{% endblock %} diff --git a/swh/web/ui/templates/includes/apidoc-header-toc.html b/swh/web/ui/templates/includes/apidoc-header-toc.html index 8716a2ef0..e3d81e2e7 100644 --- a/swh/web/ui/templates/includes/apidoc-header-toc.html +++ b/swh/web/ui/templates/includes/apidoc-header-toc.html @@ -1,26 +1,8 @@ +
  • Endpoint index
  • +
  • Data Model
  • Version
  • Schema
  • -
  • Mimetype override
  • -
  • Parameters - -
  • -
  • Client errors - -
  • -
  • Terminology - -
  • -
  • Opened endpoints
  • +
  • Parameters
  • +
  • Errors
  • +
  • Pagination
  • +
  • Rate limiting
  • diff --git a/swh/web/ui/templates/includes/apidoc-header.html b/swh/web/ui/templates/includes/apidoc-header.html index 8a2f9038c..0a190cf69 100644 --- a/swh/web/ui/templates/includes/apidoc-header.html +++ b/swh/web/ui/templates/includes/apidoc-header.html @@ -1,109 +1,100 @@ - -

    Welcome to Software Heritage project's API documentation.

    - -

    Table of Contents

    +

    This document describes the Software Heritage Web API.

    +

    Endpoint index

    +

    You can jump directly to the endpoint index, which lists all available API functionalities, or read on for more general information about the API.

    +

    Data model

    +

    The Software Heritage project harvests publicly available source code by tracking software distribution channels such as version control systems, tarball releases, and distribution packages.

    +

    All retrieved source code and related metadata are stored in the Software Heritage archive, that is conceptually a Merkle DAG. All nodes in the graph are content-addressable, i.e., their node identifiers are computed by hashing their content and, transitively, that of all nodes reachable from them; and no node or edge is ever removed from the graph: the Software Heritage archive is an append-only data structure.

    +

    The following types of objects (i.e., graph nodes) can be found in the Software Heritage archive (for more information see the Software Heritage glossary):

    + -

    Version

    -

    Current version is 1.

    +

    The current version of the API is v1.

    Schema

    -

    Api access is over https and accessed through https://archive.softwareheritage.org/api/1/.

    -

    Data is sent and received in json by default.

    -

    Examples:

    +

    API access is over HTTPS.

    +

    All API endpoints are rooted at https://archive.softwareheritage.org/api/1/.

    +

    Data is sent and received as JSON by default.

    +

    Example:

    -

    Mimetype override

    -

    The response output can be sent as yaml provided the client specifies it using the header field.

    -

    Examples:

    +

    Response format override

    +

    The response format can be overridden using the Accept request header. In particular, Accept: text/html (that web browsers send by default) requests HTML pretty-printing, whereas Accept: application/yaml requests YAML-encoded responses.

    +

    Example:

    Parameters

    -

    Some API endpoints can be used with local parameters. The url then needs to be adapted accordingly.

    -

    For example:

    -
    https://archive.softwareheritage.org/api/1/<endpoint-name>?<field0>=<value0>&<field1>=<value1>
    -

    where:

    - -

    Global parameter

    -

    One parameter is defined for all api endpoints fields. It permits to filter the output fields per key.

    -

    For example, to only list the number of contents, revisions, directories on the statistical endpoints, one uses:

    -

    Examples:

    - -

    Note: If the keys provided to filter on do not exist, they are ignored.

    -

    Client errors

    -

    There are 2 kinds of error.

    -

    In that case, the http error code will reflect. Furthermore, the response is a dictionary with one key 'error' detailing the problem.

    -

    Bad request

    -

    This means that the input is incorrect.

    +

    Some API endpoints can be tweaked by passing optional parameters. For GET requests, optional parameters can be passed as an HTTP query string.

    +

    The optional parameter fields is accepted by all endpoints that return dictionaries and can be used to restrict the list of fields returned by the API, in case you are not interested in all of them. By default, all available fields are returned.

    Example:

    -

    The api content expects an hash identifier so the error will mention that an hash identifier is expected.

    -

    Not found

    -

    This means that the request is ok but we do not found the information the user requests.

    -

    Examples:

    +

    Errors

    +

    While API endpoints will return different kinds of errors depending on their own semantics, some error patterns are common across all endpoints.

    +

    Sending malformed data, including syntactically incorrect object identifiers, will result in a 400 Bad Request HTTP response. Example:

    -

    The hash identifier is ok but nothing is found for that identifier.

    -

    Terminology

    -

    You will find below the terminology the project SWH uses. More details can be found on swh's wiki glossary page.

    -

    Content

    -

    A (specific version of a) file stored in the archive, identified by its cryptographic hashes (SHA1, "git-like" SHA1, SHA256) and its size.

    -

    Also known as: Blob Note.

    -

    (Cryptographic) hash

    -

    A fixed-size "summary" of a stream of bytes that is easy to compute, and hard to reverse.

    -

    Also known as: Checksum, Digest.

    -

    Directory

    -

    A set of named pointers to contents (file entries), directories (directory entries) and revisions (revision entries).

    -

    Origin

    -

    A location from which a coherent set of sources has been obtained.

    -

    Also known as: Data source.

    -

    Examples:

    +

    Requesting non existent resources will result in a 404 Not Found HTTP response. Example:

    -

    Project

    -

    An organized effort to develop a software product.

    -

    Projects might be nested following organizational structures (sub-project, sub-sub-project), are associated to a number of human-meaningful metadata, and release software products via Origins.

    -

    Release

    -

    A revision that has been marked by a project as noteworthy with a specific, usually mnemonic, name (for instance, a version number).

    -

    Also known as: Tag (Git-specific terminology).

    -

    Examples:

    +

    Pagination

    +

    Foo bar

    +

    Rate limiting

    +

    Due to limited resource availability on the back end side, API usage is currently rate limited. Furthermore, as API usage is currently entirely anonymous (i.e., without any authentication), API "users" are currently identified by their origin IP address.

    +

    Three HTTP response fields will inform you about the current state of limits that apply to your current rate limiting bucket:

    -

    Revision

    -

    A "point in time" snapshot in the development history of a project.

    -

    Also known as: Commit

    -

    Examples:

    +

    Example:

    -

    Opened endpoints

    -

    Accessible through https://archive.softwareheritage.org/api/1/.

    diff --git a/swh/web/ui/templates/includes/apidoc-header.md b/swh/web/ui/templates/includes/apidoc-header.md index 3ab041b92..82663fc51 100644 --- a/swh/web/ui/templates/includes/apidoc-header.md +++ b/swh/web/ui/templates/includes/apidoc-header.md @@ -1,209 +1,172 @@ -Welcome to Software Heritage project's API documentation. - - -**Table of Contents** - -- [Version](#version) -- [Schema](#schema) -- [Mimetype override](#mimetype-override) -- [Parameters](#parameters) - - [Global parameter](#global-parameter) -- [Client errors](#client-errors) - - [Bad request](#bad-request) - - [Not found](#not-found) -- [Terminology](#terminology) - - [Content](#content) - - [(Cryptographic) hash](#cryptographic-hash) - - [Directory](#directory) - - [Origin](#origin) - - [Project](#project) - - [Release](#release) - - [Revision](#revision) -- [Opened endpoints](#opened-endpoints) - - +This document describes the Software Heritage Web API. + + + +### Endpoint index + +You can jump directly to the endpoint +index, which lists all available API functionalities, or read on +for more general information about the API. + + +### Data model + +The [Software Heritage](https://www.softwareheritage.org/) project harvests +publicly available source code by tracking software distribution channels such +as version control systems, tarball releases, and distribution packages. + +All retrieved source code and related metadata are stored in the Software +Heritage archive, that is conceptually +a [Merkle DAG](https://en.wikipedia.org/wiki/Merkle_tree). All nodes in the +graph are content-addressable, i.e., their node identifiers are computed by +hashing their content and, transitively, that of all nodes reachable from them; +and no node or edge is ever removed from the graph: the Software Heritage +archive is an append-only data structure. + +The following types of objects (i.e., graph nodes) can be found in the Software +Heritage archive (for more information see +the +[Software Heritage glossary](https://wiki.softwareheritage.org/index.php?title=Glossary)): + +- **Content**: a specific version of a file stored in the archive, identified + by its cryptographic hashes (currently: SHA1, Git-like "salted" SHA1, + SHA256). Note that content objects are nameless; their names are + context-dependent and stored as part of directory entries (see below).
    + *Also known as:* "blob" +- **Directory**: a list of directory entries, where each entry can point to + content objects ("file entries"), revisions ("revision entries"), or + transitively to other directories ("directory entries"). All entries are + associated to the local name of the entry (i.e., a relative path without any + path separator) and permission metadata (e.g., chmod value or equivalent). +- **Revision**: a point in time snapshot of the content of a directory, + together with associated development metadata (e.g., author, timestamp, log + message, etc).
    + *Also known as:* "commit". +- **Release**: a revision that has been marked as noteworthy with a specific + name (e.g., a version number), together with associated development metadata + (e.g., author, timestamp, etc).
    + *Also known as:* "tag" +- **Origin**: an Internet-based location from which a coherent set of objects + (contents, revisions, releases, etc.) archived by Software Heritage has been + obtained. Origins are currently identified by URLs. ### Version -Current version is [1](/api/1/). +The current version of the API is **v1**. + ### Schema -Api access is over https and accessed through [https://archive.softwareheritage.org/api/1/](/api/1/). +API access is over HTTPS. -Data is sent and received in json by default. +All API endpoints are rooted at . -Examples: +Data is sent and received as JSON by default. -- [/api/1/stat/counters/](/api/1/stat/counters/) +Example: -- From the command line: +- from the command line: ``` shell curl -i https://archive.softwareheritage.org/api/1/stat/counters/ ``` +#### Response format override -#### Mimetype override +The response format can be overridden using the `Accept` request header. In +particular, `Accept: text/html` (that web browsers send by default) requests +HTML pretty-printing, whereas `Accept: application/yaml` requests YAML-encoded +responses. -The response output can be sent as yaml provided the client specifies -it using the header field. - -Examples: - -- From your favorite REST client API, execute the same request as - before with the request header 'Accept' set to the - 'application/yaml'. +Example: -- From the command line: +- [/api/1/stat/counters/](/api/1/stat/counters/) +- from the command line: ``` shell curl -i -H 'Accept: application/yaml' https://archive.softwareheritage.org/api/1/stat/counters/ ``` ### Parameters -Some API endpoints can be used with local parameters. The url -then needs to be adapted accordingly. - -For example: - -``` text -https://archive.softwareheritage.org/api/1/?=&= -``` - -where: - -- field0 is an appropriate field for the and value0 -- field1 is an appropriate field for the and value1 +Some API endpoints can be tweaked by passing optional parameters. For GET +requests, optional parameters can be passed as an HTTP query string. -#### Global parameter +The optional parameter `fields` is accepted by all endpoints that return +dictionaries and can be used to restrict the list of fields returned by the +API, in case you are not interested in all of them. By default, all available +fields are returned. -One parameter is defined for all api endpoints `fields`. It permits -to filter the output fields per key. - -For example, to only list the number of contents, revisions, -directories on the statistical endpoints, one uses: - -Examples: +Example: - [/api/1/stat/counters/\?fields\=content,directory,revision](/api/1/stat/counters/?fields=content,directory,revision) - -- From the command line: +- from the command line: ``` shell -curl https://archive.softwareheritage.org/api/1/stat/counters/\?fields\=content,directory,revision +curl https://archive.softwareheritage.org/api/1/stat/counters/?fields=content,directory,revision ``` -Note: If the keys provided to filter on do not exist, they are -ignored. - -### Client errors - -There are 2 kinds of error. -In that case, the http error code will reflect. Furthermore, the -response is a dictionary with one key 'error' detailing the problem. +### Errors -#### Bad request +While API endpoints will return different kinds of errors depending on their +own semantics, some error patterns are common across all endpoints. -This means that the input is incorrect. +Sending malformed data, including syntactically incorrect object identifiers, +will result in a `400 Bad Request` HTTP response. Example: -Example: - -- [/api/1/content/1/](/api/1/content/1/) - -- From the command line: +- [/api/1/content/deadbeef/](/api/1/content/deadbeef/) (client error: + "deadbeef" is too short to be a syntactically valid object identifier) +- from the command line: ``` shell -curl -i https://archive.softwareheritage.org/api/1/content/1/ +curl -i https://archive.softwareheritage.org/api/1/content/deadbeef/ ``` -The api content expects an hash identifier so the error will mention -that an hash identifier is expected. - -#### Not found - -This means that the request is ok but we do not found the information -the user requests. +Requesting non existent resources will result in a `404 Not Found` HTTP +response. Example: -Examples: - -- [/api/1/content/04740277a81c5be6c16f6c9da488ca073b770d7f/](/api/1/content/04740277a81c5be6c16f6c9da488ca073b770d7f/) - -- From the command line: +- [/api/1/content/0123456789abcdef0123456789abcdef01234567/](/api/1/content/0123456789abcdef0123456789abcdef01234567/) + (error: no object with that identifier is available [yet?]) +- from the command line: ``` shell curl -i https://archive.softwareheritage.org/api/1/content/04740277a81c5be6c16f6c9da488ca073b770d7f/ ``` -The hash identifier is ok but nothing is found for that identifier. - -### Terminology - -You will find below the terminology the project SWH uses. -More details can be found -on -[swh's wiki glossary page](https://wiki.softwareheritage.org/index.php?title=Glossary). - -#### Content - -A (specific version of a) file stored in the archive, identified by -its cryptographic hashes (SHA1, "git-like" SHA1, SHA256) and its size. - -Also known as: Blob Note. - -#### (Cryptographic) hash - -A fixed-size "summary" of a stream of bytes that is easy to compute, -and hard to reverse. - -Also known as: Checksum, Digest. - -#### Directory - -A set of named pointers to contents (file entries), directories -(directory entries) and revisions (revision entries). -#### Origin +### Pagination -A location from which a coherent set of sources has been obtained. +Foo bar -Also known as: Data source. -Examples: +### Rate limiting -- a Git repository -- a directory containing tarballs -- the history of a Debian package on snapshot.debian.org. +Due to limited resource availability on the back end side, API usage is +currently rate limited. Furthermore, as API usage is currently entirely +anonymous (i.e., without any authentication), API "users" are currently +identified by their origin IP address. -#### Project +Three HTTP response fields will inform you about the current state of limits +that apply to your current rate limiting bucket: -An organized effort to develop a software product. +- `X-RateLimit-Limit`: maximum number of permitted requests per hour +- `X-RateLimit-Remaining`: number of permitted requests remaining before the + next reset +- `X-RateLimit-Reset`: the time (expressed + in [Unix time](https://en.wikipedia.org/wiki/Unix_time) seconds) at which the + current rate limiting will expire, resetting to a fresh `X-RateLimit-Limit` -Projects might be nested following organizational structures -(sub-project, sub-sub-project), are associated to a number of -human-meaningful metadata, and release software products via Origins. - -#### Release - -A revision that has been marked by a project as noteworthy with a -specific, usually mnemonic, name (for instance, a version number). - -Also known as: Tag (Git-specific terminology). - -Examples: - -- a Git tag with its name -- a tarball with its name -- a Debian source package with its version number. - -#### Revision - -A "point in time" snapshot in the development history of a project. - -Also known as: Commit - -Examples: - -- a Git commit +Example: -### Opened endpoints +- from the command line: -Accessible through [https://archive.softwareheritage.org/api/1/](/api/1/). + curl -i https://archive.softwareheritage.org/api/1/stat/counters/ | grep ^X-RateLimit + X-RateLimit-Limit: 60 + X-RateLimit-Remaining: 54 + X-RateLimit-Reset: 1485794532