Welcome to Software Heritage project's API documentation. **Table of Contents** - [Version](#version) - [Schema](#schema) - [Mimetype override](#mimetype-override) - [Parameters](#parameters) - [Global parameter](#global-parameter) - [Client errors](#client-errors) - [Bad request](#bad-request) - [Not found](#not-found) - [Terminology](#terminology) - [Content](#content) - [(Cryptographic) hash](#cryptographic-hash) - [Directory](#directory) - [Origin](#origin) - [Project](#project) - [Release](#release) - [Revision](#revision) - [Opened endpoints](#opened-endpoints) ### Version Current version is [1](/api/1/). ### Schema Api access is over https and accessed through [https://archive.softwareheritage.org/api/1/](/api/1/). Data is sent and received as json by default. Example: ``` shell $ curl -i https://archive.softwareheritage.org/api/1/stat/counters/ HTTP/1.1 200 OK Date: Mon, 16 Jan 2017 10:57:56 GMT Server: Apache Content-Type: application/json Content-Length: 395 Vary: Accept-Encoding Access-Control-Allow-Origin: * Connection: close { "directory_entry_rev": 3039473, "person": 13903080, "entity": 7103795, "skipped_content": 17864, "entity_history": 7147753, "revision_history": 720840448, "revision": 703277184, "directory": 2616883200, "release": 5692900, "origin": 49938216, "directory_entry_dir": 2140887552, "occurrence_history": 254274832, "occurrence": 241899344, "content": 3155739136, "directory_entry_file": 3173807104 } ``` #### Mimetype override The response output can be sent as yaml provided the client specifies it using the header field. For example: ``` shell curl -i -H 'Accept: application/yaml' https://archive.softwareheritage.org/api/1/stat/counters/ HTTP/1.1 200 OK Date: Mon, 16 Jan 2017 12:31:50 GMT Server: Apache Content-Type: application/yaml Content-Length: 372 Access-Control-Allow-Origin: * Connection: close {content: 3155758336, directory: 2616955136, directory_entry_dir: 2140925824, directory_entry_file: 3173833984, directory_entry_rev: 3039473, entity: 7103741, entity_history: 7148121, occurrence: 241887488, occurrence_history: 254277584, origin: 49939848, person: 13898394, release: 5693922, revision: 703275840, revision_history: 720842176, skipped_content: 17864} ``` ### Parameters Some API endpoints can be used with with local parameters. The url then needs to be adapted accordingly. For example: ``` text https://archive.softwareheritage.org/api/1/?=&= ``` where: - field0 is an appropriate field for the and value0 - field1 is an appropriate field for the and value1 #### Global parameter One parameter is defined for all api endpoints `fields`. It permits to filter the output fields per key. For example, to only list the number of contents, revisions, directories on the statistical endpoints, one uses: ``` shell $ curl https://archive.softwareheritage.org/api/1/stat/counters/\?fields\=content,directory,revision | jq { "content": 3155739136, "revision": 703277184, "directory": 2616883200 } ``` Note: If the keys provided to filter on do not exist, they are ignored. ### Client errors There are 2 kinds of error. In that case, the http error code will reflect that error and a json response is sent with the detailed error. #### Bad request This means that the input is incorrect. Example: ``` shell curl -i https://archive.softwareheritage.org/api/1/content/1/ HTTP/1.1 400 BAD REQUEST Date: Mon, 16 Jan 2017 11:28:08 GMT Server: Apache Content-Type: application/json Content-Length: 44 Connection: close {"error": "Invalid checksum query string 1"} ``` Here, the api content expects an hash identifier. #### Not found This means that the request is ok but we do not found the information the user requests. Example: ``` shell curl -i https://archive.softwareheritage.org/api/1/content/04740277a81c5be6c16f6c9da488ca073b770d7f/ HTTP/1.1 404 NOT FOUND Date: Mon, 16 Jan 2017 11:31:46 GMT Server: Apache Content-Type: application/json Content-Length: 77 Connection: close {"error": "Content with 04740277a81c5be6c16f6c9da488ca073b770d7f not found."} ``` ### Terminology You will find below the terminology the project swh uses. More details can be found on [swh's wiki glossary page](https://wiki.softwareheritage.org/index.php?title=Glossary). #### Content A (specific version of a) file stored in the archive, identified by its cryptographic hashes (SHA1, "git-like" SHA1, SHA256) and its size. Also known as: Blob Note. #### (Cryptographic) hash A fixed-size "summary" of a stream of bytes that is easy to compute, and hard to reverse. Also known as: Checksum, Digest. #### Directory A set of named pointers to contents (file entries), directories (directory entries) and revisions (revision entries). #### Origin A location from which a coherent set of sources has been obtained. Also known as: Data source. Examples: - a Git repository - a directory containing tarballs - the history of a Debian package on snapshot.debian.org. #### Project An organized effort to develop a software product. Projects might be nested following organizational structures (sub-project, sub-sub-project), are associated to a number of human-meaningful metadata, and release software products via Origins. #### Release A revision that has been marked by a project as noteworthy with a specific, usually mnemonic, name (for instance, a version number). Also known as: Tag (Git-specific terminology). Examples: - a Git tag with its name - a tarball with its name - a Debian source package with its version number. #### Revision A "point in time" snapshot in the development history of a project. Also known as: Commit Examples: - a Git commit ### Opened endpoints Open api endpoints is accessed at [https://archive.softwareheritage.org/api/1/](/api/1/).