Page MenuHomeSoftware Heritage

Fix pagination of the /revision/<rev>/log/ public API
Closed, MigratedEdits Locked

Description

The pagination of the public API endpoint https://archive.softwareheritage.org/api/1/revision/log is currently broken.

The Link header returned does not include a context element allowing the BFS to be resumed from that link.

Related Objects

Event Timeline

douardda created this task.

As discussed on IRC, a possible fix for is to clearly document the "limitations" of the current implementation.

I'll reiterate my suggestions in this task, if we want to keep this endpoint stateless:

  • clarify documentation: remove mentions of pagination and pages; rename the per_page argument to limit, and document it as a limit to the number of objects returned by the BFS;
  • warn users that they need to keep track of the multiple branches of history when there's merge revisions in the returned objects;
  • make sure the "next" links actually provide links to all the revision logs that need to be followed to get the full history through recursion (or remove the next links entirely, leaving recursion entirely up to the API user).

Another option is to simply drop this method from the public Web API, and keep the revision graph visit logic only in swh-web (the UI). If users want to do a full visit of the revision graph they can use /revision and implement the visit policy they want. (I've suggested this design consideration for API v2 in T1805.)

Clarifying documentation for the v1 API method would be a good thing nonetheless.

anlambert changed the task status from Open to Work in Progress.Jun 16 2020, 1:12 PM
In T2450#45391, @olasd wrote:

I'll reiterate my suggestions in this task, if we want to keep this endpoint stateless:

  • clarify documentation: remove mentions of pagination and pages; rename the per_page argument to limit, and document it as a limit to the number of objects returned by the BFS;
  • warn users that they need to keep track of the multiple branches of history when there's merge revisions in the returned objects;
  • make sure the "next" links actually provide links to all the revision logs that need to be followed to get the full history through recursion (or remove the next links entirely, leaving recursion entirely up to the API user).

I agree with the proposed changes of @olasd. I will got for the removal of the next link response header and documentation update.
I also proposed to remove the prev_sha1s optional URL argument as it does not bring any particular interesting feature (it just adds extra revision data at the beginning of the returned revisions list).