Page MenuHomeSoftware Heritage

Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object
Open, NormalPublic

Description

This requires coordinated work on 3 parts:

  • swh.model, see D3389
  • swh.storage, see D3426
  • swh.loader

See also https://forge.softwareheritage.org/D3177#77454

Event Timeline

douardda triaged this task as Normal priority.May 26 2020, 5:07 PM
douardda created this task.
olasd renamed this task from Extract the `extra_git_headers` away from `Revision.metadata` into a top-level immutable object to Extract the `extra_headers` away from `Revision.metadata` into a top-level immutable object.May 26 2020, 6:30 PM
olasd added a subscriber: olasd.May 26 2020, 7:18 PM

In the git "specification" (in the git code really), extra headers are a sequence of arbitrary (key: bytes, value: bytes) tuples, that are serialized in the commit object between the common headers and the commit message.

Keys are allowed to repeat (and they do, e.g. when several tags are merged together, there's several mergetag extra headers in the commit object). Headers are serialized in an arbitrary order, that needs to be preserved (in objects generated by git itself, encoding comes first and gpgsig comes last).

We have reused the extra headers mechanism in the SVN loader, to reference the original revision id and repository uuid in archived revisions (and have them affect the identifier of the revision, as we've considered that this information was an intrinsic part of the revision).

douardda updated the task description. (Show Details)Mon, Jul 6, 1:09 PM