Our data model is based on git, and normalizes some of the data we read; this means that "weird" git objects cannot be represented.
This meta-task will group this kind of issues
Possible options so far:
1. extend the data model to support them (like "negative_utc_offset, but somewhat generalized, eg. store text representation of offsets)
2. store a binary delta between the object we would generate from the model object and the original
3. store the full original manifest for all objects that can't be losslessly represented in the model, alongside the main graph storage
4. store the full original manifest for all objects, in a separate storage
5. give up on all/some "weird objects"
Some mixes of the options are possible, especially 1 with 2, 3, or 5.
Discussion of these options:
1 -> is annoying to handle, and needs continuous effort, but this is essentially what we are already doing
2 -> brittle, as a botched migration or a bug in swh-model would make the deltas unusable
4 -> probably doubles or triples the size of the graph; but it's the only way to protect against bad migrations (short of recomputing all checksums in migrations). On the other hand, parser errors may go unnoticed because we would rely on these manifest.