Page MenuHomeSoftware Heritage

schema of timezones in the journal
Closed, MigratedEdits Locked

Description

We recently changed the format of new objects written in the journal: D7003

However, offset_bytes is going to be renamed to simply offset in the model, as we are getting rid of the numeric offset. However, this will introduce yet another change to the journal format, and a somewhat confusing one.

What should we do about this?

  1. keep existing objects as-is, with a mix of offset_bytes, a numeric offset, and bytes offset
  2. keep the schema as it is currently, with a mix of offset_bytes and a numeric offset
  3. rewrite every object to use a bytes offset

Option 1 is the hardest on future consumers, and option 3 has the highest risk for corrupting existing data. Option 2 is a middle ground.

Event Timeline

vlorentz triaged this task as Normal priority.Jan 26 2022, 11:57 AM
vlorentz created this task.

Looks like we are going to keep the status quo in the short term, ie. a numeric offset for old objects, and offset_bytes for new objects without renaming.

@douardda @olasd agreed?

Looks like we are going to keep the status quo in the short term, ie. a numeric offset for old objects, and offset_bytes for new objects without renaming.

@douardda @olasd agreed?

yes