Page MenuHomeSoftware Heritage

Migrate revision metadata to extid in the storage
Closed, MigratedEdits Locked


We want to make loaders read from the ExtID storage instead of 'revision.metadata'. We could do this without migrating the data, but it means loaders will be slow for a while, as they would lose the memory of what packages they already loaded.

This means we should read all the existing revision metadata, and generate ExtIDs before getting rid of the 'revision.metadata' column.

Event Timeline

I've deployed the extid schema changes on all storages, and I've started the migration script on getty.

olasd changed the task status from Open to Work in Progress.Mar 30 2021, 7:43 PM

The migration script has now run to completion (took around a week).

I had an issue with a debian source package with extra checksums (see attached diff - go figure), and with a tar revision with some pretty clearly bogus metadata triggering the assert False at the end of handle_tar_row. I've just skipped it. Of course, it looks like I haven't noted the relevant SWHIDs in both cases, so reproducing the issues is going to be problematic, lest we run the script in dry run mode, again, but this time to completion. I don't think it's worth it.

if you remember the crash times (.zsh_history?), we could find a range of candidate SWHIDs...