Page MenuHomeSoftware Heritage

Add recover_corrupt_objects.py
ClosedPublic

Authored by vlorentz on Jan 17 2022, 3:21 PM.

Details

Summary

Depends on D6956 to support all objects

Test Plan

tested in an empty Docker

Diff Detail

Repository
rDSNIP Code snippets
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

vlorentz created this revision.

simplify the code a little bit

vlorentz/recover_corrupt_objects.py
75

I'm not sure that's worth bothering.

If you want to do it, the takedown snippet has the proper code.

You have to:

  • keep track of the ids referenced by the directories you've removed
  • delete / update the directory
  • then remove all the directory_entry_* ids that are not referenced from any directory anymore.
79–80

I think these statements need to be swapped to avoid a reject due to constraints (or add a cascade to the first statement)

vlorentz/recover_corrupt_objects.py
79–80

Ah, right. I tested it with the default docker db, looks like it doesn't have the flavor that does the check

  • recover_corrupt_objects.py: Double-check insertion (this allowed me to find the bug fixed by D7024)

I'm not 100% convinced we need to recheck the objects at every addition (within a transaction that can still fail to commit) instead of afterwards, but it doesn't /hurt/ either. We'll make a full pass on all objects again later anyway.

This revision is now accepted and ready to land.Jan 25 2022, 1:43 PM