Page MenuHomeSoftware Heritage

Add recover_corrupt_objects.py
ClosedPublic

Authored by vlorentz on Jan 17 2022, 3:21 PM.

Details

Summary

Depends on D6956 to support all objects

Test Plan

tested in an empty Docker

Diff Detail

Repository
rDSNIP Code snippets
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 26182
Build 40925: arc lint + arc unit

Event Timeline

vlorentz created this revision.

simplify the code a little bit

vlorentz/recover_corrupt_objects.py
75

I'm not sure that's worth bothering.

If you want to do it, the takedown snippet has the proper code.

You have to:

  • keep track of the ids referenced by the directories you've removed
  • delete / update the directory
  • then remove all the directory_entry_* ids that are not referenced from any directory anymore.
79–80

I think these statements need to be swapped to avoid a reject due to constraints (or add a cascade to the first statement)

vlorentz/recover_corrupt_objects.py
79–80

Ah, right. I tested it with the default docker db, looks like it doesn't have the flavor that does the check

  • recover_corrupt_objects.py: Double-check insertion (this allowed me to find the bug fixed by D7024)

I'm not 100% convinced we need to recheck the objects at every addition (within a transaction that can still fail to commit) instead of afterwards, but it doesn't /hurt/ either. We'll make a full pass on all objects again later anyway.

This revision is now accepted and ready to land.Jan 25 2022, 1:43 PM