stuff related to content (of all kinds, not only "blobs") that is already stored in the Software Heritage archive
Apr 11 2022
Looks like the number of affected revisions is fluctuating a bit:
- needs review against the list generated for T3656
- quantify the “hole” problem which prevents us to do T3655 altogether
Feb 1 2022
With some more heuristics, retrying failed origins, and looking at dumps on banco (great idea from @olasd), I was able to bring down the total number of unrecoverable objects to 95k:
Jan 31 2022
Jan 18 2022
Dec 13 2021
Nov 26 2021
Copy of an email I sent on 2021-11-17:
Nov 22 2021
Nov 10 2021
At least loader deposit and npm  are fine.
Nov 9 2021
Nov 8 2021
Here is an overview of the fields (+ internal version name + branch name) used by each package loader, after D6616:
Oct 22 2021
Great news: of the 469k corrupt SVN revisions, all but 14 (yes, 14) can be fixed simply by adding 1 microsecond to their timestamp.
Oct 20 2021
After further investigation, I can't find any directory that is in a completely bad order; they are either ordered like git does (by adding a / at the end of dir entries) or by assuming a null byte at the end of dir entries.
Oct 15 2021
analysis on directories (some are also part of the fixable_trivial above, but I don't have the exact number, I lost it in my analysis):