Page MenuHomeSoftware Heritage

À la recherche du content perdu
Open, NormalPublic

Description

[with apologies to Marcel Proust]

When finalizing the copy of all contents to azure, 225 objects failed to copy for two reasons:

  • file not found
  • file empty

This means that, for all intents and purposes, these contents are lost.

A CSV export of the metadata for these contents is available here.

To try to restore these, we need to first find where they came from, to see whether they're still available at the source. If they are, we can ingest them manually. If they're not, for now I believe we're out of luck, unless someone has better ideas.

Update (2019-06-20): 74 contents are left missing

Event Timeline

olasd triaged this task as Normal priority.Jun 17 2019, 5:59 PM
olasd created this task.
olasd added a subscriber: grouss.Jun 20 2019, 6:22 PM

151 contents have been restored with help from the provenance index, thanks to @grouss.

The list of missing contents has been updated.

olasd updated the task description. (Show Details)Jun 20 2019, 6:23 PM
zack added a subscriber: zack.EditedNov 18 2019, 5:54 PM

I've used swh-graph to lookup the 74 still missing contents, I've managed to find 67 of them, see cnt→ori mapping in (tracing them back to actual origins requires T2045):

7 are still missing, here are their PIDs: