Page MenuHomeSoftware Heritage

Restore CRAN visits deleted in january 2020 from backups
Open, HighPublic

Description

This script was used in january 2020 to delete all CRAN visits: P585.

This caused some links between origins and snapshots/revisions to be lost, for all packages that had a version removed or were entirely removed from CRAN (see T2536).

It also loses some release history for packages that weren't removed from CRAN (ie. we lost the info they didn't change, or what release they added between visits).

So we should restore these visits from a backup.

Event Timeline

vlorentz triaged this task as Normal priority.Aug 28 2020, 4:06 PM
vlorentz created this task.
vlorentz added a subscriber: ardumont.
vlorentz raised the priority of this task from Normal to High.Aug 31 2020, 4:37 PM
olasd added a subscriber: olasd.Mon, Oct 26, 11:44 AM

We don't have database backups going back 10 months (nor 8 months, which was the horizon we would have needed at the time the task was submitted).

However, @ardumont kept the dumps of the tables before dropping the entries: belvedere:~ardumont/cran-origins-to-cleanup-dump-files.tar.gz

If there's some data worth saving in there (AFAIK the format of the snapshots is very wrong, so it's not obvious if there is) we could certainly restore them.

And now it's here

;)