Page MenuHomeSoftware Heritage

complete object storage mirror on Azure (meta task)
Closed, ResolvedPublic

Description

as per title: we want to have a first, complete, off-site mirror of our object storage, hosted on Azure

Event Timeline

olasd changed the task status from Open to Work in Progress.Apr 24 2017, 3:05 PM
olasd created subtask Unknown Object (Maniphest Task).
olasd added a subscriber: olasd.

We now have a "full" content mirror on azure (the data for each 16th bucket is up to date as of the time the snapshot was taken).

We still need to process the logs for the full injection.

zack added a comment.May 25 2019, 4:56 PM

@olasd recently made a lot of progress on this one.

Can you leave a note about how many objects (roughly) are now on the Azure object storage and how many are missing?

zack closed subtask Unknown Object (Maniphest Task) as Resolved.
olasd added a comment.Jun 7 2019, 7:30 PM
  • The main archive currently synchronously writes all contents to Azure as well as the local storage (the gap is strictly closing)
  • all partitions from uffizi have been copied to azure and mass-injected (except for partition 8 which only got partially mass injected)
  • after this process, it looks like azure is missing 10% of all objects (excluding partition 8), which are all on banco
    • I've started a procedure to copy the missing objects from banco directly. Estimated time to completion ~ 1 month
    • The same procedure has been started to copy the missing objects from partition 8 on uffizi. Estimated time to completion ~ 15 days
olasd closed this task as Resolved.Jun 17 2019, 4:25 PM
olasd claimed this task.

After processing the logs of the backfilling process to make sure to redo all the ranges that were interrupted in various database migrations, I'm now confident that this task is complete: we have a full mirror of all contents on Azure, which is kept up to date by the main archive storage backend writing synchronously to it.

zack added a comment.Jun 17 2019, 4:45 PM
In T691#33551, @olasd wrote:

After processing the logs of the backfilling process to make sure to redo all the ranges that were interrupted in various database migrations, I'm now confident that this task is complete: we have a full mirror of all contents on Azure, which is kept up to date by the main archive storage backend writing synchronously to it.

\o/