Leveraging azure infrastructure, trigger the blake2s256 update on the existing contents.
This means:
- Provisioning azure vms (sizing -> DS2_V2: 7GB ram, 14GB ssd disk, 2 cores; 85.33E/month) -> for now 2 vms
- code: configuration composability on storage read/write and objstorage readings adaptation
- puppet: swh_indexer_rehash puppetization
- Deploying the swh.indexer.rehash module (+ fix bits and pieces along the way)
- Compute list of sha1s to rehash from swh.content table (IN PROGRESS in uffizi:/srv/storage/space/lists/contents-sha1-to-rehash.txt.gz).
- Send all contents to the swh_indexer_rehash queue
Note:
In regards to the storage stack to use, we can:
- either use the azure's objstorage (copy is 'complete' as in the snapshot copy). This will be the starting point.
- or use uffizi's objstorage (or banco) as the azure's in-transit's cost is null if the cost projection is too high.
- or use a multiplexer objstorage using azure as initial objstorage, falling back to banco if object not found, falling back to uffizi if object not found (solution used)