Leveraging azure infrastructure, trigger the indexation of contents with the following indexer:
- mimetype
- language
- license
This means:
- [x] Provisioning vms (reuse the vms created for T712)
- [x] Deploying the indexer chaining with adaptation of current setup (swh-site)
- [ ] Send contents for indexation
- [x] 0-prefixed hashes
- ...
- [ ] cost/speed projection as soon as some threshold is done (~1M sounds goodx] Send contents for indexation (in-progress)
Note:
At the moment, each indexer reads the content from thea multiplexer objstorage (here azurestarting from azure, if not found, fallback read from banco, if not found, fallback read to uffizi).
If the cost is too high, we may improve on the code to read the contents less often or, as in T712, reuse other objstorages (uffizi or banco, in-transit cost being null).