Leveraging azure infrastructure, trigger the indexation of contents with the following indexer:
- mimetype
- language
- license
This means:
- [x] Provisioning vms (reuse the vms created for T712)
- [ ] Deploying the indexer chaining with adaptation of current setup (swh-site)
- [ ] Send contents for indexation
- [ ] cost/speed projection as soon as some threshold is done (~1M sounds good)
Note:
At the moment, each indexer reads the content from the objstorage (here azure).
If the cost is too high, we may improve on the code to read the contents less often or, as in T712, reuse other objstorages (uffizi or banco, in-transit cost being null).