We want to be able to analyze, in batch, all the content blobs stored by Software Heritage.
Sample use cases are:
- compute mime type
- detect the license using ninka/fossology
- detect the programming language
To this end we need some scheduling tooling that allows to add/remove analyzer, (re)run analysis in batch, incrementally stay up to date with new incoming content blobs.