~16M contents were archived in total in azure and are still being indexed (ctags and languages still running).
It'd be nice to check that for some random origins, it did the right job.
My naive check consist of:
- choosing randomly 3 origins (4 now since one was not relevant)
- check for those origins, what the reader git listed in terms of number of contents
- check for those origins, what swh knows about it in terms of number of contents
- compare that the order of magnitude matches for those origins