Page MenuHomeSoftware Heritage

Check reader git did its job
Closed, MigratedEdits Locked

Description

~16M contents were archived in total in azure and are still being indexed (ctags and languages still running).

It'd be nice to check that for some random origins, it did the right job.

My naive check consist of:

  • choosing randomly 3 origins (4 now since one was not relevant)
  • check for those origins, what the reader git listed in terms of number of contents
  • check for those origins, what swh knows about it in terms of number of contents
  • compare that the order of magnitude matches for those origins

Event Timeline

ardumont changed the task status from Open to Work in Progress.EditedNov 4 2016, 3:13 PM
ardumont created this task.

Current analysis' details in P120

TL;DR

  • 3 origin ok
  • 1 origin not relevant (i use the cache to check but it's not cached for that one).
ardumont renamed this task from Check that reader git did its job to Check reader git did its job.Nov 4 2016, 3:23 PM

Updated previous comment. It's ok.