May 16 2020
The s3 object copy is now completely caught up with where kafka was when the backfilling of all objects from postgresql ended. This means we're now copying the "newer" objects, and there's pretty much no hits at all on the inventory file anymore.
May 4 2020
Apr 30 2020
Let's consider this is done now.
Apr 29 2020
Apr 24 2020
Apr 23 2020
I've updated the exclusion file with data from the 2020-04-19 s3 inventory.
Apr 22 2020
Apr 14 2020
Remains open because there remain decision to be made
about the few real ones (3) we have so far 
So our high number of falsy hash collisions is fixed thanks to D2977 now \m/.
Apr 9 2020
Apr 8 2020
An interesting experiment, disabling the proxy buffer storage in the loader nixguix configuration.
And the number of hashcollision dropped to 0 (no new event for that loader since yesterday around 6pm our time).
Apr 3 2020
All in all, this task serves the purpose of being sure those exists.
Mar 25 2020
Mar 24 2020
Finally, we should make sure that the storage implementations reject objects with hashes of the wrong length. I'm /almost/ sure that's the case, but we should be sure of it.
to be more sure of that, I think we should make sure that all hash data in all exception arguments is hex-encoded unicode strings, rather than bytes objects left for python to repr(); this would circumvent a lot of places where encoding or decoding the data in transfer can go wrong.
it looks like there's a few actual collisions; seems that they're the known-colliding Google PDFs
I'll write my remarks down here for tracking purposes
sampled collisions extracted from sentry and storage 
Mar 12 2020
Feb 18 2020
Jan 22 2020
Dec 7 2019
I've grown tired of babysitting this, so I've added systemd notify calls to the journal replayer, allowing us to just use the systemd watchdog to restart hung processes.
Nov 25 2019
Nov 24 2019
So, the amount of contents on S3 went up fairly quickly during in between Nov 10th and Nov 20th, but then it stopped again, is it expected/normal?
Nov 8 2019
I've added a metric with the S3 objects to https://grafana.softwareheritage.org/d/jScG7g6mk/objstorage-object-counts. There's... "some" work to do still.
So I've deployed this (by hand for now) on uffizi and it seems to be doing its job.