When mirroring the archive using the storage replayer tooling, we want both to go as fast as possible and not lose any object, or, if a problem occurs that prevent an object from being inserted in the destination storage, we want to be aware of it with a better communication channel than the logs of the process.
In order to make sure we do not insert invalid/corrupted objects, a mirroring session will have to use a ValidationProxyStorage step in the destination storage config.
Also, having a TenacityProxyStorage in the destination storage config pipeline makes sense; one does not want to not insert a batch of object when only one of them is invalid or fails to be inserted, and add a bit of resiliency in case of transient failures.
Currently, these 2 proxy storages do log insertion errors, but there is not mechanism to report in a consistent and persistent database the list of objects that failed to be inserted for some reason.
Ideally, the reported objects should be stored in a k/v like database, using a unique key as identifier, typically the swhid, the hash or something forged from the BaseModel.unique_key() (a bit like what is done in the kafka writer, but this later uses msgpack encoded keys, which makes them not very practical for a k/v store like redis). The value should be the kafka message so one can introspect the problem and possibly replay the insertion for these objects.
This task is about adding such a reporting mechanism.
Redis is probably good candidate to use as database backend for this reporting tool.