As explained in T1144#21336, it would simplify post-analysis on problems if we keep all metadata information received from client.
Description
Revisions and Commits
Event Timeline
Good idea!
Similarly to what we do for the loaders (e.g., with the Git pack files), we should just keep everything (metadata + tarball) received from a deposit in raw format somewhere, to allow further re-processing. So I think this issue should not only be about keeping raw metadata, but rather the entire (raw) deposit.
Similarly to what we do for the loaders (e.g., with the Git pack files), we should just keep everything (metadata + tarball) received from a deposit in raw format somewhere, to allow further re-processing. So I think this issue should not only be about keeping raw metadata, but rather the entire (raw) deposit.
For the raw archive(s) it's already the case, as we do not know yet when the actual deposit will be done.
As a deposit could stay 'partial' for a long time (meaning, not done, user is able to update the deposit), it was kept (only from status 'deposited' onward, are the archive(s) and metadata used).
We just did not do it for the metadata.
I don't remember the reason(s) that made us inconsistent there. Probably a missing reasoning step.