As our project grows in size and visibility, we'll have to respond to more takedown requests.
Expunging data from Software Heritage is a very involved process, considering that, even within SWH itself, all the archive data is replicated across multiple systems (PostgreSQL, Kafka, Elasticsearch to name a few), each with different behaviors. The addition of mirrors to the mix makes the process even more arduous.
Finally, as a project of public interest, we have a duty to be transparent as to what operations have been taken in response to any given takedown request.
This task is two-fold: track which systems replicate what data, and how to handle clearing some data in response to a takedown. Once the brainstorming is over, this can be used as a basis for a workflow documentation.