Meta task to get my thoughts in order about adding a Cassandra backend to swh-storage. (Started in april)
[x] have a draft implementation https://forge.softwareheritage.org/source/swh-storage-cassandra/
[x] benchmark to check the performances are not catastrophic https://forge.softwareheritage.org/source/storage-benchmark-deployment/
[x] increase test coverage of all behaviors of swh-storage (D 1534 to 1552)
[x] numeric origin ids
[x] define a replacement T1731
[x] get rid of numeric origin ids in all storage clients T1816
[x] non-swh-web clients
[x] swh-web
[x] queries by origin-id D1969
[x] paginated queries T1912
[ ] ~~public API v2 T1805~~ (postponed)
[x] Add the draft Cassandra backend to the docker env
[ ] Run the draft Cassandra backend with production data
[ ] ~~Rewrite the Cassandra backend using the experience learned working on the draft~~
[x] Add it to the docker env
[ ] Write a storage proxy component, that queries the two backends (postgres and cassandra) and compares their results, to check they are the same; and run it in the docker env. This will make sure migrating to Cassandra does not introduce regressions
[ ] Run it with production data
[ ] Deploy in production (possibly with the proxy at first)