I'm not quite sure what this is about.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jun 25 2021
Jun 22 2021
An array with the possible node count relative to the replication factor was added on the hedgedoc document : https://hedgedoc.softwareheritage.org/m2MBUViUQl2r9dwcq3-_Nw?both
Jun 18 2021
@vlorentz If you have an idea on how to implement that, I take it ;), I'm not sure if I have not missed something
Several tests were executed with cassandra node on the parasilo cluster [1]
The configuration was always the same to calibrate the runs:
- ZFS is used to manage to datasets
- the commitlogs in the 200Go SSD drive
- the data in the 4 600Gb HDD configured in RAID0
- Default memory configuration (8Go / default GC (not g1))
- Cassandra configuration : [2]
Jun 16 2021
Some notes on how to perform common actions with cassandra: https://hedgedoc.softwareheritage.org/m2MBUViUQl2r9dwcq3-_Nw
Jun 15 2021
The environment can be stopped and rebuild as long as the disk remained reserved on the servers.
Jun 10 2021
Some status about the automation:
- Cassandra nodes are ok (os installation, zfs configuration according to the defined environment except a problem during the first initialization with new disks, startup, cluster configuration)
- swh-storage node is ok (os installation, gunicorn/swh-storage installation and startup)
- cassandra database initialization :
root@parasilo-3:~# nodetool status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 172.16.97.3 78.85 KiB 256 31.6% 49d46dd8-4640-45eb-9d4c-b6b16fc954ab rack1 UN 172.16.97.5 105.45 KiB 256 26.0% 47e99bb4-4846-4e03-a06c-53ea2862172d rack1 UN 172.16.97.4 98.35 KiB 256 18.1% e2aeff29-c89a-4c7a-9352-77aaf78e91b3 rack1 UN 172.16.97.2 78.85 KiB 256 24.3% edd1b72b-4c35-44bd-b7e5-316f41a156c4 rack1
root@parasilo-3:~# cqlsh 172.16.97.3 Connected to swh-storage at 172.16.97.3:9042 [cqlsh 6.0.0 | Cassandra 4.0 | CQL spec 3.4.5 | Native protocol v5] cqlsh> desc KEYSPACES
Jun 3 2021
I played with grid5000 to experiment how the jobs work and how to initialize the reserved nodes.
Jun 2 2021
May 26 2021
May 19 2021
May 7 2021
May 3 2021
Closing this as resolved now the search feature is using elasticsearch in production.
Apr 28 2021
Apr 26 2021
Apr 23 2021
Apr 21 2021
Apr 19 2021
I just discussed the multiplexer-based migration process I described above with ardumont/olasd/vsellier.
Doesn't this deserve a state-of-the-art kind of thing?
Doesn't this deserve a state-of-the-art kind of thing? Are there documentation material on the subject? How does other (big) cassandra users handle this?
In T2602#63432, @vlorentz wrote:For the harder cases, that involve changes to the PK, we could do something like this:
- create a new table with a new name (eg. revision_v[n+1]; like we do in swh-search except Cassandra does not support aliases)
- start an extra storage backend, that reads from that table instead of the old one (eg. revision_v[n]), and also reads from all the other tables as usual
- have a multiplexing storage proxy (like we have for the objstorage), that queries this new backend (which reads from v[n+1]), and falls back to the old backend (which reads from v[n])
Apr 16 2021
What we can do, however:
Apr 15 2021
Apr 14 2021
Apr 12 2021
Apr 9 2021
Schema image is now properly displayed: https://docs.softwareheritage.org/devel/swh-storage/sql-storage.html#sql-storage
Thanks @faux @KShivendu @anlambert, team work ;)
Apr 6 2021
if you remember the crash times (.zsh_history?), we could find a range of candidate SWHIDs...
The migration script has now run to completion (took around a week).
@KShivendu The linked script is a start. As it is, it requires direct access to the DB; so you need to create abstractions for it in swh-storage and swh-web
ok, thanks. It's actually tested in test_stat_counters in swh-storage/swh/storage/tests/storage_tests.py, which is used to test all four classes.
Apr 5 2021
Hi guys. Any pointers on where to start?
I might be wrong but, I think it has been completed. Check out these :
Apr 3 2021
No longer relevant
Apr 1 2021
Mar 30 2021
I've deployed the extid schema changes on all storages, and I've started the migration script on getty.
Mar 29 2021
Mar 25 2021
Mar 23 2021
(and we should keep the origin topic; we already have an ExtSWHID for origins anyway)
The following objects remain:
After a lot of back and forth, and the release of swh.model v2.3.0 and swh.storage v0.26.0, this is now all done and deployed in staging and production.