Since this "migration problem" also concerns cassandra, maybe an simple approach would be to add a Final version attribute to all model entities (a simple monotonic integer).
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 8 2020
Sep 22 2020
Thanks to @seirl using the full journal to do a graph export (and therefore having the time to check whether all objects were there), we've found a bunch of bugs in the journal backfiller / configuration preventing large objects to be added.
After resetting a local consumer to these offsets, I was completely unable to reproduce this issue.
(the backfill had, in fact, completed within a month)
At this point, I don't think we'll make it much better with postgres as source.
Sep 14 2020
Aug 27 2020
Aug 26 2020
Jul 31 2020
Jul 30 2020
Jul 29 2020
Jul 20 2020
Jul 1 2020
Jun 29 2020
Jun 17 2020
I suspect this hasn't happened recently
Jun 9 2020
Build is green
- Rework commit message
- Reuse same string format to reduce diff stat change
Build is green
Use pprint_key function as suggested in irc ;)
Build is green
Fix according to review \m/
checking for dict instance seems completely arbitrary, and doesn't catch other non-hashable types.
I suggest this instead:
try: key_str = hash_to_hex(key) except TypeError: key_str = repr(key)
checking for dict instance seems completely arbitrary, and doesn't catch other non-hashable types.
May 13 2020
Build is green
In D3151#76517, @ardumont wrote:Remove global var and put that definition local to its use
Remove global var and put that definition local to its use
Build is green
- Drop no longer required setUp method
- Specify the deinitialize/initialize in comments
Build is green
Fix tests (in the end, reset issue)
Build has FAILED
Keep initial assertions order
Build has FAILED
May 5 2020
Build is green
Adapt according to review
In D3122#75850, @vlorentz wrote:why was get_journal_client removed from swh.journal.cli?
All the code you're adding in swh/search/cli.py should be in swh-journal, so it can be used by other CLIs using a journal client
why was get_journal_client removed from swh.journal.cli?
May 4 2020
Apr 30 2020
Let's consider this is done now.
Apr 29 2020
Apr 28 2020
We've bumped the max message size to 100 MB in all producers.
The kafka producer in swh.journal now reads message receipts and fails if they're negative, or if they didn't arrive within two minutes.
snapshots, releases, revisions and directories have now been completely backfilled, and no objects of these types are (known to be) missing from the kafka cluster on azure.
Apr 24 2020
Apr 23 2020
Apr 22 2020
Apr 17 2020
Backfilled objects:
- snapshot
- release
Apr 15 2020
I've pulled the list of objects from kafka using @seirl's graph export. I'm now looking to make the diff between postgres and that list of objects.
rDJNL7ff372a02de4 has now been deployed to production