lgtm
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
All Stories
Sep 27 2021
Good idea. Thanks.
I don't see why (I'm guessing for simplification), but ok
Why?
First listing and loading done in production.
I'll update the archive changelog about this.
Draft analysis [1]
tl; dr: So far so good, the staging workers are reliably (no hash mismatch)
finishing their ingestion with their patched dulwich.
Looks fine to me, but it needs some extensive tests indeed.
Sep 25 2021
I've opened a PR with the proposed patch initially done by val (i patched the tests so the dulwich CI makes it green as well).
Sep 24 2021
By the way, this morning, we discussed with david and zack that it'd be great if that would be made dynamic.
No idea how to fetch the correct logo though.
In D6273#164561, @aeviso wrote:However, I would prefer not to squash this together with D6272 since moving those commits together in the git history is really a pain of conflict resolution (I even end up with empty files that later reappear).
There is no harm on keeping ProvenanceStorageRPC until ProvenanceStorageRabbitMQ is landed.
In D6165#164547, @olasd wrote:Thanks for this massive implementation work!
I still want to do a deeper dive in this code (and give others the chance to do so), but I think that before that, and now that bugs and wrinkles have been ironed out and this code seems to be working, we need a large pass of updating the docstrings to describe the actual behavior of the code.
I expect a lot of this is present inside the hedgedoc document, so you should try to land it as documentation at the same time as this code.
When reading this diff, I would like to find the following:
- a description of all threads and subprocesses (on the client and server side), as well as their associated workflows (who does what)
- a description of how RabbitMQ queues and exchanges are handled (the request queues, the response queues, the way the acknowledgements are managed)
- a description of how objects are serialised to be passed on to the queues
- a description of what queues feed to what server processes, and how the messages are "bundled" before being sent to the database
- a list of "tunables" (number of queues, batch sizes, timeouts, etc.) to watch out for
I would suggest documenting the "lifecycle" of the client and server threads/processes, for instance by writing a summarised list of all the methods that are called in sequence, on initialization of the classes, with how the callbacks mesh together.
When this lifecycle doc is available (centrally), I think most of the "boilerplate" documentation that's been pulled from the pika example code can go away (with a shorter reference to the full lifecycle documentation).
@douardda here we go ^ (3 new instances in the screenshot, including the logilab one and heptapod)
Build is green
rebase
In D6273#164555, @olasd wrote:I would suggest squashing D6272 and this together to land them at the same time.
I think you can remove types-werkzeug from requirements-test.txt. I'm not sure you can drop the http extra from swh.core dependencies in requirements-swh.txt, as the serialization/deserialization scaffolding is still in use in the rabbitmq backend.
I would suggest squashing D6272 and this together to land them at the same time.
Thanks for this massive implementation work!
It looks ok for the puppet code
Build is green
Update:
- address most of @vlorentz comments
- use a DFS to walk on the commits graph instead of a BFS
- improve some comments
- add missing docstring for new test suite
In D6334#164535, @olasd wrote:Thanks!
I still think that the postgres and mongodb close methods on ProvenanceStorage instances should be shutting down their respective database connections.
I remember that you didn't want to do that because currently the database connection is passed to the class opened already, which is at least consistent.
However, would it make sense to instead have the storage classes take connection parameters and handle connecting to the database themselves (and therefore having their close methods close the database connections)?
I still think that the postgres and mongodb close methods on ProvenanceStorage instances should be shutting down their respective database connections.
Looks good to me.
LGTM overall
Build is green
Le blog post a bien été publié ce jour :
https://www.softwareheritage.org/2021/09/24/building-object-storage-swh/
use types-psycopg2 instead of ignore it in mymy.ini
Build is green
Adapt docstring to explicit what's said in the commit/diff message/description
Build is green
Use right commit range
Yes, thanks. I agree it would simplify setup in container solutions.
Build is green
Drop unnecessary changes
Build is green
Rework commit message, drop spurious changes.
You may use fcntl.flock for this
I mean using an empty (lock) file in the opam_root directory.
See also 3rd party libraries like https://pypi.org/project/filelock/
Build is green
Build is green
Build is green
rebase
rebase
rebase
Build is green
- Add new RabbitMQ-based client/server API
- Rework ProvenanceStorageRabbitMQWorker to handle connection loss
- Improve server/client shoutdown logic and error handling
Build is green
Fix tests