Page MenuHomeSoftware Heritage
Feed All Stories

Sep 27 2021

vsellier added a revision to T3408: Provide read-only access to production servers: D6350: service urls: Fix the public url of the staging brocker.
Sep 27 2021, 10:57 AM · System administration
ardumont added inline comments to D6348: Clarify local/remote heads type as those are hexadecimal bytes str.
Sep 27 2021, 10:55 AM
ardumont accepted D6342: loader: Add support for dumb HTTP transfer protocol.

lgtm

Sep 27 2021, 10:51 AM
ardumont added 1 blocking reviewer(s) for D6342: loader: Add support for dumb HTTP transfer protocol: Reviewers.
Sep 27 2021, 10:51 AM
ardumont added a revision to T3568: Deploy opam lister/loader to production: D6349: Update archive changelog about the opam.ocaml.org instance.
Sep 27 2021, 10:49 AM · System administration, Archive coverage, Opam
ardumont requested review of D6349: Update archive changelog about the opam.ocaml.org instance.
Sep 27 2021, 10:49 AM
ardumont added inline comments to D6348: Clarify local/remote heads type as those are hexadecimal bytes str.
Sep 27 2021, 10:46 AM
ardumont accepted D6345: docker: do not override the DJANGO_SETTINGS_MODULE in swh-web/entrypoint.sh.

Good idea. Thanks.

Sep 27 2021, 10:44 AM
vlorentz accepted D6346: docker: use dsn connection string in web.yml.

I don't see why (I'm guessing for simplification), but ok

Sep 27 2021, 10:39 AM
vlorentz added a comment to D6347: docker: use a dedicated container for memcached.

Why?

Sep 27 2021, 10:37 AM
vlorentz requested changes to D6348: Clarify local/remote heads type as those are hexadecimal bytes str.
Sep 27 2021, 10:36 AM
vlorentz added inline comments to D6341: model: Replace attrs-strict with stricter validation.
Sep 27 2021, 10:28 AM
ardumont requested review of D6348: Clarify local/remote heads type as those are hexadecimal bytes str.
Sep 27 2021, 10:26 AM
douardda requested review of D6347: docker: use a dedicated container for memcached.
Sep 27 2021, 10:13 AM
douardda requested review of D6346: docker: use dsn connection string in web.yml.
Sep 27 2021, 10:12 AM
ardumont updated subscribers of T3568: Deploy opam lister/loader to production.

First listing and loading done in production.
I'll update the archive changelog about this.

Sep 27 2021, 10:11 AM · System administration, Archive coverage, Opam
douardda requested review of D6345: docker: do not override the DJANGO_SETTINGS_MODULE in swh-web/entrypoint.sh.
Sep 27 2021, 10:11 AM
ardumont added a comment to T3457: Some git repositories are failing to be ingested because of MemoryError.

Draft analysis [1]
tl; dr: So far so good, the staging workers are reliably (no hash mismatch)
finishing their ingestion with their patched dulwich.

Sep 27 2021, 10:04 AM · Git loader
ardumont created P1176 Patching dulwich to decrease memory footprint.
Sep 27 2021, 10:01 AM
douardda added a comment to D6341: model: Replace attrs-strict with stricter validation.

Looks fine to me, but it needs some extensive tests indeed.

Sep 27 2021, 9:42 AM

Sep 25 2021

ardumont changed the status of T3457: Some git repositories are failing to be ingested because of MemoryError from Open to Work in Progress.
Sep 25 2021, 4:10 PM · Git loader
ardumont added a comment to T3457: Some git repositories are failing to be ingested because of MemoryError.

I've opened a PR with the proposed patch initially done by val (i patched the tests so the dulwich CI makes it green as well).

Sep 25 2021, 4:10 PM · Git loader

Sep 24 2021

anlambert added a comment to D6343: misc/coverage: Add heptapod to listed origins.

By the way, this morning, we discussed with david and zack that it'd be great if that would be made dynamic.
No idea how to fetch the correct logo though.

Sep 24 2021, 6:56 PM
ardumont committed rSPSITEdd7d8dc8c291: Deploy opam lister service to production (authored by ardumont).
Deploy opam lister service to production
Sep 24 2021, 6:06 PM
ardumont committed rSPSITEdecc82873fb2: Deploy opam loader service (authored by ardumont).
Deploy opam loader service
Sep 24 2021, 5:42 PM
olasd added inline comments to D6341: model: Replace attrs-strict with stricter validation.
Sep 24 2021, 5:38 PM
olasd added a comment to D6273: Remove remote storage based on `swh.core.api.RPCClient`.

However, I would prefer not to squash this together with D6272 since moving those commits together in the git history is really a pain of conflict resolution (I even end up with empty files that later reappear).
There is no harm on keeping ProvenanceStorageRPC until ProvenanceStorageRabbitMQ is landed.

Sep 24 2021, 5:37 PM
aeviso added a comment to D6165: Add new RabbitMQ-based client/server API.
In D6165#164547, @olasd wrote:

Thanks for this massive implementation work!

I still want to do a deeper dive in this code (and give others the chance to do so), but I think that before that, and now that bugs and wrinkles have been ironed out and this code seems to be working, we need a large pass of updating the docstrings to describe the actual behavior of the code.

I expect a lot of this is present inside the hedgedoc document, so you should try to land it as documentation at the same time as this code.

When reading this diff, I would like to find the following:

  • a description of all threads and subprocesses (on the client and server side), as well as their associated workflows (who does what)
  • a description of how RabbitMQ queues and exchanges are handled (the request queues, the response queues, the way the acknowledgements are managed)
  • a description of how objects are serialised to be passed on to the queues
  • a description of what queues feed to what server processes, and how the messages are "bundled" before being sent to the database
  • a list of "tunables" (number of queues, batch sizes, timeouts, etc.) to watch out for

I would suggest documenting the "lifecycle" of the client and server threads/processes, for instance by writing a summarised list of all the methods that are called in sequence, on initialization of the classes, with how the callbacks mesh together.

When this lifecycle doc is available (centrally), I think most of the "boilerplate" documentation that's been pulled from the pika example code can go away (with a shorter reference to the full lifecycle documentation).

Sep 24 2021, 5:04 PM
ardumont accepted D6343: misc/coverage: Add heptapod to listed origins.

@douardda here we go ^ (3 new instances in the screenshot, including the logilab one and heptapod)

Sep 24 2021, 5:03 PM
swh-public-ci added a comment to D6273: Remove remote storage based on `swh.core.api.RPCClient`.

Build is green

Sep 24 2021, 5:02 PM
aeviso updated the diff for D6273: Remove remote storage based on `swh.core.api.RPCClient`.

rebase

Sep 24 2021, 4:57 PM
aeviso added a comment to D6273: Remove remote storage based on `swh.core.api.RPCClient`.
In D6273#164555, @olasd wrote:

I would suggest squashing D6272 and this together to land them at the same time.

I think you can remove types-werkzeug from requirements-test.txt. I'm not sure you can drop the http extra from swh.core dependencies in requirements-swh.txt, as the serialization/deserialization scaffolding is still in use in the rabbitmq backend.

Sep 24 2021, 4:56 PM
anlambert requested review of D6343: misc/coverage: Add heptapod to listed origins.
Sep 24 2021, 4:43 PM
olasd added a comment to D6273: Remove remote storage based on `swh.core.api.RPCClient`.

I would suggest squashing D6272 and this together to land them at the same time.

Sep 24 2021, 4:30 PM
olasd added a comment to D6165: Add new RabbitMQ-based client/server API.

Thanks for this massive implementation work!

Sep 24 2021, 4:25 PM
vsellier accepted D6305: opam: Install and maintain up-to-date shared opam root directories.

It looks ok for the puppet code

Sep 24 2021, 4:15 PM
ardumont created P1175 version ordering in opam is a bit specific.
Sep 24 2021, 4:14 PM
swh-public-ci added a comment to D6342: loader: Add support for dumb HTTP transfer protocol.

Build is green

Sep 24 2021, 4:03 PM
anlambert updated the diff for D6342: loader: Add support for dumb HTTP transfer protocol.

Update:

  • address most of @vlorentz comments
  • use a DFS to walk on the commits graph instead of a BFS
  • improve some comments
  • add missing docstring for new test suite
Sep 24 2021, 4:01 PM
aeviso added a comment to D6334: Add `close` method to both `ProvenanceInterface` and `ProvenanceStorageInterface`.
In D6334#164535, @olasd wrote:

Thanks!

I still think that the postgres and mongodb close methods on ProvenanceStorage instances should be shutting down their respective database connections.

I remember that you didn't want to do that because currently the database connection is passed to the class opened already, which is at least consistent.

However, would it make sense to instead have the storage classes take connection parameters and handle connecting to the database themselves (and therefore having their close methods close the database connections)?

Sep 24 2021, 3:44 PM
anlambert added inline comments to D6342: loader: Add support for dumb HTTP transfer protocol.
Sep 24 2021, 3:36 PM
olasd added a comment to D6334: Add `close` method to both `ProvenanceInterface` and `ProvenanceStorageInterface`.

I still think that the postgres and mongodb close methods on ProvenanceStorage instances should be shutting down their respective database connections.

Sep 24 2021, 3:35 PM
douardda closed D6336: Naive attempt to add support for dsn url config style for production db.
Sep 24 2021, 3:33 PM
douardda committed rDWAPPSdd6dde3e44cd: Naive attempt to add support for dsn url config style for production db (authored by douardda).
Naive attempt to add support for dsn url config style for production db
Sep 24 2021, 3:33 PM
douardda closed D6335: Wrap long lines in the README file.
Sep 24 2021, 3:33 PM
douardda committed rDWAPPS15b0e84456ae: Wrap long lines in the README file (authored by douardda).
Wrap long lines in the README file
Sep 24 2021, 3:33 PM
anlambert accepted D6336: Naive attempt to add support for dsn url config style for production db.

Looks good to me.

Sep 24 2021, 3:33 PM
vlorentz added inline comments to D6342: loader: Add support for dumb HTTP transfer protocol.
Sep 24 2021, 3:33 PM
anlambert added inline comments to D6342: loader: Add support for dumb HTTP transfer protocol.
Sep 24 2021, 3:24 PM
vlorentz added a parent task for T3594: Faithfully store weird git objects: T3552: Fix corrupted releases, revisions, and directories in the storage.
Sep 24 2021, 3:13 PM · meta-task, Data Model, Storage manager
vlorentz added a subtask for T3552: Fix corrupted releases, revisions, and directories in the storage: T3594: Faithfully store weird git objects.
Sep 24 2021, 3:13 PM · Storage manager
vlorentz added a comment to D6342: loader: Add support for dumb HTTP transfer protocol.

LGTM overall

Sep 24 2021, 3:08 PM
anlambert added inline comments to D6342: loader: Add support for dumb HTTP transfer protocol.
Sep 24 2021, 3:03 PM
vlorentz added inline comments to D6342: loader: Add support for dumb HTTP transfer protocol.
Sep 24 2021, 2:54 PM
anlambert updated the test plan for D6342: loader: Add support for dumb HTTP transfer protocol.
Sep 24 2021, 2:51 PM
anlambert requested review of D6342: loader: Add support for dumb HTTP transfer protocol.
Sep 24 2021, 2:44 PM
anlambert added a revision to T2489: Git origin without smart transfer protocol support cannot be loaded: D6342: loader: Add support for dumb HTTP transfer protocol.
Sep 24 2021, 2:42 PM · Git loader
swh-public-ci added a comment to D6336: Naive attempt to add support for dsn url config style for production db.

Build is green

Sep 24 2021, 2:29 PM
vlorentz requested review of D6341: model: Replace attrs-strict with stricter validation.
Sep 24 2021, 2:28 PM
vlorentz closed D6338: persistent-identifiers.rst: Update references to manifest formats.
Sep 24 2021, 2:25 PM
vlorentz committed rDMODe30eb7d29170: persistent-identifiers.rst: Update references to manifest formats (authored by vlorentz).
persistent-identifiers.rst: Update references to manifest formats
Sep 24 2021, 2:25 PM
marla.dasilva closed T3536: Blog post Easter Eggs NLNet as Resolved.
Sep 24 2021, 2:23 PM · Unknown Object (Project)
marla.dasilva added a comment to T3536: Blog post Easter Eggs NLNet.

Le blog post a bien été publié ce jour :
https://www.softwareheritage.org/2021/09/24/building-object-storage-swh/

Sep 24 2021, 2:22 PM · Unknown Object (Project)
douardda updated the diff for D6336: Naive attempt to add support for dsn url config style for production db.

use types-psycopg2 instead of ignore it in mymy.ini

Sep 24 2021, 2:14 PM
ardumont added inline comments to D6336: Naive attempt to add support for dsn url config style for production db.
Sep 24 2021, 2:13 PM
ardumont updated the summary of D6305: opam: Install and maintain up-to-date shared opam root directories.
Sep 24 2021, 2:12 PM
douardda added inline comments to D6336: Naive attempt to add support for dsn url config style for production db.
Sep 24 2021, 2:10 PM
ardumont added inline comments to D6336: Naive attempt to add support for dsn url config style for production db.
Sep 24 2021, 2:09 PM
douardda added inline comments to D6336: Naive attempt to add support for dsn url config style for production db.
Sep 24 2021, 2:08 PM
ardumont retitled D6318: opam: Allow shared state between loader runs using multi-instance opam root from loader-opam: Allow shared state between loader runs using multi-instance opam root to opam: Allow shared state between loader runs using multi-instance opam root.
Sep 24 2021, 2:08 PM
swh-public-ci added a comment to D6340: opam: Define a initialize_opam_root parameter for opam loader.

Build is green

Sep 24 2021, 2:07 PM
ardumont updated the diff for D6340: opam: Define a initialize_opam_root parameter for opam loader.

Adapt docstring to explicit what's said in the commit/diff message/description

Sep 24 2021, 2:05 PM
swh-public-ci added a comment to D6340: opam: Define a initialize_opam_root parameter for opam loader.

Build is green

Sep 24 2021, 2:03 PM
ardumont updated the diff for D6340: opam: Define a initialize_opam_root parameter for opam loader.

Use right commit range

Sep 24 2021, 2:00 PM
ardumont accepted D6336: Naive attempt to add support for dsn url config style for production db.

Yes, thanks. I agree it would simplify setup in container solutions.

Sep 24 2021, 1:59 PM
ardumont added inline comments to D6336: Naive attempt to add support for dsn url config style for production db.
Sep 24 2021, 1:59 PM
swh-public-ci added a comment to D6340: opam: Define a initialize_opam_root parameter for opam loader.

Build is green

Sep 24 2021, 1:58 PM
ardumont updated the diff for D6340: opam: Define a initialize_opam_root parameter for opam loader.

Drop unnecessary changes

Sep 24 2021, 1:56 PM
swh-public-ci added a comment to D6340: opam: Define a initialize_opam_root parameter for opam loader.

Build is green

Sep 24 2021, 1:53 PM
ardumont updated the diff for D6340: opam: Define a initialize_opam_root parameter for opam loader.

Rework commit message, drop spurious changes.

Sep 24 2021, 1:51 PM
ardumont updated the summary of D6340: opam: Define a initialize_opam_root parameter for opam loader.
Sep 24 2021, 1:48 PM
ardumont requested review of D6340: opam: Define a initialize_opam_root parameter for opam loader.
Sep 24 2021, 1:41 PM
ardumont added a revision to T3590: opam loader: Ensure required opam state is shared amongst ingestion/listing runs: D6340: opam: Define a initialize_opam_root parameter for opam loader.
Sep 24 2021, 1:39 PM · Archive coverage, Opam
ardumont added a comment to D6316: opam: Share opam root directory even on multiple instances.

You may use fcntl.flock for this

I mean using an empty (lock) file in the opam_root directory.

See also 3rd party libraries like https://pypi.org/project/filelock/

Sep 24 2021, 1:25 PM
ardumont accepted D6335: Wrap long lines in the README file.

Thanks a lot.

Sep 24 2021, 1:21 PM
zack accepted D6338: persistent-identifiers.rst: Update references to manifest formats.
Sep 24 2021, 1:02 PM
swh-public-ci added a comment to D6339: Add support for remote backend on existing storage tests.

Build is green

Sep 24 2021, 12:33 PM
swh-public-ci added a comment to D6273: Remove remote storage based on `swh.core.api.RPCClient`.

Build is green

Sep 24 2021, 12:33 PM
swh-public-ci added a comment to D6165: Add new RabbitMQ-based client/server API.

Build is green

Sep 24 2021, 12:29 PM
aeviso updated the diff for D6273: Remove remote storage based on `swh.core.api.RPCClient`.

rebase

Sep 24 2021, 12:27 PM
aeviso updated the diff for D6339: Add support for remote backend on existing storage tests.

rebase

Sep 24 2021, 12:26 PM
aeviso updated the diff for D6165: Add new RabbitMQ-based client/server API.

rebase

Sep 24 2021, 12:25 PM
aeviso requested review of D6339: Add support for remote backend on existing storage tests.
Sep 24 2021, 12:25 PM
swh-public-ci added a comment to D6165: Add new RabbitMQ-based client/server API.

Build is green

Sep 24 2021, 12:21 PM
douardda requested review of D6336: Naive attempt to add support for dsn url config style for production db.
Sep 24 2021, 12:18 PM
aeviso updated the diff for D6165: Add new RabbitMQ-based client/server API.
  • Add new RabbitMQ-based client/server API
  • Rework ProvenanceStorageRabbitMQWorker to handle connection loss
  • Improve server/client shoutdown logic and error handling
Sep 24 2021, 12:18 PM
douardda requested review of D6335: Wrap long lines in the README file.
Sep 24 2021, 12:18 PM
vlorentz requested review of D6338: persistent-identifiers.rst: Update references to manifest formats.
Sep 24 2021, 12:11 PM
swh-public-ci added a comment to D6316: opam: Share opam root directory even on multiple instances.

Build is green

Sep 24 2021, 11:58 AM
ardumont updated the diff for D6316: opam: Share opam root directory even on multiple instances.

Fix tests

Sep 24 2021, 11:55 AM