Page MenuHomeSoftware Heritage
Feed Advanced Search

Aug 4 2022

ardumont added a revision to T4371: Deploy swh-scrubber on all storage instances: D8181: scrubber: Make service parametric on the db instance to scrub.
Aug 4 2022, 3:03 PM · System administration, Archive integrity, Storage manager
ardumont moved T4371: Deploy swh-scrubber on all storage instances from Backlog to Weekly backlog on the System administration board.
Aug 4 2022, 11:34 AM · System administration, Archive integrity, Storage manager

Jul 13 2022

vsellier added a comment to T4373: [cassandra] Test the new hardware.

Unfortunately, the operator test is a failure due to the lack of configuration possibility

  • non blocker, the init containers are OOMkilled during the start, it can be solved by editing the cassandra statefulset created by the operator to extend the limits
  • blocker, it's not possible to configure the commitlog_directory explicitly. it's by default on /var/lib/cassandra/commitlog
    • it's not easy to propagate the host mounts to use 2 mountpoints /srv/cassandra and /srv/cassandra/commitlog without tweaking the kernel / rancher configuration
    • it's not possible to add a second volume on the pod description created by the operator
Jul 13 2022, 10:25 AM · Storage manager, System administration
vlorentz added a subtask for T3841: regularly scrub all the data stores of swh: Restricted Maniphest Task.
Jul 13 2022, 9:40 AM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager

Jul 12 2022

vsellier added a revision to T4373: [cassandra] Test the new hardware: D8116: Deploy the cassandra operator on the production cassandra cluster.
Jul 12 2022, 3:01 PM · Storage manager, System administration
vsellier changed the status of T4374: [cassandra] Test basic topology, a subtask of T4379: [cassandra] create etcd / controlplane servers, from Open to Work in Progress.
Jul 12 2022, 12:10 PM · Storage manager, System administration
vsellier changed the status of T4374: [cassandra] Test basic topology from Open to Work in Progress.
Jul 12 2022, 12:10 PM · Storage manager, System administration
vsellier closed T4389: [cassandra] Configure the monitoring of the cluster, a subtask of T4373: [cassandra] Test the new hardware, as Resolved.
Jul 12 2022, 12:09 PM · Storage manager, System administration

Jul 11 2022

vsellier closed T4379: [cassandra] create etcd / controlplane servers as Resolved.

Finally, the cluster is up.
I'm not sure what unstuck the node registration, but I suspect a node with all the roles is needed to bootstrap the cluster.
I tried this initially, it didn't worked, but I'm not sure in which status the cluster was.

Jul 11 2022, 4:33 PM · Storage manager, System administration
vsellier closed T4379: [cassandra] create etcd / controlplane servers, a subtask of T4373: [cassandra] Test the new hardware, as Resolved.
Jul 11 2022, 4:33 PM · Storage manager, System administration
vsellier added a revision to T4373: [cassandra] Test the new hardware: D8105: Install zfs and docker on the cassandra node to prepare the cass operator tests.
Jul 11 2022, 9:33 AM · Storage manager, System administration

Jul 7 2022

vsellier added a comment to T4379: [cassandra] create etcd / controlplane servers.

The management nodes were correctly created but it seems rancher is having some issuer to register them in the cluster.

Jul 7 2022, 6:52 PM · Storage manager, System administration
vsellier added a revision to T4379: [cassandra] create etcd / controlplane servers: D8094: Declare the kubernetes cluster and management nodes for cassandra.
Jul 7 2022, 12:11 PM · Storage manager, System administration
vsellier changed the status of T4379: [cassandra] create etcd / controlplane servers, a subtask of T4373: [cassandra] Test the new hardware, from Open to Work in Progress.
Jul 7 2022, 11:56 AM · Storage manager, System administration
vsellier changed the status of T4379: [cassandra] create etcd / controlplane servers from Open to Work in Progress.
Jul 7 2022, 11:56 AM · Storage manager, System administration

Jul 5 2022

vsellier removed a parent task for T4374: [cassandra] Test basic topology: T4373: [cassandra] Test the new hardware.
Jul 5 2022, 5:50 PM · Storage manager, System administration
vsellier removed a subtask for T4373: [cassandra] Test the new hardware: T4374: [cassandra] Test basic topology.
Jul 5 2022, 5:50 PM · Storage manager, System administration
vsellier removed a subtask for T4373: [cassandra] Test the new hardware: T4375: [cassandra] One cassandra per data disk.
Jul 5 2022, 5:50 PM · Storage manager, System administration
vsellier removed a parent task for T4375: [cassandra] One cassandra per data disk: T4373: [cassandra] Test the new hardware.
Jul 5 2022, 5:50 PM · Storage manager, System administration
vsellier added a parent task for T4374: [cassandra] Test basic topology: T4379: [cassandra] create etcd / controlplane servers.
Jul 5 2022, 5:49 PM · Storage manager, System administration
vsellier added a subtask for T4379: [cassandra] create etcd / controlplane servers: T4374: [cassandra] Test basic topology.
Jul 5 2022, 5:49 PM · Storage manager, System administration
vsellier added a parent task for T4375: [cassandra] One cassandra per data disk: T4379: [cassandra] create etcd / controlplane servers.
Jul 5 2022, 5:49 PM · Storage manager, System administration
vsellier added a subtask for T4379: [cassandra] create etcd / controlplane servers: T4375: [cassandra] One cassandra per data disk.
Jul 5 2022, 5:49 PM · Storage manager, System administration
vsellier triaged T4379: [cassandra] create etcd / controlplane servers as Normal priority.
Jul 5 2022, 5:47 PM · Storage manager, System administration
vsellier changed the status of T4373: [cassandra] Test the new hardware from Open to Work in Progress.
Jul 5 2022, 5:41 PM · Storage manager, System administration
ardumont updated the task description for T4373: [cassandra] Test the new hardware.
Jul 5 2022, 10:04 AM · Storage manager, System administration
vsellier triaged T4375: [cassandra] One cassandra per data disk as Normal priority.
Jul 5 2022, 9:52 AM · Storage manager, System administration
vsellier triaged T4374: [cassandra] Test basic topology as Normal priority.
Jul 5 2022, 9:43 AM · Storage manager, System administration
vsellier triaged T4373: [cassandra] Test the new hardware as Normal priority.
Jul 5 2022, 9:36 AM · Storage manager, System administration

Jul 4 2022

ardumont added a project to T4371: Deploy swh-scrubber on all storage instances: System administration.
Jul 4 2022, 10:47 AM · System administration, Archive integrity, Storage manager
ardumont added projects to T4371: Deploy swh-scrubber on all storage instances: Storage manager, Archive integrity.
Jul 4 2022, 10:47 AM · System administration, Archive integrity, Storage manager

Jul 1 2022

douardda triaged T4370: Refactor the origin visit data model (aka get rid of the OriginVisit model object) as High priority.
Jul 1 2022, 4:35 PM · Storage manager, Data Model
douardda triaged T4368: Loosen "foreign key" validation in storages used as mirror ingestion endpoint as High priority.
Jul 1 2022, 11:13 AM · Storage manager
douardda created T4368: Loosen "foreign key" validation in storages used as mirror ingestion endpoint.
Jul 1 2022, 11:13 AM · Storage manager

Jun 23 2022

vlorentz added a comment to T4185: Loader profiling : Add Measure of ignored objects .

the Git loader now exports a swh_loader_filtered_objects_total metric. We should generalize this to other loaders eventually; using one of the options above

Jun 23 2022, 10:40 AM · Storage manager

Jun 10 2022

douardda triaged T4325: Remove (useless) metadata_authority and metadata_fetcher from the journal as Normal priority.
Jun 10 2022, 2:51 PM · Storage manager

Jun 2 2022

vlorentz moved T3841: regularly scrub all the data stores of swh from Backlog to Work in progress on the Roadmap 2022 board.
Jun 2 2022, 9:58 AM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz moved T2214: Scale-out graph and database storage in production from Backlog to Work in progress on the Roadmap 2022 board.
Jun 2 2022, 9:57 AM · meta-task, Roadmap 2022, Roadmap 2021, Storage manager

May 31 2022

douardda closed T4286: Replace usage of swh.core's postgresql_fact by stock pytest_postgresql's factory function as Resolved.
May 31 2022, 7:28 PM · Storage manager
douardda added a revision to T4286: Replace usage of swh.core's postgresql_fact by stock pytest_postgresql's factory function : D7918: pytest_plugin: use the stock pytest_postgresql postgresql factory.
May 31 2022, 4:43 PM · Storage manager
douardda triaged T4286: Replace usage of swh.core's postgresql_fact by stock pytest_postgresql's factory function as Normal priority.
May 31 2022, 4:23 PM · Storage manager

May 4 2022

olasd added a project to T4185: Loader profiling : Add Measure of ignored objects : Storage manager.
May 4 2022, 10:54 AM · Storage manager

Apr 19 2022

anlambert closed T4090: Add method to efficiently retrieve latest statuses of origin visits as Resolved.

Feature has been implemented and deployed, closing this.

Apr 19 2022, 11:42 AM · Storage manager

Apr 12 2022

ardumont closed T4137: Deploy swh.storage v1.3 as Resolved.
Apr 12 2022, 5:08 PM · System administration, Storage manager
anlambert added a revision to T4090: Add method to efficiently retrieve latest statuses of origin visits : D7559: common/archive: Improve lookup_origin_visits performance.
Apr 12 2022, 4:32 PM · Storage manager
ardumont moved T4137: Deploy swh.storage v1.3 from in-progress to deployed/landed/monitoring on the System administration board.
Apr 12 2022, 3:54 PM · System administration, Storage manager
ardumont added a comment to T4137: Deploy swh.storage v1.3.

done ^

Apr 12 2022, 3:54 PM · System administration, Storage manager
ardumont moved T4137: Deploy swh.storage v1.3 from deployed/landed/monitoring to in-progress on the System administration board.
Apr 12 2022, 3:27 PM · System administration, Storage manager
ardumont added a comment to T4137: Deploy swh.storage v1.3.

New fix needed so another round for v1.3.1

Apr 12 2022, 3:26 PM · System administration, Storage manager
anlambert added a revision to T4090: Add method to efficiently retrieve latest statuses of origin visits : D7552: origin_get_with_statuses: Fix case when fetched visits list is empty.
Apr 12 2022, 12:10 PM · Storage manager

Apr 11 2022

ardumont moved T4137: Deploy swh.storage v1.3 from in-progress to deployed/landed/monitoring on the System administration board.
Apr 11 2022, 4:06 PM · System administration, Storage manager
ardumont updated the task description for T4137: Deploy swh.storage v1.3.
Apr 11 2022, 4:06 PM · System administration, Storage manager
ardumont updated the task description for T4137: Deploy swh.storage v1.3.
Apr 11 2022, 4:03 PM · System administration, Storage manager
ardumont updated the task description for T4137: Deploy swh.storage v1.3.
Apr 11 2022, 4:01 PM · System administration, Storage manager
ardumont updated the task description for T4137: Deploy swh.storage v1.3.
Apr 11 2022, 2:29 PM · System administration, Storage manager
ardumont changed the status of T4137: Deploy swh.storage v1.3 from Open to Work in Progress.
Apr 11 2022, 2:27 PM · System administration, Storage manager
ardumont added a project to T4137: Deploy swh.storage v1.3: System administration.
Apr 11 2022, 2:27 PM · System administration, Storage manager
ardumont updated the task description for T4137: Deploy swh.storage v1.3.
Apr 11 2022, 2:25 PM · System administration, Storage manager
ardumont renamed T4137: Deploy swh.storage v1.3 from storage v1.3 to Deploy swh.storage v1.3.
Apr 11 2022, 2:10 PM · System administration, Storage manager
ardumont triaged T4137: Deploy swh.storage v1.3 as Normal priority.
Apr 11 2022, 2:10 PM · System administration, Storage manager
bchauvet removed a project from T4136: Add an "history completeness check": Roadmap 2022.
Apr 11 2022, 12:02 PM · Storage manager
bchauvet triaged T4136: Add an "history completeness check" as Normal priority.
Apr 11 2022, 12:02 PM · Storage manager

Mar 30 2022

vsellier closed T4117: Storage metrics not refreshed as Resolved.

Thanks olasd for restarting the service following this documentation: https://docs.gunicorn.org/en/stable/signals.html#upgrading-to-a-new-binary-on-the-fly

First, replace the old binary with a new one, then send a USR2 signal to the current master process. It executes a new binary whose PID file is postfixed with .2 (e.g. /var/run/gunicorn.pid.2), which in turn starts a new master process and new worker processes:
At this point, two instances of Gunicorn are running, handling the incoming requests together. To phase the old instance out, you have to send a WINCH signal to the old master process, and its worker processes will start to gracefully shut down.

Mar 30 2022, 11:51 AM · Storage manager, System administration
vsellier added a comment to T4117: Storage metrics not refreshed.

Sorry the description was not completely clear
The durations metrics are still updated, but not the operations count ones
In 2 metrics at 10 minutes of interval:

vsellier@saam ~ % ls -al /tmp/m[12]              
-rw-r--r-- 1 vsellier vsellier 176850 Mar 30 07:25 /tmp/m1
-rw-r--r-- 1 vsellier vsellier 176846 Mar 30 07:34 /tmp/m2
Mar 30 2022, 10:17 AM · Storage manager, System administration
olasd added a comment to T4117: Storage metrics not refreshed.

swh.storage 1.2.0 increased a bunch of timeouts and made some queries smarter so it's entirely plausible that the number of errors has dropped drastically.

Mar 30 2022, 10:09 AM · Storage manager, System administration
olasd added a comment to T4117: Storage metrics not refreshed.

What exact metric are you saying isn't updating? In your diff, the swh_storage_request_duration_seconds_count{endpoint="index"} metric seems to be increasing normally

Mar 30 2022, 10:07 AM · Storage manager, System administration
vsellier renamed T4117: Storage metrics not refreshed from Storage metrics not updated to Storage metrics not refreshed.
Mar 30 2022, 10:02 AM · Storage manager, System administration
vsellier changed the status of T4117: Storage metrics not refreshed from Open to Work in Progress.
Mar 30 2022, 10:01 AM · Storage manager, System administration

Mar 29 2022

olasd added a comment to T4090: Add method to efficiently retrieve latest statuses of origin visits .

SGTM, thanks!

Mar 29 2022, 2:34 PM · Storage manager

Mar 28 2022

anlambert added a comment to T4090: Add method to efficiently retrieve latest statuses of origin visits .

I have the feeling that, in terms of API extensibility, we'll want to be returning both the OriginVisit and its latest OriginVisitStatus.
If we don't do that, we might find ourselves in a situation where we want to combine the fields from both objects into one, which feels a bit clunky when we could just return a proper composite type.

Mar 28 2022, 4:08 PM · Storage manager
olasd added a comment to T4090: Add method to efficiently retrieve latest statuses of origin visits .

I have the feeling that, in terms of API extensibility, we'll want to be returning both the OriginVisit and its latest OriginVisitStatus.

Mar 28 2022, 2:47 PM · Storage manager
anlambert added a revision to T4090: Add method to efficiently retrieve latest statuses of origin visits : D7442: interface: Add new method origin_visit_get_with_statuses.
Mar 28 2022, 2:37 PM · Storage manager
anlambert renamed T4090: Add method to efficiently retrieve latest statuses of origin visits from Add method to efficiently retrieve origin visits and their latest statuses to Add method to efficiently retrieve latest statuses of origin visits .
Mar 28 2022, 10:41 AM · Storage manager

Mar 25 2022

vlorentz closed T3552: Fix corrupted releases, revisions, and directories in the storage as Resolved.
Mar 25 2022, 5:36 PM · Storage manager
bchauvet raised the priority of T2214: Scale-out graph and database storage in production from Normal to High.
Mar 25 2022, 5:30 PM · meta-task, Roadmap 2022, Roadmap 2021, Storage manager
anlambert triaged T4090: Add method to efficiently retrieve latest statuses of origin visits as Normal priority.
Mar 25 2022, 11:40 AM · Storage manager

Mar 23 2022

bchauvet added projects to T2214: Scale-out graph and database storage in production: Roadmap 2022, meta-task.
Mar 23 2022, 5:04 PM · meta-task, Roadmap 2022, Roadmap 2021, Storage manager
bchauvet added projects to T3841: regularly scrub all the data stores of swh: Roadmap 2022, meta-task.
Mar 23 2022, 4:38 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager

Mar 16 2022

vlorentz added a revision to T3841: regularly scrub all the data stores of swh: D7360: Initialize DB schema and postgresql storage checker.
Mar 16 2022, 4:08 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz closed T3752: Store/represent time offsets as strings, a subtask of T3594: Faithfully store weird git objects, as Resolved.
Mar 16 2022, 10:36 AM · meta-task, Data Model, Storage manager
vlorentz closed T3752: Store/represent time offsets as strings as Resolved.

swh-model 5.0.0 released, which finalizes these changes

Mar 16 2022, 10:36 AM · Data Model, Storage manager
vlorentz added revisions to T3752: Store/represent time offsets as strings: D7011: Revert "Restore 'offset' and 'negative_utc' arguments and make them optional", D7012: Remove deprecated property 'TimestampWithTimezone.offset'.
Mar 16 2022, 10:36 AM · Data Model, Storage manager

Mar 15 2022

vlorentz closed T3878: Fix existing corrupt objects, a subtask of T3135: Improve integrity of ingested content, as Resolved.
Mar 15 2022, 1:46 PM · Storage manager, Roadmap 2021, meta-task
vlorentz closed T3878: Fix existing corrupt objects as Resolved.

Closing this, because we already fixed everything we can for now.

Mar 15 2022, 1:46 PM · Storage manager
vlorentz added a revision to T3841: regularly scrub all the data stores of swh: D7347: Add swh-scrubber to .mrconfig.
Mar 15 2022, 11:41 AM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz added a revision to T3841: regularly scrub all the data stores of swh: D7346: Add swh-scrubber package to the CI.
Mar 15 2022, 11:40 AM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz removed a project from T3841: regularly scrub all the data stores of swh: meta-task.
Mar 15 2022, 11:26 AM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz claimed T3841: regularly scrub all the data stores of swh.
Mar 15 2022, 11:25 AM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz updated the task description for T3752: Store/represent time offsets as strings.
Mar 15 2022, 10:33 AM · Data Model, Storage manager

Mar 14 2022

olasd added a comment to T3841: regularly scrub all the data stores of swh.
In T3841#80779, @olasd wrote:

I think it's fine to remove the entries when we don't need them anymore (i.e. the object has been restored). Worst case, it'll be re-added at the next iteration of the script :-)

Mar 14 2022, 3:23 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz added a comment to T3841: regularly scrub all the data stores of swh.
In T3841#80779, @olasd wrote:

You'll need a column for which datastore has the corrupted object.

Mar 14 2022, 3:21 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
olasd added a comment to T3841: regularly scrub all the data stores of swh.

You'll need a column for which datastore has the corrupted object.

Mar 14 2022, 3:19 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz added a comment to T3841: regularly scrub all the data stores of swh.

I wrote a script to scrub postgres and kafka: https://forge.softwareheritage.org/source/snippets/browse/master/vlorentz/recheck_consistency.py

Mar 14 2022, 2:45 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager

Mar 11 2022

vsellier added a watcher for Storage manager: vsellier.
Mar 11 2022, 10:58 AM

Feb 8 2022

vlorentz closed T3594: Faithfully store weird git objects, a subtask of T3135: Improve integrity of ingested content, as Resolved.
Feb 8 2022, 11:53 AM · Storage manager, Roadmap 2021, meta-task
vlorentz closed T3594: Faithfully store weird git objects, a subtask of T3552: Fix corrupted releases, revisions, and directories in the storage, as Resolved.
Feb 8 2022, 11:53 AM · Storage manager
vlorentz closed T3594: Faithfully store weird git objects as Resolved.
Feb 8 2022, 11:53 AM · meta-task, Data Model, Storage manager
vlorentz closed T3753: Store original git manifests as Resolved.
Feb 8 2022, 11:53 AM · Data Model, Storage manager
vlorentz closed T3753: Store original git manifests, a subtask of T3594: Faithfully store weird git objects, as Resolved.
Feb 8 2022, 11:53 AM · meta-task, Data Model, Storage manager

Feb 2 2022

vlorentz added a revision to T3753: Store original git manifests: D7067: git_bare: Use raw_manifest when available.
Feb 2 2022, 6:32 PM · Data Model, Storage manager