Page MenuHomeSoftware Heritage
Feed Advanced Search

Mar 30 2022

vsellier added a comment to T4117: Storage metrics not refreshed.

Sorry the description was not completely clear
The durations metrics are still updated, but not the operations count ones
In 2 metrics at 10 minutes of interval:

vsellier@saam ~ % ls -al /tmp/m[12]              
-rw-r--r-- 1 vsellier vsellier 176850 Mar 30 07:25 /tmp/m1
-rw-r--r-- 1 vsellier vsellier 176846 Mar 30 07:34 /tmp/m2
Mar 30 2022, 10:17 AM · Storage manager, System administration
olasd added a comment to T4117: Storage metrics not refreshed.

swh.storage 1.2.0 increased a bunch of timeouts and made some queries smarter so it's entirely plausible that the number of errors has dropped drastically.

Mar 30 2022, 10:09 AM · Storage manager, System administration
olasd added a comment to T4117: Storage metrics not refreshed.

What exact metric are you saying isn't updating? In your diff, the swh_storage_request_duration_seconds_count{endpoint="index"} metric seems to be increasing normally

Mar 30 2022, 10:07 AM · Storage manager, System administration
vsellier renamed T4117: Storage metrics not refreshed from Storage metrics not updated to Storage metrics not refreshed.
Mar 30 2022, 10:02 AM · Storage manager, System administration
vsellier changed the status of T4117: Storage metrics not refreshed from Open to Work in Progress.
Mar 30 2022, 10:01 AM · Storage manager, System administration

Mar 29 2022

olasd added a comment to T4090: Add method to efficiently retrieve latest statuses of origin visits .

SGTM, thanks!

Mar 29 2022, 2:34 PM · Storage manager

Mar 28 2022

anlambert added a comment to T4090: Add method to efficiently retrieve latest statuses of origin visits .

I have the feeling that, in terms of API extensibility, we'll want to be returning both the OriginVisit and its latest OriginVisitStatus.
If we don't do that, we might find ourselves in a situation where we want to combine the fields from both objects into one, which feels a bit clunky when we could just return a proper composite type.

Mar 28 2022, 4:08 PM · Storage manager
olasd added a comment to T4090: Add method to efficiently retrieve latest statuses of origin visits .

I have the feeling that, in terms of API extensibility, we'll want to be returning both the OriginVisit and its latest OriginVisitStatus.

Mar 28 2022, 2:47 PM · Storage manager
anlambert added a revision to T4090: Add method to efficiently retrieve latest statuses of origin visits : D7442: interface: Add new method origin_visit_get_with_statuses.
Mar 28 2022, 2:37 PM · Storage manager
anlambert renamed T4090: Add method to efficiently retrieve latest statuses of origin visits from Add method to efficiently retrieve origin visits and their latest statuses to Add method to efficiently retrieve latest statuses of origin visits .
Mar 28 2022, 10:41 AM · Storage manager

Mar 25 2022

vlorentz closed T3552: Fix corrupted releases, revisions, and directories in the storage as Resolved.
Mar 25 2022, 5:36 PM · Storage manager
bchauvet raised the priority of T2214: Scale-out graph and database storage in production from Normal to High.
Mar 25 2022, 5:30 PM · meta-task, Roadmap 2022, Roadmap 2021, Storage manager
anlambert triaged T4090: Add method to efficiently retrieve latest statuses of origin visits as Normal priority.
Mar 25 2022, 11:40 AM · Storage manager

Mar 23 2022

bchauvet added projects to T2214: Scale-out graph and database storage in production: Roadmap 2022, meta-task.
Mar 23 2022, 5:04 PM · meta-task, Roadmap 2022, Roadmap 2021, Storage manager
bchauvet added projects to T3841: regularly scrub all the data stores of swh: Roadmap 2022, meta-task.
Mar 23 2022, 4:38 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager

Mar 16 2022

vlorentz added a revision to T3841: regularly scrub all the data stores of swh: D7360: Initialize DB schema and postgresql storage checker.
Mar 16 2022, 4:08 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz closed T3752: Store/represent time offsets as strings, a subtask of T3594: Faithfully store weird git objects, as Resolved.
Mar 16 2022, 10:36 AM · meta-task, Data Model, Storage manager
vlorentz closed T3752: Store/represent time offsets as strings as Resolved.

swh-model 5.0.0 released, which finalizes these changes

Mar 16 2022, 10:36 AM · Data Model, Storage manager
vlorentz added revisions to T3752: Store/represent time offsets as strings: D7011: Revert "Restore 'offset' and 'negative_utc' arguments and make them optional", D7012: Remove deprecated property 'TimestampWithTimezone.offset'.
Mar 16 2022, 10:36 AM · Data Model, Storage manager

Mar 15 2022

vlorentz closed T3878: Fix existing corrupt objects, a subtask of T3135: Improve integrity of ingested content, as Resolved.
Mar 15 2022, 1:46 PM · Storage manager, Roadmap 2021, meta-task
vlorentz closed T3878: Fix existing corrupt objects as Resolved.

Closing this, because we already fixed everything we can for now.

Mar 15 2022, 1:46 PM · Storage manager
vlorentz added a revision to T3841: regularly scrub all the data stores of swh: D7347: Add swh-scrubber to .mrconfig.
Mar 15 2022, 11:41 AM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz added a revision to T3841: regularly scrub all the data stores of swh: D7346: Add swh-scrubber package to the CI.
Mar 15 2022, 11:40 AM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz removed a project from T3841: regularly scrub all the data stores of swh: meta-task.
Mar 15 2022, 11:26 AM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz claimed T3841: regularly scrub all the data stores of swh.
Mar 15 2022, 11:25 AM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz updated the task description for T3752: Store/represent time offsets as strings.
Mar 15 2022, 10:33 AM · Data Model, Storage manager

Mar 14 2022

olasd added a comment to T3841: regularly scrub all the data stores of swh.
In T3841#80779, @olasd wrote:

I think it's fine to remove the entries when we don't need them anymore (i.e. the object has been restored). Worst case, it'll be re-added at the next iteration of the script :-)

Mar 14 2022, 3:23 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz added a comment to T3841: regularly scrub all the data stores of swh.
In T3841#80779, @olasd wrote:

You'll need a column for which datastore has the corrupted object.

Mar 14 2022, 3:21 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
olasd added a comment to T3841: regularly scrub all the data stores of swh.

You'll need a column for which datastore has the corrupted object.

Mar 14 2022, 3:19 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
vlorentz added a comment to T3841: regularly scrub all the data stores of swh.

I wrote a script to scrub postgres and kafka: https://forge.softwareheritage.org/source/snippets/browse/master/vlorentz/recheck_consistency.py

Mar 14 2022, 2:45 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager

Mar 11 2022

vsellier added a watcher for Storage manager: vsellier.
Mar 11 2022, 10:58 AM

Feb 8 2022

vlorentz closed T3594: Faithfully store weird git objects, a subtask of T3135: Improve integrity of ingested content, as Resolved.
Feb 8 2022, 11:53 AM · Storage manager, Roadmap 2021, meta-task
vlorentz closed T3594: Faithfully store weird git objects, a subtask of T3552: Fix corrupted releases, revisions, and directories in the storage, as Resolved.
Feb 8 2022, 11:53 AM · Storage manager
vlorentz closed T3594: Faithfully store weird git objects as Resolved.
Feb 8 2022, 11:53 AM · meta-task, Data Model, Storage manager
vlorentz closed T3753: Store original git manifests as Resolved.
Feb 8 2022, 11:53 AM · Data Model, Storage manager
vlorentz closed T3753: Store original git manifests, a subtask of T3594: Faithfully store weird git objects, as Resolved.
Feb 8 2022, 11:53 AM · meta-task, Data Model, Storage manager

Feb 2 2022

vlorentz added a revision to T3753: Store original git manifests: D7067: git_bare: Use raw_manifest when available.
Feb 2 2022, 6:32 PM · Data Model, Storage manager

Jan 28 2022

douardda closed T3693: Provide a mecanism to report (with persistence) objects that fails to get replayed (mirror) as Resolved.
Jan 28 2022, 9:38 AM · Storage manager

Jan 24 2022

douardda triaged T3882: Handle updated kafka messages for the storage replayer as High priority.
Jan 24 2022, 5:22 PM · Storage manager, Mirror
vlorentz added a revision to T3878: Fix existing corrupt objects: D6957: Add recover_corrupt_objects.py.
Jan 24 2022, 11:02 AM · Storage manager
vlorentz triaged T3878: Fix existing corrupt objects as Normal priority.
Jan 24 2022, 11:01 AM · Storage manager

Jan 21 2022

vlorentz added revisions to T3752: Store/represent time offsets as strings: D7008: Stop using the deprecated 'TimestampWithTimezone.offset' attribute, D7007: Stop using the deprecated 'TimestampWithTimezone.offset' attribute, D7006: Stop using the deprecated 'TimestampWithTimezone.offset' attribute, D7005: Add method 'TimestampWithTimezone.offset_minutes', D7003: journal: Document the new format for gitdate..
Jan 21 2022, 1:29 PM · Data Model, Storage manager

Jan 20 2022

ardumont closed T3869: Deploy storage v0.41.2 as Resolved.
Jan 20 2022, 3:51 PM · System administration, Storage manager
ardumont moved T3869: Deploy storage v0.41.2 from in-progress to deployed/landed/monitoring on the System administration board.
Jan 20 2022, 3:42 PM · System administration, Storage manager
ardumont added a comment to T3869: Deploy storage v0.41.2.

migration done.

Jan 20 2022, 3:42 PM · System administration, Storage manager
ardumont updated the task description for T3869: Deploy storage v0.41.2.
Jan 20 2022, 3:00 PM · System administration, Storage manager
ardumont added a comment to T3869: Deploy storage v0.41.2.

belvedere migration status: first index on directory created, ongoing index creation for revision, and then release.

Jan 20 2022, 3:00 PM · System administration, Storage manager
ardumont added a comment to T3869: Deploy storage v0.41.2.

staging, prod: storage deployed and service restarted (it's not dependent on the sql migration to be complete).

Jan 20 2022, 11:52 AM · System administration, Storage manager
ardumont updated the task description for T3869: Deploy storage v0.41.2.
Jan 20 2022, 10:48 AM · System administration, Storage manager
ardumont added a comment to T3869: Deploy storage v0.41.2.
  • staging db already migrated during the deployment of T3861.
  • production db migration ongoing
Jan 20 2022, 10:37 AM · System administration, Storage manager
ardumont changed the status of T3869: Deploy storage v0.41.2 from Open to Work in Progress.
Jan 20 2022, 10:33 AM · System administration, Storage manager
ardumont updated the task description for T3869: Deploy storage v0.41.2.
Jan 20 2022, 10:33 AM · System administration, Storage manager
ardumont triaged T3869: Deploy storage v0.41.2 as Normal priority.
Jan 20 2022, 10:33 AM · System administration, Storage manager

Jan 19 2022

olasd closed T3819: Deploy swh.model 4.1.0 / swh.storage 0.41.0 to production, a subtask of T3752: Store/represent time offsets as strings, as Resolved.
Jan 19 2022, 7:12 PM · Data Model, Storage manager

Jan 18 2022

olasd merged task T2449: Consider switching timestamp offset storage to strings/byte arrays into T3752: Store/represent time offsets as strings.
Jan 18 2022, 12:27 PM · Storage manager, Data Model
olasd merged T2449: Consider switching timestamp offset storage to strings/byte arrays into T3752: Store/represent time offsets as strings.
Jan 18 2022, 12:27 PM · Data Model, Storage manager
olasd added a subtask for T3752: Store/represent time offsets as strings: T3819: Deploy swh.model 4.1.0 / swh.storage 0.41.0 to production.
Jan 18 2022, 12:26 PM · Data Model, Storage manager

Jan 13 2022

vlorentz added revisions to T3752: Store/represent time offsets as strings: D6940: tests: Use 'offset_bytes' instead of 'offset'/'negative_utc', D6939: Stop passing 'offset' and 'negative_utc' to TimestampWithTimezone(), D6938: tests: Replace 'offset' and 'negative_utc' with 'offset_bytes', D6937: Remove 'offset' and 'negative_utc', D6935: deposit: Remove 'negative_utc' from test data.
Jan 13 2022, 12:26 PM · Data Model, Storage manager
vlorentz added a revision to T3752: Store/represent time offsets as strings: D6936: TimestampWithTimezone: Make 'offset' and 'negative_utc' optional.
Jan 13 2022, 12:16 PM · Data Model, Storage manager
vlorentz added a revision to T3752: Store/represent time offsets as strings: D6929: Remove 'negative_utc'..
Jan 13 2022, 11:28 AM · Data Model, Storage manager

Jan 12 2022

vlorentz added revisions to T3752: Store/represent time offsets as strings: D6927: Remove special handling of negative_utc, D6923: converters: Write raw_manifest of Directory objects, D6894: converters: Write object_bytes and raw_manifest on revisions and releases.
Jan 12 2022, 3:24 PM · Data Model, Storage manager

Jan 11 2022

vlorentz added a revision to T3752: Store/represent time offsets as strings: D6915: tests: Use TimestampWithTimezone.from_datetime() instead of the constructor.
Jan 11 2022, 3:25 PM · Data Model, Storage manager
vlorentz added revisions to T3752: Store/represent time offsets as strings: D6911: Remove strdate_to_timestamp, D6913: tests: Use TimestampWithTimezone.from_datetime() instead of the constructor, D6910: tests: Use TimestampWithTimezone.from_datetime() instead of the constructor, D6909: tests: Use TimestampWithTimezone.from_datetime() instead of the constructor, D6908: tests: Use TimestampWithTimezone.from_datetime() instead of the constructor.
Jan 11 2022, 2:15 PM · Data Model, Storage manager
douardda renamed T3841: regularly scrub all the data stores of swh from regularly scrub all the data sources of swh to regularly scrub all the data stores of swh.
Jan 11 2022, 12:32 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
douardda removed a project from T3841: regularly scrub all the data stores of swh: Roadmap 2021.
Jan 11 2022, 12:31 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
douardda triaged T3841: regularly scrub all the data stores of swh as Normal priority.
Jan 11 2022, 12:31 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager

Jan 7 2022

vlorentz added revisions to T3752: Store/represent time offsets as strings: D6848: Add columns {,committer_}date_offset to rev/rel and raw_manifest to dir/rev/rel, D6890: git_objects: Use raw offset_bytes to format dates, and remove format_offset().
Jan 7 2022, 1:54 PM · Data Model, Storage manager
vlorentz added revisions to T3753: Store original git manifests: D6801: model: Add a raw_manifest attribute, D6811: model: Exclude 'raw_manifest' from dictionaries when it is null, D6847: hypothesis_strategies: Generate raw_manifest, D6848: Add columns {,committer_}date_offset to rev/rel and raw_manifest to dir/rev/rel.
Jan 7 2022, 1:54 PM · Data Model, Storage manager

Jan 6 2022

vlorentz added revisions to T3577: Parallel loaders performances : D6888: cassandra: Rewrite content_missing to run queries concurrently., D6885: cassandra: Use concurrent queries in *_missing() instead of naive grouping.
Jan 6 2022, 5:32 PM · System administration, Storage manager

Dec 22 2021

vlorentz closed T3585: Fix inconsistencies of the Cassandra backend with postgres as Wontfix.
Dec 22 2021, 2:34 PM · meta-task, Storage manager
vlorentz closed T3585: Fix inconsistencies of the Cassandra backend with postgres, a subtask of T1892: Cassandra as a storage backend, as Wontfix.
Dec 22 2021, 2:34 PM · meta-task, Storage manager

Dec 7 2021

vlorentz added a revision to T3752: Store/represent time offsets as strings: D6776: Add attribute TimestampWithTimezone.offset_bytes, to store raw Git offsets.
Dec 7 2021, 4:51 PM · Data Model, Storage manager
anlambert closed T3776: cassandra tests are failing in the swh-environment build as Resolved by committing rDSTO615fb99eb708: test_cassandra: Fix failing tests since swh-model update.
Dec 7 2021, 1:56 PM · Storage manager
anlambert added a revision to T3776: cassandra tests are failing in the swh-environment build: D6768: test_cassandra: Fix failing tests since swh-model update.
Dec 7 2021, 1:38 PM · Storage manager
vsellier triaged T3776: cassandra tests are failing in the swh-environment build as High priority.
Dec 7 2021, 1:16 PM · Storage manager

Dec 2 2021

vlorentz updated the task description for T3752: Store/represent time offsets as strings.
Dec 2 2021, 4:04 PM · Data Model, Storage manager
vlorentz closed T3586: Figure out what to do with 'misordered' directories in Cassandra, a subtask of T3585: Fix inconsistencies of the Cassandra backend with postgres, as Resolved.
Dec 2 2021, 3:14 PM · meta-task, Storage manager
vlorentz closed T3586: Figure out what to do with 'misordered' directories in Cassandra as Resolved.

We don't care anymore, this will be handled by T3753.

Dec 2 2021, 3:14 PM · Data Model, Storage manager
vlorentz removed a parent task for T3752: Store/represent time offsets as strings: T3753: Store original git manifests.
Dec 2 2021, 3:01 PM · Data Model, Storage manager
vlorentz removed a subtask for T3753: Store original git manifests: T3752: Store/represent time offsets as strings.
Dec 2 2021, 3:01 PM · Data Model, Storage manager
vlorentz added a parent task for T3752: Store/represent time offsets as strings: T3753: Store original git manifests.
Dec 2 2021, 3:00 PM · Data Model, Storage manager
vlorentz added a subtask for T3753: Store original git manifests: T3752: Store/represent time offsets as strings.
Dec 2 2021, 3:00 PM · Data Model, Storage manager
vlorentz updated the task description for T3752: Store/represent time offsets as strings.
Dec 2 2021, 2:59 PM · Data Model, Storage manager
vlorentz updated the task description for T3752: Store/represent time offsets as strings.
Dec 2 2021, 2:55 PM · Data Model, Storage manager
vlorentz updated the task description for T3752: Store/represent time offsets as strings.
Dec 2 2021, 2:52 PM · Data Model, Storage manager
vlorentz updated the task description for T3753: Store original git manifests.
Dec 2 2021, 2:48 PM · Data Model, Storage manager
vlorentz updated the task description for T3752: Store/represent time offsets as strings.
Dec 2 2021, 2:22 PM · Data Model, Storage manager
vsellier closed T3357: Perform some tests of the cassandra storage on Grid5000, a subtask of T1892: Cassandra as a storage backend, as Resolved.
Dec 2 2021, 10:10 AM · meta-task, Storage manager
vsellier closed T3357: Perform some tests of the cassandra storage on Grid5000 as Resolved.

The slide of the restrospective of the experiment are available at : https://hedgedoc.softwareheritage.org/VOP9qh1MTqm4DjPQfFgNbQ

Dec 2 2021, 10:10 AM · System administration, Storage manager
vsellier closed T3573: [cassandra] directory and content read benchmarks, a subtask of T3357: Perform some tests of the cassandra storage on Grid5000, as Resolved.
Dec 2 2021, 10:08 AM · System administration, Storage manager
vsellier closed T3573: [cassandra] directory and content read benchmarks as Resolved.

It was not easy to know if it's a lot of call or long running calls because it's regular sample and we don't have this granularity.

Dec 2 2021, 10:08 AM · System administration, Storage manager

Dec 1 2021

zack moved T2053: support graph export for the cassandra backend from Backlog to Deployed on the Compressed graph service board.
Dec 1 2021, 4:37 PM · Compressed graph service, Storage manager
zack moved T2045: add support for reverse lookup from swh:1:ori:... PIDs to origin URLs from Backlog to Deployed on the Compressed graph service board.
Dec 1 2021, 4:35 PM · Compressed graph service, Storage manager

Nov 26 2021

vlorentz removed a project from T3752: Store/represent time offsets as strings: meta-task.
Nov 26 2021, 5:19 PM · Data Model, Storage manager
vlorentz removed a project from T3753: Store original git manifests: meta-task.
Nov 26 2021, 5:19 PM · Data Model, Storage manager
vlorentz claimed T3594: Faithfully store weird git objects.
Nov 26 2021, 4:43 PM · meta-task, Data Model, Storage manager
vlorentz claimed T3753: Store original git manifests.
Nov 26 2021, 4:43 PM · Data Model, Storage manager
vlorentz triaged T3753: Store original git manifests as Normal priority.
Nov 26 2021, 4:43 PM · Data Model, Storage manager
vlorentz triaged T3752: Store/represent time offsets as strings as Normal priority.
Nov 26 2021, 4:42 PM · Data Model, Storage manager
vlorentz closed T3598: Support revisions with "extra headers" not at the end, a subtask of T3594: Faithfully store weird git objects, as Wontfix.
Nov 26 2021, 4:41 PM · meta-task, Data Model, Storage manager