Page MenuHomeSoftware Heritage
Feed Advanced Search

Jan 27 2022

douardda triaged T3892: Add a dashboard/page with a summary of which version of swh components are running on which service/machine as Normal priority.
Jan 27 2022, 11:04 AM · System administration

Jan 26 2022

douardda added inline comments to D7003: journal: Document the new format for gitdate..
Jan 26 2022, 3:50 PM
douardda added a comment to D7039: Update the debian local package building section.
In D7039#183023, @olasd wrote:

Thanks.

--build-dep-resolver=aptitude should only be used when building with an extra-repository which has a non-default priority, that is only when using a -backports suite (so only for the bullseye and buster instructions). It should probably be documented in the list of "useful options" rather than as the default.

Jan 26 2022, 2:26 PM
douardda requested changes to D7003: journal: Document the new format for gitdate..
Jan 26 2022, 2:24 PM
douardda added inline comments to D7003: journal: Document the new format for gitdate..
Jan 26 2022, 2:23 PM
douardda added inline comments to D7003: journal: Document the new format for gitdate..
Jan 26 2022, 2:18 PM
douardda requested review of D7039: Update the debian local package building section.
Jan 26 2022, 12:33 PM

Jan 25 2022

douardda committed rDDOC7ff0c981c3e9: Update displayed copyright to 2022 (authored by douardda).
Update displayed copyright to 2022
Jan 25 2022, 9:55 AM

Jan 24 2022

douardda triaged T3883: Handle updated kafka messages for the objstorage replayer as High priority.
Jan 24 2022, 5:25 PM · Mirror
douardda triaged T3882: Handle updated kafka messages for the storage replayer as High priority.
Jan 24 2022, 5:22 PM · Storage manager, Mirror
douardda triaged T3881: Mirror - handle handling of multiple kafka messages for the same object as High priority.
Jan 24 2022, 5:22 PM · Mirror
douardda added a comment to T3877: Automate the weekly-planning script.

It might need a dedicated bot user to be created on hedgedoc also.

Jan 24 2022, 5:17 PM · System administration
douardda added a comment to T3877: Automate the weekly-planning script.

Things to fix in the script:

Jan 24 2022, 5:16 PM · System administration
douardda committed rDDOC52c6a83ea2cd: Fix rst syntax in mirror-operations/docker.rst (authored by douardda).
Fix rst syntax in mirror-operations/docker.rst
Jan 24 2022, 5:03 PM
douardda closed D7004: Update the docker mirror doc.
Jan 24 2022, 5:03 PM
douardda committed rDDOC7396e4263eec: Update the 'updating a configuration' section of the swarm-based mirror (authored by douardda).
Update the 'updating a configuration' section of the swarm-based mirror
Jan 24 2022, 5:03 PM
douardda updated the diff for D7004: Update the docker mirror doc.

forgot one...

Jan 24 2022, 4:57 PM
douardda updated the diff for D7004: Update the docker mirror doc.

Use roles as suggested by vlorentz

Jan 24 2022, 4:47 PM
douardda triaged T3877: Automate the weekly-planning script as High priority.
Jan 24 2022, 10:46 AM · System administration

Jan 21 2022

douardda requested review of D7004: Update the docker mirror doc.
Jan 21 2022, 12:54 PM
douardda closed D6994: Add a quickstart section in the doc.
Jan 21 2022, 12:46 PM
douardda committed rDOBJSRPLeca8d7714b00: Add a quickstart section in the doc (authored by douardda).
Add a quickstart section in the doc
Jan 21 2022, 12:46 PM
douardda updated the diff for D6994: Add a quickstart section in the doc.

yat (yet-another-typo)

Jan 21 2022, 11:39 AM
douardda closed D6944: Add a few statsd metrics in the kafka journal client.
Jan 21 2022, 11:30 AM
douardda committed rDJNL5a26dae22928: Add a few statsd metrics in the kafka journal client (authored by douardda).
Add a few statsd metrics in the kafka journal client
Jan 21 2022, 11:30 AM
douardda updated the diff for D6994: Add a quickstart section in the doc.

more typos

Jan 21 2022, 11:06 AM
douardda updated the diff for D6994: Add a quickstart section in the doc.

and fix the spurious () (thx ardumont)

Jan 21 2022, 11:04 AM
douardda updated the diff for D6994: Add a quickstart section in the doc.

fix rst syntax (thx D6995 review)

Jan 21 2022, 11:03 AM
douardda accepted D6995: Fix ReST syntax.

thx

Jan 21 2022, 11:02 AM
douardda requested review of D6994: Add a quickstart section in the doc.
Jan 21 2022, 10:37 AM
douardda closed D6884: Add support for the rdkafka 'stats_cb' config option in JournalClient.
Jan 21 2022, 9:52 AM
douardda committed rDJNL5b17c50d3280: Add support for the rdkafka 'stats_cb' config option in get_journal_client (authored by douardda).
Add support for the rdkafka 'stats_cb' config option in get_journal_client
Jan 21 2022, 9:52 AM
douardda closed D6980: Add an "Hosting a mirror" page.
Jan 21 2022, 9:49 AM
douardda committed rDDOC2fc9925ba5a3: Add an "Hosting a mirror" page (authored by douardda).
Add an "Hosting a mirror" page
Jan 21 2022, 9:49 AM
douardda closed D6945: Make the copy process of blob objects run with thread concurrency.
Jan 21 2022, 9:46 AM
douardda committed rDOBJSRPL0dffebc423a1: Make the copy process of blob objects run with thread concurrency (authored by douardda).
Make the copy process of blob objects run with thread concurrency
Jan 21 2022, 9:46 AM

Jan 20 2022

douardda added a comment to T3127: Compute and display distribution of origins by forge.

Is there a reason not to close this task?

Jan 20 2022, 6:35 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task
douardda updated the diff for D6944: Add a few statsd metrics in the kafka journal client.

update using new statsd.status_gauge() context manager (in swh.core 1.1)

Jan 20 2022, 4:22 PM
douardda closed D6989: Add a Statsd.status_gauge() context manager.
Jan 20 2022, 3:32 PM
douardda committed rDCOREaba5c80765ad: Add a Statsd.status_gauge() context manager (authored by douardda).
Add a Statsd.status_gauge() context manager
Jan 20 2022, 3:32 PM
douardda requested review of D6989: Add a Statsd.status_gauge() context manager.
Jan 20 2022, 2:27 PM
douardda created P1260 (An Untitled Masterwork).
Jan 20 2022, 1:46 PM
douardda added a comment to D6944: Add a few statsd metrics in the kafka journal client.
In D6944#181139, @olasd wrote:

The code for the gauges feels like something that would be usefully handled with a context manager.

Something like (untested)

class StatsdStatusGauges:
    def __init__(self, metric_name: str, statuses: Collection[str], common_tags: Optional[Dict[str, str]] = None):
        self.metric_name = metric_name
        self.statuses = set(statuses)
        self.common_metrics = common_tags or {}
        self.current_status: Optional[str] = None

    def reset_gauges(self):
        self.current_status = None
        for status in self.statuses:
            statsd.gauge(self.metric_name, 0, {**self.common_tags, "status": status})

    def send_current_gauge(self, value: int):
        if self.current_status is not None:
            statsd.gauge(self.metric_name, value, {**self.common_tags, "status": self.current_status})

    def set(self, new_status: str):
        if new_status not in self.statuses:
            raise ValueError(f'{new_status} not in {self.statuses}')

        # May not be needed; May even be counter-productive if we want to send the gauges to keep them around in the statsd exporter
        if new_status == self.current_status:
            return

        self.send_current_gauge(0)
        self.current_status = new_status
        self.send_current_gauge(1)

    def __enter__(self):
        self.reset_gauges()
        return self

    def __exit__(self, *exc):
        self.reset_gauges()
        return False

Which would be used like:

with StatsdStatusGauges(JOURNAL_STATUS_METRIC, {"processing", "waiting"}) as status_gauge:
    [...]
    status_gauge.set("waiting")
    [...]
    status_gauge.set("processing")
Jan 20 2022, 1:31 PM
douardda added a comment to D6884: Add support for the rdkafka 'stats_cb' config option in JournalClient.
In D6884#181115, @olasd wrote:

I'm kinda wondering if this import stuff should move to a common module - I think we do kind of the same thing with entrypoints?

Jan 20 2022, 1:29 PM
douardda updated the diff for D6980: Add an "Hosting a mirror" page.

Rebase and update according suggestions

Jan 20 2022, 12:41 PM

Jan 19 2022

douardda requested review of D6980: Add an "Hosting a mirror" page.
Jan 19 2022, 5:37 PM

Jan 14 2022

douardda committed rMSLDbee9e94855e1: Tech talk at #swh5years (authored by douardda).
Tech talk at #swh5years
Jan 14 2022, 5:14 PM
douardda added inline comments to D6945: Make the copy process of blob objects run with thread concurrency.
Jan 14 2022, 5:04 PM
douardda closed D6943: Add support for env var substitution in statsd tags from STATSD_TAGS.
Jan 14 2022, 11:39 AM
douardda committed rDCOREde9b0c9fb441: Add support for env var substitution in statsd tags from STATSD_TAGS (authored by douardda).
Add support for env var substitution in statsd tags from STATSD_TAGS
Jan 14 2022, 11:39 AM
douardda retitled D6884: Add support for the rdkafka 'stats_cb' config option in JournalClient from [WIP] Add support for the rdkafka 'stats_cb' config option in JournalClient to Add support for the rdkafka 'stats_cb' config option in JournalClient.
Jan 14 2022, 10:52 AM
douardda requested review of D6944: Add a few statsd metrics in the kafka journal client.
Jan 14 2022, 10:50 AM
douardda abandoned D6875: Add statsd metrics in JournalClient.process.
Jan 14 2022, 10:40 AM
douardda updated the diff for D6943: Add support for env var substitution in statsd tags from STATSD_TAGS.

improve comment as suggested by ardumont

Jan 14 2022, 10:29 AM

Jan 13 2022

douardda updated the diff for D6945: Make the copy process of blob objects run with thread concurrency.

Add the cli option to configure this concurrency value

Jan 13 2022, 4:23 PM
douardda requested review of D6945: Make the copy process of blob objects run with thread concurrency.
Jan 13 2022, 4:08 PM
douardda requested review of D6943: Add support for env var substitution in statsd tags from STATSD_TAGS.
Jan 13 2022, 3:49 PM

Jan 12 2022

douardda accepted D6889: cassandra: Make content_missing run in linear time instead of quadratic.
Jan 12 2022, 11:32 AM
douardda accepted D6888: cassandra: Rewrite content_missing to run queries concurrently..

fine for me (but plz give a bit more insight)

Jan 12 2022, 11:28 AM

Jan 11 2022

douardda renamed T3841: regularly scrub all the data stores of swh from regularly scrub all the data sources of swh to regularly scrub all the data stores of swh.
Jan 11 2022, 12:32 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
douardda removed a project from T3841: regularly scrub all the data stores of swh: Roadmap 2021.
Jan 11 2022, 12:31 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
douardda triaged T3841: regularly scrub all the data stores of swh as Normal priority.
Jan 11 2022, 12:31 PM · Datastore Scrubber, meta-task, Roadmap 2022, Storage manager
douardda added a comment to T3544: Deal with GitHub removing support for git:// URLs.

I guess this is then related to T3653 somehow

Jan 11 2022, 10:55 AM · Origin-GitHub, Git loader

Jan 6 2022

douardda closed D6882: Remove 'process_timeout' from JournalClient's arguments.
Jan 6 2022, 2:25 PM
douardda committed rDJNL0d115993e0f1: Remove 'process_timeout' from JournalClient's arguments (authored by douardda).
Remove 'process_timeout' from JournalClient's arguments
Jan 6 2022, 2:25 PM
douardda updated the diff for D6884: Add support for the rdkafka 'stats_cb' config option in JournalClient.

move the code in get_journal_client

Jan 6 2022, 12:41 PM
douardda updated the diff for D6882: Remove 'process_timeout' from JournalClient's arguments.

make the warning an exception, as suggested by vlorentz

Jan 6 2022, 12:34 PM
douardda requested review of D6884: Add support for the rdkafka 'stats_cb' config option in JournalClient.
Jan 6 2022, 12:20 PM
douardda added inline comments to D6882: Remove 'process_timeout' from JournalClient's arguments.
Jan 6 2022, 12:15 PM
douardda requested review of D6882: Remove 'process_timeout' from JournalClient's arguments.
Jan 6 2022, 11:49 AM

Jan 4 2022

douardda added a comment to D6875: Add statsd metrics in JournalClient.process.
In D6875#178755, @olasd wrote:

statsd.timing/statsd.timed do full histograms. Do we really want to keep bucketed counts for all of these values, or just a running total?

Jan 4 2022, 4:46 PM
douardda added inline comments to D6875: Add statsd metrics in JournalClient.process.
Jan 4 2022, 4:45 PM
douardda added a comment to D6839: utils: Add a function to parse a subversion external definition.

Gawd this is horrible! (not your fault!)

Jan 4 2022, 4:12 PM
douardda added a comment to D6874: docker/conf/loader: Configure storage with retry proxy.

I'd feel more comfortable also if we have a good (aka documented and understood) reason for doing this.

Jan 4 2022, 3:52 PM
douardda requested review of D6875: Add statsd metrics in JournalClient.process.
Jan 4 2022, 3:02 PM
douardda closed D6818: Move the 'error_reporter' config entry in a dedicated 'replayer' section.
Jan 4 2022, 2:54 PM
douardda committed rDOBJSRPLa2d1aa994400: Move the 'error_reporter' config entry in a dedicated 'replayer' section (authored by douardda).
Move the 'error_reporter' config entry in a dedicated 'replayer' section
Jan 4 2022, 2:54 PM
douardda added inline comments to D6818: Move the 'error_reporter' config entry in a dedicated 'replayer' section.
Jan 4 2022, 2:52 PM
douardda updated the diff for D6818: Move the 'error_reporter' config entry in a dedicated 'replayer' section.

vlorentz' comment + update copyright timestamps

Jan 4 2022, 2:49 PM
douardda added inline comments to D6818: Move the 'error_reporter' config entry in a dedicated 'replayer' section.
Jan 4 2022, 2:46 PM
douardda closed D6817: Move the 'error_reporter' config entry in a dedicated 'replayer' section.

closed by 259bf6fe1e3bacbcd2e91f8f3d55d49f5219892c

Jan 4 2022, 2:44 PM
douardda committed rDSTO259bf6fe1e3b: Improve documentation of the replay command (authored by douardda).
Improve documentation of the replay command
Jan 4 2022, 12:08 PM
douardda committed rDSTO1071781d8483: Move the 'error_reporter' config entry in a dedicated 'replayer' section (authored by douardda).
Move the 'error_reporter' config entry in a dedicated 'replayer' section
Jan 4 2022, 12:08 PM

Jan 3 2022

douardda added a comment to T3134: SWHID v2.

wishlist: it would be nice ot be able to check the whole hash of a revision/release even when the author name/email are replaced by a hash. (eg. by making SWHIDv2 a tree hash)

Jan 3 2022, 12:02 PM · Roadmap 2022, Roadmap 2020, Data Model, Web app, meta-task, Roadmap 2021

Dec 16 2021

douardda added a comment to T3477: Add alerting when the copy to S3 starts lagging.

I guess https://grafana.softwareheritage.org/d/d3l2oqXWz/s3-object-copy?orgId=1 is almost an answer to this task

Dec 16 2021, 3:15 PM · Roadmap 2021, System administration

Dec 13 2021

douardda accepted D6815: postgresql: Fix one-by-one error in db_to_date on negative dates.

LGTM (but see my comment)

Dec 13 2021, 10:21 AM

Dec 10 2021

douardda accepted D6714: Add support to flatten directories in the isochrone frontiers separately.

Same comment as in D6712

Dec 10 2021, 5:24 PM
douardda accepted D6712: Add explicit flag for flattenned directories to `ProvenanceStorageInterface`.
Dec 10 2021, 5:23 PM
douardda added inline comments to D6714: Add support to flatten directories in the isochrone frontiers separately.
Dec 10 2021, 4:30 PM
douardda added inline comments to D6712: Add explicit flag for flattenned directories to `ProvenanceStorageInterface`.
Dec 10 2021, 4:30 PM
douardda added inline comments to D6712: Add explicit flag for flattenned directories to `ProvenanceStorageInterface`.
Dec 10 2021, 4:12 PM
douardda updated subscribers of D6714: Add support to flatten directories in the isochrone frontiers separately.
Dec 10 2021, 4:09 PM
douardda requested changes to D6712: Add explicit flag for flattenned directories to `ProvenanceStorageInterface`.

It would really nice to have a better explanation of what this flag is added for and why. Why do we add complexity in the code for this? I know there are good reasons for that, but I cannot see them just reading the code or the commit message.

Dec 10 2021, 3:48 PM
douardda accepted D6717: Add new flag to skip directory flattening while processing revisions.
Dec 10 2021, 3:16 PM
douardda accepted D6746: Unify frontier definition between track-all vs track-first strategies.

There is a small typo in the commit message (adn instead of and)

Dec 10 2021, 3:13 PM

Dec 9 2021

douardda requested review of D6817: Move the 'error_reporter' config entry in a dedicated 'replayer' section.
Dec 9 2021, 5:33 PM
douardda requested review of D6818: Move the 'error_reporter' config entry in a dedicated 'replayer' section.
Dec 9 2021, 5:24 PM

Dec 6 2021

douardda added inline comments to D6745: fix expansion of the Log keyword with rsync origins.
Dec 6 2021, 10:05 AM

Dec 3 2021

douardda committed rDOBJSRPL5da286e8d1db: Updated debian changelog for version 0.3.1-2 (authored by douardda).
Updated debian changelog for version 0.3.1-2
Dec 3 2021, 12:20 PM
douardda committed rDOBJSRPL093687737c11: Add forgotten build-dependency on redis-server and python3-pytest-redis (authored by douardda).
Add forgotten build-dependency on redis-server and python3-pytest-redis
Dec 3 2021, 12:15 PM