Page MenuHomeSoftware Heritage

douardda (David Douard)
User

User Details

User Since
Jul 10 2018, 12:38 PM (184 w, 12 h)

Recent Activity

Fri, Jan 14

douardda committed rMSLDbee9e94855e1: Tech talk at #swh5years (authored by douardda).
Tech talk at #swh5years
Fri, Jan 14, 5:14 PM
douardda added inline comments to D6945: Make the copy process of blob objects run with thread concurrency.
Fri, Jan 14, 5:04 PM
douardda closed D6943: Add support for env var substitution in statsd tags from STATSD_TAGS.
Fri, Jan 14, 11:39 AM
douardda committed rDCOREde9b0c9fb441: Add support for env var substitution in statsd tags from STATSD_TAGS (authored by douardda).
Add support for env var substitution in statsd tags from STATSD_TAGS
Fri, Jan 14, 11:39 AM
douardda retitled D6884: Add support for the rdkafka 'stats_cb' config option in JournalClient from [WIP] Add support for the rdkafka 'stats_cb' config option in JournalClient to Add support for the rdkafka 'stats_cb' config option in JournalClient.
Fri, Jan 14, 10:52 AM
douardda requested review of D6944: Add a few statsd metrics in the kafka journal client.
Fri, Jan 14, 10:50 AM
douardda abandoned D6875: Add statsd metrics in JournalClient.process.
Fri, Jan 14, 10:40 AM
douardda updated the diff for D6943: Add support for env var substitution in statsd tags from STATSD_TAGS.

improve comment as suggested by ardumont

Fri, Jan 14, 10:29 AM

Thu, Jan 13

douardda updated the diff for D6945: Make the copy process of blob objects run with thread concurrency.

Add the cli option to configure this concurrency value

Thu, Jan 13, 4:23 PM
douardda requested review of D6945: Make the copy process of blob objects run with thread concurrency.
Thu, Jan 13, 4:08 PM
douardda requested review of D6943: Add support for env var substitution in statsd tags from STATSD_TAGS.
Thu, Jan 13, 3:49 PM

Wed, Jan 12

douardda accepted D6889: cassandra: Make content_missing run in linear time instead of quadratic.
Wed, Jan 12, 11:32 AM
douardda accepted D6888: cassandra: Rewrite content_missing to run queries concurrently..

fine for me (but plz give a bit more insight)

Wed, Jan 12, 11:28 AM

Tue, Jan 11

douardda renamed T3841: regularly scrub all the data stores of swh from regularly scrub all the data sources of swh to regularly scrub all the data stores of swh.
Tue, Jan 11, 12:32 PM · Storage manager, meta-task
douardda removed a project from T3841: regularly scrub all the data stores of swh: Roadmap 2021.
Tue, Jan 11, 12:31 PM · Storage manager, meta-task
douardda triaged T3841: regularly scrub all the data stores of swh as Normal priority.
Tue, Jan 11, 12:31 PM · Storage manager, meta-task
douardda added a comment to T3544: Deal with GitHub removing support for git:// URLs.

I guess this is then related to T3653 somehow

Tue, Jan 11, 10:55 AM · Origin-GitHub, Git loader

Thu, Jan 6

douardda closed D6882: Remove 'process_timeout' from JournalClient's arguments.
Thu, Jan 6, 2:25 PM
douardda committed rDJNL0d115993e0f1: Remove 'process_timeout' from JournalClient's arguments (authored by douardda).
Remove 'process_timeout' from JournalClient's arguments
Thu, Jan 6, 2:25 PM
douardda updated the diff for D6884: Add support for the rdkafka 'stats_cb' config option in JournalClient.

move the code in get_journal_client

Thu, Jan 6, 12:41 PM
douardda updated the diff for D6882: Remove 'process_timeout' from JournalClient's arguments.

make the warning an exception, as suggested by vlorentz

Thu, Jan 6, 12:34 PM
douardda requested review of D6884: Add support for the rdkafka 'stats_cb' config option in JournalClient.
Thu, Jan 6, 12:20 PM
douardda added inline comments to D6882: Remove 'process_timeout' from JournalClient's arguments.
Thu, Jan 6, 12:15 PM
douardda requested review of D6882: Remove 'process_timeout' from JournalClient's arguments.
Thu, Jan 6, 11:49 AM

Tue, Jan 4

douardda added a comment to D6875: Add statsd metrics in JournalClient.process.
In D6875#178755, @olasd wrote:

statsd.timing/statsd.timed do full histograms. Do we really want to keep bucketed counts for all of these values, or just a running total?

Tue, Jan 4, 4:46 PM
douardda added inline comments to D6875: Add statsd metrics in JournalClient.process.
Tue, Jan 4, 4:45 PM
douardda added a comment to D6839: utils: Add a function to parse a subversion external definition.

Gawd this is horrible! (not your fault!)

Tue, Jan 4, 4:12 PM
douardda added a comment to D6874: docker/conf/loader: Configure storage with retry proxy.

I'd feel more comfortable also if we have a good (aka documented and understood) reason for doing this.

Tue, Jan 4, 3:52 PM
douardda requested review of D6875: Add statsd metrics in JournalClient.process.
Tue, Jan 4, 3:02 PM
douardda closed D6818: Move the 'error_reporter' config entry in a dedicated 'replayer' section.
Tue, Jan 4, 2:54 PM
douardda committed rDOBJSRPLa2d1aa994400: Move the 'error_reporter' config entry in a dedicated 'replayer' section (authored by douardda).
Move the 'error_reporter' config entry in a dedicated 'replayer' section
Tue, Jan 4, 2:54 PM
douardda added inline comments to D6818: Move the 'error_reporter' config entry in a dedicated 'replayer' section.
Tue, Jan 4, 2:52 PM
douardda updated the diff for D6818: Move the 'error_reporter' config entry in a dedicated 'replayer' section.

vlorentz' comment + update copyright timestamps

Tue, Jan 4, 2:49 PM
douardda added inline comments to D6818: Move the 'error_reporter' config entry in a dedicated 'replayer' section.
Tue, Jan 4, 2:46 PM
douardda closed D6817: Move the 'error_reporter' config entry in a dedicated 'replayer' section.

closed by 259bf6fe1e3bacbcd2e91f8f3d55d49f5219892c

Tue, Jan 4, 2:44 PM
douardda committed rDSTO259bf6fe1e3b: Improve documentation of the replay command (authored by douardda).
Improve documentation of the replay command
Tue, Jan 4, 12:08 PM
douardda committed rDSTO1071781d8483: Move the 'error_reporter' config entry in a dedicated 'replayer' section (authored by douardda).
Move the 'error_reporter' config entry in a dedicated 'replayer' section
Tue, Jan 4, 12:08 PM

Mon, Jan 3

douardda added a comment to T3134: SWHID v2.

wishlist: it would be nice ot be able to check the whole hash of a revision/release even when the author name/email are replaced by a hash. (eg. by making SWHIDv2 a tree hash)

Mon, Jan 3, 12:02 PM · Roadmap 2020, Data Model, Web app, meta-task, Roadmap 2021

Dec 16 2021

douardda added a comment to T3477: Add alerting when the copy to S3 starts lagging.

I guess https://grafana.softwareheritage.org/d/d3l2oqXWz/s3-object-copy?orgId=1 is almost an answer to this task

Dec 16 2021, 3:15 PM · Roadmap 2021, System administration

Dec 13 2021

douardda accepted D6815: postgresql: Fix one-by-one error in db_to_date on negative dates.

LGTM (but see my comment)

Dec 13 2021, 10:21 AM

Dec 10 2021

douardda accepted D6714: Add support to flatten directories in the isochrone frontiers separately.

Same comment as in D6712

Dec 10 2021, 5:24 PM
douardda accepted D6712: Add explicit flag for flattenned directories to `ProvenanceStorageInterface`.
Dec 10 2021, 5:23 PM
douardda added inline comments to D6714: Add support to flatten directories in the isochrone frontiers separately.
Dec 10 2021, 4:30 PM
douardda added inline comments to D6712: Add explicit flag for flattenned directories to `ProvenanceStorageInterface`.
Dec 10 2021, 4:30 PM
douardda added inline comments to D6712: Add explicit flag for flattenned directories to `ProvenanceStorageInterface`.
Dec 10 2021, 4:12 PM
douardda updated subscribers of D6714: Add support to flatten directories in the isochrone frontiers separately.
Dec 10 2021, 4:09 PM
douardda requested changes to D6712: Add explicit flag for flattenned directories to `ProvenanceStorageInterface`.

It would really nice to have a better explanation of what this flag is added for and why. Why do we add complexity in the code for this? I know there are good reasons for that, but I cannot see them just reading the code or the commit message.

Dec 10 2021, 3:48 PM
douardda accepted D6717: Add new flag to skip directory flattening while processing revisions.
Dec 10 2021, 3:16 PM
douardda accepted D6746: Unify frontier definition between track-all vs track-first strategies.

There is a small typo in the commit message (adn instead of and)

Dec 10 2021, 3:13 PM

Dec 9 2021

douardda requested review of D6817: Move the 'error_reporter' config entry in a dedicated 'replayer' section.
Dec 9 2021, 5:33 PM
douardda requested review of D6818: Move the 'error_reporter' config entry in a dedicated 'replayer' section.
Dec 9 2021, 5:24 PM

Dec 6 2021

douardda added inline comments to D6745: fix expansion of the Log keyword with rsync origins.
Dec 6 2021, 10:05 AM

Dec 3 2021

douardda committed rDOBJSRPL5da286e8d1db: Updated debian changelog for version 0.3.1-2 (authored by douardda).
Updated debian changelog for version 0.3.1-2
Dec 3 2021, 12:20 PM
douardda committed rDOBJSRPL093687737c11: Add forgotten build-dependency on redis-server and python3-pytest-redis (authored by douardda).
Add forgotten build-dependency on redis-server and python3-pytest-redis
Dec 3 2021, 12:15 PM

Dec 2 2021

douardda closed D6727: Fix compatibility with tenacity 6.2.
Dec 2 2021, 2:37 PM
douardda committed rDOBJSRPL86b8509fc2c7: Fix compatibility with tenacity 6.2 (authored by douardda).
Fix compatibility with tenacity 6.2
Dec 2 2021, 2:37 PM
douardda committed rDOBJSRPL273675865266: Fix dependencies in d/control (authored by douardda).
Fix dependencies in d/control
Dec 2 2021, 2:36 PM
douardda requested review of D6727: Fix compatibility with tenacity 6.2.
Dec 2 2021, 2:27 PM
douardda abandoned D6492: Add support for pathslicing in seaweedfs backend.

probably won't be used

Dec 2 2021, 1:16 PM
douardda closed D6693: Add support for a redis-based reporter for failed replayed objects.
Dec 2 2021, 1:12 PM
douardda committed rDOBJSRPLa7bd6bc4427b: Add doctrings and comments in test_cli.py (authored by douardda).
Add doctrings and comments in test_cli.py
Dec 2 2021, 1:12 PM
douardda committed rDOBJSRPLf5051ce1cbf4: Add support for a redis-based reporter for failed replayed objects (authored by douardda).
Add support for a redis-based reporter for failed replayed objects
Dec 2 2021, 1:12 PM
douardda closed D6692: Rework the retry and reporting system in replay.py.
Dec 2 2021, 1:11 PM
douardda committed rDOBJSRPL1d8ea80c7d01: Add tests for expected statsd reports during a content replay session (authored by douardda).
Add tests for expected statsd reports during a content replay session
Dec 2 2021, 1:11 PM
douardda committed rDOBJSRPL8098798820bb: Rework the retry and reporting system in replay.py (authored by douardda).
Rework the retry and reporting system in replay.py
Dec 2 2021, 1:11 PM
douardda closed D6724: Add tests for expected statsd reports during a content replay session.
Dec 2 2021, 1:11 PM
douardda committed rDOBJSRPL002443567791: Small code refactoring in test_cli (authored by douardda).
Small code refactoring in test_cli
Dec 2 2021, 1:11 PM
douardda updated the diff for D6693: Add support for a redis-based reporter for failed replayed objects.

rebase

Dec 2 2021, 11:34 AM
douardda updated the diff for D6692: Rework the retry and reporting system in replay.py.

rebase

Dec 2 2021, 11:34 AM
douardda updated the diff for D6724: Add tests for expected statsd reports during a content replay session.

typo and small improvement in the test

Dec 2 2021, 11:32 AM
douardda added a comment to D6692: Rework the retry and reporting system in replay.py.
In D6692#174126, @olasd wrote:

You seem to have dropped the CONTENT_BYTES_METRIC ?

Dec 2 2021, 11:10 AM
douardda updated the diff for D6692: Rework the retry and reporting system in replay.py.

Rebase on D6724 and adapt statsd tests to match new statsd probes

Dec 2 2021, 11:09 AM
douardda requested review of D6724: Add tests for expected statsd reports during a content replay session.
Dec 2 2021, 11:08 AM
douardda updated the diff for D6692: Rework the retry and reporting system in replay.py.

Add tests for expected statsd reports during a content replay session

Dec 2 2021, 11:01 AM
douardda committed rDCORE862e1e51a67f: Port test_statsd.py to pytest and add a statsd test fixture (authored by douardda).
Port test_statsd.py to pytest and add a statsd test fixture
Dec 2 2021, 10:48 AM
douardda closed D6709: Port test_statsd.py to pytest and add a statsd test fixture.
Dec 2 2021, 10:48 AM
douardda committed rDCORE58ff77995512: Add magic to mypy ignored packages (authored by douardda).
Add magic to mypy ignored packages
Dec 2 2021, 10:48 AM
douardda updated the diff for D6709: Port test_statsd.py to pytest and add a statsd test fixture.

Add a docstring in statsd fixture and update copyright dates in pytest_plugin.py

Dec 2 2021, 10:45 AM

Dec 1 2021

douardda updated the diff for D6709: Port test_statsd.py to pytest and add a statsd test fixture.

fix and swap args of 'assert' statements

Dec 1 2021, 5:47 PM
douardda added inline comments to D6709: Port test_statsd.py to pytest and add a statsd test fixture.
Dec 1 2021, 5:39 PM
douardda requested review of D6709: Port test_statsd.py to pytest and add a statsd test fixture.
Dec 1 2021, 2:02 PM
douardda added a comment to D6692: Rework the retry and reporting system in replay.py.
In D6692#174129, @olasd wrote:

Could you make sure that this dashboard https://grafana.softwareheritage.org/d/d3l2oqXWz/s3-object-copy?orgId=1 is not affected (or that its functionality can be replaced easily?)

Dec 1 2021, 11:57 AM
douardda added a comment to D6692: Rework the retry and reporting system in replay.py.
In D6692#174126, @olasd wrote:

You seem to have dropped the CONTENT_BYTES_METRIC ?

Dec 1 2021, 11:47 AM
douardda accepted D6701: objectstorage: http backend: test cannonicalization of URL.

sorry I thought I had accepted it already :-)

Dec 1 2021, 9:54 AM

Nov 29 2021

douardda added a comment to D6697: Add tests for conflict resolution functions.

When I did so with other tests your claim was that they were too difficult to follow and that tests should be as explicit as possible. These ones have a declarative function name for each case. What's the criteria after all?

Nov 29 2021, 12:08 PM
douardda accepted D6697: Add tests for conflict resolution functions.

Fine for me; these tests could be written as parametrized tests (https://docs.pytest.org/en/6.2.x/parametrize.html#parametrize-basics) but not a big deal.

Nov 29 2021, 10:04 AM
douardda accepted D6699: Stop writing swhid2node.bin maps.

Looks overall ok to me, but I miss a few "why?" here:

  • the "why stop writing this file" should be in the commit message,
  • why the need for SortOutputHandler attributes to become final?
Nov 29 2021, 10:00 AM

Nov 26 2021

douardda requested review of D6693: Add support for a redis-based reporter for failed replayed objects.
Nov 26 2021, 1:35 PM
douardda requested review of D6692: Rework the retry and reporting system in replay.py.
Nov 26 2021, 1:34 PM
douardda added a revision to T3693: Provide a mecanism to report (with persistence) objects that fails to get replayed (mirror): D6693: Add support for a redis-based reporter for failed replayed objects.
Nov 26 2021, 1:33 PM · Storage manager

Nov 24 2021

douardda committed rMSLDab780bbec731: Try to improve a bit the swh-dataflow-merkle image (authored by douardda).
Try to improve a bit the swh-dataflow-merkle image
Nov 24 2021, 12:08 PM

Nov 17 2021

douardda committed rCDFP0526bf35c087: Use dnsrr as endpoint_mode for storage and objstorage and document this (authored by douardda).
Use dnsrr as endpoint_mode for storage and objstorage and document this
Nov 17 2021, 11:02 AM
douardda committed rCDFP62767c6937ab: Improve entrypoint files (authored by douardda).
Improve entrypoint files
Nov 17 2021, 11:02 AM
douardda committed rCDFPc4c400dd69f5: Make nginx always resolve upstream server names (authored by douardda).
Make nginx always resolve upstream server names
Nov 17 2021, 11:02 AM
douardda committed rCDFPd131fe26e0e5: Fix the content-replayer grafana dashboard (authored by douardda).
Fix the content-replayer grafana dashboard
Nov 17 2021, 11:02 AM
douardda committed rDSTOba105df3e5a0: d/changelog: version 0.40.0-2~swh1 (authored by douardda).
d/changelog: version 0.40.0-2~swh1
Nov 17 2021, 9:47 AM
douardda committed rDSTO9af9b2641cfc: Update dependencies in d/control (authored by douardda).
Update dependencies in d/control
Nov 17 2021, 9:47 AM

Nov 16 2021

douardda closed D6645: Replace usage of 'stop_after_objects' by 'stop_on_eof' in tests.
Nov 16 2021, 4:28 PM
douardda committed rDJNLa38c6e7e94a5: Replace usage of 'stop_after_objects' by 'stop_on_eof' in tests (authored by douardda).
Replace usage of 'stop_after_objects' by 'stop_on_eof' in tests
Nov 16 2021, 4:28 PM
douardda closed D6644: Fix flakyness in test_client_with_deserializer.
Nov 16 2021, 4:28 PM