Page MenuHomeSoftware Heritage

douardda (David Douard)
User

User Details

User Since
Jul 10 2018, 12:38 PM (171 w, 3 d)

Recent Activity

Today

douardda requested review of D6538: Remove the RADOS backend.
Fri, Oct 22, 12:26 PM

Yesterday

douardda closed D6521: Add a simple read-only HTTP backend.
Thu, Oct 21, 2:23 PM
douardda committed rDOBJS8ed5f4ebc915: Add a simple read-only HTTP backend (authored by douardda).
Add a simple read-only HTTP backend
Thu, Oct 21, 2:23 PM
douardda updated the diff for D6521: Add a simple read-only HTTP backend.

rebase

Thu, Oct 21, 2:09 PM
douardda closed D6526: Reorganise the seaweedfs backend in a subpackage.
Thu, Oct 21, 2:06 PM
douardda committed rDOBJSbcbbfd466987: Reorganise the seaweedfs backend in a subpackage (authored by douardda).
Reorganise the seaweedfs backend in a subpackage
Thu, Oct 21, 2:06 PM
douardda closed D6525: Use get_objstorage in seaweedfs tests instead of direct class instanciation.
Thu, Oct 21, 2:06 PM
douardda closed D6524: Add support for deprecation of objstorage cls in factory.
Thu, Oct 21, 2:06 PM
douardda committed rDOBJS38c02dcfae2b: Add support for deprecation of objstorage cls in factory (authored by douardda).
Add support for deprecation of objstorage cls in factory
Thu, Oct 21, 2:06 PM
douardda committed rDOBJS82d9714b0ae5: Use get_objstorage in seaweedfs tests instead of direct class (authored by douardda).
Use get_objstorage in seaweedfs tests instead of direct class
Thu, Oct 21, 2:06 PM
douardda updated the diff for D6526: Reorganise the seaweedfs backend in a subpackage.

and fix the LOGGER.error usage

Thu, Oct 21, 2:03 PM
douardda updated the diff for D6526: Reorganise the seaweedfs backend in a subpackage.

remove hardcoded log levels

Thu, Oct 21, 1:53 PM
douardda added inline comments to D6526: Reorganise the seaweedfs backend in a subpackage.
Thu, Oct 21, 1:49 PM
douardda requested review of D6526: Reorganise the seaweedfs backend in a subpackage.
Thu, Oct 21, 1:09 PM
douardda requested review of D6525: Use get_objstorage in seaweedfs tests instead of direct class instanciation.
Thu, Oct 21, 1:09 PM
douardda requested review of D6524: Add support for deprecation of objstorage cls in factory.
Thu, Oct 21, 1:08 PM
douardda updated the diff for D6521: Add a simple read-only HTTP backend.

Use ReadOnlyObjStorage and NonIterableObjStorage instead of NotImplementedError

Thu, Oct 21, 12:50 PM
douardda added inline comments to D6521: Add a simple read-only HTTP backend.
Thu, Oct 21, 12:24 PM

Wed, Oct 20

douardda added inline comments to D6424: Perfect hashmap C implementation.
Wed, Oct 20, 5:17 PM
douardda updated the diff for D6521: Add a simple read-only HTTP backend.

document the build_objstorage() test helper function

Wed, Oct 20, 5:00 PM
douardda updated the diff for D6521: Add a simple read-only HTTP backend.

remove useless statement

Wed, Oct 20, 4:55 PM
douardda updated the diff for D6521: Add a simple read-only HTTP backend.

remove mistakenly commited mypy.ini file

Wed, Oct 20, 4:53 PM
douardda requested review of D6521: Add a simple read-only HTTP backend.
Wed, Oct 20, 4:53 PM
douardda committed rDOBJS6269067ca7b3: Improve a bit the seaweedfs backend (authored by douardda).
Improve a bit the seaweedfs backend
Wed, Oct 20, 2:40 PM
douardda closed D6517: Improve tests of the seaweedfs backend.
Wed, Oct 20, 2:40 PM
douardda committed rDOBJS55ff4b95d306: Improve tests of the seaweedfs backend (authored by douardda).
Improve tests of the seaweedfs backend
Wed, Oct 20, 2:40 PM
douardda updated the diff for D6492: Add support for pathslicing in seaweedfs backend.

rebase

Wed, Oct 20, 2:36 PM
douardda updated the diff for D6517: Improve tests of the seaweedfs backend.

remove a (comment) garbage line

Wed, Oct 20, 2:35 PM
douardda updated the diff for D6492: Add support for pathslicing in seaweedfs backend.

rebase

Wed, Oct 20, 2:31 PM
douardda updated the diff for D6517: Improve tests of the seaweedfs backend.

slight simplification as suggested by vlorentz

Wed, Oct 20, 2:30 PM
douardda added inline comments to D6517: Improve tests of the seaweedfs backend.
Wed, Oct 20, 2:17 PM
douardda updated the diff for D6492: Add support for pathslicing in seaweedfs backend.

respawn jenkins

Wed, Oct 20, 2:15 PM
douardda updated the diff for D6517: Improve tests of the seaweedfs backend.

respawn jenkins

Wed, Oct 20, 2:13 PM
douardda requested review of D6517: Improve tests of the seaweedfs backend.
Wed, Oct 20, 2:08 PM
douardda updated the diff for D6492: Add support for pathslicing in seaweedfs backend.

rebase

Wed, Oct 20, 12:34 PM
douardda added inline comments to D6492: Add support for pathslicing in seaweedfs backend.
Wed, Oct 20, 12:23 PM
douardda updated the diff for D6492: Add support for pathslicing in seaweedfs backend.

split the diff in 2

Wed, Oct 20, 12:14 PM

Tue, Oct 19

douardda created P1203 (An Untitled Masterwork).
Tue, Oct 19, 11:43 AM

Mon, Oct 18

douardda triaged T3668: Improve the seaweedfs backend as Normal priority.
Mon, Oct 18, 3:34 PM · Object storage
douardda created T3668: Improve the seaweedfs backend.
Mon, Oct 18, 3:33 PM · Object storage
douardda created P1202 (An Untitled Masterwork).
Mon, Oct 18, 2:47 PM
douardda added a comment to T3627: Consider dropping pull request references from the git loader ingestion.

B3 I am not convinced a "synthetic" flag on the Snapshot branch makes sense, or at least I find this name confusing, especially considering we already have a synthetic flag on Revision: it's not synthetic in the sense of it's not object crafted by SWH, it comes from the origin.

Mon, Oct 18, 11:59 AM · Git loader

Fri, Oct 15

douardda requested review of D6492: Add support for pathslicing in seaweedfs backend.
Fri, Oct 15, 6:35 PM
douardda triaged T3663: Make the swh-environment jenkins job green and activate notifications as High priority.
Fri, Oct 15, 10:45 AM · System administration

Thu, Oct 14

douardda added a comment to T3635: git loader: enable "partial" global deduplication of revisions via the extid mapping table.

Ok I think what puzzle me in this description is the fact the 2 first bullets of the "git loader adaptations" are actually only one point: at the end of a successful loading, store a mapping in the extid table.

Thu, Oct 14, 11:23 AM · Git loader

Wed, Oct 13

douardda closed D6442: Extract the path slicing logic in a dedicated PathSlicer class.
Wed, Oct 13, 3:19 PM
douardda committed rDOBJS23b7f81c1483: Extract the path slicing logic in a dedicated PathSlicer class (authored by douardda).
Extract the path slicing logic in a dedicated PathSlicer class
Wed, Oct 13, 3:19 PM

Tue, Oct 12

douardda updated the diff for D6442: Extract the path slicing logic in a dedicated PathSlicer class.

forgotten print statement...

Tue, Oct 12, 5:52 PM
douardda added inline comments to D6442: Extract the path slicing logic in a dedicated PathSlicer class.
Tue, Oct 12, 5:50 PM
douardda committed rDDOC807d63991a8e: sysadm: fill the mirror deployment section (authored by douardda).
sysadm: fill the mirror deployment section
Tue, Oct 12, 4:53 PM
douardda committed rDDOCfefabca8e6d3: conf: add swh-sysadm intershpinx mapping entry (authored by douardda).
conf: add swh-sysadm intershpinx mapping entry
Tue, Oct 12, 2:27 PM
douardda committed rDDOCe6ebb39c4b6d: sysadm: add mirror-operations without content (authored by douardda).
sysadm: add mirror-operations without content
Tue, Oct 12, 2:07 PM

Mon, Oct 11

douardda added a comment to T3627: Consider dropping pull request references from the git loader ingestion.

An alternative to annotating synthetic refs: add a "type" or "forge_type" attribute to snapshots.

Mon, Oct 11, 12:33 PM · Git loader
douardda added a comment to T3632: Investigate the ContentDisallowed exception.

What's the difference in deployed dependencies versions (staging vs. prod)?

Mon, Oct 11, 12:15 PM · Scheduling utilities
douardda added a comment to T3621: Create a production read-only objstorage.

For ENEA I'd llike to test different scenarios for the source objstorage:

Mon, Oct 11, 12:12 PM · System administration
douardda added a comment to T3592: POC elastic worker infrastructure.

just a quick remark about the scheduling of (sub)tasks of this task: IMHO the autoscaling should come last; all the supervision/monitoring/logging related tasks are much more important than the autoscaling.

Mon, Oct 11, 10:29 AM · System administration

Fri, Oct 8

douardda updated the diff for D6442: Extract the path slicing logic in a dedicated PathSlicer class.

Better docstrings and kill a few map()

Fri, Oct 8, 4:19 PM
douardda added inline comments to D6442: Extract the path slicing logic in a dedicated PathSlicer class.
Fri, Oct 8, 3:53 PM
douardda closed D6444: docker: configure and document the APP evironment variable for celery.
Fri, Oct 8, 3:51 PM
douardda committed rDENVeefd5e532124: docker: configure and document the APP evironment variable for celery (authored by douardda).
docker: configure and document the APP evironment variable for celery
Fri, Oct 8, 3:51 PM
douardda accepted D6443: buffer: add a threshold for the number of directory entries in one batch.

Thx

Fri, Oct 8, 3:46 PM
douardda updated the diff for D6444: docker: configure and document the APP evironment variable for celery.

be a bit more consistent...

Fri, Oct 8, 3:29 PM
douardda added a comment to D6410: Allow application/x-msgpack deserialization again.

as @vlorentz pointed out [1], this change should be irrelevant though...

[1] https://github.com/celery/kombu/blob/master/kombu/serialization.py#L369-L372

does not seem to be the proper fix.

FTR, using the celery cli tool directly from a development venv to interact with the celery server running in the docker compose test setup (as described there ) used to work ok, but not any more.

One have to specify the app, like:

celery --app=swh.scheduler.celery_backend.config.app status

[edit] I use celery 5.1.2 in my venv.

Fri, Oct 8, 3:27 PM
douardda requested review of D6444: docker: configure and document the APP evironment variable for celery.
Fri, Oct 8, 3:25 PM
douardda added a comment to T3632: Investigate the ContentDisallowed exception.

Unless I'm mistaken, this error does not appear in sentry any more, right?

Fri, Oct 8, 3:06 PM · Scheduling utilities
douardda added a comment to D6410: Allow application/x-msgpack deserialization again.

as @vlorentz pointed out [1], this change should be irrelevant though...

[1] https://github.com/celery/kombu/blob/master/kombu/serialization.py#L369-L372

Fri, Oct 8, 3:04 PM
douardda accepted D6427: swh.storage filter/buffer improvements.

looks fine to me

Fri, Oct 8, 2:53 PM
douardda accepted D6428: docs: Add a save forge documentation.

Ok but see my 2 (nitpicky) comments

Fri, Oct 8, 2:49 PM
douardda accepted D6431: Rename imports of swh.model.identifiers to fix deprecation warnings..

LGTM thx

Fri, Oct 8, 2:46 PM
douardda added a comment to T3104: Persistent readonly perfect hash table.

See https://forge.softwareheritage.org/source/swh-perfecthash/

Fri, Oct 8, 2:43 PM · Object storage
douardda updated the diff for D6442: Extract the path slicing logic in a dedicated PathSlicer class.

allow the pathslicer to be a noop (with an empty slicing)

Fri, Oct 8, 2:12 PM
douardda requested review of D6442: Extract the path slicing logic in a dedicated PathSlicer class.
Fri, Oct 8, 2:10 PM

Thu, Oct 7

douardda accepted D6401: Filter out pull request related branches.

LGTM

Thu, Oct 7, 9:32 AM

Wed, Oct 6

douardda added a comment to T3627: Consider dropping pull request references from the git loader ingestion.

FTR without D6401, the packfile received from GH for the CocoaPods/Specs repo contains 21162 references, 21146 of which are starting with /refs/pull/ and 7126 are ending with /merge (even if those have been explicitly not asked thanks to the filtering in RepoRepresentation.determine_wanted().
When D6401 is applied, we only get the 20-ish references that are not pull request related.

Wed, Oct 6, 2:56 PM · Git loader

Tue, Oct 5

douardda committed rMSLDfef6e8ca5b60: EOSC-Pillar F2F meeting: presentation of the UC6.4 (authored by douardda).
EOSC-Pillar F2F meeting: presentation of the UC6.4
Tue, Oct 5, 5:03 PM
douardda added a comment to T3633: staging/production - Kafka access for ENEA mirror.

token for the prod will be needed after that as well, thanks

Tue, Oct 5, 3:40 PM · System administration
douardda committed rCDFPe00b10ea28c8: Fix memcache config in web.yml (authored by douardda).
Fix memcache config in web.yml
Tue, Oct 5, 2:32 PM
douardda committed rCDFP44d8b4cad1ed: Fix replayers' entrypoint script (authored by douardda).
Fix replayers' entrypoint script
Tue, Oct 5, 2:32 PM
douardda committed rCDFP1831360b6c84: Improve posgresql config, especially for swh-web (authored by douardda).
Improve posgresql config, especially for swh-web
Tue, Oct 5, 2:32 PM
douardda committed rCDFP5de6a2ec92ea: Dockerfile: install postgresql-client in swh-web image (authored by douardda).
Dockerfile: install postgresql-client in swh-web image
Tue, Oct 5, 2:32 PM
douardda committed rCDFPb466ad7a743f: Improve nginx config (authored by douardda).
Improve nginx config
Tue, Oct 5, 2:32 PM
douardda committed rCDFPc36f34d1e137: Add support for postgresql as swh-web database (authored by douardda).
Add support for postgresql as swh-web database
Tue, Oct 5, 2:32 PM
douardda committed rCDFPf1cf061a3177: Add explicit rw and Z to volume definitions (authored by Jonas Eriksson <jonas.eriksson@fossid.com>).
Add explicit rw and Z to volume definitions
Tue, Oct 5, 2:32 PM
douardda committed rCDFP0fb09c414448: Storage conf: Point to correct objstorage port (authored by Jonas Eriksson <jonas.eriksson@fossid.com>).
Storage conf: Point to correct objstorage port
Tue, Oct 5, 2:32 PM
douardda closed D6403: docker: use a dedicated container for the cron-like job of swh-web.
Tue, Oct 5, 10:56 AM
douardda committed rDENVaf0a2af3e7c8: docker: use a dedicated container for the cron-like job of swh-web (authored by douardda).
docker: use a dedicated container for the cron-like job of swh-web
Tue, Oct 5, 10:56 AM
douardda closed D6402: docker: Do not limit the list of task types handled by swh-scheduler-runner-priority.
Tue, Oct 5, 10:56 AM
douardda committed rDENVebb07bdae059: docker: Do not limit the list of task types handled by swh-scheduler-runner… (authored by douardda).
docker: Do not limit the list of task types handled by swh-scheduler-runner…
Tue, Oct 5, 10:56 AM
douardda added a comment to D6165: Add new RabbitMQ-based client/server API.

Also there is no real value in keeping 3 revisions: the last 2 revisions actually improve/modify the code from the first revision.

Tue, Oct 5, 10:47 AM
douardda added a comment to D6339: Add support for remote backend on existing storage tests.

this should be squashed with the previous diff, and still my previous question about .gitignore

Tue, Oct 5, 10:45 AM
douardda accepted D6165: Add new RabbitMQ-based client/server API.

As others (and I) said, this must come with actual documentation.
As is, I have hard time understanding how this actually works (even after reading the document in hedgdoc).

Tue, Oct 5, 10:39 AM
douardda updated the diff for D6403: docker: use a dedicated container for the cron-like job of swh-web.

indent...

Tue, Oct 5, 10:01 AM
douardda updated the diff for D6403: docker: use a dedicated container for the cron-like job of swh-web.

improve entrypoint script to properly handle a SIGTERM

Tue, Oct 5, 9:59 AM
douardda accepted D6334: Add `close` method to both `ProvenanceInterface` and `ProvenanceStorageInterface`.
Tue, Oct 5, 9:34 AM
douardda added a comment to D6334: Add `close` method to both `ProvenanceInterface` and `ProvenanceStorageInterface`.

looks ok to me. Just one question, why do you need __future__.annotation?

Tue, Oct 5, 9:34 AM

Mon, Oct 4

douardda requested review of D6403: docker: use a dedicated container for the cron-like job of swh-web.
Mon, Oct 4, 5:33 PM
douardda requested review of D6402: docker: Do not limit the list of task types handled by swh-scheduler-runner-priority.
Mon, Oct 4, 5:33 PM
douardda accepted D6387: type_validator: Re-allow subclasses.

Oh well...

Mon, Oct 4, 4:04 PM
douardda created P1195 (An Untitled Masterwork).
Mon, Oct 4, 3:31 PM
douardda added a comment to T3611: Define the mapping for Bazaar repositories/branches to the SWH data model.

Ideally this doc would (briefly) describe how bazaar works and how it is different from already supported DVCS, then document chosen the "mapping" of the bzr model into swh (especially mentioning what is lost during this).

Mon, Oct 4, 11:43 AM · Data Model, BZR loader