Page MenuHomeSoftware Heritage

douardda (David Douard)
User

User Details

User Since
Jul 10 2018, 12:38 PM (145 w, 6 h)

Recent Activity

Today

douardda closed T3229: Update "software architecture" image as Resolved by committing rDDOCbf30a9270a87: Update the general architecture diagram.
Tue, Apr 20, 5:27 PM · Documentation
douardda closed D5558: Update the general architecture diagram.
Tue, Apr 20, 5:27 PM
douardda committed rDDOCbf30a9270a87: Update the general architecture diagram (authored by douardda).
Update the general architecture diagram
Tue, Apr 20, 5:27 PM
douardda added a comment to T3087: Implement support for takedown notices (infra, admin tools, workflow).

So what about exports of the archive available on git-annex?

Tue, Apr 20, 11:09 AM · meta-task, Roadmap 2021, Web app
douardda added a comment to T3246: Document takedown request processing workflow.

do we also intent to have a takedown topic on kafka?

Tue, Apr 20, 11:08 AM · Archive content, Roadmap 2021
douardda added a comment to T1481: add metric to monitor "save code now" efficiency.

Note that there is the same transient vs cumulative discrepency on the "Accepted requests" graph.

Tue, Apr 20, 11:06 AM · SaveCodeNow, System administration, Metrics/monitoring
douardda added a comment to T1481: add metric to monitor "save code now" efficiency.

I think the "submitted requests per visit type / status" graph should be split in 2 parts. Both accepted and rejected are cumulative values that will indefinitely grow, while pending are transient value aiming at staying near zero, so it makes no sense to have them on the same graph.

Tue, Apr 20, 11:02 AM · SaveCodeNow, System administration, Metrics/monitoring
douardda added a comment to T1481: add metric to monitor "save code now" efficiency.

I think the "submitted requests per visit type / status" graph should be split in 2 parts. Both accepted and rejected are cumulative values that will indefinitely grow, while pending are transient value aiming at staying near zero, so it makes no sense to have them on the same graph.

Tue, Apr 20, 11:00 AM · SaveCodeNow, System administration, Metrics/monitoring
douardda added a comment to T3084: Fast track save code now requests.

is there a grafana dashboard dedicated to this queue?

Tue, Apr 20, 10:55 AM · System administration, Web app
douardda changed the status of T3227: DB Schema link broken in docs under swh-storage. from Duplicate to Resolved.
Tue, Apr 20, 10:52 AM · Easy hack, Documentation
douardda requested review of D5558: Update the general architecture diagram.
Tue, Apr 20, 10:51 AM
douardda added a revision to T3229: Update "software architecture" image: D5558: Update the general architecture diagram.
Tue, Apr 20, 10:51 AM · Documentation
douardda closed D5553: Update black to 20.8b1.
Tue, Apr 20, 9:44 AM
douardda committed rDPROV8009af31a914: Update black to 20.8b1 (authored by douardda).
Update black to 20.8b1
Tue, Apr 20, 9:44 AM

Yesterday

douardda updated subscribers of D5553: Update black to 20.8b1.

Looks good to me. It's true the black version we are using is quite outdated.

We should upgrade black in all other swh repositories in the same manner then.

Mon, Apr 19, 5:46 PM
douardda requested review of D5553: Update black to 20.8b1.
Mon, Apr 19, 5:06 PM
douardda committed rDPROV982e2c1a2a9a: Add synthetic test files for the mindepth=2 heuristic (authored by douardda).
Add synthetic test files for the mindepth=2 heuristic
Mon, Apr 19, 4:56 PM
douardda closed D5389: Improve tests.
Mon, Apr 19, 4:56 PM
douardda closed D5388: Also test the provenance db with ArchiveStorage.
Mon, Apr 19, 4:56 PM
douardda committed rDPROVf2ffd718c468: Rename synthetic result test files (authored by douardda).
Rename synthetic result test files
Mon, Apr 19, 4:56 PM
douardda committed rDPROV594490576b00: Improve and rename test_provenance_db() as test_probenance_heuristics() (authored by douardda).
Improve and rename test_provenance_db() as test_probenance_heuristics()
Mon, Apr 19, 4:56 PM
douardda committed rDPROV5e89689ef08b: Also test the provenance db with ArchiveStorage (authored by douardda).
Also test the provenance db with ArchiveStorage
Mon, Apr 19, 4:56 PM
douardda closed D5387: Refactor the model and simplify a bit origin.py.
Mon, Apr 19, 4:56 PM
douardda committed rDPROV7eaeebb6091a: Simplify a bit origin.py (authored by douardda).
Simplify a bit origin.py
Mon, Apr 19, 4:56 PM
douardda committed rDPROVa23a33c5a77d: Refactor the model (authored by douardda).
Refactor the model
Mon, Apr 19, 4:56 PM
douardda committed rDPROV8853314af981: Add a test for the (noroot, upper) case (authored by douardda).
Add a test for the (noroot, upper) case
Mon, Apr 19, 4:55 PM
douardda closed D5337: Add a test to compare the result of revision_add() with known results.
Mon, Apr 19, 4:55 PM
douardda committed rDPROVadbc99dd357d: Add a test to compare the result of revision_add() with known results (authored by douardda).
Add a test to compare the result of revision_add() with known results
Mon, Apr 19, 4:55 PM
douardda updated the diff for D5389: Improve tests.

rebas

Mon, Apr 19, 4:45 PM
douardda updated the diff for D5388: Also test the provenance db with ArchiveStorage.

rebase

Mon, Apr 19, 4:44 PM
douardda updated the diff for D5387: Refactor the model and simplify a bit origin.py.

rebased

Mon, Apr 19, 4:43 PM
douardda updated the diff for D5337: Add a test to compare the result of revision_add() with known results.

rebased

Mon, Apr 19, 4:42 PM
douardda committed rDPROV62617e500649: Enforce black version 19.10b0 in tox to be consistent with pre-commit (authored by douardda).
Enforce black version 19.10b0 in tox to be consistent with pre-commit
Mon, Apr 19, 4:41 PM
douardda added a comment to T3269: Investigate scheduling policy for fsfe's gitea.

The loading tasks created during this first listing were oneshot tasks. So they have been modified to recurring tasks with something like:

Mon, Apr 19, 3:29 PM
douardda added a comment to T2602: Investigate how to upgrade the schema of the Cassandra storage.

Doesn't this deserve a state-of-the-art kind of thing? Are there documentation material on the subject? How does other (big) cassandra users handle this?

Mon, Apr 19, 2:14 PM · Storage manager
douardda added a comment to T3269: Investigate scheduling policy for fsfe's gitea.

The listing task has been disabled, I think because of failures in the last executions:

Mon, Apr 19, 11:51 AM
douardda triaged T3269: Investigate scheduling policy for fsfe's gitea as High priority.
Mon, Apr 19, 10:19 AM
douardda added a comment to T3084: Fast track save code now requests.

is there a grafana dashboard dedicated to this queue?

Mon, Apr 19, 10:14 AM · System administration, Web app
douardda added a comment to T3246: Document takedown request processing workflow.

also: what about exports we provide on git annex?

Mon, Apr 19, 10:10 AM · Archive content, Roadmap 2021
douardda added a comment to T3246: Document takedown request processing workflow.

do we also intent to have a takedown topic on kafka?

Mon, Apr 19, 10:09 AM · Archive content, Roadmap 2021

Fri, Apr 9

douardda created P1003 (An Untitled Masterwork).
Fri, Apr 9, 4:04 PM

Thu, Apr 8

douardda added a comment to T3198: Mirror: unexpected closed connection to the pg server.

Just got this one below. Note that this occurred just when the replayer actually started to insert object in the storage (before that, since the start of the replayer process, only kafka scaffolding took place for quite some time, around 30mn!)

Thu, Apr 8, 12:02 PM · Mirror
douardda triaged T3218: The graph replayer generates REQTMOUT Timeout errors as High priority.
Thu, Apr 8, 11:44 AM · Mirror

Wed, Apr 7

douardda added a comment to T3214: Restrict accepted timestamps to values that can be processed all along.

looks like there is no revision with date or committer_date > 9999-12-31 in the main storage...

Wed, Apr 7, 3:04 PM · Data Model
douardda triaged T3214: Restrict accepted timestamps to values that can be processed all along as High priority.
Wed, Apr 7, 2:30 PM · Data Model

Tue, Apr 6

douardda closed T3201: Mirror: unsupported Unicode escape sequence as Resolved by committing rDSTO39507b24d0f4: Make the replayer drop the Revision.metadata.
Tue, Apr 6, 4:42 PM · Mirror
douardda closed D5414: Make the replayer drop the Revision.metadata.
Tue, Apr 6, 4:42 PM
douardda closed T3201: Mirror: unsupported Unicode escape sequence, a subtask of T3197: Mirror: fix common issues of a replayer session, as Resolved.
Tue, Apr 6, 4:42 PM · Mirror
douardda committed rDSTO39507b24d0f4: Make the replayer drop the Revision.metadata (authored by douardda).
Make the replayer drop the Revision.metadata
Tue, Apr 6, 4:42 PM
douardda committed rDSTO84dcbe3d0e56: Merge test_replay's _check_replayed and check_replayed in a single function (authored by douardda).
Merge test_replay's _check_replayed and check_replayed in a single function
Tue, Apr 6, 4:42 PM
douardda closed D5413: Make pg Storage.extid_add() write extid objects to the journal.
Tue, Apr 6, 4:42 PM
douardda committed rDSTO36a7fd34f3ba: Fix pg Storage.extid_add(): write ExtID objects to the journal (authored by douardda).
Fix pg Storage.extid_add(): write ExtID objects to the journal
Tue, Apr 6, 4:42 PM
douardda retitled D5413: Make pg Storage.extid_add() write extid objects to the journal from Make pg Strorage.extid_add() write extid objects to the journal to Make pg Storage.extid_add() write extid objects to the journal.
Tue, Apr 6, 4:33 PM
douardda updated the diff for D5414: Make the replayer drop the Revision.metadata.

fix commit message

Tue, Apr 6, 4:32 PM
douardda added a comment to D5413: Make pg Storage.extid_add() write extid objects to the journal.

Could you add a test for the storage? All other *_add have a journal test IIRC

let me check that

Tue, Apr 6, 4:09 PM
douardda updated the diff for D5414: Make the replayer drop the Revision.metadata.

rebased

Tue, Apr 6, 4:08 PM
douardda updated the diff for D5413: Make pg Storage.extid_add() write extid objects to the journal.

Add explicit checks for extid being written in the journal and split the revision in 2

Tue, Apr 6, 4:07 PM
douardda added a comment to D5413: Make pg Storage.extid_add() write extid objects to the journal.

Could you add a test for the storage? All other *_add have a journal test IIRC

Tue, Apr 6, 3:39 PM
douardda added a comment to D5413: Make pg Storage.extid_add() write extid objects to the journal.

lgtm

(I would have made that 2 commits with each its own perimeter, 1 for the actual perimeter, 1 to refactor the test, but whatever)

Tue, Apr 6, 3:38 PM
douardda added inline comments to D5414: Make the replayer drop the Revision.metadata.
Tue, Apr 6, 3:33 PM
douardda triaged T3209: Fix swh-scanner for python > 3.7 as High priority.
Tue, Apr 6, 11:55 AM · Code scanner
douardda committed rDSNIP78408668a12f: Add the weekly-planning.sh script (authored by douardda).
Add the weekly-planning.sh script
Tue, Apr 6, 10:19 AM

Fri, Apr 2

douardda requested review of D5414: Make the replayer drop the Revision.metadata.
Fri, Apr 2, 4:26 PM
douardda requested review of D5413: Make pg Storage.extid_add() write extid objects to the journal.
Fri, Apr 2, 4:20 PM
douardda added a comment to T3197: Mirror: fix common issues of a replayer session.

Currently, the mirror test session is running with:

Fri, Apr 2, 10:15 AM · Mirror
douardda added a comment to T3201: Mirror: unsupported Unicode escape sequence.

easy fix: modify the replayer to ignore this 'metadata' column while inserting revisions

Fri, Apr 2, 10:05 AM · Mirror
douardda added a comment to T3201: Mirror: unsupported Unicode escape sequence.
09:45 <+vlorentz> douardda: yes and the only way around it (short of dropping data) is T3089
09:46 -swhbot:#swh-devel- T3089 (submitter: vlorentz, owner: vlorentz, status: Open): Remove the 'metadata' column of the 'revision' table <https://forge.softwareheritage.org/T3089>
09:46 <+vlorentz> or switching to cassandra
09:46 <+vlorentz> the good news is, they couldn't be inserted in the storage either, so you can safely drop them for now
Fri, Apr 2, 9:59 AM · Mirror
douardda triaged T3201: Mirror: unsupported Unicode escape sequence as High priority.
Fri, Apr 2, 9:54 AM · Mirror
douardda triaged T3200: Mirror: year is out of range as High priority.
Fri, Apr 2, 9:51 AM · Mirror
douardda triaged T3199: Mirror: key value violates unique constraint "person_fullname_idx" as High priority.
Fri, Apr 2, 9:48 AM · Mirror
douardda triaged T3198: Mirror: unexpected closed connection to the pg server as High priority.
Fri, Apr 2, 9:47 AM · Mirror
douardda triaged T3197: Mirror: fix common issues of a replayer session as High priority.
Fri, Apr 2, 9:41 AM · Mirror
douardda created P998 (An Untitled Masterwork).
Fri, Apr 2, 9:34 AM

Thu, Apr 1

douardda created P996 (An Untitled Masterwork).
Thu, Apr 1, 12:48 PM
douardda created P995 (An Untitled Masterwork).
Thu, Apr 1, 11:11 AM

Wed, Mar 31

douardda added inline comments to D5387: Refactor the model and simplify a bit origin.py.
Wed, Mar 31, 3:24 PM
douardda added a comment to D5387: Refactor the model and simplify a bit origin.py.

test coverage of the code touched by this diff isn't great

Wed, Mar 31, 3:22 PM
douardda added inline comments to D5388: Also test the provenance db with ArchiveStorage.
Wed, Mar 31, 3:20 PM

Tue, Mar 30

douardda added a comment to D5389: Improve tests.

Why .hex() everywhere? Does swh-provenance use hex strings internally?

Tue, Mar 30, 6:51 PM
douardda added reviewers for D5389: Improve tests: zack, grouss.
Tue, Mar 30, 5:36 PM
douardda requested review of D5389: Improve tests.
Tue, Mar 30, 5:34 PM
douardda requested review of D5388: Also test the provenance db with ArchiveStorage.
Tue, Mar 30, 5:32 PM
douardda requested review of D5387: Refactor the model and simplify a bit origin.py.
Tue, Mar 30, 5:31 PM
douardda added inline comments to D5337: Add a test to compare the result of revision_add() with known results.
Tue, Mar 30, 11:02 AM

Mon, Mar 29

douardda added reviewers for D5337: Add a test to compare the result of revision_add() with known results: zack, grouss.
Mon, Mar 29, 12:20 PM
douardda accepted D5363: extid: remove unicity on (extid_type, extid) and (target_type, target).

looks indeed reasonable (both the 1. point and the code) thanks

Mon, Mar 29, 11:33 AM

Fri, Mar 26

douardda updated the diff for D5337: Add a test to compare the result of revision_add() with known results.

rebase

Fri, Mar 26, 4:21 PM
douardda committed rDPROV4a5a99ea7d20: Add missing mypy.ini entry for iso8601 (authored by douardda).
Add missing mypy.ini entry for iso8601
Fri, Mar 26, 4:20 PM
douardda updated the diff for D5337: Add a test to compare the result of revision_add() with known results.

rebase

Fri, Mar 26, 2:58 PM
douardda committed rDPROV877a8a02b5ed: Add missing dependency on iso8601 (authored by douardda).
Add missing dependency on iso8601
Fri, Mar 26, 2:57 PM
douardda updated the diff for D5337: Add a test to compare the result of revision_add() with known results.

rebased

Fri, Mar 26, 2:49 PM
douardda committed rDPROV41bc4cef338c: Fix invalid extra dependency on swh-core (authored by douardda).
Fix invalid extra dependency on swh-core
Fri, Mar 26, 2:48 PM
douardda updated the diff for D5337: Add a test to compare the result of revision_add() with known results.

apply vlorentz comments

Fri, Mar 26, 12:09 PM
douardda added inline comments to D5337: Add a test to compare the result of revision_add() with known results.
Fri, Mar 26, 11:03 AM
douardda created P991 (An Untitled Masterwork).
Fri, Mar 26, 9:52 AM

Thu, Mar 25

douardda updated the diff for D5337: Add a test to compare the result of revision_add() with known results.

refactor a bit the test

Thu, Mar 25, 3:11 PM
douardda requested review of D5337: Add a test to compare the result of revision_add() with known results.
Thu, Mar 25, 3:04 PM
douardda committed rDPROVeb524713c3d6: Refactor ProvenanceWithPathDB.insert_location() (authored by douardda).
Refactor ProvenanceWithPathDB.insert_location()
Thu, Mar 25, 10:16 AM
douardda committed rDPROV15e390fb66c9: Add tests for revision_add() and content_find_first() (authored by douardda).
Add tests for revision_add() and content_find_first()
Thu, Mar 25, 10:16 AM
douardda committed rDPROV2ec4a0ab9da6: Make ArchivePostgreSQL.directory_ls_internal close the db cursor (authored by douardda).
Make ArchivePostgreSQL.directory_ls_internal close the db cursor
Thu, Mar 25, 10:16 AM