Page MenuHomeSoftware Heritage
Feed Advanced Search

Apr 28 2021

douardda committed rDSTO98804f9e1262: Add a fixer for ExtrinsicRawMetadata (authored by douardda).
Add a fixer for ExtrinsicRawMetadata
Apr 28 2021, 3:00 PM
douardda added inline comments to D5634: Add a fixer for ExtrinsicRawMetadata.
Apr 28 2021, 2:16 PM
douardda updated the diff for D5634: Add a fixer for ExtrinsicRawMetadata.

remove the pprint from the doctest

Apr 28 2021, 2:14 PM
douardda added inline comments to D5634: Add a fixer for ExtrinsicRawMetadata.
Apr 28 2021, 2:11 PM
douardda requested review of D5634: Add a fixer for ExtrinsicRawMetadata.
Apr 28 2021, 12:44 PM
douardda created P1027 (An Untitled Masterwork).
Apr 28 2021, 12:34 PM
douardda created P1026 (An Untitled Masterwork).
Apr 28 2021, 12:27 PM
douardda created P1025 (An Untitled Masterwork).
Apr 28 2021, 12:23 PM

Apr 27 2021

douardda closed D5589: Fix swh_model_data hardcoded id values.
Apr 27 2021, 10:42 AM
douardda committed rDMOD446bd2b167c3: Fix swh_model_data hardcoded id values (authored by douardda).
Fix swh_model_data hardcoded id values
Apr 27 2021, 10:42 AM
douardda added a comment to D5589: Fix swh_model_data hardcoded id values.

if you are convinced it's a report covering issue, ok then ;)

Apr 27 2021, 10:41 AM
douardda accepted D5584: cassandra: Add a test of a 'complex' migration, with a PK update.

I would not say all this is crystal clear to me, but overall looks fine.

Apr 27 2021, 10:34 AM
douardda added inline comments to D5582: cassandra: Add 'allow_overwrite' option, to allow updating objects.
Apr 27 2021, 10:26 AM
douardda accepted D5614: tarball: properly normalize perms for all extracted files.

lgtm

Apr 27 2021, 10:16 AM
douardda added a comment to D5614: tarball: properly normalize perms for all extracted files.

Why only 0o100644 and 0o100755? I don't think we should make the package loaders discard this information just because Git does.

Apr 27 2021, 10:16 AM
douardda added inline comments to D5589: Fix swh_model_data hardcoded id values.
Apr 27 2021, 10:12 AM

Apr 26 2021

douardda updated the diff for D3334: Add a new TenaciousProxyStorage.

Rebased and updated to current HEAD

Apr 26 2021, 4:23 PM
douardda committed rDSTO2c477ec442b7: Fix storage_data hardcoded id values (authored by douardda).
Fix storage_data hardcoded id values
Apr 26 2021, 4:19 PM
douardda closed D5587: Fix storage_data hardcoded id values.
Apr 26 2021, 4:19 PM
douardda added inline comments to D5589: Fix swh_model_data hardcoded id values.
Apr 26 2021, 4:06 PM
douardda requested review of D5587: Fix storage_data hardcoded id values.
Apr 26 2021, 2:35 PM
douardda added inline comments to D5589: Fix swh_model_data hardcoded id values.
Apr 26 2021, 2:29 PM
douardda committed rDENVf9991bc8ed2b: Remove most of the README content and point to the Developer setup page (authored by douardda).
Remove most of the README content and point to the Developer setup page
Apr 26 2021, 1:57 PM
douardda closed D5585: Remove most of the README content and point to the Developer setup page.
Apr 26 2021, 1:57 PM
douardda updated the diff for D5585: Remove most of the README content and point to the Developer setup page.

rebase

Apr 26 2021, 1:55 PM

Apr 23 2021

douardda requested review of D5589: Fix swh_model_data hardcoded id values.
Apr 23 2021, 5:27 PM
douardda updated the diff for D5585: Remove most of the README content and point to the Developer setup page.

also add lik to the docker page

Apr 23 2021, 12:10 PM
douardda requested review of D5585: Remove most of the README content and point to the Developer setup page.
Apr 23 2021, 11:57 AM
douardda added a comment to T3283: Create a vm to test the mirror environment.

it works! thx

Apr 23 2021, 11:06 AM · System administration

Apr 22 2021

douardda created P1014 (An Untitled Masterwork).
Apr 22 2021, 5:37 PM
douardda created P1013 (An Untitled Masterwork).
Apr 22 2021, 5:33 PM
douardda updated the task description for T3281: Create a list of known test/buggy repos and use them in loader/storage tests.
Apr 22 2021, 2:59 PM
douardda added a subtask for T1957: Handling missing DAG nodes: T3282: Add support for "uninterpreted upstream object" in SWH model and storage.
Apr 22 2021, 2:44 PM · Data Model
douardda added a parent task for T3282: Add support for "uninterpreted upstream object" in SWH model and storage: T1957: Handling missing DAG nodes.
Apr 22 2021, 2:44 PM · Data Model
douardda reopened T3282: Add support for "uninterpreted upstream object" in SWH model and storage as "Open".

actually no, it's not the same...

Apr 22 2021, 2:43 PM · Data Model
douardda added a comment to T1957: Handling missing DAG nodes.

Examples of such missing objects are revisions with attributes that cannot fit the current data model, e.g. out of range dates. We have example of such revisions in kafka, as mentionned in T3200 and T3170.

Apr 22 2021, 2:39 PM · Data Model
douardda closed T3282: Add support for "uninterpreted upstream object" in SWH model and storage as Wontfix.

Same as T1957

Apr 22 2021, 2:37 PM · Data Model
douardda renamed T3282: Add support for "uninterpreted upstream object" in SWH model and storage from Add support for "uninterpreted upstream object" in our model and storage to Add support for "uninterpreted upstream object" in SWH model and storage.
Apr 22 2021, 2:33 PM · Data Model
douardda created T3282: Add support for "uninterpreted upstream object" in SWH model and storage.
Apr 22 2021, 2:32 PM · Data Model
douardda triaged T3281: Create a list of known test/buggy repos and use them in loader/storage tests as Normal priority.
Apr 22 2021, 12:25 PM
douardda added a comment to T3200: Mirror: year is out of range.

These revisions are probably coming from https://gitlab.com/gitlab-org/gitlab-test (or a clone)

Apr 22 2021, 12:10 PM · Mirror
douardda added a comment to T3200: Mirror: year is out of range.

Ah fun, one of the revisions with this pb, on staging (ba3343bc4fa403a8dfbfcab7fc1a8c29ee34bd69) seems to have been crafted by https://gitlab.com/gitlab-org/gitlab-foss/-/blob/staging-26-fix_add_deploy_key_spec/spec/models/merge_request_diff_commit_spec.rb

Apr 22 2021, 12:05 PM · Mirror

Apr 21 2021

douardda added a comment to T3170: Revisions in the journal with out of range dates.

Note that none of their parent revisions can be found either in the archive (one invalid revision in a set of ingested revisions prevent any of them being inserted in the database I suppose, but they are already inserted in kafka at this moment).

Apr 21 2021, 7:08 PM · Data Model, Journal
douardda added a comment to T3200: Mirror: year is out of range.

See T3170 (error generated by the same invalid kafka messages).

Apr 21 2021, 6:58 PM · Mirror
douardda created P1010 (An Untitled Masterwork).
Apr 21 2021, 10:27 AM

Apr 20 2021

douardda closed T3229: Update "software architecture" image as Resolved by committing rDDOCbf30a9270a87: Update the general architecture diagram.
Apr 20 2021, 5:27 PM · Documentation
douardda closed D5558: Update the general architecture diagram.
Apr 20 2021, 5:27 PM
douardda committed rDDOCbf30a9270a87: Update the general architecture diagram (authored by douardda).
Update the general architecture diagram
Apr 20 2021, 5:27 PM
douardda added a comment to T3087: Implement support for takedown notices (infra, admin tools, workflow).

So what about exports of the archive available on git-annex?

Apr 20 2021, 11:09 AM · Roadmap 2022, meta-task, Roadmap 2021, Web app
douardda added a comment to T3246: Document takedown request processing workflow.

do we also intent to have a takedown topic on kafka?

Apr 20 2021, 11:08 AM · Archive content
douardda added a comment to T1481: add metric to monitor "save code now" efficiency.

Note that there is the same transient vs cumulative discrepency on the "Accepted requests" graph.

Apr 20 2021, 11:06 AM · Save Code Now, System administration, Metrics/monitoring
douardda added a comment to T1481: add metric to monitor "save code now" efficiency.

I think the "submitted requests per visit type / status" graph should be split in 2 parts. Both accepted and rejected are cumulative values that will indefinitely grow, while pending are transient value aiming at staying near zero, so it makes no sense to have them on the same graph.

Apr 20 2021, 11:02 AM · Save Code Now, System administration, Metrics/monitoring
douardda added a comment to T1481: add metric to monitor "save code now" efficiency.

I think the "submitted requests per visit type / status" graph should be split in 2 parts. Both accepted and rejected are cumulative values that will indefinitely grow, while pending are transient value aiming at staying near zero, so it makes no sense to have them on the same graph.

Apr 20 2021, 11:00 AM · Save Code Now, System administration, Metrics/monitoring
douardda added a comment to T3084: Fast track save code now requests.

is there a grafana dashboard dedicated to this queue?

Apr 20 2021, 10:55 AM · System administration, Web app
douardda changed the status of T3227: DB Schema link broken in docs under swh-storage. from Duplicate to Resolved.
Apr 20 2021, 10:52 AM · Easy hack, Documentation
douardda requested review of D5558: Update the general architecture diagram.
Apr 20 2021, 10:51 AM
douardda added a revision to T3229: Update "software architecture" image: D5558: Update the general architecture diagram.
Apr 20 2021, 10:51 AM · Documentation
douardda closed D5553: Update black to 20.8b1.
Apr 20 2021, 9:44 AM
douardda committed rDPROV8009af31a914: Update black to 20.8b1 (authored by douardda).
Update black to 20.8b1
Apr 20 2021, 9:44 AM

Apr 19 2021

douardda updated subscribers of D5553: Update black to 20.8b1.

Looks good to me. It's true the black version we are using is quite outdated.

We should upgrade black in all other swh repositories in the same manner then.

Apr 19 2021, 5:46 PM
douardda requested review of D5553: Update black to 20.8b1.
Apr 19 2021, 5:06 PM
douardda committed rDPROV982e2c1a2a9a: Add synthetic test files for the mindepth=2 heuristic (authored by douardda).
Add synthetic test files for the mindepth=2 heuristic
Apr 19 2021, 4:56 PM
douardda closed D5389: Improve tests.
Apr 19 2021, 4:56 PM
douardda closed D5388: Also test the provenance db with ArchiveStorage.
Apr 19 2021, 4:56 PM
douardda committed rDPROVf2ffd718c468: Rename synthetic result test files (authored by douardda).
Rename synthetic result test files
Apr 19 2021, 4:56 PM
douardda committed rDPROV594490576b00: Improve and rename test_provenance_db() as test_probenance_heuristics() (authored by douardda).
Improve and rename test_provenance_db() as test_probenance_heuristics()
Apr 19 2021, 4:56 PM
douardda committed rDPROV5e89689ef08b: Also test the provenance db with ArchiveStorage (authored by douardda).
Also test the provenance db with ArchiveStorage
Apr 19 2021, 4:56 PM
douardda closed D5387: Refactor the model and simplify a bit origin.py.
Apr 19 2021, 4:56 PM
douardda committed rDPROV7eaeebb6091a: Simplify a bit origin.py (authored by douardda).
Simplify a bit origin.py
Apr 19 2021, 4:56 PM
douardda committed rDPROVa23a33c5a77d: Refactor the model (authored by douardda).
Refactor the model
Apr 19 2021, 4:56 PM
douardda committed rDPROV8853314af981: Add a test for the (noroot, upper) case (authored by douardda).
Add a test for the (noroot, upper) case
Apr 19 2021, 4:55 PM
douardda closed D5337: Add a test to compare the result of revision_add() with known results.
Apr 19 2021, 4:55 PM
douardda committed rDPROVadbc99dd357d: Add a test to compare the result of revision_add() with known results (authored by douardda).
Add a test to compare the result of revision_add() with known results
Apr 19 2021, 4:55 PM
douardda updated the diff for D5389: Improve tests.

rebas

Apr 19 2021, 4:45 PM
douardda updated the diff for D5388: Also test the provenance db with ArchiveStorage.

rebase

Apr 19 2021, 4:44 PM
douardda updated the diff for D5387: Refactor the model and simplify a bit origin.py.

rebased

Apr 19 2021, 4:43 PM
douardda updated the diff for D5337: Add a test to compare the result of revision_add() with known results.

rebased

Apr 19 2021, 4:42 PM
douardda committed rDPROV62617e500649: Enforce black version 19.10b0 in tox to be consistent with pre-commit (authored by douardda).
Enforce black version 19.10b0 in tox to be consistent with pre-commit
Apr 19 2021, 4:41 PM
douardda added a comment to T3269: Investigate scheduling policy for fsfe's gitea.

The loading tasks created during this first listing were oneshot tasks. So they have been modified to recurring tasks with something like:

Apr 19 2021, 3:29 PM · Origin-Gitea/Gogs
douardda added a comment to T2602: Investigate how to upgrade the schema of the Cassandra storage.

Doesn't this deserve a state-of-the-art kind of thing? Are there documentation material on the subject? How does other (big) cassandra users handle this?

Apr 19 2021, 2:14 PM · Storage manager
douardda added a comment to T3269: Investigate scheduling policy for fsfe's gitea.

The listing task has been disabled, I think because of failures in the last executions:

Apr 19 2021, 11:51 AM · Origin-Gitea/Gogs
douardda triaged T3269: Investigate scheduling policy for fsfe's gitea as High priority.
Apr 19 2021, 10:19 AM · Origin-Gitea/Gogs
douardda added a comment to T3084: Fast track save code now requests.

is there a grafana dashboard dedicated to this queue?

Apr 19 2021, 10:14 AM · System administration, Web app
douardda added a comment to T3246: Document takedown request processing workflow.

also: what about exports we provide on git annex?

Apr 19 2021, 10:10 AM · Archive content
douardda added a comment to T3246: Document takedown request processing workflow.

do we also intent to have a takedown topic on kafka?

Apr 19 2021, 10:09 AM · Archive content

Apr 9 2021

douardda created P1003 (An Untitled Masterwork).
Apr 9 2021, 4:04 PM

Apr 8 2021

douardda added a comment to T3198: Mirror: unexpected closed connection to the pg server.

Just got this one below. Note that this occurred just when the replayer actually started to insert object in the storage (before that, since the start of the replayer process, only kafka scaffolding took place for quite some time, around 30mn!)

Apr 8 2021, 12:02 PM · Mirror
douardda triaged T3218: The graph replayer generates REQTMOUT Timeout errors as High priority.
Apr 8 2021, 11:44 AM · Mirror

Apr 7 2021

douardda added a comment to T3214: Restrict accepted timestamps to values that can be processed all along.

looks like there is no revision with date or committer_date > 9999-12-31 in the main storage...

Apr 7 2021, 3:04 PM · Data Model
douardda triaged T3214: Restrict accepted timestamps to values that can be processed all along as High priority.
Apr 7 2021, 2:30 PM · Data Model

Apr 6 2021

douardda closed T3201: Mirror: unsupported Unicode escape sequence as Resolved by committing rDSTO39507b24d0f4: Make the replayer drop the Revision.metadata.
Apr 6 2021, 4:42 PM · Mirror
douardda closed D5414: Make the replayer drop the Revision.metadata.
Apr 6 2021, 4:42 PM
douardda closed T3201: Mirror: unsupported Unicode escape sequence, a subtask of T3197: Mirror: fix common issues of a replayer session, as Resolved.
Apr 6 2021, 4:42 PM · Mirror
douardda committed rDSTO39507b24d0f4: Make the replayer drop the Revision.metadata (authored by douardda).
Make the replayer drop the Revision.metadata
Apr 6 2021, 4:42 PM
douardda committed rDSTO84dcbe3d0e56: Merge test_replay's _check_replayed and check_replayed in a single function (authored by douardda).
Merge test_replay's _check_replayed and check_replayed in a single function
Apr 6 2021, 4:42 PM
douardda closed D5413: Make pg Storage.extid_add() write extid objects to the journal.
Apr 6 2021, 4:42 PM
douardda committed rDSTO36a7fd34f3ba: Fix pg Storage.extid_add(): write ExtID objects to the journal (authored by douardda).
Fix pg Storage.extid_add(): write ExtID objects to the journal
Apr 6 2021, 4:42 PM
douardda retitled D5413: Make pg Storage.extid_add() write extid objects to the journal from Make pg Strorage.extid_add() write extid objects to the journal to Make pg Storage.extid_add() write extid objects to the journal.
Apr 6 2021, 4:33 PM
douardda updated the diff for D5414: Make the replayer drop the Revision.metadata.

fix commit message

Apr 6 2021, 4:32 PM
douardda added a comment to D5413: Make pg Storage.extid_add() write extid objects to the journal.

Could you add a test for the storage? All other *_add have a journal test IIRC

let me check that

Apr 6 2021, 4:09 PM