Page MenuHomeSoftware Heritage
Feed Advanced Search

Jul 28 2021

douardda updated the diff for D6031: Add a quick start section in the documentation and simplify the configuration file loading mechanism in the cli.

rebase

Jul 28 2021, 2:41 PM
douardda updated the diff for D6015: Use stored SQL functions for content_find_{all,one}() and merge Provenance*DB classes in a single ProvenanceDB.

move _relation_uses_location_table at the end of the class

Jul 28 2021, 2:40 PM
douardda added inline comments to D6015: Use stored SQL functions for content_find_{all,one}() and merge Provenance*DB classes in a single ProvenanceDB.
Jul 28 2021, 2:28 PM
douardda updated the diff for D6031: Add a quick start section in the documentation and simplify the configuration file loading mechanism in the cli.

fix typos reported by ardumont and vlorentz (thx)

Jul 28 2021, 2:20 PM
douardda added a comment to D5843: Add support for a denormalized version of the provenance DB.

It's not clear to me how the denormalized version handles the insertion of duplicated entries.

It's something I am still trying to figure also (whether this code performs as expected under heavy concurrent workload). I want to make more tests (by hand, this is hard to implement as a "unit" test) ASAP.

Jul 28 2021, 1:57 PM

Jul 27 2021

douardda accepted D5985: Simplify history graph creation and origin-revision algorithm.
Jul 27 2021, 6:15 PM
douardda requested changes to D6026: Add test for origin-revision layer.

I am not fond at all of the code duplication (between R-C and O-R synth file parsers), looks to me at least parts of it could be kept factorised in a dedicated module (I agree it should not live in conftest any more: too much code and logic now). It would then be best to have these test-helper functions tested themselves (as unitary as possible).

Jul 27 2021, 6:11 PM
douardda requested review of D6031: Add a quick start section in the documentation and simplify the configuration file loading mechanism in the cli.
Jul 27 2021, 6:05 PM
douardda added a comment to D5843: Add support for a denormalized version of the provenance DB.

It's not clear to me how the denormalized version handles the insertion of duplicated entries.

Jul 27 2021, 4:39 PM
douardda added a comment to D5843: Add support for a denormalized version of the provenance DB.

It's not clear to me how the denormalized version handles the insertion of duplicated entries.

Jul 27 2021, 4:36 PM
douardda updated the diff for D6015: Use stored SQL functions for content_find_{all,one}() and merge Provenance*DB classes in a single ProvenanceDB.

rebase and cpitalize sql queries

Jul 27 2021, 4:27 PM
douardda updated the diff for D5843: Add support for a denormalized version of the provenance DB.

capitalize sql querie

Jul 27 2021, 4:26 PM
douardda added inline comments to D6015: Use stored SQL functions for content_find_{all,one}() and merge Provenance*DB classes in a single ProvenanceDB.
Jul 27 2021, 4:02 PM
douardda accepted D6002: git_bare: Add support for swh-graph when loading a snapshot.

LGTM but see my questions (not sure they make really sense, but who knows)

Jul 27 2021, 11:28 AM
douardda added a comment to T3444: 26/07/2021: Unstuck infrastructure outage then post-mortem.

ceph is not properly monitored (ENOSPC should not get unoticed on these machines),

P1099 and further earlier logs from that moment do not seem to warn about this... T3945
got created for this.

Jul 27 2021, 9:52 AM · System administration

Jul 26 2021

douardda added a comment to T3444: 26/07/2021: Unstuck infrastructure outage then post-mortem.

Potential issues/weakness of our current infra:

Jul 26 2021, 5:15 PM · System administration

Jul 22 2021

douardda added a comment to D6015: Use stored SQL functions for content_find_{all,one}() and merge Provenance*DB classes in a single ProvenanceDB.

I would have loved to also replace the logic in relation_add() and _relation_get() by stored SQL functions, but it's above my poor SQL skills...

Jul 22 2021, 5:47 PM
douardda requested review of D6015: Use stored SQL functions for content_find_{all,one}() and merge Provenance*DB classes in a single ProvenanceDB.
Jul 22 2021, 3:05 PM

Jul 21 2021

douardda updated the diff for D5843: Add support for a denormalized version of the provenance DB.

rebase

Jul 21 2021, 3:09 PM

Jul 19 2021

douardda added a comment to T3104: Persistent readonly perfect hash table.

sorry I don't understand everything here:

Jul 19 2021, 5:20 PM · Object storage (RedHat collaboration)

Jul 2 2021

douardda accepted D5943: Fix database queries related to the origin-revision layer.

I still disagree with the implementation of get_dates() but meh

Jul 2 2021, 4:38 PM
douardda added inline comments to D5943: Fix database queries related to the origin-revision layer.
Jul 2 2021, 4:36 PM
douardda added inline comments to D5943: Fix database queries related to the origin-revision layer.
Jul 2 2021, 4:35 PM
douardda accepted D5947: Add `ProvenanceStorageInterface` as discussed during backend design.

I've made several small comments / nitpicks, fell free to address them or not.

Jul 2 2021, 4:32 PM
douardda added inline comments to D5947: Add `ProvenanceStorageInterface` as discussed during backend design.
Jul 2 2021, 4:30 PM
douardda accepted D5946: Rework `ProvenanceInterface` as discussed during backend design.

okay but as stated, I don't like too much the general usage of the RealDictCursor; sometimes it helps, but sometimes it does not. Ideally both should be available (depending on the query).

Jul 2 2021, 3:48 PM
douardda requested changes to D5943: Fix database queries related to the origin-revision layer.
Jul 2 2021, 3:40 PM
douardda accepted D5925: Refactor ArchiveInterface to fit origin-revision layer needs.
Jul 2 2021, 3:33 PM

Jul 1 2021

douardda added inline comments to D5943: Fix database queries related to the origin-revision layer.
Jul 1 2021, 3:28 PM
douardda added inline comments to D5925: Refactor ArchiveInterface to fit origin-revision layer needs.
Jul 1 2021, 3:25 PM
douardda accepted D5944: Add tests for history graph topology.

ok but please remove print statements before

Jul 1 2021, 12:37 PM
douardda added inline comments to D5944: Add tests for history graph topology.
Jul 1 2021, 12:33 PM
douardda updated subscribers of D5943: Fix database queries related to the origin-revision layer.
Jul 1 2021, 12:29 PM
douardda requested changes to D5925: Refactor ArchiveInterface to fit origin-revision layer needs.
Jul 1 2021, 12:07 PM
douardda added a comment to D5943: Fix database queries related to the origin-revision layer.

Why do all these queries use LOCK TABLE?

Jul 1 2021, 10:53 AM
douardda accepted D5948: Force `snapshot_get_heads` to return revisions in chronological order.

ok but the SQL query could be improved to not return unwanted dates

Jul 1 2021, 10:49 AM

Jun 29 2021

douardda triaged T3416: Implement the replayer service for Vitam as High priority.
Jun 29 2021, 9:33 AM
douardda added a comment to T3415: Specify the Vitam archiving format.

This initial proposal from CINES has not been selected because it de facto normalize a number of relations of the SWH graph making it unfit to storage in a solution like Vitam (too many objects, hard to manage incremental updates).

Jun 29 2021, 9:30 AM
douardda added a comment to T3415: Specify the Vitam archiving format.
  1. Proposal from CINES
Jun 29 2021, 9:27 AM
douardda triaged T3415: Specify the Vitam archiving format as High priority.
Jun 29 2021, 9:27 AM
douardda triaged T3414: Save the Archive in CINES' Vitam platform as High priority.
Jun 29 2021, 9:22 AM · meta-task, Roadmap 2022

Jun 28 2021

douardda added a comment to D5843: Add support for a denormalized version of the provenance DB.

Should this be documented somewhere? (How to use it / why)

Jun 28 2021, 3:35 PM

Jun 25 2021

douardda created P1078 (An Untitled Masterwork).
Jun 25 2021, 3:33 PM
douardda accepted D5893: hypothesis_strategies: Add raw_extrinsic_metadata() strategy.
Jun 25 2021, 11:27 AM
douardda accepted D5914: backend: Auto-generate origin visit stats upsert query.
Jun 25 2021, 11:25 AM
douardda accepted D5916: cli/task: Ensure cli output is always in the same order.
Jun 25 2021, 11:23 AM
douardda requested changes to D5917: journal_client: Only check last_* fields for some permutation tests.
Jun 25 2021, 11:22 AM
douardda added a comment to D5917: journal_client: Only check last_* fields for some permutation tests.

I think I'd rather like to have an explicit list of excluded fields (when these extra fields are added). So I'd prefer see this diff be something that compares dicts (as a result of BaseObject.to_dict()), possibly filtered to exclude some fields.

Jun 25 2021, 11:21 AM

Jun 23 2021

douardda added a comment to D5843: Add support for a denormalized version of the provenance DB.

Also, at some point we might want to use better templating to write these SQL queries, or use stored procedures (with the proper "variation" being chosen at db creation time on the selected flavor; would simplify the python code a lot.

Jun 23 2021, 11:22 AM
douardda updated the diff for D5843: Add support for a denormalized version of the provenance DB.

reword a bit the ci message and kill a few tabs in 30-schema.sql

Jun 23 2021, 11:12 AM
douardda added a comment to D5843: Add support for a denormalized version of the provenance DB.

yes I know, names for the subqueries are horrible...

Jun 23 2021, 11:07 AM
douardda added a comment to D5843: Add support for a denormalized version of the provenance DB.

yes I know, names for the subqueries are horrible...

Jun 23 2021, 11:07 AM
douardda retitled D5843: Add support for a denormalized version of the provenance DB from [WIP] Add support for a denormalized version of the provenance DB to Add support for a denormalized version of the provenance DB.
Jun 23 2021, 11:04 AM
douardda updated the diff for D5843: Add support for a denormalized version of the provenance DB.

rebase, adapt and implement denormalization for content_in_dir and dir_in_rev

Jun 23 2021, 11:03 AM

Jun 22 2021

douardda abandoned D5841: Remove the without-path flavor of ProvenanceDB.

we keep it for now

Jun 22 2021, 5:12 PM
douardda abandoned D5885: Add support for (topological) branches and merges in generate_repo.py.

I believe this diff is duplicated and the other one was already landed.

Jun 22 2021, 5:11 PM
douardda accepted D5902: Remove origin_get_id method from ProvenanceInterface.

overall ok but see the comment

Jun 22 2021, 11:05 AM

Jun 21 2021

douardda closed D5894: Allow to add extra origins and snapshots in generated test storages.
Jun 21 2021, 4:48 PM
douardda closed D5892: Add support for (topological) branches and merges in generate_repo.py.
Jun 21 2021, 4:48 PM
douardda committed rDPROV011645221cf6: Allow to add extra origins and snapshots in generated test storages (authored by douardda).
Allow to add extra origins and snapshots in generated test storages
Jun 21 2021, 4:48 PM
douardda committed rDPROV6734fd36b872: Add support for (topological) branches and merges in generate_repo.py (authored by douardda).
Add support for (topological) branches and merges in generate_repo.py
Jun 21 2021, 4:48 PM
douardda closed D5891: Refactor the generate_storage_from_git dataset creation tool.
Jun 21 2021, 4:48 PM
douardda committed rDPROV7886bf494ab8: Refactor the generate_storage_from_git dataset creation tool (authored by douardda).
Refactor the generate_storage_from_git dataset creation tool
Jun 21 2021, 4:48 PM
douardda updated the diff for D5891: Refactor the generate_storage_from_git dataset creation tool.

rebase

Jun 21 2021, 4:46 PM
douardda updated the diff for D5892: Add support for (topological) branches and merges in generate_repo.py.

rebase

Jun 21 2021, 4:45 PM
douardda updated the diff for D5894: Allow to add extra origins and snapshots in generated test storages.

typos

Jun 21 2021, 4:43 PM
douardda added inline comments to D5894: Allow to add extra origins and snapshots in generated test storages.
Jun 21 2021, 4:39 PM
douardda accepted D5886: Refactor origin-revision layer.

ok but some questions/remarks have not been addressed...

Jun 21 2021, 4:37 PM
douardda accepted D5880: Update methods associated to the origin-revision layer.

Thanks for the DatetimeCache & co.

Jun 21 2021, 4:32 PM
douardda added a comment to T3382: Save process seems to be stuck.

I agree having access to the logs of the task (more or less) in real-time would be very handy (as one can expect on any CI-like tool nowadays).

Jun 21 2021, 10:50 AM · Save Code Now
douardda added inline comments to D5886: Refactor origin-revision layer.
Jun 21 2021, 10:20 AM

Jun 18 2021

douardda added inline comments to D5880: Update methods associated to the origin-revision layer.
Jun 18 2021, 5:36 PM
douardda added inline comments to D5880: Update methods associated to the origin-revision layer.
Jun 18 2021, 2:49 PM
douardda accepted D5862: Rework ArchiveInterface.

but please add a comment in the ArchivePostgreSQL's version of snapshot_get_heads explaining why it's (for now) a duplication of the other implementation, thanks

Jun 18 2021, 2:34 PM
douardda added a comment to D5862: Rework ArchiveInterface.

OK for the first two items, but I don't agree on the third one. The idea is to replace one of them by a direct SQL query in the near future so reworking this will be useless. I just didn't implement the query because I needed to move forward with the other stuff

Jun 18 2021, 2:31 PM
douardda accepted D5884: Fix bugs when retrieving parents in RevisionEntry.

thanks, the diff looks much simpler now :-)

Jun 18 2021, 2:26 PM
douardda added inline comments to D5884: Fix bugs when retrieving parents in RevisionEntry.
Jun 18 2021, 2:22 PM
douardda requested changes to D5886: Refactor origin-revision layer.

I know I am rambling, but could it come with some testing?

Jun 18 2021, 2:20 PM
douardda requested review of D5892: Add support for (topological) branches and merges in generate_repo.py.
Jun 18 2021, 12:35 PM
douardda updated the diff for D5894: Allow to add extra origins and snapshots in generated test storages.

use 'branches' instead of "revisions" as section in the yaml file

Jun 18 2021, 12:35 PM
douardda requested review of D5891: Refactor the generate_storage_from_git dataset creation tool.
Jun 18 2021, 12:32 PM
douardda requested review of D5894: Allow to add extra origins and snapshots in generated test storages.
Jun 18 2021, 12:29 PM
douardda committed rDJNLa06bab98b115: Add a StreamJournalWriter backend (authored by douardda).
Add a StreamJournalWriter backend
Jun 18 2021, 11:21 AM
douardda closed D5890: Add a StreamJournalWriter backend.
Jun 18 2021, 11:21 AM
douardda committed rDJNLa4ae96d12d2c: Better annotation for InMemoryJournalWriter's value_sanitizer (authored by douardda).
Better annotation for InMemoryJournalWriter's value_sanitizer
Jun 18 2021, 11:21 AM
douardda requested changes to D5880: Update methods associated to the origin-revision layer.

Mostly nitpicking comments, but I'd really prefer that:

  • the cache is kept properly typed
  • the cache clearing thing gets its own git revision
Jun 18 2021, 11:06 AM
douardda added inline comments to D5890: Add a StreamJournalWriter backend.
Jun 18 2021, 10:49 AM
douardda updated the diff for D5890: Add a StreamJournalWriter backend.

Add a revision to fix the annotation of InMemory's value_sanitizer

Jun 18 2021, 10:48 AM
douardda added inline comments to D5890: Add a StreamJournalWriter backend.
Jun 18 2021, 10:37 AM
douardda updated the diff for D5890: Add a StreamJournalWriter backend.

small simplification in the docstring

Jun 18 2021, 10:34 AM
douardda updated the diff for D5890: Add a StreamJournalWriter backend.

use get_journal_writer in test_stream

Jun 18 2021, 10:31 AM
douardda updated the diff for D5890: Add a StreamJournalWriter backend.

fix docstring, thx ardumont

Jun 18 2021, 10:25 AM
douardda added inline comments to D5890: Add a StreamJournalWriter backend.
Jun 18 2021, 10:23 AM
douardda added inline comments to D5890: Add a StreamJournalWriter backend.
Jun 18 2021, 10:17 AM
douardda added inline comments to D5890: Add a StreamJournalWriter backend.
Jun 18 2021, 10:14 AM
douardda requested changes to D5862: Rework ArchiveInterface.

Thanks for the split. Almost there, but a few nitpicks/comments remains:

Jun 18 2021, 10:06 AM
douardda requested changes to D5884: Fix bugs when retrieving parents in RevisionEntry.

Thanks a lot for the split.

Jun 18 2021, 9:55 AM
douardda updated the diff for D5890: Add a StreamJournalWriter backend.

update the copyright in __init__.py

Jun 18 2021, 9:46 AM

Jun 17 2021

douardda committed rDPROV8ff1ab5860a6: Improve .gitignore (authored by douardda).
Improve .gitignore
Jun 17 2021, 8:47 PM
douardda retitled D5890: Add a StreamJournalWriter backend from Add a StreamJournalWrtier backend to Add a StreamJournalWriter backend.
Jun 17 2021, 8:39 PM