Page MenuHomeSoftware Heritage
Feed Advanced Search

Oct 18 2022

douardda requested review of D8718: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:44 PM
douardda requested review of D8734: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:38 PM
douardda requested review of D8736: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:37 PM
douardda requested review of D8738: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:36 PM
douardda requested review of D8732: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:35 PM
douardda requested review of D8731: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:34 PM
douardda requested review of D8712: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:33 PM
douardda requested review of D8730: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:31 PM
douardda requested review of D8727: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:31 PM
douardda requested review of D8728: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:31 PM
douardda requested review of D8726: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:30 PM
douardda requested review of D8729: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:30 PM
douardda requested review of D8724: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:29 PM
douardda requested review of D8725: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:28 PM
douardda requested review of D8717: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:28 PM
douardda requested review of D8723: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:27 PM
douardda requested review of D8719: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:27 PM
douardda requested review of D8722: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:27 PM
douardda requested review of D8721: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:26 PM
douardda requested review of D8720: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:24 PM
douardda requested review of D8711: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:23 PM
douardda requested review of D8714: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:22 PM
douardda requested review of D8713: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:21 PM
douardda requested review of D8716: pre-commit, tox: Bump pre-commit, black and flake8.
Oct 18 2022, 7:20 PM
douardda requested review of D8715: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:19 PM
douardda requested review of D8710: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:17 PM
douardda requested review of D8709: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:15 PM
douardda requested review of D8708: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:14 PM
douardda requested review of D8706: pre-commit, tox: Bump pre-commit, codespell, black and flake8.
Oct 18 2022, 7:10 PM
douardda committed rDMODda5c23bdbdd0: remove now unneeded mypy missing entries (authored by douardda).
remove now unneeded mypy missing entries
Oct 18 2022, 5:31 PM
douardda created P1501 (An Untitled Masterwork).
Oct 18 2022, 11:53 AM
douardda requested review of D8695: Move all non pytest-related functions from conftest to utils.py.
Oct 18 2022, 11:03 AM
douardda abandoned D8692: tox: run grpc-related tests in a dedicated pytest session.

nope, does not work (as expected, actually)

Oct 18 2022, 10:00 AM

Oct 17 2022

douardda requested review of D8692: tox: run grpc-related tests in a dedicated pytest session.
Oct 17 2022, 6:20 PM
douardda closed D8678: Add a 'swh provenance replay' cli command.
Oct 17 2022, 6:08 PM
douardda committed rDPROV022b6f76614e: Add a 'swh provenance replay' cli command (authored by douardda).
Add a 'swh provenance replay' cli command
Oct 17 2022, 6:08 PM
douardda added a comment to D8678: Add a 'swh provenance replay' cli command.

Bikeshedding: it should be called "journal-client" rather than "replay" for consistency with swh-indexer and swh-search. (swh-storage only calls it "replay" because it's used to copy from another instance of the same code so it "replays" the same API calls; but here it may be the first "play")

I'm not sure I follow you there; this really is a replayer feature: it aims at replicating a provenance DB via a kafka journal.
We already have a journal client in provenance consuming the main archive revision and origin-visit-status topics. The cli are swh provenance revision from-journal and swh provenance origin from-journal (aka execute the {origin,revision} layer reading from the journal; there are from-csv versions of these commands as well).

Oct 17 2022, 5:20 PM
douardda added a comment to D8678: Add a 'swh provenance replay' cli command.

Bikeshedding: it should be called "journal-client" rather than "replay" for consistency with swh-indexer and swh-search. (swh-storage only calls it "replay" because it's used to copy from another instance of the same code so it "replays" the same API calls; but here it may be the first "play")

Oct 17 2022, 5:17 PM

Oct 14 2022

douardda requested review of D8678: Add a 'swh provenance replay' cli command.
Oct 14 2022, 12:16 PM

Oct 13 2022

douardda abandoned D7477: Improve `revision_get_parents` implementation in `ArchiveGraph`.
Oct 13 2022, 4:38 PM
douardda commandeered D7477: Improve `revision_get_parents` implementation in `ArchiveGraph`.
Oct 13 2022, 4:38 PM
douardda abandoned D7478: Clean up `model` removing old unused logic.
Oct 13 2022, 4:37 PM
douardda commandeered D7478: Clean up `model` removing old unused logic.
Oct 13 2022, 4:37 PM
douardda closed D8658: Add a journal replayer for the revision layer.
Oct 13 2022, 4:34 PM
douardda closed D8670: Make relation_add sql function prefill entity tables if needed.
Oct 13 2022, 4:34 PM
douardda committed rDPROV0850a3943df2: Add a journal replayer for the revision layer (authored by douardda).
Add a journal replayer for the revision layer
Oct 13 2022, 4:34 PM
douardda committed rDPROVe1da37d4375f: Make relation_add sql function prefill entity tables if needed (authored by douardda).
Make relation_add sql function prefill entity tables if needed
Oct 13 2022, 4:34 PM
douardda closed D8668: Remove the without-path db flavor.
Oct 13 2022, 4:33 PM
douardda committed rDPROVa0b2f0e9da09: Remove the without-path db flavor (authored by douardda).
Remove the without-path db flavor
Oct 13 2022, 4:33 PM
douardda added inline comments to D8673: packagist: Canonicalize github origins.
Oct 13 2022, 1:50 PM
douardda added a comment to D8662: FindEarliestRevision: Add earliest_ts and rev_occurrences columns.

Also I don't understand what "this is an import from replication package" means in this context...

Oct 13 2022, 12:02 PM
douardda added a comment to D8662: FindEarliestRevision: Add earliest_ts and rev_occurrences columns.

not sure I understand what's going on here.

Oct 13 2022, 12:00 PM
douardda accepted D8609: storage_checker: Notify database when ranges are fully checked.

lgtm (but with a couple of comments)

Oct 13 2022, 11:49 AM
douardda added inline comments to D8670: Make relation_add sql function prefill entity tables if needed.
Oct 13 2022, 11:23 AM
douardda requested review of D8670: Make relation_add sql function prefill entity tables if needed.
Oct 13 2022, 11:18 AM
douardda updated the diff for D8658: Add a journal replayer for the revision layer.

split in 2 (one for each rev)

Oct 13 2022, 11:13 AM
douardda updated the diff for D8668: Remove the without-path db flavor.

move the 'not null' removal in the proper revision

Oct 13 2022, 11:09 AM
douardda added inline comments to D8668: Remove the without-path db flavor.
Oct 13 2022, 11:00 AM

Oct 12 2022

douardda added a comment to D8658: Add a journal replayer for the revision layer.

This seems to work:

diff --git a/swh/provenance/storage/replay.py b/swh/provenance/storage/replay.py
index d42d134..63e44cb 100644
--- a/swh/provenance/storage/replay.py
+++ b/swh/provenance/storage/replay.py
@@ -4,7 +4,7 @@
 # See top-level LICENSE file for more information
 
 import logging
-from typing import Any, Callable, Dict, List, Optional
+from typing import Any, Callable, Dict, List, Optional, Tuple
 
 try:
     from systemd.daemon import notify
@@ -29,19 +29,19 @@
 
 
 def cvrt_directory(msg_d):
-    return {msg_d["id"]: DirectoryData(**msg_d["value"])}
+    return (msg_d["id"], DirectoryData(**msg_d["value"]))
 
 
 def cvrt_revision(msg_d):
-    return {msg_d["id"]: RevisionData(**msg_d["value"])}
+    return (msg_d["id"], RevisionData(**msg_d["value"]))
 
 
 def cvrt_default(msg_d):
-    return {msg_d["id"]: msg_d["value"]}
+    return (msg_d["id"], msg_d["value"])
 
 
 def cvrt_relation(msg_d):
-    return {msg_d["id"]: {RelationData(**v) for v in msg_d["value"]}}
+    return (msg_d["id"], {RelationData(**v) for v in msg_d["value"]})
 
 
 OBJECT_CONVERTERS: Dict[str, Callable[[Dict], Dict]] = {
@@ -75,7 +75,9 @@ def report_failure(self, msg: bytes, obj: Dict):
 
 
 def process_replay_objects(
-    all_objects: Dict[str, List[Dict]], *, storage: ProvenanceStorageInterface
+    all_objects: Dict[str, List[Tuple[bytes, Any]]],
+    *,
+    storage: ProvenanceStorageInterface,
 ) -> None:
     for object_type, objects in all_objects.items():
         logger.debug("Inserting %s %s objects", len(objects), object_type)
@@ -89,14 +91,16 @@ def process_replay_objects(
 
 
 def _insert_objects(
-    object_type: str, objects: List[Any], storage: ProvenanceStorageInterface
+    object_type: str,
+    objects: List[Tuple[bytes, Any]],
+    storage: ProvenanceStorageInterface,
 ) -> None:
     """Insert objects of type object_type in the storage."""
     if object_type not in OBJECT_CONVERTERS:
         logger.warning("Received a series of %s, this should not happen", object_type)
         return
 
-    data = dict(next(iter(obj.items())) for obj in objects)
+    data = dict(objects)
     if "_in_" in object_type:
         storage.relation_add(relation=RelationType(object_type), data=data)
     else:
Oct 12 2022, 6:09 PM
douardda added inline comments to D8658: Add a journal replayer for the revision layer.
Oct 12 2022, 6:08 PM
douardda added inline comments to D8658: Add a journal replayer for the revision layer.
Oct 12 2022, 6:02 PM
douardda updated the summary of D8658: Add a journal replayer for the revision layer.
Oct 12 2022, 5:45 PM
douardda updated the diff for D8658: Add a journal replayer for the revision layer.

fix tabs in 40-funcs.sql file

Oct 12 2022, 5:23 PM
douardda requested review of D8668: Remove the without-path db flavor.
Oct 12 2022, 5:00 PM
douardda updated the diff for D8658: Add a journal replayer for the revision layer.

Attempt to make replaying work in the general situation

Oct 12 2022, 4:57 PM
douardda committed rDPROVb3fa1f59243a: Remove unused test data directory (authored by douardda).
Remove unused test data directory
Oct 12 2022, 4:08 PM
douardda closed D8657: Add support for kafka journalization of the ProvenanceStorageInterface.
Oct 12 2022, 4:07 PM
douardda committed rDPROV08f2e604b074: Add support for kafka journalization of the ProvenanceStorageInterface (authored by douardda).
Add support for kafka journalization of the ProvenanceStorageInterface
Oct 12 2022, 4:07 PM
douardda committed rDPROV7e6a62c990b7: Rename ProvenanceInterface.directory_xxx_flattenned as directory_xxx_flattened (authored by douardda).
Rename ProvenanceInterface.directory_xxx_flattenned as directory_xxx_flattened
Oct 12 2022, 4:07 PM
douardda closed D8656: Normalize _add() methods of the ProvenanceStorage interface.
Oct 12 2022, 4:07 PM
douardda committed rDPROV2bd74fc7d97d: Normalize _add() methods of the ProvenanceStorage interface (authored by douardda).
Normalize _add() methods of the ProvenanceStorage interface
Oct 12 2022, 4:07 PM

Oct 11 2022

douardda updated the diff for D8658: Add a journal replayer for the revision layer.

simplify a bit the code (vlorentz' suggestion)

Oct 11 2022, 5:17 PM
douardda added inline comments to D8658: Add a journal replayer for the revision layer.
Oct 11 2022, 5:13 PM
douardda updated the diff for D8658: Add a journal replayer for the revision layer.

rebase

Oct 11 2022, 5:02 PM
douardda updated the diff for D8657: Add support for kafka journalization of the ProvenanceStorageInterface.

rebase

Oct 11 2022, 4:47 PM
douardda updated the diff for D8656: Normalize _add() methods of the ProvenanceStorage interface.

also rename ProvenanceInterface.directory_xxx_flattenned as directory_xxx_flattened

Oct 11 2022, 4:38 PM
douardda added inline comments to D8657: Add support for kafka journalization of the ProvenanceStorageInterface.
Oct 11 2022, 4:25 PM
douardda updated the diff for D8657: Add support for kafka journalization of the ProvenanceStorageInterface.

rebase

Oct 11 2022, 4:12 PM
douardda updated the diff for D8656: Normalize _add() methods of the ProvenanceStorage interface.

apply vlorentz' comments

Oct 11 2022, 4:05 PM
douardda requested review of D8657: Add support for kafka journalization of the ProvenanceStorageInterface.
Oct 11 2022, 3:57 PM
douardda added inline comments to D8656: Normalize _add() methods of the ProvenanceStorage interface.
Oct 11 2022, 3:52 PM
douardda requested review of D8656: Normalize _add() methods of the ProvenanceStorage interface.
Oct 11 2022, 3:04 PM
douardda requested review of D8658: Add a journal replayer for the revision layer.
Oct 11 2022, 2:09 PM
douardda added a revision to T4616: Add kafka journal log for the revision layer : D8658: Add a journal replayer for the revision layer.
Oct 11 2022, 12:26 PM · Provenance database
douardda added a revision to T4616: Add kafka journal log for the revision layer : D8657: Add support for kafka journalization of the ProvenanceStorageInterface.
Oct 11 2022, 12:25 PM · Provenance database
douardda added a revision to T4616: Add kafka journal log for the revision layer : D8656: Normalize _add() methods of the ProvenanceStorage interface.
Oct 11 2022, 12:24 PM · Provenance database
douardda added inline comments to D8634: Prepare the tests to run in Jenkins.
Oct 11 2022, 10:43 AM
douardda accepted D8641: storage_checker: Do not re-check ranges already marked as checked.

lgtm

Oct 11 2022, 10:27 AM

Oct 10 2022

douardda triaged T4616: Add kafka journal log for the revision layer as High priority.
Oct 10 2022, 10:51 AM · Provenance database
douardda closed T3555: Re-factor the MongoDB backend, a subtask of T3431: Implement a MongoDB backend for SWH-provenance , as Wontfix.
Oct 10 2022, 10:50 AM · Provenance database
douardda closed T3555: Re-factor the MongoDB backend as Wontfix.
Oct 10 2022, 10:50 AM · Provenance database

Oct 7 2022

douardda closed T3557: Run experiments against the MongoDB backend as Wontfix.
Oct 7 2022, 6:07 PM · Provenance database
douardda closed T3557: Run experiments against the MongoDB backend , a subtask of T3431: Implement a MongoDB backend for SWH-provenance , as Wontfix.
Oct 7 2022, 6:07 PM · Provenance database
douardda closed T3431: Implement a MongoDB backend for SWH-provenance as Wontfix.
Oct 7 2022, 6:06 PM · Provenance database

Oct 3 2022

douardda closed D8593: Reorganize the code.
Oct 3 2022, 12:26 PM
douardda committed rDPROV7c882f571656: Reorganize the code (authored by douardda).
Reorganize the code
Oct 3 2022, 12:26 PM
douardda committed rDPROV6f4a193e9081: More core reorganization (authored by douardda).
More core reorganization
Oct 3 2022, 12:26 PM
douardda closed D8591: Adapt postgresql backend to swh.core.db >= 2.0.
Oct 3 2022, 12:26 PM
douardda closed D8592: Mark origin layer tests as "origin_layer".
Oct 3 2022, 12:26 PM
douardda committed rDPROV9a63bd8164af: Mark origin layer tests as "origin_layer" (authored by douardda).
Mark origin layer tests as "origin_layer"
Oct 3 2022, 12:26 PM