to avoid processing twice the history graph of a revision that was
already seen as the head of another origin.
Details
- Reviewers
olasd - Group Reviewers
Reviewers - Commits
- rDPROV8feeadaea130: Improve `origin_add` logic
rDPROVb0fccc251c5e: Fix tests that broke after `swh-environment` update
Diff Detail
- Repository
- rDPROV Provenance database
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Build is green
Patch application report for D7166 (id=25973)
Could not rebase; Attempt merge onto bb800f80ed...
Updating bb800f8..9df61c2 Fast-forward requirements-swh-graph.txt | 1 + setup.py | 10 ++++---- swh/provenance/__init__.py | 9 ++++++- swh/provenance/archive.py | 3 +-- swh/provenance/interface.py | 4 ++++ swh/provenance/origin.py | 16 +++++++------ swh/provenance/postgresql/archive.py | 2 +- swh/provenance/provenance.py | 3 +++ swh/provenance/storage/archive.py | 2 +- swh/provenance/swhgraph/__init__.py | 0 swh/provenance/swhgraph/archive.py | 46 ++++++++++++++++++++++++++++++++++++ tox.ini | 4 ++++ 12 files changed, 83 insertions(+), 17 deletions(-) create mode 100644 requirements-swh-graph.txt create mode 100644 swh/provenance/swhgraph/__init__.py create mode 100644 swh/provenance/swhgraph/archive.py
Changes applied before test
commit 9df61c25de6509881078cbcd502a947ca3612862 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Fri Feb 11 20:23:35 2022 +0100 Improve `origin_add` logic to avoid processing twice the history graph of a revision that was already seen as the head of another origin. commit 2833438c3aaa44ec34c3a125cb7fad2739b51b6c Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jan 20 16:08:32 2022 +0100 Add partial implementation of `ArchiveGraph` class commit 45c5b78ba0c61b9bf3e37ea10408cf6de26a5071 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jan 20 15:35:43 2022 +0100 Remove ordered result constrain from `snapshot_get_heads` It is not require anymore after simplifying the origin-revision layer algorithm.
See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/579/ for more details.
swh/provenance/origin.py | ||
---|---|---|
53 ↗ | (On Diff #25973) | This "commit" parameter change looks a bit strange. Either that's really needed (to prevent flushing from time to time) and then that should be opened in the interface above. Either way, that's also looks like a change that could be in a separate diff. What do you think? |
swh/provenance/origin.py | ||
---|---|---|
53 ↗ | (On Diff #25973) | I don't really follow what you mean by "should be opened in the interface above". The commit parameter is used when testing to simulate the (partial) processing of batches. It's like that for the other methods in the "interface" as well: ie. directory_add and revision_add |
Build has FAILED
Patch application report for D7166 (id=26640)
Rebasing onto bb800f80ed...
Current branch diff-target is up to date.
Changes applied before test
commit 80071ce5d92c51487e191a8890a43cd42123a16d Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Fri Feb 11 20:23:35 2022 +0100 Improve `origin_add` logic to avoid processing twice the history graph of a revision that was already seen as the head of another origin.
Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/580/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/580/console
Build has FAILED
Patch application report for D7166 (id=26640)
Rebasing onto bb800f80ed...
Current branch diff-target is up to date.
Changes applied before test
commit 80071ce5d92c51487e191a8890a43cd42123a16d Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Fri Feb 11 20:23:35 2022 +0100 Improve `origin_add` logic to avoid processing twice the history graph of a revision that was already seen as the head of another origin.
Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/581/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/581/console
swh/provenance/origin.py | ||
---|---|---|
53 ↗ | (On Diff #25973) | I'm the one who does not get it. Also, opening specific code for tests is usually not a good thing. |
swh/provenance/origin.py | ||
---|---|---|
53 ↗ | (On Diff #25973) |
I guess by interface here you mean the CLI. This method is the entry point for the origin-revision layer ingestion command, but provenance and archive are resolved based on the configuration file by the CLI function itself. The extra commit parameter has testing purposes.
I agree, but it was designed that way so I'm just making it compliant with the other of the methods (directory_add and revision_add). |
Build is green
Patch application report for D7166 (id=26642)
Rebasing onto bb800f80ed...
Current branch diff-target is up to date.
Changes applied before test
commit 8feeadaea1306edb67b7b241f0fcf1f37f9570e2 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Fri Feb 11 20:23:35 2022 +0100 Improve `origin_add` logic to avoid processing twice the history graph of a revision that was already seen as the head of another origin. commit b0fccc251c5e59d020e9c0f9953726e102c09891 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Mar 17 14:54:29 2022 +0100 Fix tests that broke after `swh-environment` update
See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/582/ for more details.