the queries should be exactly the same as before (query plans are the
same); just written (hopefully) in a bit more readable manner.
Depends on D5780.
Differential D5781
Simplify DB queries in ProvenanceWithPathDB.content_find_(first|all) douardda on May 25 2021, 3:21 PM. Authored by
Details
the queries should be exactly the same as before (query plans are the Depends on D5780.
Diff Detail
Event TimelineComment Actions Build has FAILED Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/27/ Comment Actions Build has FAILED Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/31/ Comment Actions Build is green Patch application report for D5781 (id=20675)Could not rebase; Attempt merge onto 77fce4e59d... Updating 77fce4e..15301e7 Fast-forward swh/provenance/model.py | 21 ++- swh/provenance/postgresql/provenancedb_base.py | 22 +-- .../postgresql/provenancedb_with_path.py | 115 ++++-------- swh/provenance/provenance.py | 202 ++++++++++++--------- swh/provenance/tests/test_provenance_db.py | 169 ++++++++++++++++- 5 files changed, 351 insertions(+), 178 deletions(-) Changes applied before testcommit 15301e78ef2f3b305791d83b120f02a5d6192904 Author: David Douard <david.douard@sdfa3.org> Date: Tue May 25 14:31:51 2021 +0200 Simplify DB queries in ProvenanceWithPathDB.content_find_(first|all) the queries should be exactly the same as before (query plans are the same); just written (hopefully) in a bit more readable manner. commit a2fe6386596e89f3af8c2d5b00a0b8d8d20d82ce Author: David Douard <david.douard@sdfa3.org> Date: Tue May 25 14:59:16 2021 +0200 Add a test for content_find_all() commit 489197686cfa447dc4ee0a827ef578cdc041b1d4 Author: David Douard <david.douard@sdfa3.org> Date: Fri May 21 19:34:07 2021 +0200 Replace ProvenanceDB.remove_cache by a dict of Set instead of a dict of dict, which was actually used as a set. commit 2860fe6126de1804ab9190026545c935bbbbd99a Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 09:47:03 2021 +0200 Refactor the isochrone graph computation attempt to simplify a bit this part of the code: - IsochroneNode are now only used for directories - FileEntry are stored in a new IsochroneNode.files attribute, so - IsochroneNode.children only stores IsochroneNode (thus DirectoryEntry) objects, - rename IsochroneNode.date as 'dbdate' and clarify its semantics commit 8e4a2f69b53fd8ef74613509eb7a5f6707855a7a Author: David Douard <david.douard@sdfa3.org> Date: Wed May 19 16:16:53 2021 +0200 Add 'ls_files()' and 'ls_dirs()' methods to the DirectoryEntry class to make it a bit easier to compute the isochrone graph (see following revisions). commit 4ba05e92698d73a6a05078a969eea85d08cd5dca Author: David Douard <david.douard@sdfa3.org> Date: Wed May 19 16:14:41 2021 +0200 Add __str__ methods to RevisionEntry, DirectoryEntry and FileEntry to ease logging and debugging. commit fb0ef598657a9810deac43d495ab882718265543 Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 12:44:07 2021 +0200 Improve a bit the code of ProvenanceDBBase commit edff905d0df269cb90246ef77b554b1ca58cbeef Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 10:30:23 2021 +0200 Add a test for the build_isochrone_graph() function this test is far from ideal, since it's mostly the record of what happen during a "known good" session of revision insertions, but at least it should allow to refactor code related to the isochrone graph computation with a bit more confidence... commit 113e11031aa5365a9244409a0cfe646061cb94f9 Author: David Douard <david.douard@sdfa3.org> Date: Tue May 11 15:33:57 2021 +0200 Replace the 'dates' argument of IsochroneNode() by a simple 'date' one there is no need for passing a dict here, we only care about the date for the node being instanciated. See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/37/ for more details. Comment Actions lgtm
Comment Actions Build is green Patch application report for D5781 (id=20736)Could not rebase; Attempt merge onto 5aa0314dd7... Updating 5aa0314..08d8dd6 Fast-forward swh/provenance/model.py | 21 ++- swh/provenance/postgresql/provenancedb_base.py | 12 +- .../postgresql/provenancedb_with_path.py | 115 ++++-------- swh/provenance/provenance.py | 201 ++++++++++++--------- swh/provenance/tests/test_provenance_db.py | 171 +++++++++++++++++- 5 files changed, 344 insertions(+), 176 deletions(-) Changes applied before testcommit 08d8dd6478836ff4ab1c00c67f553b6d705b5a9c Author: David Douard <david.douard@sdfa3.org> Date: Tue May 25 14:31:51 2021 +0200 Simplify DB queries in ProvenanceWithPathDB.content_find_(first|all) the queries should be exactly the same as before (query plans are the same); just written (hopefully) in a bit more readable manner. commit fd43523fd594e70ccd002827d379321f52c2b6da Author: David Douard <david.douard@sdfa3.org> Date: Tue May 25 14:59:16 2021 +0200 Add a test for content_find_all() commit d85f2b0ee48aefe03ad32311623e5390f43d7261 Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 09:47:03 2021 +0200 Refactor the isochrone graph computation attempt to simplify a bit this part of the code: - IsochroneNode are now only used for directories - FileEntry are stored in a new IsochroneNode.files attribute, so - IsochroneNode.children only stores IsochroneNode (thus DirectoryEntry) objects, - rename IsochroneNode.date as 'dbdate' and clarify its semantics commit 31d833ec86bf041e100795e7796ce832d00450ef Author: David Douard <david.douard@sdfa3.org> Date: Wed May 19 16:16:53 2021 +0200 Add 'ls_files()' and 'ls_dirs()' methods to the DirectoryEntry class to make it a bit easier to compute the isochrone graph (see following revisions). commit 72644b98a218132c0b173f360c503438688ecebb Author: David Douard <david.douard@sdfa3.org> Date: Wed May 19 16:14:41 2021 +0200 Add __str__ methods to RevisionEntry, DirectoryEntry and FileEntry to ease logging and debugging. commit a71041fbaf3f0d7ec3ea944cbbf04286c57d8b7e Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 12:44:07 2021 +0200 Improve a bit the code of ProvenanceDBBase commit defcb388ffba0869edb1a126b6626710c396c2ac Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 10:30:23 2021 +0200 Add a test for the build_isochrone_graph() function this test is far from ideal, since it's mostly the record of what happen during a "known good" session of revision insertions, but at least it should allow to refactor code related to the isochrone graph computation with a bit more confidence... See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/43/ for more details. Comment Actions We are about to start refactoring the provenance backend. It would be nice to have this changes pushed, since re-basing them might by complicated after refactoring. Comment Actions I'd be delighted, but I need you to accept the diff in the stack... This "{merge} might by complicated after refactoring" is true for the entire stack of diff, so yes, I'd like very much to push this stack. Comment Actions This diff is already accepted by ardumont, who knows SQL better than I do. There is no much for me to review here Comment Actions Yes but it is part of a diff stack. I cannot push it without reworking my stack of git revisions, which can be quite a painfull process; in this very case, it should be rather painless, BUT the ides was also to have this refactoring coming AFTER better testing has been added to the code so we can validate that this refactoring does not break the code. Comment Actions Also, allow me to emphasis that even is @ardumont already gave his "ok" on this, you can also give your own green or red light: you may see something neither I nor @ardumont saw. If you are ok with the diff, you are not obliged to Now the reason it's still not pushed in the main git repo is because it sits in the middle of a stack of diffs, some of which have not been accepted (especially some on which this diff depends). See my previous comment for the reason I have not "moved" it at the base of the stack of diff. Comment Actions Build is green Patch application report for D5781 (id=20752)Could not rebase; Attempt merge onto 49e47c3ea7... Merge made by the 'recursive' strategy. swh/provenance/model.py | 31 ++- swh/provenance/postgresql/provenancedb_base.py | 12 +- .../postgresql/provenancedb_with_path.py | 115 ++++------- swh/provenance/provenance.py | 220 ++++++++++++--------- swh/provenance/tests/test_provenance_db.py | 171 +++++++++++++++- 5 files changed, 365 insertions(+), 184 deletions(-) Changes applied before testcommit e242a69878f57c8b48227681dae76bee3ee01cf2 Merge: 49e47c3 16bab3c Author: Jenkins user <jenkins@localhost> Date: Wed Jun 2 15:36:30 2021 +0000 Merge branch 'diff-target' into HEAD commit 16bab3c60c2a3a80782273f1aaff796826e7dc2c Author: David Douard <david.douard@sdfa3.org> Date: Tue May 25 14:31:51 2021 +0200 Simplify DB queries in ProvenanceWithPathDB.content_find_(first|all) the queries should be exactly the same as before (query plans are the same); just written (hopefully) in a bit more readable manner. commit 024cc9ce93e545782a980f8e81d5d09651b8231b Author: David Douard <david.douard@sdfa3.org> Date: Tue May 25 14:59:16 2021 +0200 Add a test for content_find_all() commit af15ad65f4a34e7703bfec80666102a6403cb505 Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 09:47:03 2021 +0200 Refactor the isochrone graph computation attempt to simplify a bit this part of the code: - IsochroneNode are now only used for directories - FileEntry are used directly from IsochroneNode.entry.files (no need for creating new FileEntry instances), so - IsochroneNode.children only stores IsochroneNode (thus DirectoryEntry) objects, - rename IsochroneNode.date as 'dbdate' and clarify its semantics, - attempt to document (comments) a bit more the algorithm and semantics of several attributes/variables used in there. commit 1f49fdc967a2854d3a68dec34886b824fdf045f6 Author: David Douard <david.douard@sdfa3.org> Date: Wed May 19 16:16:53 2021 +0200 Replace 'DirectoryEntry.ls()' method by 'files' and 'dirs' properties and make the retrieval of children from the archive explicit in a dedicated retrieve_children() method. commit 72644b98a218132c0b173f360c503438688ecebb Author: David Douard <david.douard@sdfa3.org> Date: Wed May 19 16:14:41 2021 +0200 Add __str__ methods to RevisionEntry, DirectoryEntry and FileEntry to ease logging and debugging. commit a71041fbaf3f0d7ec3ea944cbbf04286c57d8b7e Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 12:44:07 2021 +0200 Improve a bit the code of ProvenanceDBBase commit defcb388ffba0869edb1a126b6626710c396c2ac Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 10:30:23 2021 +0200 Add a test for the build_isochrone_graph() function this test is far from ideal, since it's mostly the record of what happen during a "known good" session of revision insertions, but at least it should allow to refactor code related to the isochrone graph computation with a bit more confidence... See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/52/ for more details. Comment Actions Build is green Patch application report for D5781 (id=20767)Could not rebase; Attempt merge onto 49e47c3ea7... Updating 49e47c3..de30f33 Fast-forward swh/provenance/model.py | 31 ++- swh/provenance/postgresql/provenancedb_base.py | 12 +- .../postgresql/provenancedb_with_path.py | 115 ++++------- swh/provenance/provenance.py | 220 ++++++++++++--------- swh/provenance/tests/test_provenance_db.py | 171 +++++++++++++++- 5 files changed, 365 insertions(+), 184 deletions(-) Changes applied before testcommit de30f332f219e4edb299bb50a0b808a779c57d85 Author: David Douard <david.douard@sdfa3.org> Date: Tue May 25 14:31:51 2021 +0200 Simplify DB queries in ProvenanceWithPathDB.content_find_(first|all) the queries should be exactly the same as before (query plans are the same); just written (hopefully) in a bit more readable manner. commit ee8e4b0b7ce6a85eac0665a916a37b2d63e3bb4d Author: David Douard <david.douard@sdfa3.org> Date: Tue May 25 14:59:16 2021 +0200 Add a test for content_find_all() commit 94598b3ce8c49eb6dfe5308b47b74271a7f9d625 Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 09:47:03 2021 +0200 Refactor the isochrone graph computation attempt to simplify a bit this part of the code: - IsochroneNode are now only used for directories - FileEntry are used directly from IsochroneNode.entry.files (no need for creating new FileEntry instances), so - IsochroneNode.children only stores IsochroneNode (thus DirectoryEntry) objects, - rename IsochroneNode.date as 'dbdate' and clarify its semantics, - attempt to document (comments) a bit more the algorithm and semantics of several attributes/variables used in there. commit 9d110b93e9c39d65bf2986b148c4bf3467b0efa3 Author: David Douard <david.douard@sdfa3.org> Date: Wed May 19 16:16:53 2021 +0200 Replace 'DirectoryEntry.ls()' method by 'files' and 'dirs' properties and make the retrieval of children from the archive explicit in a dedicated retrieve_children() method. commit fcfbb250e688a4ade6849522714832ec49238a8d Author: David Douard <david.douard@sdfa3.org> Date: Wed May 19 16:14:41 2021 +0200 Add __str__ methods to RevisionEntry, DirectoryEntry and FileEntry to ease logging and debugging. commit 1f823ac01491ee0f27eac685d32322f8558c26bc Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 12:44:07 2021 +0200 Improve a bit the code of ProvenanceDBBase commit cb623cb0e7dd9a2a568b6d2645e89c4d86ba0a66 Author: David Douard <david.douard@sdfa3.org> Date: Wed May 12 10:30:23 2021 +0200 Add a test for the build_isochrone_graph() function this test is far from ideal, since it's mostly the record of what happen during a "known good" session of revision insertions, but at least it should allow to refactor code related to the isochrone graph computation with a bit more confidence... See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/61/ for more details. Comment Actions Build is green Patch application report for D5781 (id=20784)Could not rebase; Attempt merge onto 33eada55f0... Updating 33eada5..08344d3 Fast-forward .../postgresql/provenancedb_with_path.py | 115 ++++++--------------- swh/provenance/tests/test_provenance_db.py | 75 ++++++++++++++ 2 files changed, 109 insertions(+), 81 deletions(-) Changes applied before testcommit 08344d3f76c000b2fc30452c271a8ce3caf95cdd Author: David Douard <david.douard@sdfa3.org> Date: Tue May 25 14:31:51 2021 +0200 Simplify DB queries in ProvenanceWithPathDB.content_find_(first|all) the queries should be exactly the same as before (query plans are the same); just written (hopefully) in a bit more readable manner. commit ce700ffb9810d3607090d123b3b627e1f0cfd271 Author: David Douard <david.douard@sdfa3.org> Date: Tue May 25 14:59:16 2021 +0200 Add a test for content_find_all() See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/68/ for more details. |