Page MenuHomeSoftware Heritage

Add a test for content_find_all()
ClosedPublic

Authored by douardda on May 25 2021, 3:20 PM.

Details

Summary

Depends on D5774

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.May 25 2021, 3:20 PM
Harbormaster failed remote builds in B21621: Diff 20657!
Harbormaster returned this revision to the author for changes because remote builds failed.May 25 2021, 3:27 PM
Harbormaster failed remote builds in B21626: Diff 20662!

Build is green

Patch application report for D5780 (id=20673)

Could not rebase; Attempt merge onto 77fce4e59d...

Updating 77fce4e..a2fe638
Fast-forward
 swh/provenance/model.py                        |  21 ++-
 swh/provenance/postgresql/provenancedb_base.py |  22 +--
 swh/provenance/provenance.py                   | 202 +++++++++++++++----------
 swh/provenance/tests/test_provenance_db.py     | 169 ++++++++++++++++++++-
 4 files changed, 316 insertions(+), 98 deletions(-)
Changes applied before test
commit a2fe6386596e89f3af8c2d5b00a0b8d8d20d82ce
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue May 25 14:59:16 2021 +0200

    Add a test for content_find_all()

commit 489197686cfa447dc4ee0a827ef578cdc041b1d4
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri May 21 19:34:07 2021 +0200

    Replace ProvenanceDB.remove_cache by a dict of Set
    
    instead of a dict of dict, which was actually used as a set.

commit 2860fe6126de1804ab9190026545c935bbbbd99a
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 09:47:03 2021 +0200

    Refactor the isochrone graph computation
    
    attempt to simplify a bit this part of the code:
    
    - IsochroneNode are now only used for directories
    - FileEntry are stored in a new IsochroneNode.files attribute, so
    - IsochroneNode.children only stores IsochroneNode (thus DirectoryEntry)
      objects,
    - rename IsochroneNode.date as 'dbdate' and clarify its semantics

commit 8e4a2f69b53fd8ef74613509eb7a5f6707855a7a
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 19 16:16:53 2021 +0200

    Add 'ls_files()' and 'ls_dirs()' methods to the DirectoryEntry class
    
    to make it a bit easier to compute the isochrone graph (see following
    revisions).

commit 4ba05e92698d73a6a05078a969eea85d08cd5dca
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 19 16:14:41 2021 +0200

    Add __str__ methods to RevisionEntry, DirectoryEntry and FileEntry
    
    to ease logging and debugging.

commit fb0ef598657a9810deac43d495ab882718265543
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 12:44:07 2021 +0200

    Improve a bit the code of ProvenanceDBBase

commit edff905d0df269cb90246ef77b554b1ca58cbeef
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 10:30:23 2021 +0200

    Add a test for the build_isochrone_graph() function
    
    this test is far from ideal, since it's mostly the record of what happen
    during a "known good" session of revision insertions, but at least it
    should allow to refactor code related to the isochrone graph computation
    with a bit more confidence...

commit 113e11031aa5365a9244409a0cfe646061cb94f9
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue May 11 15:33:57 2021 +0200

    Replace the 'dates' argument of IsochroneNode() by a simple 'date' one
    
    there is no need for passing a dict here, we only care about the date
    for the node being instanciated.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/36/ for more details.

what about disabling E501 for the whole file, instead of repeating the comment? https://stackoverflow.com/a/64431741

aeviso requested changes to this revision.May 28 2021, 3:06 PM
aeviso added inline comments.
swh/provenance/tests/test_provenance_db.py
328

I believe we should check for set equality instead, as there is no guarantee on the order when a blob occurs several times in the same revision. Maybe also check that the lists have the same amount of elements prior to convert them to sets.

This revision now requires changes to proceed.May 28 2021, 3:06 PM

what about disabling E501 for the whole file, instead of repeating the comment? https://stackoverflow.com/a/64431741

I would have loved to be able to disable it only for the namespace/block (here, the dictionary, similar to what i did to trick black)...
I thought about putting this in a dedicated file also (so i can use the big "per-file-ignores" hammer, but I preferred to keep the data with the test, it's small enough to be readable within the test.
So meh, I think I still prefer repeating the comment for this one.

swh/provenance/tests/test_provenance_db.py
328

ok but then it should be clearly documented that the provenance db does not guarantee ordered / stable results...

rebase + use set() to compare expected results, as requested by aeviso

Build is green

Patch application report for D5780 (id=20735)

Could not rebase; Attempt merge onto 5aa0314dd7...

Updating 5aa0314..fd43523
Fast-forward
 swh/provenance/model.py                        |  21 ++-
 swh/provenance/postgresql/provenancedb_base.py |  12 +-
 swh/provenance/provenance.py                   | 201 ++++++++++++++-----------
 swh/provenance/tests/test_provenance_db.py     | 171 ++++++++++++++++++++-
 4 files changed, 310 insertions(+), 95 deletions(-)
Changes applied before test
commit fd43523fd594e70ccd002827d379321f52c2b6da
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue May 25 14:59:16 2021 +0200

    Add a test for content_find_all()

commit d85f2b0ee48aefe03ad32311623e5390f43d7261
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 09:47:03 2021 +0200

    Refactor the isochrone graph computation
    
    attempt to simplify a bit this part of the code:
    
    - IsochroneNode are now only used for directories
    - FileEntry are stored in a new IsochroneNode.files attribute, so
    - IsochroneNode.children only stores IsochroneNode (thus DirectoryEntry)
      objects,
    - rename IsochroneNode.date as 'dbdate' and clarify its semantics

commit 31d833ec86bf041e100795e7796ce832d00450ef
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 19 16:16:53 2021 +0200

    Add 'ls_files()' and 'ls_dirs()' methods to the DirectoryEntry class
    
    to make it a bit easier to compute the isochrone graph (see following
    revisions).

commit 72644b98a218132c0b173f360c503438688ecebb
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 19 16:14:41 2021 +0200

    Add __str__ methods to RevisionEntry, DirectoryEntry and FileEntry
    
    to ease logging and debugging.

commit a71041fbaf3f0d7ec3ea944cbbf04286c57d8b7e
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 12:44:07 2021 +0200

    Improve a bit the code of ProvenanceDBBase

commit defcb388ffba0869edb1a126b6626710c396c2ac
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 10:30:23 2021 +0200

    Add a test for the build_isochrone_graph() function
    
    this test is far from ideal, since it's mostly the record of what happen
    during a "known good" session of revision insertions, but at least it
    should allow to refactor code related to the isochrone graph computation
    with a bit more confidence...

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/42/ for more details.

This revision is now accepted and ready to land.Jun 2 2021, 1:24 PM

Build is green

Patch application report for D5780 (id=20751)

Could not rebase; Attempt merge onto 49e47c3ea7...

Merge made by the 'recursive' strategy.
 swh/provenance/model.py                        |  31 +++-
 swh/provenance/postgresql/provenancedb_base.py |  12 +-
 swh/provenance/provenance.py                   | 220 ++++++++++++++-----------
 swh/provenance/tests/test_provenance_db.py     | 171 ++++++++++++++++++-
 4 files changed, 331 insertions(+), 103 deletions(-)
Changes applied before test
commit fe2954a5a30c37ae2364b85a4978e5e485325932
Merge: 49e47c3 024cc9c
Author: Jenkins user <jenkins@localhost>
Date:   Wed Jun 2 15:35:04 2021 +0000

    Merge branch 'diff-target' into HEAD

commit 024cc9ce93e545782a980f8e81d5d09651b8231b
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue May 25 14:59:16 2021 +0200

    Add a test for content_find_all()

commit af15ad65f4a34e7703bfec80666102a6403cb505
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 09:47:03 2021 +0200

    Refactor the isochrone graph computation
    
    attempt to simplify a bit this part of the code:
    
    - IsochroneNode are now only used for directories
    - FileEntry are used directly from IsochroneNode.entry.files (no need
      for creating new FileEntry instances), so
    - IsochroneNode.children only stores IsochroneNode (thus DirectoryEntry)
      objects,
    - rename IsochroneNode.date as 'dbdate' and clarify its semantics,
    - attempt to document (comments) a bit more the algorithm and semantics
      of several attributes/variables used in there.

commit 1f49fdc967a2854d3a68dec34886b824fdf045f6
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 19 16:16:53 2021 +0200

    Replace 'DirectoryEntry.ls()' method by 'files' and 'dirs' properties
    
    and make the retrieval of children from the archive explicit in a
    dedicated retrieve_children() method.

commit 72644b98a218132c0b173f360c503438688ecebb
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 19 16:14:41 2021 +0200

    Add __str__ methods to RevisionEntry, DirectoryEntry and FileEntry
    
    to ease logging and debugging.

commit a71041fbaf3f0d7ec3ea944cbbf04286c57d8b7e
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 12:44:07 2021 +0200

    Improve a bit the code of ProvenanceDBBase

commit defcb388ffba0869edb1a126b6626710c396c2ac
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 10:30:23 2021 +0200

    Add a test for the build_isochrone_graph() function
    
    this test is far from ideal, since it's mostly the record of what happen
    during a "known good" session of revision insertions, but at least it
    should allow to refactor code related to the isochrone graph computation
    with a bit more confidence...

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/51/ for more details.

Build is green

Patch application report for D5780 (id=20766)

Could not rebase; Attempt merge onto 49e47c3ea7...

Updating 49e47c3..ee8e4b0
Fast-forward
 swh/provenance/model.py                        |  31 +++-
 swh/provenance/postgresql/provenancedb_base.py |  12 +-
 swh/provenance/provenance.py                   | 220 ++++++++++++++-----------
 swh/provenance/tests/test_provenance_db.py     | 171 ++++++++++++++++++-
 4 files changed, 331 insertions(+), 103 deletions(-)
Changes applied before test
commit ee8e4b0b7ce6a85eac0665a916a37b2d63e3bb4d
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue May 25 14:59:16 2021 +0200

    Add a test for content_find_all()

commit 94598b3ce8c49eb6dfe5308b47b74271a7f9d625
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 09:47:03 2021 +0200

    Refactor the isochrone graph computation
    
    attempt to simplify a bit this part of the code:
    
    - IsochroneNode are now only used for directories
    - FileEntry are used directly from IsochroneNode.entry.files (no need
      for creating new FileEntry instances), so
    - IsochroneNode.children only stores IsochroneNode (thus DirectoryEntry)
      objects,
    - rename IsochroneNode.date as 'dbdate' and clarify its semantics,
    - attempt to document (comments) a bit more the algorithm and semantics
      of several attributes/variables used in there.

commit 9d110b93e9c39d65bf2986b148c4bf3467b0efa3
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 19 16:16:53 2021 +0200

    Replace 'DirectoryEntry.ls()' method by 'files' and 'dirs' properties
    
    and make the retrieval of children from the archive explicit in a
    dedicated retrieve_children() method.

commit fcfbb250e688a4ade6849522714832ec49238a8d
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 19 16:14:41 2021 +0200

    Add __str__ methods to RevisionEntry, DirectoryEntry and FileEntry
    
    to ease logging and debugging.

commit 1f823ac01491ee0f27eac685d32322f8558c26bc
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 12:44:07 2021 +0200

    Improve a bit the code of ProvenanceDBBase

commit cb623cb0e7dd9a2a568b6d2645e89c4d86ba0a66
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed May 12 10:30:23 2021 +0200

    Add a test for the build_isochrone_graph() function
    
    this test is far from ideal, since it's mostly the record of what happen
    during a "known good" session of revision insertions, but at least it
    should allow to refactor code related to the isochrone graph computation
    with a bit more confidence...

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/60/ for more details.

Build is green

Patch application report for D5780 (id=20783)

Rebasing onto 33eada55f0...

Current branch diff-target is up to date.
Changes applied before test
commit ce700ffb9810d3607090d123b3b627e1f0cfd271
Author: David Douard <david.douard@sdfa3.org>
Date:   Tue May 25 14:59:16 2021 +0200

    Add a test for content_find_all()

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/67/ for more details.

This revision was automatically updated to reflect the committed changes.