Page MenuHomeSoftware Heritage

Refactor origin-revision layer
ClosedPublic

Authored by aeviso on Jun 17 2021, 3:54 PM.

Details

Summary

Remove hash_to_hex usage in the revision-content layer
It was only used for debug messaged and it's now replaced by bytes' hex method.

Add history graph structure to be used in the origin-revision layer algorithm
Also all uses of hash_to_hex were removed from the graph module in favour
of bytes' hex method.

Refactor origin-revision layer algorithm to use the new history graph structure

Depends on D5880

Diff Detail

Repository
rDPROV Provenance database
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D5886 (id=21094)

Could not rebase; Attempt merge onto c9d1369ba1...

Updating c9d1369..2228a3c
Fast-forward
 swh/provenance/archive.py                          |  42 +++++---
 swh/provenance/graph.py                            |  98 +++++++++++++++---
 swh/provenance/model.py                            | 111 ++++++++++-----------
 swh/provenance/origin.py                           |  88 +++++++---------
 swh/provenance/postgresql/archive.py               |  67 +++++++------
 swh/provenance/postgresql/provenancedb_base.py     | 102 ++++++++++++-------
 .../postgresql/provenancedb_with_path.py           |  26 ++---
 .../postgresql/provenancedb_without_path.py        |  22 ++--
 swh/provenance/provenance.py                       |  48 +++++----
 swh/provenance/revision.py                         |   8 +-
 swh/provenance/sql/30-schema.sql                   |  10 +-
 swh/provenance/storage/archive.py                  |  72 +++++++------
 12 files changed, 412 insertions(+), 282 deletions(-)
Changes applied before test
commit 2228a3c1b05569830732768d43308567f4a55814
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:51:20 2021 +0200

    Refactor origin-revision layer algorithm to use the new history graph structure

commit 7d783236977c9f90c2f5ff20ba1898f6e564b295
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:47:35 2021 +0200

    Add history graph structure to be used in the origin-revision layer algorithm
    
    Also all uses of `hash_to_bytes` were removed from the graph module in favour
    of bytes' `hex` method.

commit 03295d50ffb5fb6aa5d436b1d8d8b9ae9e736ba5
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:23:29 2021 +0200

    Remove `hash_to_bytes` usage in the revision-content layer
    
    It was only used for debug messaged and it's now replaced by bytes' `hex` method.

commit 4c204561c0d3264e81b3fb1b75cf80e44ccbd9be
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 13:53:25 2021 +0200

    Update backend methods associated to the origin-revision layer

commit cf9136d10a143c962b5b37baad9321d7ff4959d1
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 11:11:23 2021 +0200

    Fix outdated comments and code styling.

commit efb4c361330a1402ec25ecb0b62b8ab41382c740
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 14 14:03:09 2021 +0200

    Refactor RevisionEntry's parents iterator
    
    Make parents a class property and create a separate method to retrieve information
    from the archive, just as it is done for the other model classes

commit a12961f7536eb2e25c81b4f106c028dc6c655b29
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:51:54 2021 +0200

    Rework ArchiveInterface
    
    Unused methods were removed and type annotations fixed.
    Other methos in OriginEntry and RevisionEntry were updated accordingly

commit 12edf4a3b7992575f5f24cd8a648c803dbda142a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:48:24 2021 +0200

    Fix bugs when retrieven parents in RevisionEntry

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/154/ for more details.

Build is green

Patch application report for D5886 (id=21134)

Could not rebase; Attempt merge onto 8ff1ab5860...

Updating 8ff1ab5..fa649b7
Fast-forward
 requirements-swh.txt                               |   2 +-
 swh/provenance/archive.py                          |  42 +++++--
 swh/provenance/graph.py                            |  98 +++++++++++++--
 swh/provenance/model.py                            |  90 +++++++-------
 swh/provenance/origin.py                           |  88 ++++++--------
 swh/provenance/postgresql/archive.py               |  67 ++++++-----
 swh/provenance/postgresql/provenancedb_base.py     | 133 +++++++++++++--------
 .../postgresql/provenancedb_with_path.py           |  26 ++--
 .../postgresql/provenancedb_without_path.py        |  22 ++--
 swh/provenance/provenance.py                       |  48 +++++---
 swh/provenance/revision.py                         |   8 +-
 swh/provenance/sql/30-schema.sql                   |  10 +-
 swh/provenance/storage/archive.py                  |  72 ++++++-----
 13 files changed, 418 insertions(+), 288 deletions(-)
Changes applied before test
commit fa649b7add05cde5aaf6de35e69896e54f9e3402
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:51:20 2021 +0200

    Refactor origin-revision layer algorithm to use the new history graph structure

commit 6c4c039af0be1f18564b37dc9ec2de62d70e202b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:47:35 2021 +0200

    Add history graph structure to be used in the origin-revision layer algorithm
    
    Also all uses of `hash_to_bytes` were removed from the graph module in favour
    of bytes' `hex` method.

commit 5954211b2f69c9fb54788b9487cf03fa76e1e214
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:23:29 2021 +0200

    Remove `hash_to_bytes` usage in the revision-content layer
    
    It was only used for debug messaged and it's now replaced by bytes' `hex` method.

commit e89b9a7ccfccd526c09f66c292664a7b5c7e4aa9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 13:53:25 2021 +0200

    Update backend methods associated to the origin-revision layer

commit ce6a2a4dedc80f6a19918b6b927b4d35462fc484
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 11:11:23 2021 +0200

    Fix outdated comments and code styling.

commit a7e977c8af91a7b7a40655a30fe19211b6d794da
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 14 14:03:09 2021 +0200

    Refactor RevisionEntry's parents iterator
    
    Make parents a class property and create a separate method to retrieve information
    from the archive, just as it is done for the other model classes

commit 969e3359123b781ae1101e2fb62b9934b0988478
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:51:54 2021 +0200

    Rework ArchiveInterface
    
    Remove Unused methods and fix type annotations. Update Other methods in
    OriginEntry and RevisionEntry accordingly.

commit d28d3eed7049e36c5841405021fd0a0e78c11cb7
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:48:24 2021 +0200

    Fix bugs when retrieving parents in RevisionEntry
    
    Convert `Revision.date` from` TimestampWithTimezone` to `datetime` as expected by` RevisionEntry`.
    
    Create a list with the iterator returned by `ArchiveInterface.revision_get ()` before comparison.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/165/ for more details.

douardda added a subscriber: douardda.

I know I am rambling, but could it come with some testing?

swh/provenance/graph.py
54 ↗(On Diff #21094)

why exclude self.parents here? Looks very odd. Is it's by design, it should be commented (explain why), otherwise, it should be added (and possibly __eq__ could be simplified as hash(self) == hash(other)).

62 ↗(On Diff #21094)

please make this a proper docstring (with a quick explanation of what it does, eg, the fact it recursively build the history graph from the revision using the archive to retrieve revisions).

swh/provenance/origin.py
102 ↗(On Diff #21134)

would be nice to make this a docstring.

Also, is the notion of "preferred origin" defined/documented somewhere?

74 ↗(On Diff #21094)

s/head it's/head is/ right?

This revision now requires changes to proceed.Jun 18 2021, 2:20 PM

Build is green

Patch application report for D5886 (id=21142)

Could not rebase; Attempt merge onto 8ff1ab5860...

Updating 8ff1ab5..25d12d8
Fast-forward
 requirements-swh.txt                               |   2 +-
 swh/provenance/archive.py                          |  42 +++++--
 swh/provenance/graph.py                            |  98 +++++++++++++--
 swh/provenance/model.py                            |  90 +++++++-------
 swh/provenance/origin.py                           |  88 ++++++--------
 swh/provenance/postgresql/archive.py               |  70 ++++++-----
 swh/provenance/postgresql/provenancedb_base.py     | 133 +++++++++++++--------
 .../postgresql/provenancedb_with_path.py           |  26 ++--
 .../postgresql/provenancedb_without_path.py        |  22 ++--
 swh/provenance/provenance.py                       |  48 +++++---
 swh/provenance/revision.py                         |   8 +-
 swh/provenance/sql/30-schema.sql                   |  10 +-
 swh/provenance/storage/archive.py                  |  72 ++++++-----
 13 files changed, 421 insertions(+), 288 deletions(-)
Changes applied before test
commit 25d12d88fc9492b895f02a5cdfe7254108425f29
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:51:20 2021 +0200

    Refactor origin-revision layer algorithm to use the new history graph structure

commit a74288e2bea94e97715910bd73ae00379e25bbd0
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:47:35 2021 +0200

    Add history graph structure to be used in the origin-revision layer algorithm
    
    Also all uses of `hash_to_hex` were removed from the graph module in favour
    of bytes' `hex` method.

commit 2a98a575f4cde500f6628af7da0ebd90790e7161
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:23:29 2021 +0200

    Remove `hash_to_hex` usage in the revision-content layer
    
    It was only used for debug messaged and it's now replaced by bytes' `hex` method.

commit ef13a5a5072cd3f0a68367571613ba45ff8e3c84
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 13:53:25 2021 +0200

    Update backend methods associated to the origin-revision layer

commit 4953a1688a19b956a33164b4e5540edb3b8e2c78
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 11:11:23 2021 +0200

    Fix outdated comments and code styling.

commit e610883f5d89c58e409cc7ddf490a4a778b99dcd
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 14 14:03:09 2021 +0200

    Refactor RevisionEntry's parents iterator
    
    Make parents a class property and create a separate method to retrieve information
    from the archive, just as it is done for the other model classes

commit 01bbb0ce850a5cfafaa1837fa346a19b2a851fc9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:51:54 2021 +0200

    Rework ArchiveInterface
    
    Remove Unused methods and fix type annotations. Update Other methods in
    OriginEntry and RevisionEntry accordingly.

commit cd48f9d71efeaf73cfcc24f397d22589a5f7987b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:48:24 2021 +0200

    Fix bugs when retrieving parents in RevisionEntry
    
    Convert `Revision.date` from` TimestampWithTimezone` to `datetime` as expected by` RevisionEntry`.
    Create a list with the iterator returned by `ArchiveInterface.revision_get()` before comparison.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/170/ for more details.

swh/provenance/graph.py
54 ↗(On Diff #21094)

This is the same as before with IsochroneNode, we need __hash__ to be able to create sets of HistoryNode but sets themselves are not hashable. Regarding the redefinition of __eq__, correct me if I'm wrong, but I believe the hash function has collisions (ie. two distinct elements my have the same hash). Then, I don't think it is not a good notion of equality. I know the probability is low, but not null. So I rather implement equality explicitly.

62 ↗(On Diff #21094)

What? The comment line? I don't really understand the criteria for defining docstrings, some functions have them, some other not. What's the general rule?

swh/provenance/origin.py
102 ↗(On Diff #21134)

No, there is no notion of preferred origin actually. We just set the first we find

swh/provenance/graph.py
54 ↗(On Diff #21094)

If you fear collisions, then don't use the hash function at all (and don't use sets).
It makes no sense to have a "better" __eq__ than __hash__ here.

About excluding self.parents, I guess a "valid" reason could be they are already taken into consideration thanks to self.entry, but it depends on how this later (a RevisionEntry) itself is hashed.

62 ↗(On Diff #21094)

The general rule is every function should have a docstring documenting what the function does.

Comments may be present in the code to explain how the code works, but if the code requires too much comments to be understandable, it's generally a bad smell: something is not right with this piece of code or this architecture. (again, it's a general rule, so sometime you really need comments to help the reader understand the code, why it's written as it is, etc.)

swh/provenance/origin.py
102 ↗(On Diff #21134)

No, there is no notion of preferred origin actually. We just set the first we find

well, the function is called check_preferred_origin...

Build is green

Patch application report for D5886 (id=21155)

Could not rebase; Attempt merge onto 8ff1ab5860...

Updating 8ff1ab5..7127a5e
Fast-forward
 requirements-swh.txt                               |   2 +-
 swh/provenance/archive.py                          |  42 +++++--
 swh/provenance/graph.py                            | 133 +++++++++++++--------
 swh/provenance/model.py                            |  90 +++++++-------
 swh/provenance/origin.py                           |  88 ++++++--------
 swh/provenance/postgresql/archive.py               |  70 ++++++-----
 swh/provenance/postgresql/provenancedb_base.py     | 133 +++++++++++++--------
 .../postgresql/provenancedb_with_path.py           |  26 ++--
 .../postgresql/provenancedb_without_path.py        |  22 ++--
 swh/provenance/provenance.py                       |  63 ++++++----
 swh/provenance/revision.py                         |   8 +-
 swh/provenance/sql/30-schema.sql                   |  10 +-
 swh/provenance/storage/archive.py                  |  72 ++++++-----
 swh/provenance/tests/test_isochrone_graph.py       |   4 +-
 14 files changed, 431 insertions(+), 332 deletions(-)
Changes applied before test
commit 7127a5efb285bf08dea8c0aeffaaed6ccb178f66
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:51:20 2021 +0200

    Refactor origin-revision layer algorithm to use the new history graph structure

commit 65ed210e72500cf7522411295f6ad1fa674bfa46
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 21 14:47:30 2021 +0200

    Add history graph structure to be used in the origin-revision layer algorithm

commit 446ee1ec2a497831b0e88e7291699d33d04092b9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 21 14:42:12 2021 +0200

    Fix IsochroneNode hash calculation
    
    There was a bug as the previous implementation was considering mutable attributes.
    This fix allows to replace the list of children by a set instead. Also, all uses
    of `hash_to_hex` were removed from the graph module in favour of bytes' `hex` method.

commit ac9461eb350b6bc6dfae5adb7ef9b6c0678e23f7
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:23:29 2021 +0200

    Remove `hash_to_hex` usage in the revision-content layer
    
    It was only used for debug messaged and it's now replaced by bytes' `hex` method.

commit 31f9aad728a70b7bfe9f065b86bf78ab1126bc99
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 13:53:25 2021 +0200

    Update backend methods associated to the origin-revision layer

commit 4953a1688a19b956a33164b4e5540edb3b8e2c78
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 11:11:23 2021 +0200

    Fix outdated comments and code styling.

commit e610883f5d89c58e409cc7ddf490a4a778b99dcd
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 14 14:03:09 2021 +0200

    Refactor RevisionEntry's parents iterator
    
    Make parents a class property and create a separate method to retrieve information
    from the archive, just as it is done for the other model classes

commit 01bbb0ce850a5cfafaa1837fa346a19b2a851fc9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:51:54 2021 +0200

    Rework ArchiveInterface
    
    Remove Unused methods and fix type annotations. Update Other methods in
    OriginEntry and RevisionEntry accordingly.

commit cd48f9d71efeaf73cfcc24f397d22589a5f7987b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:48:24 2021 +0200

    Fix bugs when retrieving parents in RevisionEntry
    
    Convert `Revision.date` from` TimestampWithTimezone` to `datetime` as expected by` RevisionEntry`.
    Create a list with the iterator returned by `ArchiveInterface.revision_get()` before comparison.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/172/ for more details.

ok but some questions/remarks have not been addressed...

swh/provenance/graph.py
51 ↗(On Diff #21155)

s/on/from/ (from the given revision)

swh/provenance/origin.py
50 ↗(On Diff #21155)

why remove the -> None annotation?

(and same below)

This revision is now accepted and ready to land.Jun 21 2021, 4:37 PM

Build is green

Patch application report for D5886 (id=21174)

Could not rebase; Attempt merge onto 011645221c...

Updating 0116452..9191002
Fast-forward
 requirements-swh.txt                               |   2 +-
 swh/provenance/archive.py                          |  42 +++++--
 swh/provenance/graph.py                            | 133 +++++++++++++--------
 swh/provenance/model.py                            |  90 +++++++-------
 swh/provenance/origin.py                           |  88 ++++++--------
 swh/provenance/postgresql/archive.py               |  70 ++++++-----
 swh/provenance/postgresql/provenancedb_base.py     | 133 +++++++++++++--------
 .../postgresql/provenancedb_with_path.py           |  26 ++--
 .../postgresql/provenancedb_without_path.py        |  22 ++--
 swh/provenance/provenance.py                       |  62 ++++++----
 swh/provenance/revision.py                         |   8 +-
 swh/provenance/sql/30-schema.sql                   |  10 +-
 swh/provenance/storage/archive.py                  |  72 ++++++-----
 swh/provenance/tests/test_isochrone_graph.py       |   4 +-
 14 files changed, 430 insertions(+), 332 deletions(-)
Changes applied before test
commit 9191002381e097ef8cfd48bb7f7e327963b5f7c3
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:51:20 2021 +0200

    Refactor origin-revision layer algorithm to use the new history graph structure

commit 61206fef3c9a6d75967e373ca77973bbe7df4052
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 21 14:47:30 2021 +0200

    Add history graph structure to be used in the origin-revision layer algorithm

commit 87fbfa18c1884e7ccd732d07304e57494400eaa4
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 21 14:42:12 2021 +0200

    Fix IsochroneNode hash calculation
    
    There was a bug as the previous implementation was considering mutable attributes.
    This fix allows to replace the list of children by a set instead. Also, all uses
    of `hash_to_hex` were removed from the graph module in favour of bytes' `hex` method.

commit 6a97e7e2e30ec20a7265f1526c3fff8ea70d2cbb
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:23:29 2021 +0200

    Remove `hash_to_hex` usage in the revision-content layer
    
    It was only used for debug messaged and it's now replaced by bytes' `hex` method.

commit f2ec1e58c91f1fda1121c6d7c1e29dede6431d45
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 13:53:25 2021 +0200

    Update backend methods associated to the origin-revision layer

commit fd66d83c119d8f8b283098f6320bf6f68cc7114f
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 11:11:23 2021 +0200

    Fix outdated comments and code styling.

commit 417fd014d96f09b8a20831a6a060cfb408376d5c
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 14 14:03:09 2021 +0200

    Refactor RevisionEntry's parents iterator
    
    Make parents a class property and create a separate method to retrieve information
    from the archive, just as it is done for the other model classes

commit f354b65e52ed78b3a637b2632e1b74bb920669be
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:51:54 2021 +0200

    Rework ArchiveInterface
    
    Remove Unused methods and fix type annotations. Update Other methods in
    OriginEntry and RevisionEntry accordingly.

commit e6f39d0244b10b49942b0ab93d4628828e343642
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:48:24 2021 +0200

    Fix bugs when retrieving parents in RevisionEntry
    
    Convert `Revision.date` from` TimestampWithTimezone` to `datetime` as expected by` RevisionEntry`.
    Create a list with the iterator returned by `ArchiveInterface.revision_get()` before comparison.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/180/ for more details.

Build is green

Patch application report for D5886 (id=21177)

Could not rebase; Attempt merge onto 011645221c...

Updating 0116452..532a9cb
Fast-forward
 requirements-swh.txt                               |   2 +-
 swh/provenance/archive.py                          |  42 +++++--
 swh/provenance/graph.py                            | 133 +++++++++++++--------
 swh/provenance/model.py                            |  90 +++++++-------
 swh/provenance/origin.py                           |  88 ++++++--------
 swh/provenance/postgresql/archive.py               |  70 ++++++-----
 swh/provenance/postgresql/provenancedb_base.py     | 133 +++++++++++++--------
 .../postgresql/provenancedb_with_path.py           |  26 ++--
 .../postgresql/provenancedb_without_path.py        |  22 ++--
 swh/provenance/provenance.py                       |  62 ++++++----
 swh/provenance/revision.py                         |   8 +-
 swh/provenance/sql/30-schema.sql                   |  10 +-
 swh/provenance/storage/archive.py                  |  72 ++++++-----
 swh/provenance/tests/test_isochrone_graph.py       |   4 +-
 14 files changed, 430 insertions(+), 332 deletions(-)
Changes applied before test
commit 532a9cba75fc066b82300a58c3467a44e5cc1809
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:51:20 2021 +0200

    Refactor origin-revision layer algorithm to use the new history graph structure

commit 1773a39d2b026d7574ebb69a903652fc36accd07
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 21 14:47:30 2021 +0200

    Add history graph structure to be used in the origin-revision layer algorithm

commit 87fbfa18c1884e7ccd732d07304e57494400eaa4
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 21 14:42:12 2021 +0200

    Fix IsochroneNode hash calculation
    
    There was a bug as the previous implementation was considering mutable attributes.
    This fix allows to replace the list of children by a set instead. Also, all uses
    of `hash_to_hex` were removed from the graph module in favour of bytes' `hex` method.

commit 6a97e7e2e30ec20a7265f1526c3fff8ea70d2cbb
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:23:29 2021 +0200

    Remove `hash_to_hex` usage in the revision-content layer
    
    It was only used for debug messaged and it's now replaced by bytes' `hex` method.

commit f2ec1e58c91f1fda1121c6d7c1e29dede6431d45
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 13:53:25 2021 +0200

    Update backend methods associated to the origin-revision layer

commit fd66d83c119d8f8b283098f6320bf6f68cc7114f
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 11:11:23 2021 +0200

    Fix outdated comments and code styling.

commit 417fd014d96f09b8a20831a6a060cfb408376d5c
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 14 14:03:09 2021 +0200

    Refactor RevisionEntry's parents iterator
    
    Make parents a class property and create a separate method to retrieve information
    from the archive, just as it is done for the other model classes

commit f354b65e52ed78b3a637b2632e1b74bb920669be
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:51:54 2021 +0200

    Rework ArchiveInterface
    
    Remove Unused methods and fix type annotations. Update Other methods in
    OriginEntry and RevisionEntry accordingly.

commit e6f39d0244b10b49942b0ab93d4628828e343642
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:48:24 2021 +0200

    Fix bugs when retrieving parents in RevisionEntry
    
    Convert `Revision.date` from` TimestampWithTimezone` to `datetime` as expected by` RevisionEntry`.
    Create a list with the iterator returned by `ArchiveInterface.revision_get()` before comparison.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/182/ for more details.

Build is green

Patch application report for D5886 (id=21185)

Could not rebase; Attempt merge onto 011645221c...

Updating 0116452..86b731b
Fast-forward
 requirements-swh.txt                               |   2 +-
 swh/provenance/archive.py                          |  42 +++++--
 swh/provenance/graph.py                            | 133 +++++++++++++--------
 swh/provenance/model.py                            |  90 +++++++-------
 swh/provenance/origin.py                           |  88 ++++++--------
 swh/provenance/postgresql/archive.py               |  70 ++++++-----
 swh/provenance/postgresql/provenancedb_base.py     | 133 +++++++++++++--------
 .../postgresql/provenancedb_with_path.py           |  26 ++--
 .../postgresql/provenancedb_without_path.py        |  22 ++--
 swh/provenance/provenance.py                       |  62 ++++++----
 swh/provenance/revision.py                         |   8 +-
 swh/provenance/sql/30-schema.sql                   |  10 +-
 swh/provenance/storage/archive.py                  |  72 ++++++-----
 swh/provenance/tests/test_isochrone_graph.py       |   4 +-
 14 files changed, 430 insertions(+), 332 deletions(-)
Changes applied before test
commit 86b731bbde4d55cdc6f193d405811dd86cbde555
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:51:20 2021 +0200

    Refactor origin-revision layer algorithm to use the new history graph structure

commit f358690963061c8b0afdddd4f67f5dec2b5be53f
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 21 14:47:30 2021 +0200

    Add history graph structure to be used in the origin-revision layer algorithm

commit 78614359c07e61abf9f1d8293644430dcba09273
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 21 14:42:12 2021 +0200

    Fix IsochroneNode hash calculation
    
    There was a bug as the previous implementation was considering mutable attributes.
    This fix allows to replace the list of children by a set instead. Also, all uses
    of `hash_to_hex` were removed from the graph module in favour of bytes' `hex` method.

commit acebbfcf4527e9f69e5a47e7ebb847025078ed2f
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 15:23:29 2021 +0200

    Remove `hash_to_hex` usage in the revision-content layer
    
    It was only used for debug messaged and it's now replaced by bytes' `hex` method.

commit a0e6dffc572d46a8b0b4407319eeb5b189cf9dc8
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 13:53:25 2021 +0200

    Update backend methods associated to the origin-revision layer

commit fd66d83c119d8f8b283098f6320bf6f68cc7114f
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 16 11:11:23 2021 +0200

    Fix outdated comments and code styling.

commit 417fd014d96f09b8a20831a6a060cfb408376d5c
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 14 14:03:09 2021 +0200

    Refactor RevisionEntry's parents iterator
    
    Make parents a class property and create a separate method to retrieve information
    from the archive, just as it is done for the other model classes

commit f354b65e52ed78b3a637b2632e1b74bb920669be
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:51:54 2021 +0200

    Rework ArchiveInterface
    
    Remove Unused methods and fix type annotations. Update Other methods in
    OriginEntry and RevisionEntry accordingly.

commit e6f39d0244b10b49942b0ab93d4628828e343642
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 17 13:48:24 2021 +0200

    Fix bugs when retrieving parents in RevisionEntry
    
    Convert `Revision.date` from` TimestampWithTimezone` to `datetime` as expected by` RevisionEntry`.
    Create a list with the iterator returned by `ArchiveInterface.revision_get()` before comparison.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/185/ for more details.