Page MenuHomeSoftware Heritage

mercurial: Build snapshot on visits
ClosedPublic

Authored by ardumont on Sep 15 2021, 2:55 PM.

Details

Summary

If an uneventful visit happens, then an empty snapshot should happen. When filtering
revisions for speed, still the heads should appear in the snapshot.

Related to T3571

Test Plan

tox

Diff Detail

Repository
rDLDHG Mercurial loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D6268 (id=22694)

Could not rebase; Attempt merge onto 9be124af21...

Updating 9be124a..6110c47
Fast-forward
 swh/loader/mercurial/from_disk.py            | 132 ++++++++++++++++++---------
 swh/loader/mercurial/hgutil.py               |   4 +-
 swh/loader/mercurial/tests/test_from_disk.py |  79 ++++++++--------
 3 files changed, 134 insertions(+), 81 deletions(-)
Changes applied before test
commit 6110c47d14392e46c70b61561d2430e206192cce
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 14:53:24 2021 +0200

    Build snapshot on visits
    
    If an uneventful visit happens, then an empty snapshot should happen. When filtering
    revisions for speed, still the heads should appear in the snapshot.
    
    Related to T3571

commit 5fe2a9bf77d6667b55a912b58c0f3350390d2cda
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 10 17:00:15 2021 +0200

    Add support to filter out already seen revisions across hg origins
    
    This uses the extid mapping from the storage to actually achieve this.
    
    Related to T3563

commit 139061bcc38bd5652a266b5eb0645a1492e7a5aa
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 10:24:48 2021 +0200

    test: Explicit that 2 visits without change ends up with no snapshot

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/292/ for more details.

Rebase with the missing test added back and adapted.

This drops fixmes \o/

swh/loader/mercurial/from_disk.py
526

that's actually better in bytes to debug but i can revert it.

Build is green

Patch application report for D6268 (id=22697)

Could not rebase; Attempt merge onto 9be124af21...

Updating 9be124a..f7eb2d7
Fast-forward
 swh/loader/mercurial/from_disk.py            | 132 ++++++++++++++++++--------
 swh/loader/mercurial/hgutil.py               |   4 +-
 swh/loader/mercurial/tests/test_from_disk.py | 134 +++++++++++++++++++--------
 3 files changed, 189 insertions(+), 81 deletions(-)
Changes applied before test
commit f7eb2d710016e24d1a06f6794e26d0c9d5aa9fbe
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 14:53:24 2021 +0200

    Build snapshot on visits
    
    If an uneventful visit happens, then an empty snapshot should happen. When filtering
    revisions for speed, still the heads should appear in the snapshot.
    
    Related to T3571

commit 3231f2ed19962275d567f7bf39127a2d0621697d
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 10 17:00:15 2021 +0200

    Add support to filter out already seen revisions across hg origins
    
    This uses the extid mapping from the storage to actually achieve this.
    
    Related to T3563

commit 139061bcc38bd5652a266b5eb0645a1492e7a5aa
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 10:24:48 2021 +0200

    test: Explicit that 2 visits without change ends up with no snapshot

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/295/ for more details.

olasd requested changes to this revision.Sep 15 2021, 3:12 PM
olasd added a subscriber: olasd.
olasd added inline comments.
swh/loader/mercurial/from_disk.py
271–283

targets are swhids/revision ids, not HgNodeIds.

481–482

This should happen regardless of whether there is a default branch alias or not.

526

Please revert it to keep it human-readable. One can use bytes.fromhex() to revert it.

swh/loader/mercurial/tests/test_from_disk.py
724–725

You can keep only the last line, but yeah.

This revision now requires changes to proceed.Sep 15 2021, 3:12 PM
swh/loader/mercurial/from_disk.py
481–482

oh that got indented, the f***?

swh/loader/mercurial/from_disk.py
526

Yes, ok (i knew but that's annoying to do ;).
Reverting.

swh/loader/mercurial/tests/test_from_disk.py
724–725

the last 2 lines, i also want to ensure the snapshot is truthy.

Fix/revert almost all that got said.

Except one, i'm still trying to understand the remark.

Build is green

Patch application report for D6268 (id=22712)

Could not rebase; Attempt merge onto 9be124af21...

Updating 9be124a..efe34a7
Fast-forward
 swh/loader/mercurial/from_disk.py            | 118 ++++++++++++++++--------
 swh/loader/mercurial/hgutil.py               |   4 +-
 swh/loader/mercurial/tests/test_from_disk.py | 131 +++++++++++++++++++--------
 3 files changed, 177 insertions(+), 76 deletions(-)
Changes applied before test
commit efe34a792251a6f48b30757979a245056d4f6441
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 14:53:24 2021 +0200

    Build snapshot on visits
    
    If an uneventful visit happens, then an empty snapshot should happen. When filtering
    revisions for speed, still the heads should appear in the snapshot.
    
    Related to T3571

commit 3231f2ed19962275d567f7bf39127a2d0621697d
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 10 17:00:15 2021 +0200

    Add support to filter out already seen revisions across hg origins
    
    This uses the extid mapping from the storage to actually achieve this.
    
    Related to T3563

commit 139061bcc38bd5652a266b5eb0645a1492e7a5aa
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 10:24:48 2021 +0200

    test: Explicit that 2 visits without change ends up with no snapshot

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/296/ for more details.

swh/loader/mercurial/hgutil.py
22 ↗(On Diff #22712)

I dropped this because mypy was unhappy about it (with the use
of it within the from_disk implementation i mean).

We just want an alias on bytes here so i kept it simpler.

Fix the remaining suggestion (i finally got it)

Build is green

Patch application report for D6268 (id=22713)

Could not rebase; Attempt merge onto 9be124af21...

Updating 9be124a..0e4c3e4
Fast-forward
 swh/loader/mercurial/from_disk.py            | 118 ++++++++++++++++--------
 swh/loader/mercurial/hgutil.py               |   4 +-
 swh/loader/mercurial/tests/test_from_disk.py | 129 +++++++++++++++++++--------
 3 files changed, 176 insertions(+), 75 deletions(-)
Changes applied before test
commit 0e4c3e49b3cc4293821ee7aefdc3b7921675c34b
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 14:53:24 2021 +0200

    Build snapshot on visits
    
    If an uneventful visit happens, then an empty snapshot should happen. When filtering
    revisions for speed, still the heads should appear in the snapshot.
    
    Related to T3571

commit 3231f2ed19962275d567f7bf39127a2d0621697d
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 10 17:00:15 2021 +0200

    Add support to filter out already seen revisions across hg origins
    
    This uses the extid mapping from the storage to actually achieve this.
    
    Related to T3563

commit 139061bcc38bd5652a266b5eb0645a1492e7a5aa
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 10:24:48 2021 +0200

    test: Explicit that 2 visits without change ends up with no snapshot

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/297/ for more details.

Build is green

Patch application report for D6268 (id=22716)

Could not rebase; Attempt merge onto 9be124af21...

Updating 9be124a..29bc541
Fast-forward
 swh/loader/mercurial/from_disk.py            | 118 +++++++++++++++++--------
 swh/loader/mercurial/hgutil.py               |   4 +-
 swh/loader/mercurial/tests/test_from_disk.py | 127 +++++++++++++++++++--------
 3 files changed, 174 insertions(+), 75 deletions(-)
Changes applied before test
commit 29bc541e692cb2032b6f2475251adc5ba50ac571
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 14:53:24 2021 +0200

    Build snapshot on visits
    
    If an uneventful visit happens, then an empty snapshot should happen. When filtering
    revisions for speed, still the heads should appear in the snapshot.
    
    Related to T3571

commit 1c6c7e5be82df5f2419e9727d6b5f0e74e9494d9
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 10 17:00:15 2021 +0200

    Add support to filter out already seen revisions across hg origins
    
    This uses the extid mapping from the storage to actually achieve this.
    
    Related to T3563

commit bf39050051ec5c7b4c6cc7b4801079ea591cd549
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 10:24:48 2021 +0200

    test: Explicit that 2 visits without change ends up with no snapshot

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/300/ for more details.

swh/loader/mercurial/from_disk.py
271–283

Right, i fixed the signature, thanks.

swh/loader/mercurial/hgutil.py
22 ↗(On Diff #22712)

Well the point is to make an alias of bytes that /cannot/ be mistaken for raw bytes.

If mypy is unhappy, this means that the typing is unsound somehow.

swh/loader/mercurial/hgutil.py
22 ↗(On Diff #22712)

(I assume that this is not a problem anymore with the update to the signature of _get_extids_for_targets)

swh/loader/mercurial/hgutil.py
22 ↗(On Diff #22712)

Well, i can also revert and let the bytes flow everywhere.
I was merely trying to make sense of the signature.

I was not attempting to untangle deep typing issues (i've no idea what that NewType is).
I'm already enough side-tracked as it is.z

22 ↗(On Diff #22712)

Nope it still is.
It's at the frontier with the StorageInterface that comes problems or something.

$ mypy swh
swh/loader/mercurial/from_disk.py:264: error: Generator has incompatible item type "bytes"; expected "HgNodeId"
swh/loader/mercurial/from_disk.py:267: error: Generator has incompatible item type "bytes"; expected "HgNodeId"
swh/loader/mercurial/from_disk.py:279: error: Invalid index type "bytes" for "Dict[HgNodeId, bytes]"; expected type "HgNodeId"
swh/loader/mercurial/from_disk.py:299: error: Argument 2 to "extid_get_from_extid" of "StorageInterface" has incompatible type "List[HgNodeId]"; expected "List[bytes]"
swh/loader/mercurial/from_disk.py:299: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance
swh/loader/mercurial/from_disk.py:299: note: Consider using "Sequence" instead, which is covariant
swh/loader/mercurial/from_disk.py:303: error: Invalid index type "bytes" for "Dict[HgNodeId, bytes]"; expected type "HgNodeId"
swh/loader/mercurial/from_disk.py:386: error: List comprehension has incompatible type List[bytes]; expected List[HgNodeId]
Found 6 errors in 1 file (checked 28 source files)
swh/loader/mercurial/hgutil.py
22 ↗(On Diff #22712)

> Well, i can also revert a...

That was supposedly destroyed ^ ¯\_(ツ)_/¯

swh/loader/mercurial/hgutil.py
22 ↗(On Diff #22712)

and of course, swh/loader/mercurial/from_disk.py:279: error: Cannot use isinstance() with NewType type
because why not.

swh/loader/mercurial/hgutil.py
22 ↗(On Diff #22712)

To avoid making nico repeat himself.

18:19 <+olasd> supposedly this avoids the mistake that happened in the blablabla_from_targets(targets: List[HgNodeId]) signature
18:20 <+olasd> where you passed a list of bytes that weren't really NodeIds
18:21 <+olasd> most of them can probably be fixed

An update for the diff is ongoing.

Build is green

Patch application report for D6268 (id=22723)

Could not rebase; Attempt merge onto bf39050051...

Updating bf39050..da1bd7c
Fast-forward
 swh/loader/mercurial/from_disk.py            | 132 +++++++++++++++++++--------
 swh/loader/mercurial/tests/test_from_disk.py | 124 ++++++++++++++++---------
 2 files changed, 175 insertions(+), 81 deletions(-)
Changes applied before test
commit da1bd7c460b238edcbfdf8c7e931a4cf6fd9ab58
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 14:53:24 2021 +0200

    Build snapshot on visits
    
    If an uneventful visit happens, then an empty snapshot should happen. When filtering
    revisions for speed, still the heads should appear in the snapshot.
    
    Related to T3571

commit 4c9eb38ec516de3f65c3996d937cd3328ded6261
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 10 17:00:15 2021 +0200

    Add support to filter out already seen revisions across hg origins
    
    This uses the extid mapping from the storage to actually achieve this.
    
    Related to T3563

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/302/ for more details.

This revision is now accepted and ready to land.Thu, Sep 16, 9:47 AM

Build is green

Patch application report for D6268 (id=22745)

Could not rebase; Attempt merge onto bf39050051...

Updating bf39050..a0f8e9e
Fast-forward
 swh/loader/mercurial/from_disk.py            | 129 +++++++++++++++++++--------
 swh/loader/mercurial/tests/test_from_disk.py | 124 ++++++++++++++++---------
 2 files changed, 173 insertions(+), 80 deletions(-)
Changes applied before test
commit a0f8e9e856a4d659e9e73d5973f4c87884b79f6f
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Sep 15 14:53:24 2021 +0200

    Build snapshot on visits
    
    If an uneventful visit happens, then an empty snapshot should happen. When filtering
    revisions for speed, still the heads should appear in the snapshot.
    
    Related to T3571

commit 765cb5aefa8bb370bc1343d990ad328188a4eba5
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 10 17:00:15 2021 +0200

    Add support to filter out already seen revisions across hg origins
    
    This uses the extid mapping from the storage to actually achieve this.
    
    Related to T3563

See https://jenkins.softwareheritage.org/job/DLDHG/job/tests-on-diff/305/ for more details.

This revision was automatically updated to reflect the committed changes.