Page MenuHomeSoftware Heritage

Use bytes as snapshot id and get rid of dict manipulation in check_snapshot()
ClosedPublic

Authored by douardda on Jul 10 2020, 4:50 PM.

Details

Summary

This would break tests in other swh-loader-xx unless D3499 D3498 and D3496
are landed. <- this is possibly a lie...

In fact it would break loader git and mercurial until D3516 and D3517 are landed.

Event Timeline

Build is green

Patch application report for D3501 (id=12374)

Rebasing onto fcc7e61bf9...

Current branch diff-target is up to date.
Changes applied before test
commit e45fd137e573f3757da6a5bc54c6db15c918af76
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Jul 10 16:28:16 2020 +0200

    Use bytes as snapshot id and get rid of dict manipulation in check_snapshot()

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/145/ for more details.

This would break tests in other swh-loader-xx unless D3499 D3498 and D3496
are landed. <- this is possibly a lie...

yeah, i don't think that's the case.

fwiw, i was heading towards making the transition you started *after* having deployed
the current implementation of "check-snapshot" (without that diff so with the internal
manipulation) And then commit the current diff ;)

fwiw, i was heading towards making the transition you started *after* having deployed

the current implementation of "check-snapshot" (without that diff so with the internal
manipulation) And then commit the current diff ;)

Do you still want to land it first?

olasd added inline comments.
swh/loader/package/archive/tests/test_archive.py
86

I don't think this noqa is needed?

fwiw, i was heading towards making the transition you started *after* having deployed

the current implementation of "check-snapshot" (without that diff so with the internal
manipulation) And then commit the current diff ;)

Do you still want to land it first?

I'm very fine with landing a reasonably consistent and stable baseline any time you want.

swh/loader/package/archive/tests/test_archive.py
86

probably not, indeed

Build is green

Patch application report for D3501 (id=12393)

Rebasing onto 9f04d72516...

First, rewinding head to replay your work on top of it...
Applying: Use bytes as snapshot id and get rid of dict manipulation in check_snapshot()
Changes applied before test
commit 98b251120b4e3d7d2f89c83178f599c3f27773bb
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Jul 10 16:28:16 2020 +0200

    Use bytes as snapshot id and get rid of dict manipulation in check_snapshot()

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/155/ for more details.

Build is green

Patch application report for D3501 (id=12396)

Rebasing onto 9f04d72516...

Current branch diff-target is up to date.
Changes applied before test
commit d4a29630f9cdac90fcba9ea1c8c49981313ce0a7
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Jul 10 16:28:16 2020 +0200

    Use bytes as snapshot id and get rid of dict manipulation in check_snapshot()

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/156/ for more details.

I'm very fine with landing a reasonably consistent and stable baseline any time you want.

In that case, i think a plan matching those properties could be:

  • review/land first D3503 (so the check-snapshot implementation is complete, it checks up to the contents). It's not per say a blocker but that'd be more consistent.
  • Then release new loader-core.
  • Trigger back the builds of your mutiple diffs on the dvcs loaders (and land them when green, they should be ;)
  • Then getting back here, rebase on latest loader-core and land this one.
  • And then we (either you or me) should pick up those tests and replace the snapshots as dict with snapshot model objects (both core and again dvcs). <- that was my initial plan
  • Optionally (or not), then we can come back here again and drop the from_dict conversion from the input... (only model object!!! at least in tests)

Does that sound reasonable?

I'm very fine with landing a reasonably consistent and stable baseline any time you want.

In that case, i think a plan matching those properties could be:

  • review/land first D3503 (so the check-snapshot implementation is complete, it checks up to the contents). It's not per say a blocker but that'd be more consistent.
  • Then release new loader-core.
  • Trigger back the builds of your mutiple diffs on the dvcs loaders (and land them when green, they should be ;)
  • Then getting back here, rebase on latest loader-core and land this one.
  • And then we (either you or me) should pick up those tests and replace the snapshots as dict with snapshot model objects (both core and again dvcs). <- that was my initial plan
  • Optionally (or not), then we can come back here again and drop the from_dict conversion from the input... (only model object!!! at least in tests)

Does that sound reasonable?

yep

review/land first D3503 (so the check-snapshot implementation is complete, it checks up to the contents). It's not per say a blocker but that'd be more consistent.
Then release new loader-core.

That part is done btw.
(I need to check why the debian package fails but from tox's point of view, everything is fine ;)

(I need to check why the debian package fails but from tox's point of view, everything is fine ;)

It's because D3514 btw
(we are current implicitely pulling the optional swh.journal dependency in the loader-core tests only within the debian build somehow)

It's because D3514 btw
(we are current implicitely pulling the optional swh.journal dependency in the loader-core tests only within the debian build somehow)

scratch D3514 (loader-core), D3515 is more appropriate a fix (in storage) so it is not to be replicated everywhere

Otherwise, i think it's fine ;)
And according to the plan, the next step is to land it.

This revision is now accepted and ready to land.Jul 16 2020, 9:48 AM
douardda edited the summary of this revision. (Show Details)

Otherwise, i think it's fine ;)
And according to the plan, the next step is to land it.

Thanks but I need D3516 and D3517 landed and tagged to prevent any glitch in the matrix.