Page MenuHomeSoftware Heritage

Store the result of MetadataFetcher.get_parent_origins
ClosedPublic

Authored by vlorentz on Apr 26 2022, 5:42 PM.

Details

Summary

So it is available to concrete loaders.

The Git loader will use it to incrementally load new origins by fetching
the last snapshot of a parent, instead of starting from scratch.

Diff Detail

Event Timeline

Build is green

Patch application report for D7691 (id=27790)

Rebasing onto 1089330c3b...

Current branch diff-target is up to date.
Changes applied before test
commit 199dcb8f0683c6547b0e7bbb42e755a374fe6f7f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Apr 26 17:41:09 2022 +0200

    Store the result of MetadataFetcher.get_parent_origins
    
    So it is available to concrete loaders.
    
    The Git loader will use it to incrementally load new origins by fetching
    the last snapshot of a parent, instead of starting from scratch.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/744/ for more details.

olasd added inline comments.
swh/loader/core/metadata_fetchers.py
35

Should probably be Optional[List[Origin]]...

swh/loader/core/tests/test_loader.py
27

Please keep these consistent (I guess by always using the kwargs form)?

105–106

...because this one doesn't match the MetadataFetcherProtocol. Maybe it would be worth adding a test that all known/registered MetadataFetchers do match the protocol?

LGTM (save from a couple of nitpicks), thanks

This revision is now accepted and ready to land.Apr 27 2022, 1:35 PM
swh/loader/core/tests/test_loader.py
105–106

I'll just updated this one instead.

And it's tested in swh-metadata-loaders. Or do you mean runtime checks?

vlorentz marked 2 inline comments as done.

fix inconsistencies in tests

Build has FAILED

Patch application report for D7691 (id=27819)

Rebasing onto 1089330c3b...

Current branch diff-target is up to date.
Changes applied before test
commit e05ea3a4d2c395564b61674ca52d2da79c090295
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Apr 26 17:41:09 2022 +0200

    Store the result of MetadataFetcher.get_parent_origins
    
    So it is available to concrete loaders.
    
    The Git loader will use it to incrementally load new origins by fetching
    the last snapshot of a parent, instead of starting from scratch.

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/745/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/745/console

swh/loader/core/tests/test_loader.py
105–106

No, I meant consistency checks at testing time. It's probably worth adding an ad-hoc test here for these two mock classes?

swh/loader/core/tests/test_loader.py
105–106

I can, but it wouldn't have caught this issue; isinstance doesn't check type signatures. https://docs.python.org/3/library/typing.html#typing.runtime_checkable

Build has FAILED

Patch application report for D7691 (id=27821)

Rebasing onto 1089330c3b...

Current branch diff-target is up to date.
Changes applied before test
commit 97999bf67068743279d1c9927556e8cb7383a3ba
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Apr 26 17:41:09 2022 +0200

    Store the result of MetadataFetcher.get_parent_origins
    
    So it is available to concrete loaders.
    
    The Git loader will use it to incrementally load new origins by fetching
    the last snapshot of a parent, instead of starting from scratch.

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/746/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/746/console

fix test again + add isinstance check in tests

swh/loader/core/tests/test_loader.py
105–106

done

swh/loader/core/tests/test_loader.py
105–106

I mean, we do have such a set of explicit tests for our StorageInterface, ... etc. implementations.

I agree that this may be overkill for such a small interface, though. :-)

Build is green

Patch application report for D7691 (id=27822)

Rebasing onto 1089330c3b...

Current branch diff-target is up to date.
Changes applied before test
commit 07fb382655a043f3e144fb35784d79635c564af9
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Apr 26 17:41:09 2022 +0200

    Store the result of MetadataFetcher.get_parent_origins
    
    So it is available to concrete loaders.
    
    The Git loader will use it to incrementally load new origins by fetching
    the last snapshot of a parent, instead of starting from scratch.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/747/ for more details.

swh/loader/core/tests/test_loader.py
105–106

oh, you're right. I'll consider it in swh-loader-metadata; but I don't think it's worth it to check test classes this deeply