Page MenuHomeSoftware Heritage

Add method get_parent_origins()
ClosedPublic

Authored by vlorentz on Apr 26 2022, 1:28 PM.

Details

Summary

This will allow the Git loader to incrementally load GitHub forks

Diff Detail

Repository
rDLDMD Extrinsic Metadata Loaders
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D7663 (id=27728)

Rebasing onto aa0af83a5e...

Current branch diff-target is up to date.
Changes applied before test
commit 427d6cdb5f72b9955023b81c6e060f7f2d4cd1c5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Apr 26 13:28:37 2022 +0200

    Add method get_parent_origin()
    
    This will allow the Git loader to incrementally load GitHub forks

See https://jenkins.softwareheritage.org/job/DLDMD/job/tests-on-diff/6/ for more details.

Should we be using the parent, or the source (which, afaik, is the root of all forks) repo? Or both?

Ideally we'd have a way to attempt all known parent repos in succession, from the closest to the farthest

swh/loader/metadata/github.py
79–81

Maybe we should build that from clone_url (stripping the .git ending if it's there) instead. I guess it's consistent with the way the lister does it?

vlorentz marked an inline comment as done.
  • use html_url like listers, instead of building the URL
  • support both parent and source

Build is green

Patch application report for D7663 (id=27788)

Rebasing onto aa0af83a5e...

Current branch diff-target is up to date.
Changes applied before test
commit 263515f3538550213159ce1706d30d4e4c015d7c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Apr 26 13:28:37 2022 +0200

    Add method get_parent_origin()
    
    This will allow the Git loader to incrementally load GitHub forks

See https://jenkins.softwareheritage.org/job/DLDMD/job/tests-on-diff/7/ for more details.

vlorentz retitled this revision from Add method get_parent_origin() to Add method get_parent_origins().Apr 27 2022, 1:11 PM

Thanks. Your commit message needs an update :-)

This revision is now accepted and ready to land.Apr 27 2022, 1:30 PM
This revision was landed with ongoing or failed builds.Apr 27 2022, 1:31 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D7663 (id=27817)

Rebasing onto 01d34fa768...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-8-D7663.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DLDMD/job/tests-on-diff/8/ for more details.