Page MenuHomeSoftware Heritage

packagist: Canonicalize github origins
ClosedPublic

Authored by vlorentz on Oct 13 2022, 1:06 PM.

Details

Summary

In particular, there seems to be a negligeable number of origins
using SSH instead of HTTPS, which the git loader cannot deal with.

https://sentry.softwareheritage.org/share/issue/709325715f6d44cdbb7cc27a244a472a/

Test Plan

Build fails because it depends on D8674

Diff Detail

Repository
rDLS Listers
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D8673 (id=31327)

Rebasing onto f5c5599f2e...

Current branch diff-target is up to date.
Changes applied before test
commit e52c0183076898fb0d8c6b3e174b365bd60196f2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Oct 13 13:05:34 2022 +0200

    packagist: Canonicalize github origins
    
    In particular, there seems to be a negligeable number of origins
    using SSH instead of HTTPS, which the git loader cannot deal with.

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/792/ for more details.

anlambert added a subscriber: anlambert.
anlambert added inline comments.
swh/lister/packagist/lister.py
66

I think you forgot the call to self.github_sesssion.get_canonical_url.

This revision now requires changes to proceed.Oct 13 2022, 1:21 PM
douardda added inline comments.
swh/lister/packagist/lister.py
66

how can the test be green then?

swh/lister/packagist/lister.py
66

because the origin URL is already canonicalized in tests data.

that'll teach me to TDD

but also wtf, I don't see why the URL is already canon when I query the Packagist API, but the lister somehow used a SSH URL

  • actually implement it
  • use a non-canon URL in test
  • add a gitlab-based project in test data (instead of only github)

Test the case of deleted github repos

swh/lister/packagist/tests/data/https_api.github.com/repos_gitlky_wx_article
1

the original project was deleted, so I wrote this manually

Build has FAILED

Patch application report for D8673 (id=31334)

Rebasing onto f5c5599f2e...

Current branch diff-target is up to date.
Changes applied before test
commit c7454c6a205e3f048dae006b4137d2031296fbcd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Oct 13 13:05:34 2022 +0200

    packagist: Canonicalize github origins
    
    In particular, there seems to be a negligeable number of origins
    using SSH instead of HTTPS, which the git loader cannot deal with.

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/793/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/793/console

Build has FAILED

Patch application report for D8673 (id=31335)

Rebasing onto f5c5599f2e...

Current branch diff-target is up to date.
Changes applied before test
commit a3ba391ab0bd645cb0d61a48234d03847465d6a5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Oct 13 13:05:34 2022 +0200

    packagist: Canonicalize github origins
    
    In particular, there seems to be a negligeable number of origins
    using SSH instead of HTTPS, which the git loader cannot deal with.

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/794/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/794/console

This revision is now accepted and ready to land.Oct 13 2022, 4:34 PM

bump dependency on swh-core

Build has FAILED

Patch application report for D8673 (id=31344)

Rebasing onto 82b936a277...

First, rewinding head to replay your work on top of it...
Applying: packagist: Canonicalize github origins
Changes applied before test
commit a325693d9103d2d04f8f1cea55f210fe814e5ff6
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Oct 13 13:05:34 2022 +0200

    packagist: Canonicalize github origins
    
    In particular, there seems to be a negligeable number of origins
    using SSH instead of HTTPS, which the git loader cannot deal with.

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/795/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/795/console

Build is green

Patch application report for D8673 (id=31344)

Rebasing onto 82b936a277...

First, rewinding head to replay your work on top of it...
Applying: packagist: Canonicalize github origins
Changes applied before test
commit 17adddd988fd2bdbce7e8070ec2640fbef441779
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Oct 13 13:05:34 2022 +0200

    packagist: Canonicalize github origins
    
    In particular, there seems to be a negligeable number of origins
    using SSH instead of HTTPS, which the git loader cannot deal with.

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/796/ for more details.

This revision was landed with ongoing or failed builds.Oct 13 2022, 5:15 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D8673 (id=31345)

Rebasing onto a681f2f405...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-797-D8673.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/797/ for more details.