Page MenuHomeSoftware Heritage

packagist: Canonicalize github origins
ClosedPublic

Authored by vlorentz on Oct 13 2022, 1:06 PM.

Details

Summary

In particular, there seems to be a negligeable number of origins
using SSH instead of HTTPS, which the git loader cannot deal with.

https://sentry.softwareheritage.org/share/issue/709325715f6d44cdbb7cc27a244a472a/

Test Plan

Build fails because it depends on D8674

Diff Detail

Repository
rDLS Listers
Branch
packagist
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 32292
Build 50580: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 50579: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D8673 (id=31327)

Rebasing onto f5c5599f2e...

Current branch diff-target is up to date.
Changes applied before test
commit e52c0183076898fb0d8c6b3e174b365bd60196f2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Oct 13 13:05:34 2022 +0200

    packagist: Canonicalize github origins
    
    In particular, there seems to be a negligeable number of origins
    using SSH instead of HTTPS, which the git loader cannot deal with.

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/792/ for more details.

anlambert added a subscriber: anlambert.
anlambert added inline comments.
swh/lister/packagist/lister.py
66

I think you forgot the call to self.github_sesssion.get_canonical_url.

This revision now requires changes to proceed.Oct 13 2022, 1:21 PM
douardda added inline comments.
swh/lister/packagist/lister.py
66

how can the test be green then?

swh/lister/packagist/lister.py
66

because the origin URL is already canonicalized in tests data.

that'll teach me to TDD

but also wtf, I don't see why the URL is already canon when I query the Packagist API, but the lister somehow used a SSH URL

  • actually implement it
  • use a non-canon URL in test
  • add a gitlab-based project in test data (instead of only github)

Test the case of deleted github repos

swh/lister/packagist/tests/data/https_api.github.com/repos_gitlky_wx_article
1

the original project was deleted, so I wrote this manually

Build has FAILED

Patch application report for D8673 (id=31334)

Rebasing onto f5c5599f2e...

Current branch diff-target is up to date.
Changes applied before test
commit c7454c6a205e3f048dae006b4137d2031296fbcd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Oct 13 13:05:34 2022 +0200

    packagist: Canonicalize github origins
    
    In particular, there seems to be a negligeable number of origins
    using SSH instead of HTTPS, which the git loader cannot deal with.

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/793/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/793/console

Build has FAILED

Patch application report for D8673 (id=31335)

Rebasing onto f5c5599f2e...

Current branch diff-target is up to date.
Changes applied before test
commit a3ba391ab0bd645cb0d61a48234d03847465d6a5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Oct 13 13:05:34 2022 +0200

    packagist: Canonicalize github origins
    
    In particular, there seems to be a negligeable number of origins
    using SSH instead of HTTPS, which the git loader cannot deal with.

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/794/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/794/console

This revision is now accepted and ready to land.Oct 13 2022, 4:34 PM

bump dependency on swh-core

Build has FAILED

Patch application report for D8673 (id=31344)

Rebasing onto 82b936a277...

First, rewinding head to replay your work on top of it...
Applying: packagist: Canonicalize github origins
Changes applied before test
commit a325693d9103d2d04f8f1cea55f210fe814e5ff6
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Oct 13 13:05:34 2022 +0200

    packagist: Canonicalize github origins
    
    In particular, there seems to be a negligeable number of origins
    using SSH instead of HTTPS, which the git loader cannot deal with.

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/795/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/795/console

Build is green

Patch application report for D8673 (id=31344)

Rebasing onto 82b936a277...

First, rewinding head to replay your work on top of it...
Applying: packagist: Canonicalize github origins
Changes applied before test
commit 17adddd988fd2bdbce7e8070ec2640fbef441779
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Oct 13 13:05:34 2022 +0200

    packagist: Canonicalize github origins
    
    In particular, there seems to be a negligeable number of origins
    using SSH instead of HTTPS, which the git loader cannot deal with.

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/796/ for more details.

This revision was landed with ongoing or failed builds.Oct 13 2022, 5:15 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D8673 (id=31345)

Rebasing onto a681f2f405...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-797-D8673.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/797/ for more details.