Page MenuHomeSoftware Heritage

Adapt maven lister to list canonical gh urls if any
ClosedPublic

Authored by ardumont on May 20 2022, 4:38 PM.

Details

Summary

That means detected github urls {https,git,http}://github.com/${user_repo}(.git) are
canonicalized to https://github.com/${user_repo} format.

This avoids duplication of origins.

Related to T4232
Depends on D7880

Diff Detail

Repository
rDLS Listers
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 29529
Build 46145: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 46144: arc lint + arc unit

Unit TestsFailed

TimeTest
361 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.lister.maven.tests.test_lister::test_maven_full_listing
swh_scheduler = <swh.scheduler.backend.SchedulerBackend object at 0x7f0648358a58> def test_maven_full_listing(swh_scheduler):
368 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.lister.maven.tests.test_lister::test_maven_full_listing_malformed
swh_scheduler = <swh.scheduler.backend.SchedulerBackend object at 0x7f06480d0048> requests_mock = <requests_mock.mocker.Mocker object at 0x7f06480d0da0> maven_pom_1_malformed = b'<?xml version="1.0" encoding="UTF-8"?>\n<project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven....actId>\n <version>3.10.0</version>\n <scope>test</scope>\n </dependency>\n </dependencies>\n</project>\n'
460 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.lister.maven.tests.test_lister::test_maven_incremental_listing
swh_scheduler = <swh.scheduler.backend.SchedulerBackend object at 0x7f06483b29b0> requests_mock = <requests_mock.mocker.Mocker object at 0x7f064816fba8> maven_index_full = b'doc 0\n field 0\n name u\n type string\n value al.aldi|sprova4j|0.1.0|sources|jar\n field 1\n name m\n...otGroups\n field 19\n name rootGroupsList\n type string\n value com|al\nEND\nchecksum 00000000004102281591\n'
377 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.lister.bitbucket.tests.test_lister::test_bitbucket_full_lister
1,794 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.lister.bitbucket.tests.test_lister::test_bitbucket_incremental_lister
View Full Test Results (3 Failed · 206 Passed)

Event Timeline

Build has FAILED

Patch application report for D7879 (id=28432)

Rebasing onto 2ffe9c2aea...

Current branch diff-target is up to date.
Changes applied before test
commit f9bf6aefb1a4ef407d6f64bb4cbea778a9667edc
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 20 16:37:15 2022 +0200

    Adapt maven lister to list canonical gh urls if any
    
    Related to T4232

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/524/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/524/console

Harbormaster returned this revision to the author for changes because remote builds failed.May 20 2022, 4:42 PM
Harbormaster failed remote builds in B29523: Diff 28432!

Add missing pytest configuration (mock github api call)

Build has FAILED

Patch application report for D7879 (id=28433)

Rebasing onto 2ffe9c2aea...

Current branch diff-target is up to date.
Changes applied before test
commit c7d4faae7c81ef2fc1230738ff42cb56a771e0d2
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 20 16:37:15 2022 +0200

    Adapt maven lister to list canonical gh urls if any
    
    Related to T4232

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/525/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/525/console

Harbormaster returned this revision to the author for changes because remote builds failed.May 20 2022, 4:56 PM
Harbormaster failed remote builds in B29524: Diff 28433!

Fix requests_mock override definition

Build has FAILED

Patch application report for D7879 (id=28434)

Rebasing onto 2ffe9c2aea...

Current branch diff-target is up to date.
Changes applied before test
commit c2c4d7d440ef512f3424748a33ef8b001b6311b7
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 20 16:37:15 2022 +0200

    Adapt maven lister to list canonical gh urls if any
    
    Related to T4232

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/526/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/526/console

Harbormaster returned this revision to the author for changes because remote builds failed.May 20 2022, 5:02 PM
Harbormaster failed remote builds in B29525: Diff 28434!

Build has FAILED

Patch application report for D7879 (id=28438)

Rebasing onto 2ffe9c2aea...

Current branch diff-target is up to date.
Changes applied before test
commit 6fa40a1d4c32746f30d2180c67901b888ced9187
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 20 16:37:15 2022 +0200

    Adapt maven lister to list canonical gh urls if any
    
    Related to T4232

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/527/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/527/console

Harbormaster returned this revision to the author for changes because remote builds failed.May 20 2022, 5:52 PM
Harbormaster failed remote builds in B29529: Diff 28438!
ardumont edited the test plan for this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)
ardumont retitled this revision from wip: Adapt maven lister to list canonical gh urls if any to Adapt maven lister to list canonical gh urls if any.
ardumont edited the test plan for this revision. (Show Details)

Bump requirements, tests should pass now.

What i said but using the right repository this time ¯\_(ツ)_/¯

Build is green

Patch application report for D7879 (id=28441)

Rebasing onto 817c9e099c...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-427-D7879.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DCORE/job/tests-on-diff/427/ for more details.

Build is green

Patch application report for D7879 (id=28442)

Rebasing onto 2ffe9c2aea...

Current branch diff-target is up to date.
Changes applied before test
commit 7f8c778fd7f315641fac8c4507a6fc9eb8d8de9d
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 20 16:37:15 2022 +0200

    Adapt maven lister to list canonical gh urls if any
    
    Related to T4232

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/528/ for more details.

  • Improve commit message
  • Reword docstrings
  • Add comments

Build is green

Patch application report for D7879 (id=28445)

Rebasing onto 2ffe9c2aea...

Current branch diff-target is up to date.
Changes applied before test
commit 0a997670d4c01ba11f5c7cfb27a13bc1be1ded58
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 20 16:37:15 2022 +0200

    Adapt maven lister to list canonical gh urls if any
    
    That means detected github urls {https,git,http}://github.com/${user_repo}(.git) are
    canonicalized to https://github.com/${user_repo} format.
    
    This avoids duplication of origins.
    
    Related to T4232

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/529/ for more details.

vlorentz added inline comments.
swh/lister/maven/lister.py
56

could you move it to the toplevel? it doesn't need to be an attribute

302–321

It is simpler to return early than to re-check the same conditions every time, IMO

swh/lister/maven/lister.py
56

sure

302–321

yeah, i was missing the else return None you put early on, thanks.

Build has FAILED

Patch application report for D7879 (id=28447)

Rebasing onto 2ffe9c2aea...

Current branch diff-target is up to date.
Changes applied before test
commit 114074a0ead6e38cef7f6b0dba37cd7f65161ef3
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 20 16:37:15 2022 +0200

    Adapt maven lister to list canonical gh urls if any
    
    That means detected github urls {https,git,http}://github.com/${user_repo}(.git) are
    canonicalized to https://github.com/${user_repo} format.
    
    This avoids duplication of origins.
    
    Related to T4232

Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/530/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/530/console

swh/lister/maven/lister.py
313–318

another one

swh/lister/maven/lister.py
313–318

No, here it must be after the get_canonical call because that could return None.
Hence why i put it there

vlorentz added inline comments.
swh/lister/maven/lister.py
313–318

ack

This revision is now accepted and ready to land.May 23 2022, 1:37 PM

So now let's fix the unit tests.
I've managed to break the tests with the recent fixes.

Build is green

Patch application report for D7879 (id=28454)

Rebasing onto 2ffe9c2aea...

Current branch diff-target is up to date.
Changes applied before test
commit 263db667d09c4090dbc9d7be1df5aae0721c612b
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri May 20 16:37:15 2022 +0200

    Adapt maven lister to list canonical gh urls if any
    
    That means detected github urls {https,git,http}://github.com/${user_repo}(.git) are
    canonicalized to https://github.com/${user_repo} format.
    
    This avoids duplication of origins.
    
    Related to T4232

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/531/ for more details.