Page MenuHomeSoftware Heritage

migrate_extrinsic_metadata: improve pypi_project_from_filename to support suffixes after the version number.
ClosedPublic

Authored by vlorentz on Wed, Sep 16, 10:45 AM.

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

vlorentz created this revision.Wed, Sep 16, 10:45 AM

Build is green

Patch application report for D3960 (id=13939)

Could not rebase; Attempt merge onto 3b781a8a52...

Merge made by the 'recursive' strategy.
 swh/storage/migrate_extrinsic_metadata.py          | 123 +++++--
 .../migrate_extrinsic_metadata/test_debian.py      | 301 ++++++++++++++++-
 .../tests/migrate_extrinsic_metadata/test_pypi.py  | 365 ++++++++++++++-------
 3 files changed, 640 insertions(+), 149 deletions(-)
Changes applied before test
commit 0ab6adec36ad9178a9fb6506644e068d4c3fcec6
Merge: 3b781a8a a69df4c4
Author: Jenkins user <jenkins@localhost>
Date:   Wed Sep 16 08:53:00 2020 +0000

    Merge branch 'diff-target' into HEAD

commit a69df4c400c86bf98942b1610ede7b184cf15322
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 10:45:29 2020 +0200

    migrate_extrinsic_metadata: improve pypi_project_from_filename to support suffixes after the version number.

commit a69bb3b76a6ef3225d7b74cc6d1c23112b9fee70
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 10:42:19 2020 +0200

    migrate_extrinsic_metadata: guess PyPI origins.
    
    This works by guessing the package name from the original_artifact data,
    then building an origin that would match the package name, then filtering
    checking if the revision can be reached from it.

commit 8e8c7ee79a832e59a20ff894299b58024addd967
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 09:50:08 2020 +0200

    migrate_extrinsic_metadata.test_pypi: use the in-memory storage instead of mocks
    
    in a future commit, migrating pypi revisions will become more interactive with
    the storage, so it's easier to have a real one instead of a mock.

commit f6943400ff48ba840ab604747252ad4197fab5d9
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Sep 14 10:51:48 2020 +0200

    migrate_extrinsic_metadata.test_debian: use the in-memory storage instead of mocks
    
    in tests that need to read in the storage.
    
    Using mocks just makes it more complicated, and we decided not to do that
    a while ago.

commit 7a0467972fcd0c05ff16d806b18261aba8624288
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Sep 11 14:16:21 2020 +0200

    migrate_extrinsic_metadata: fix crash on dangling branch.

commit 7969d368966c43ebfd51b2901c827217b0712dd5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Sep 11 13:59:55 2020 +0200

    migrate_extrinsic_metadata: fix crash when a Debian revision is missing.
    
    https://forge.softwareheritage.org/T997

commit 265fc387f7b3d5f1a55d136b74fa2ee9b9f11f58
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Sep 10 14:23:12 2020 +0200

    migrate_extrinsic_metadata: guess Debian origins.
    
    This works by guessing the package name from the original_artifact data,
    then building origins that would match the package name, then filtering
    out origins by checking if the revision can be reached from them.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/935/ for more details.

ardumont accepted this revision.Wed, Sep 16, 12:20 PM
This revision is now accepted and ready to land.Wed, Sep 16, 12:20 PM
vlorentz updated this revision to Diff 13949.Wed, Sep 16, 2:13 PM

allow digits in the suffix.

Build is green

Patch application report for D3960 (id=13949)

Rebasing onto 3b781a8a52...

First, rewinding head to replay your work on top of it...
Applying: migrate_extrinsic_metadata: guess Debian origins.
Applying: migrate_extrinsic_metadata: fix crash when a Debian revision is missing.
Applying: migrate_extrinsic_metadata: fix crash on dangling branch.
Applying: migrate_extrinsic_metadata.test_debian: use the in-memory storage instead of mocks
Applying: migrate_extrinsic_metadata.test_pypi: use the in-memory storage instead of mocks
Applying: migrate_extrinsic_metadata: guess PyPI origins.
Applying: migrate_extrinsic_metadata: improve pypi_project_from_filename to support suffixes after the version number.
Changes applied before test
commit 8b9ddbab1f96589daf25ff177d224d2fcaa09e09
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 10:45:29 2020 +0200

    migrate_extrinsic_metadata: improve pypi_project_from_filename to support suffixes after the version number.

commit b7445b5c353190722c72e4d7383486decc17e76f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 10:42:19 2020 +0200

    migrate_extrinsic_metadata: guess PyPI origins.
    
    This works by guessing the package name from the original_artifact data,
    then building an origin that would match the package name, then filtering
    checking if the revision can be reached from it.

commit 39fdf79eaae60e11b41aea5f9e84ab73dd3800d2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 09:50:08 2020 +0200

    migrate_extrinsic_metadata.test_pypi: use the in-memory storage instead of mocks
    
    in a future commit, migrating pypi revisions will become more interactive with
    the storage, so it's easier to have a real one instead of a mock.

commit 7388693db56f41c0610bd18d88e55546d1638a03
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Sep 14 10:51:48 2020 +0200

    migrate_extrinsic_metadata.test_debian: use the in-memory storage instead of mocks
    
    in tests that need to read in the storage.
    
    Using mocks just makes it more complicated, and we decided not to do that
    a while ago.

commit f885fba4ccee076c5b5c4131ac51c1a238c9fc60
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Sep 11 14:16:21 2020 +0200

    migrate_extrinsic_metadata: fix crash on dangling branch.

commit 4f63f8d1ae0cccab821a1b95ead425cf63901a02
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Sep 11 13:59:55 2020 +0200

    migrate_extrinsic_metadata: fix crash when a Debian revision is missing.
    
    https://forge.softwareheritage.org/T997

commit 3ef0039e6e2e125189f8af8380c690c01fab346a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Sep 10 14:23:12 2020 +0200

    migrate_extrinsic_metadata: guess Debian origins.
    
    This works by guessing the package name from the original_artifact data,
    then building origins that would match the package name, then filtering
    out origins by checking if the revision can be reached from them.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/936/ for more details.

vlorentz updated this revision to Diff 13951.Wed, Sep 16, 2:19 PM

also allow 'dev' suffix without a dash.

Build is green

Patch application report for D3960 (id=13950)

Could not rebase; Attempt merge onto 3b781a8a52...

Merge made by the 'recursive' strategy.
 swh/storage/migrate_extrinsic_metadata.py          | 123 +++++--
 .../migrate_extrinsic_metadata/test_debian.py      | 301 ++++++++++++++++-
 .../tests/migrate_extrinsic_metadata/test_pypi.py  | 366 ++++++++++++++-------
 3 files changed, 641 insertions(+), 149 deletions(-)
Changes applied before test
commit 50846927b2d36b41dd8f796110a6e5e1c38a05fa
Merge: 3b781a8a 963c991b
Author: Jenkins user <jenkins@localhost>
Date:   Wed Sep 16 12:17:55 2020 +0000

    Merge branch 'diff-target' into HEAD

commit 963c991bb1a3f9e13fd41eb8a4ddefa7ca282665
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 10:45:29 2020 +0200

    migrate_extrinsic_metadata: improve pypi_project_from_filename to support suffixes after the version number.

commit a69bb3b76a6ef3225d7b74cc6d1c23112b9fee70
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 10:42:19 2020 +0200

    migrate_extrinsic_metadata: guess PyPI origins.
    
    This works by guessing the package name from the original_artifact data,
    then building an origin that would match the package name, then filtering
    checking if the revision can be reached from it.

commit 8e8c7ee79a832e59a20ff894299b58024addd967
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 09:50:08 2020 +0200

    migrate_extrinsic_metadata.test_pypi: use the in-memory storage instead of mocks
    
    in a future commit, migrating pypi revisions will become more interactive with
    the storage, so it's easier to have a real one instead of a mock.

commit f6943400ff48ba840ab604747252ad4197fab5d9
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Sep 14 10:51:48 2020 +0200

    migrate_extrinsic_metadata.test_debian: use the in-memory storage instead of mocks
    
    in tests that need to read in the storage.
    
    Using mocks just makes it more complicated, and we decided not to do that
    a while ago.

commit 7a0467972fcd0c05ff16d806b18261aba8624288
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Sep 11 14:16:21 2020 +0200

    migrate_extrinsic_metadata: fix crash on dangling branch.

commit 7969d368966c43ebfd51b2901c827217b0712dd5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Sep 11 13:59:55 2020 +0200

    migrate_extrinsic_metadata: fix crash when a Debian revision is missing.
    
    https://forge.softwareheritage.org/T997

commit 265fc387f7b3d5f1a55d136b74fa2ee9b9f11f58
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Sep 10 14:23:12 2020 +0200

    migrate_extrinsic_metadata: guess Debian origins.
    
    This works by guessing the package name from the original_artifact data,
    then building origins that would match the package name, then filtering
    out origins by checking if the revision can be reached from them.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/937/ for more details.

Build is green

Patch application report for D3960 (id=13951)

Could not rebase; Attempt merge onto 3b781a8a52...

Merge made by the 'recursive' strategy.
 swh/storage/migrate_extrinsic_metadata.py          | 123 +++++--
 .../migrate_extrinsic_metadata/test_debian.py      | 301 ++++++++++++++++-
 .../tests/migrate_extrinsic_metadata/test_pypi.py  | 367 ++++++++++++++-------
 3 files changed, 642 insertions(+), 149 deletions(-)
Changes applied before test
commit d7ecee8ef5e495400f460e2b168701d462ed8c13
Merge: 3b781a8a bda6b32c
Author: Jenkins user <jenkins@localhost>
Date:   Wed Sep 16 12:22:31 2020 +0000

    Merge branch 'diff-target' into HEAD

commit bda6b32c73c84f0823c8ea6cadcc5c53d29919a7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 10:45:29 2020 +0200

    migrate_extrinsic_metadata: improve pypi_project_from_filename to support suffixes after the version number.

commit a69bb3b76a6ef3225d7b74cc6d1c23112b9fee70
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 10:42:19 2020 +0200

    migrate_extrinsic_metadata: guess PyPI origins.
    
    This works by guessing the package name from the original_artifact data,
    then building an origin that would match the package name, then filtering
    checking if the revision can be reached from it.

commit 8e8c7ee79a832e59a20ff894299b58024addd967
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 16 09:50:08 2020 +0200

    migrate_extrinsic_metadata.test_pypi: use the in-memory storage instead of mocks
    
    in a future commit, migrating pypi revisions will become more interactive with
    the storage, so it's easier to have a real one instead of a mock.

commit f6943400ff48ba840ab604747252ad4197fab5d9
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Sep 14 10:51:48 2020 +0200

    migrate_extrinsic_metadata.test_debian: use the in-memory storage instead of mocks
    
    in tests that need to read in the storage.
    
    Using mocks just makes it more complicated, and we decided not to do that
    a while ago.

commit 7a0467972fcd0c05ff16d806b18261aba8624288
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Sep 11 14:16:21 2020 +0200

    migrate_extrinsic_metadata: fix crash on dangling branch.

commit 7969d368966c43ebfd51b2901c827217b0712dd5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Sep 11 13:59:55 2020 +0200

    migrate_extrinsic_metadata: fix crash when a Debian revision is missing.
    
    https://forge.softwareheritage.org/T997

commit 265fc387f7b3d5f1a55d136b74fa2ee9b9f11f58
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Sep 10 14:23:12 2020 +0200

    migrate_extrinsic_metadata: guess Debian origins.
    
    This works by guessing the package name from the original_artifact data,
    then building origins that would match the package name, then filtering
    out origins by checking if the revision can be reached from them.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/938/ for more details.

vlorentz requested review of this revision.Wed, Sep 16, 3:58 PM
ardumont accepted this revision.Wed, Sep 16, 4:48 PM
This revision is now accepted and ready to land.Wed, Sep 16, 4:48 PM
This revision was landed with ongoing or failed builds.Wed, Sep 16, 5:02 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D3960 (id=13975)

Rebasing onto f008a597fd...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-945-D3960.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/945/ for more details.