Page MenuHomeSoftware Heritage

Make the base PackageLoader write extrinsic revision metadata.
ClosedPublic

Authored by vlorentz on Jul 24 2020, 2:04 PM.

Details

Summary

For now, no loader outside the tests use it; but DepositLoader will soon.

Depends on D3612.

Event Timeline

Build is green

Patch application report for D3613 (id=12715)

Could not rebase; Attempt merge onto 14f700d0fa...

Updating 14f700d..24f8137
Fast-forward
 swh/loader/package/archive/loader.py             |  10 +-
 swh/loader/package/archive/tests/test_archive.py |   4 +-
 swh/loader/package/cran/loader.py                |   6 +-
 swh/loader/package/debian/loader.py              |   6 +-
 swh/loader/package/deposit/loader.py             |   6 +-
 swh/loader/package/loader.py                     | 128 +++++++++++++-
 swh/loader/package/nixguix/loader.py             |   6 +-
 swh/loader/package/npm/loader.py                 |  10 +-
 swh/loader/package/pypi/loader.py                |   6 +-
 swh/loader/package/tests/test_loader.py          |   1 -
 swh/loader/package/tests/test_loader_metadata.py | 202 +++++++++++++++++++++++
 11 files changed, 353 insertions(+), 32 deletions(-)
 create mode 100644 swh/loader/package/tests/test_loader_metadata.py
Changes applied before test
commit 24f8137107c558ad4c5849d36f153782758a0cca
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jul 24 14:04:04 2020 +0200

    Make the base PackageLoader write extrinsic revision metadata.
    
    For now, no loader outside the tests use it; but DepositLoader will soon.

commit ec522be394ebe8eefd489a85e21a3d7ac593851d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jul 24 14:00:17 2020 +0200

    Rename PackageInfo.raw to PackageInfo.raw_info and remove it from BasePackageInfo.
    
    The rename is to disambiguate with 'raw metadata', which may differ from the
    raw info.
    And the base PackageLoader doesn't need to access this field, so removing
    it from BasePackageInfo.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/214/ for more details.

For now, no loader outside the tests use it; but DepositLoader will soon.

I'm under the impression it's the opposite, right now, all package loaders will write extrinsic revision metadata...

never mind, i read it again [1] ;)

[1] https://forge.softwareheritage.org/D3613#inline-24758

swh/loader/package/loader.py
103

Shouldn't this be false by default, and opening it once it's ok instead?

614

make it clearer we stop early (I had to read this multiple times ;)

if not metadata_objects:
    return None
swh/loader/package/tests/test_loader_metadata.py
89

is there a difference with yield from [(version, p_info)]?

or even yield (version, p_info)?

swh/loader/package/loader.py
103

well, maybe it's fine, i don't know.

Why do you think this is needed? What package loaders would not want those created?

swh/loader/package/loader.py
99

erf, i tried to remove it once...
Can't we do without?

swh/loader/package/loader.py
99

Yes, thanks to the comment below

103

The current behavior is equivalent to True. But you're right, it doesn't need to be configurable. I'll update the diff to remove it.

swh/loader/package/tests/test_loader_metadata.py
89

It's very different in how Python parses and runs the function/generator, but it works the same for us. I'll change it.

vlorentz marked 3 inline comments as done.

apply comments

vlorentz added inline comments.
swh/loader/package/loader.py
614

There's something to that effect in my uncommitted code

Build is green

Patch application report for D3613 (id=12719)

Could not rebase; Attempt merge onto 14f700d0fa...

Updating 14f700d..8851951
Fast-forward
 swh/loader/package/archive/loader.py             |  10 +-
 swh/loader/package/archive/tests/test_archive.py |   4 +-
 swh/loader/package/cran/loader.py                |   6 +-
 swh/loader/package/debian/loader.py              |   6 +-
 swh/loader/package/deposit/loader.py             |   6 +-
 swh/loader/package/loader.py                     | 122 ++++++++++++++++-
 swh/loader/package/nixguix/loader.py             |   6 +-
 swh/loader/package/npm/loader.py                 |  10 +-
 swh/loader/package/pypi/loader.py                |   6 +-
 swh/loader/package/tests/test_loader.py          |   1 -
 swh/loader/package/tests/test_loader_metadata.py | 158 +++++++++++++++++++++++
 11 files changed, 305 insertions(+), 30 deletions(-)
 create mode 100644 swh/loader/package/tests/test_loader_metadata.py
Changes applied before test
commit 8851951fc1689b402f44f087dad49d8ffc62d15e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jul 24 14:04:04 2020 +0200

    Make the base PackageLoader write extrinsic revision metadata.
    
    For now, no loader outside the tests use it; but DepositLoader will soon.

commit ec522be394ebe8eefd489a85e21a3d7ac593851d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jul 24 14:00:17 2020 +0200

    Rename PackageInfo.raw to PackageInfo.raw_info and remove it from BasePackageInfo.
    
    The rename is to disambiguate with 'raw metadata', which may differ from the
    raw info.
    And the base PackageLoader doesn't need to access this field, so removing
    it from BasePackageInfo.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/217/ for more details.

swh/loader/package/loader.py
614

It's in D3616 now

This revision is now accepted and ready to land.Jul 24 2020, 4:57 PM