Page MenuHomeSoftware Heritage

Make the base PackageLoader write extrinsic revision metadata.
ClosedPublic

Authored by vlorentz on Jul 24 2020, 2:04 PM.

Details

Summary

For now, no loader outside the tests use it; but DepositLoader will soon.

Depends on D3612.

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Branch
package-metadata
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 13941
Build 21391: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 21390: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D3613 (id=12715)

Could not rebase; Attempt merge onto 14f700d0fa...

Updating 14f700d..24f8137
Fast-forward
 swh/loader/package/archive/loader.py             |  10 +-
 swh/loader/package/archive/tests/test_archive.py |   4 +-
 swh/loader/package/cran/loader.py                |   6 +-
 swh/loader/package/debian/loader.py              |   6 +-
 swh/loader/package/deposit/loader.py             |   6 +-
 swh/loader/package/loader.py                     | 128 +++++++++++++-
 swh/loader/package/nixguix/loader.py             |   6 +-
 swh/loader/package/npm/loader.py                 |  10 +-
 swh/loader/package/pypi/loader.py                |   6 +-
 swh/loader/package/tests/test_loader.py          |   1 -
 swh/loader/package/tests/test_loader_metadata.py | 202 +++++++++++++++++++++++
 11 files changed, 353 insertions(+), 32 deletions(-)
 create mode 100644 swh/loader/package/tests/test_loader_metadata.py
Changes applied before test
commit 24f8137107c558ad4c5849d36f153782758a0cca
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jul 24 14:04:04 2020 +0200

    Make the base PackageLoader write extrinsic revision metadata.
    
    For now, no loader outside the tests use it; but DepositLoader will soon.

commit ec522be394ebe8eefd489a85e21a3d7ac593851d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jul 24 14:00:17 2020 +0200

    Rename PackageInfo.raw to PackageInfo.raw_info and remove it from BasePackageInfo.
    
    The rename is to disambiguate with 'raw metadata', which may differ from the
    raw info.
    And the base PackageLoader doesn't need to access this field, so removing
    it from BasePackageInfo.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/214/ for more details.

For now, no loader outside the tests use it; but DepositLoader will soon.

I'm under the impression it's the opposite, right now, all package loaders will write extrinsic revision metadata...

never mind, i read it again [1] ;)

[1] https://forge.softwareheritage.org/D3613#inline-24758

swh/loader/package/loader.py
104

Shouldn't this be false by default, and opening it once it's ok instead?

615

make it clearer we stop early (I had to read this multiple times ;)

if not metadata_objects:
    return None
swh/loader/package/tests/test_loader_metadata.py
89

is there a difference with yield from [(version, p_info)]?

or even yield (version, p_info)?

swh/loader/package/loader.py
104

well, maybe it's fine, i don't know.

Why do you think this is needed? What package loaders would not want those created?

swh/loader/package/loader.py
99

erf, i tried to remove it once...
Can't we do without?

swh/loader/package/loader.py
99

Yes, thanks to the comment below

104

The current behavior is equivalent to True. But you're right, it doesn't need to be configurable. I'll update the diff to remove it.

swh/loader/package/tests/test_loader_metadata.py
89

It's very different in how Python parses and runs the function/generator, but it works the same for us. I'll change it.

vlorentz marked 3 inline comments as done.

apply comments

vlorentz added inline comments.
swh/loader/package/loader.py
615

There's something to that effect in my uncommitted code

Build is green

Patch application report for D3613 (id=12719)

Could not rebase; Attempt merge onto 14f700d0fa...

Updating 14f700d..8851951
Fast-forward
 swh/loader/package/archive/loader.py             |  10 +-
 swh/loader/package/archive/tests/test_archive.py |   4 +-
 swh/loader/package/cran/loader.py                |   6 +-
 swh/loader/package/debian/loader.py              |   6 +-
 swh/loader/package/deposit/loader.py             |   6 +-
 swh/loader/package/loader.py                     | 122 ++++++++++++++++-
 swh/loader/package/nixguix/loader.py             |   6 +-
 swh/loader/package/npm/loader.py                 |  10 +-
 swh/loader/package/pypi/loader.py                |   6 +-
 swh/loader/package/tests/test_loader.py          |   1 -
 swh/loader/package/tests/test_loader_metadata.py | 158 +++++++++++++++++++++++
 11 files changed, 305 insertions(+), 30 deletions(-)
 create mode 100644 swh/loader/package/tests/test_loader_metadata.py
Changes applied before test
commit 8851951fc1689b402f44f087dad49d8ffc62d15e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jul 24 14:04:04 2020 +0200

    Make the base PackageLoader write extrinsic revision metadata.
    
    For now, no loader outside the tests use it; but DepositLoader will soon.

commit ec522be394ebe8eefd489a85e21a3d7ac593851d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jul 24 14:00:17 2020 +0200

    Rename PackageInfo.raw to PackageInfo.raw_info and remove it from BasePackageInfo.
    
    The rename is to disambiguate with 'raw metadata', which may differ from the
    raw info.
    And the base PackageLoader doesn't need to access this field, so removing
    it from BasePackageInfo.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/217/ for more details.

swh/loader/package/loader.py
615

It's in D3616 now

This revision is now accepted and ready to land.Jul 24 2020, 4:57 PM