Page MenuHomeSoftware Heritage

package loaders: write original_artifact metadata to directories instead of revisions.
ClosedPublic

Authored by vlorentz on Oct 23 2020, 5:01 PM.

Details

Summary

For consistency with the change in the previous diff (D4346), writing other
extrinsic metadata on directories.

Event Timeline

Build is green

Patch application report for D4347 (id=15380)

Could not rebase; Attempt merge onto c30e485120...

Updating c30e485..5412e90
Fast-forward
 swh/loader/package/deposit/loader.py             |  2 +-
 swh/loader/package/deposit/tests/test_deposit.py | 44 +++++++++++----------
 swh/loader/package/loader.py                     | 50 +++++++++++++++---------
 swh/loader/package/npm/loader.py                 |  2 +-
 swh/loader/package/npm/tests/test_npm.py         | 16 +++++---
 swh/loader/package/pypi/loader.py                |  2 +-
 swh/loader/package/pypi/tests/test_pypi.py       | 12 ++++--
 swh/loader/package/tests/test_loader_metadata.py | 43 +++++++++++---------
 8 files changed, 101 insertions(+), 70 deletions(-)
Changes applied before test
commit 5412e90f8360ec22d9c780b8cc8a2d0870405df5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Oct 23 17:01:16 2020 +0200

    package loaders: write original_artifact metadata to directories instead of revisions.
    
    For consistency with the change in the previous commit, writing other
    extrinsic metadata on directories.

commit 28bd28ca57a23d123dcc0931843e9a1d83c4d53d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Oct 23 16:52:23 2020 +0200

    package loaders: write extrinsic metadata to directories instead of revisions.
    
    They are more useful on directories, as directory ids are more intrinsic than
    synthetic revision ids.
    And this adds the revision swhid in the context, so the revision relationship
    is still available when it's useful (eg. because the same directory can be
    referenced from multiple revisions).

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/331/ for more details.

moranegg added inline comments.
swh/loader/package/loader.py
563

this TODO is aiming for the diffs goal but the line below says that some metadata are still on the revision, right?

swh/loader/package/loader.py
563

no, it means in the revision table

Build is green

Patch application report for D4347 (id=15485)

Could not rebase; Attempt merge onto e2726ef650...

Merge made by the 'recursive' strategy.
 swh/loader/package/deposit/loader.py             |  2 +-
 swh/loader/package/deposit/tests/test_deposit.py | 52 +++++++++++++-----------
 swh/loader/package/loader.py                     | 50 +++++++++++++++--------
 swh/loader/package/npm/loader.py                 |  2 +-
 swh/loader/package/npm/tests/test_npm.py         | 16 +++++---
 swh/loader/package/pypi/loader.py                |  2 +-
 swh/loader/package/pypi/tests/test_pypi.py       | 12 ++++--
 swh/loader/package/tests/test_loader_metadata.py | 43 +++++++++++---------
 8 files changed, 105 insertions(+), 74 deletions(-)
Changes applied before test
commit eaf544224e713c364dba7a62f5d60df8d3328f83
Merge: e2726ef 4eb0f0f
Author: Jenkins user <jenkins@localhost>
Date:   Thu Oct 29 10:10:59 2020 +0000

    Merge branch 'diff-target' into HEAD

commit 4eb0f0f4237e887d5eb25bc02727cf4a65741db5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Oct 23 17:01:16 2020 +0200

    package loaders: write original_artifact metadata to directories instead of revisions.
    
    For consistency with the change in the previous commit, writing other
    extrinsic metadata on directories.

commit ba364b6e603657880f9a82f61561cbec17f5b1fc
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Oct 23 16:52:23 2020 +0200

    package loaders: write extrinsic metadata to directories instead of revisions.
    
    They are more useful on directories, as directory ids are more intrinsic than
    synthetic revision ids.
    And this adds the revision swhid in the context, so the revision relationship
    is still available when it's useful (eg. because the same directory can be
    referenced from multiple revisions).

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/334/ for more details.

Build is green

Patch application report for D4347 (id=15487)

Could not rebase; Attempt merge onto e2726ef650...

Updating e2726ef..8f41aee
Fast-forward
 swh/loader/package/deposit/loader.py             |  2 +-
 swh/loader/package/deposit/tests/test_deposit.py | 52 +++++++++++++-----------
 swh/loader/package/loader.py                     | 50 +++++++++++++++--------
 swh/loader/package/npm/loader.py                 |  2 +-
 swh/loader/package/npm/tests/test_npm.py         | 16 +++++---
 swh/loader/package/pypi/loader.py                |  2 +-
 swh/loader/package/pypi/tests/test_pypi.py       | 12 ++++--
 swh/loader/package/tests/test_loader_metadata.py | 43 +++++++++++---------
 8 files changed, 105 insertions(+), 74 deletions(-)
Changes applied before test
commit 8f41aeee10c4e6a605d62330a11c46b1cba82d84
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Oct 23 17:01:16 2020 +0200

    package loaders: write original_artifact metadata to directories instead of revisions.
    
    For consistency with the change in the previous commit, writing other
    extrinsic metadata on directories.

commit cf58604ccaa469a7d31c30c476a3b9187493e9ce
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Oct 23 16:52:23 2020 +0200

    package loaders: write extrinsic metadata to directories instead of revisions.
    
    They are more useful on directories, as directory ids are more intrinsic than
    synthetic revision ids.
    And this adds the revision swhid in the context, so the revision relationship
    is still available when it's useful (eg. because the same directory can be
    referenced from multiple revisions).

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/336/ for more details.

Why didn't this go in the previous diff?

swh/loader/package/loader.py
563

I'm confused by the change and the apparent todo contradiction.
Can you please clarify this for me?

olasd added inline comments.
swh/loader/package/loader.py
563

The revision object (mutated by attr.evolve below) still contains the metadata. That needs to go away when it's not used anymore (by incremental loads)

Why didn't this go in the previous diff?

To have smaller diffs.

This revision is now accepted and ready to land.Nov 2 2020, 10:19 AM