Page MenuHomeSoftware Heritage

Pubdev: Add raw_extrinsic_metadata
AbandonedPublic

Authored by franckbret on Oct 12 2022, 12:42 PM.

Details

Reviewers
None
Group Reviewers
Reviewers
Maniphest Tasks
T4465: Ingest pub.dev (Dart, Flutter)
Summary

Populate directory_extrinsic_metadata with original-artifacts-json

Related to T4465

Diff Detail

Event Timeline

Build is green

Patch application report for D8665 (id=31295)

Rebasing onto a13e3e6f35...

First, rewinding head to replay your work on top of it...
Applying: Pubdev: Add raw_extrinsic_metadata
Changes applied before test
commit 54b7bd5d9ad3d1c57bd159026f71a778ea9ffec3
Author: Franck Bret <franck.bret@octobus.net>
Date:   Wed Oct 12 12:38:58 2022 +0200

    Pubdev: Add raw_extrinsic_metadata
    
    Populate directory_extrinsic_metadata with `original-artifacts-json` and `pubdev-pubspec-json`
    
    Related T4465

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/995/ for more details.

vlorentz added inline comments.
swh/loader/package/pubdev/loader.py
156

Isn't this just a JSON representation of the pubspec.yaml file already present in the directory?

167–170

the base loader already does it itself, but with a different authority (that's why it doesn't show up when you call swh_storage.raw_extrinsic_metadata_get() in the test)

anlambert added inline comments.
swh/loader/package/pubdev/loader.py
156

Looks like it is, the Web API data only has extra info about tarball URL, checksum and release date (see pubspec.yml example and related Web API data).

Should we rather implement an indexer parsing pubspec.yml files to be stored as intrinsic metadata in that case ?

swh/loader/package/pubdev/loader.py
167–170

Better remove pubdev-pubspec-json format from the loader for now?

swh/loader/package/pubdev/loader.py
167–170

or indefinitely, yes. I don't see a reason to do it

Remove 'pubdev-pubspec-json' format from raw_extrinsic_metadata

Build is green

Patch application report for D8665 (id=31305)

Rebasing onto a13e3e6f35...

First, rewinding head to replay your work on top of it...
Applying: Pubdev: Add raw_extrinsic_metadata
Changes applied before test
commit b8934da6e55a09af4350a51758ad6652a156de5b
Author: Franck Bret <franck.bret@octobus.net>
Date:   Wed Oct 12 12:38:58 2022 +0200

    Pubdev: Add raw_extrinsic_metadata
    
    Populate directory_extrinsic_metadata with `original-artifacts-json``
    
    Related to T4465

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/996/ for more details.

franckbret marked an inline comment as done.
swh/loader/package/pubdev/loader.py
167–170

I think you missed this comment:

the base loader already does it itself, but with a different authority (that's why it doesn't show up when you call swh_storage.raw_extrinsic_metadata_get() in the test)

swh/loader/package/pubdev/loader.py
167–170

Ok you mean I can drop "original-artifacts-json" too, got it.

I've checked if there was another interesting endpoint related to version extra data, but no.

The first one works but returns same information as second one which is per package name:

https://pub.dev/api/packages/connectivity_plus/versions/1.0.0/score
https://pub.dev/api/packages/connectivity_plus/score

Not sure if we are interested in.

Pub.dev have no other metadata we do not already parse.