Page MenuHomeSoftware Heritage

migrate_extrinsic_metadata: Add support for the current format of original_artifacts written by the CRAN loader.
ClosedPublic

Authored by vlorentz on Sep 30 2020, 1:08 PM.

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D4099 (id=14445)

Rebasing onto 40997c0506...

Current branch diff-target is up to date.
Changes applied before test
commit a56dbbc055dc233841cc9fdb5c37ca622a7d121d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 30 13:07:49 2020 +0200

    migrate_extrinsic_metadata: Add support for the current format of original_artifacts written by the CRAN loader.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/990/ for more details.

This revision is now accepted and ready to land.Sep 30 2020, 1:17 PM

Thank you @vlorentz for this important script that will take us one step towards a searchable archive :-)

Not commenting the implementation itself, only the doc with the needed change: not to remove metadata, only copy metadata to ERMDS as it is.

swh/storage/migrate_extrinsic_metadata.py
13

to continue with the conservative approach, I would suggest to keep the metadata and just copy.

15

this way, you don't need to distinguish between the fields that require and those that do not.

21

Finally, it will be less stressful to run a srcipt that doesn't change the archive but is very useful for the search mechanisms we want to implement on the ERMDS (Extrinsic Raw MetaData Storage).

This revision was landed with ongoing or failed builds.Oct 2 2020, 11:08 AM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D4099 (id=14535)

Rebasing onto bef08d6316...

Current branch diff-target is up to date.
Changes applied before test
commit 07df3f6f6dbf0399ab5f34564d6f01f3861607b2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 30 13:07:49 2020 +0200

    migrate_extrinsic_metadata: Add support for the current format of original_artifacts written by the CRAN loader.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/996/ for more details.