Page MenuHomeSoftware Heritage

Build a connector for software deposit via Zenodo/InvenioRDM
Open, NormalPublic

Description

The popular Zenodo platform is now becoming a white label open source application, InvenioRDM, built on top of the Invenio library: https://inveniosoftware.org/products/rdm/
There are many partners collaborating on this new version, and this is the right time to contribute to it a software deposit functionality similar to what we have for HAL:

  • SWORD 3 support is planned for InvenioRDM
  • CodeMeta support/export is planned for InvenioRDM
  • our new BibTeX types would be welcome in InvenioRDM

First public release expected this summer.

Event Timeline

rdicosmo triaged this task as Normal priority.Apr 1 2020, 5:41 PM
rdicosmo created this task.

Great news !!

Does this mean we need to be SWORD 3 compatible?

Great news !!

Does this mean we need to be SWORD 3 compatible?

After some thought, it appears to me that's not absolutely necessary, we need to discuss this point with the people working on the Invenio codebase.

Here is the pad shared with the InvenioRDM team:
https://hackmd.io/YIJXcf3YTDiwwYGD-yePrA

The InvenioRDM team propose to divide the work on their side into 3-4 phases:

  1. Core Development
  2. InvenioRDM to Software Heritage base integration- implement the interactions outlined in Figure 1 and Figure 2 (available in the hack.md)
  3. GitHub Integration - extend to integration with GitHub, enabling automatic deposit into Invenio, and “save code now” in Software Heritage
  4. Advanced GitHub Integration

On the SWH side for the InvenioRDM integration and HAL extension to the deposit metadata, the following steps are needed:

  1. Extend same deposit endpoint
  2. Add new option client swh-deposit
  3. specs about metadata verifications
    • choose metadata format T2311
    • SWHID (core and/or with context)
    • not empty metadata
    • url / authors ?
    • (syntax) incorrect ?
  4. implementation of metadata verification
  5. specs about deposit metadata storage
    • keep only in metadata storage
    • don't create origin (or other graph artifacts) for metadata deposit
  6. implementation of deposit metadata storage
  7. keep release in mind for content deposit T1755

I have made a survey of the existing code to ensure what I think happens in the
deposit is correct. TL; DR, it is!

Existing update metadata endpoints are focused in the
swh.deposit.api.deposit_update module [1].

Up-to-now, there is a restriction of use within the base class to prevent
updating deposit with status other than 'partial' [2]. That restriction should
be relaxed for the deposit metadata update case (when a SWHID is provided in
some ways). It should stay for the other existing cases.

[1] https://forge.softwareheritage.org/source/swh-deposit/browse/master/swh/deposit/api/deposit_update.py$86-161

[2] https://forge.softwareheritage.org/source/swh-deposit/browse/master/swh/deposit/api/common.py$755-778