Page MenuHomeSoftware Heritage

deposit client: Adapt metadata generation so metadata pass server side checks
Closed, MigratedEdits Locked

Description

When the deposit client generates a xml metadata file out of the --name and
--title flag, it generates it with codemeta headers [1]. The author and
name are qualified under the codemeta namespace [1].

In such a state, the current metadata checks (done server side) will reject it.

Those checks are done on plain author, name or title fields (no namespace).

So for example, the update metadata scenario won't work with that metadata generation.

Decide what to do and adapt accordingly the cli.

[1]

<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:codemeta="https://doi.org/10.5063/SCHEMA/CODEMETA-2.0">
        <codemeta:name>test-project</codemeta:name>
        <codemeta:identifier>41cebb4c-66ed-4692-87a8-7ba44aa25db8</codemeta:identifier>
        <codemeta:author>
                <codemeta:name>Jane Doe</codemeta:name>
        </codemeta:author>
</entry>

Event Timeline

The Atom spec says that:

  • atom:entry elements MUST contain one or more atom:author elements, unless [irrelevant stuff]
  • atom:entry elements MUST contain exactly one atom:title element.
  • atom:entry elements MUST contain exactly one atom:updated element.

However, we also want to use CodeMeta, and we want some basic information to be mandatory.

Therefore, I recommend that we require all of the following:

  • http://www.w3.org/2005/Atom#title
  • http://www.w3.org/2005/Atom#updated
  • http://www.w3.org/2005/Atom#author
  • https://doi.org/10.5063/SCHEMA/CODEMETA-2.0#name (yes, in addition to http://www.w3.org/2005/Atom#title, even if they have somewhat the same meaning)
  • https://doi.org/10.5063/SCHEMA/CODEMETA-2.0#author (yes, also a duplicate)
ardumont renamed this task from deposit client: Adapt metadata generation so metadata checks passes to deposit client: Adapt metadata generation so metadata checks pass.Oct 13 2020, 6:13 PM
ardumont updated the task description. (Show Details)
ardumont updated the task description. (Show Details)

@ardumont thanks for this task! now it is more clear what the goal was with D4232.

I agree with @vlorentz that we MUST be SWORD compliant.
We also want to accept CodeMeta vocabulary, since there are more properties to describe software with CodeMeta than with AtomPub or DCTerms (which is mentioned on the SWORD spec).

I hesitate to require all mandatory properties in both vocabularies.
Note that HAL sends us in atom:author the value HAL and in author:codemeta the real author values.

About the cli, we can use @vlorentz suggestion here without hesitation by duplicating the entries.
I was going to say that we need to add the namespace as well, but it's there in your example.
so just add :

<title> My epic software </title>
<author> swh cli </author>
<updated>  ? </updated>
  1. In this example, I use the same logic we did with HAL (even if it is faulty logic :-D) using the channel or client for the atom:author.
  1. I'm not sure we have required updated with HAL- I will check that, because adding a check with updated` might break the workflow.

@ardumont thanks for this task! now it is more clear what the goal was with D4232.

sure, but to be fair, the diff described the same predicament (granted, without as much detail ;)

Therefore, I recommend that we require all of the following:

I don't understand the we require in your sentence.

Are we talking about the code server side [1] (which does the metadata checks) or are we talking about the generated metadata (done client side in the cli) ?

[1] is a breaking change in my mind.

About the cli, we can use @vlorentz suggestion here without hesitation by duplicating the entries.

I'm not sure that it was his suggestion, cf. my question above.
But yes, i gather we can do that.

I'm not sure we have required updated with HAL- I will check that, because adding a check with updated` might break the workflow.

nope, it's not required.

Currently, service side, required fields are:

  • author
  • name (or alternatively title), to convey the same information
ardumont renamed this task from deposit client: Adapt metadata generation so metadata checks pass to deposit client: Adapt metadata generation so metadata checks (server side) pass.Oct 14 2020, 11:52 AM
ardumont renamed this task from deposit client: Adapt metadata generation so metadata checks (server side) pass to deposit client: Adapt metadata generation so metadata checks server side pass.
ardumont renamed this task from deposit client: Adapt metadata generation so metadata checks server side pass to deposit client: Adapt metadata generation so metadata pass server side checks.Oct 14 2020, 4:21 PM

I tried to adapt according to my current understanding of your suggestions in D4261.

ardumont claimed this task.