Page MenuHomeSoftware Heritage

Review codemeta.json export from HAL
Closed, ResolvedPublic

Description

Here is a first test of the codemeta.json export:
https://hal.halpreprod.archives-ouvertes.fr/hal-02071874v1/codemeta

Comments from @vlorentz :

  1. "identifier" value should not be a dict
  2. it would be nice if the license was an SPDX name or URI

and I'm not sure what applicationCategory refers too, but it's underspecified in schema.org anyway
and developmentStatus should preferably not be in French, but use a term from repostatus.org

see P746 for Property value.

Event Timeline

moranegg triaged this task as Normal priority.Aug 21 2020, 11:59 AM
moranegg created this task.
moranegg added a comment.EditedAug 31 2020, 2:39 PM

List of comments from this collaborative document: https://hackmd.io/g_6J8cBETBi66R9AvPAGOA

List of changes

  1. Add schema.org to the context
"@context": ["https://doi.org/doi:10.5063/schema/codemeta-2.0", "http://schema.org"],
  1. Add author's role follwoing (https://github.com/codemeta/codemeta/issues/240)
{
...
"author": [
        {
            "@type": "Role",
            "roleName": "Design",
            "author": {
                "@type Person",
                "@id": "http://example.org/~useIDHAL",
                "givenName": "John",
                "familyName": "Dupont",
                "email": "email@email.org"
                "affiliation": [
                    {
                        "@type": "Organization",
                        "name": "CNRS"
                    },
                    {
                        "@type": "Organization",
                        "name": "Inria"
                    },
                    {
                        "@type": "Organization",
                        "name": "Université de Paris"
                    }
                ]
            }
        },
    ]
  1. "identifier" value should not be a dict but a PropertyValue
    • decide which propertyID to use
{
...
"identifier": [
	{
	  "@type": "PropertyValue",
	  "propertyID": "SWHID",
	  "value": "swh:1:dir:9f85c8f51850028a9fbc03463c74de29a2d24c6c"
	},
    {
	  "@type": "PropertyValue",
	  "propertyID": "HAL-ID",
	  "value": "hal-02071874"
    }        
],
...
}
  1. dateModified: what date is used here?
  2. add downloadUrl with the url used for file
  3. add email to author
  4. validate that all proerties in export is also in deposit XML: A. introduce alternative solution for deposit metadata using the CodeMeta export
vlorentz added a comment.EditedAug 31 2020, 5:46 PM

Nitpick, but I'd rather use URIs as propertyID, as it is recommended by schema.org (avoids name clashes, auto-documenting, etc.). eg. https://softwareheritage.org/swhid instead of SWHID. I don't have a good one that is resolvable for IdHAL, but we could decide on https://hal.archives-ouvertes.fr/idhal or https://archives-ouvertes.fr/idhal even if they are not

The resolver for SWHIDs can be https://archive.softwareheritage.org/ so should it be the value of PropertyID or the one you have written: https://softwareheritage.org/swhid?
The same is with HAL: https://hal.archives-ouvertes.fr/ adding the HAL-ID to the end resolves the identifier.
So is that correct?

{
...
"identifier": [
	{
	  "@type": "PropertyValue",
	  "propertyID": "https://archive.softwareheritage.org/",
	  "value": "swh:1:dir:9f85c8f51850028a9fbc03463c74de29a2d24c6c"
	},
    {
	  "@type": "PropertyValue",
	  "propertyID": "https://hal.archives-ouvertes.fr/",
	  "value": "hal-02071874"
    }        
],
...
}

or

{
...
"identifier": [
	{
	  "@type": "PropertyValue",
	  "propertyID": "https://archive.softwareheritage.org/SWHID",
	  "value": "swh:1:dir:9f85c8f51850028a9fbc03463c74de29a2d24c6c"
	},
    {
	  "@type": "PropertyValue",
	  "propertyID": "https://hal.archives-ouvertes.fr/HAL-ID",
	  "value": "hal-02071874"
    }        
],
...
}
vlorentz added a comment.EditedSep 1 2020, 12:17 PM

The resolver for SWHIDs can be https://archive.softwareheritage.org/ so should it be the value of PropertyID or the one you have written: https://softwareheritage.org/swhid?

Sorry, that's not what I meant. The only requirement is that PropertyID must be a unique URI. By "resolvable", I mean that opening that URI in a web browser describes what that PropertyID is about (ie. documentation), not that it can be used to resolve the identifier.

Another example is the URI of the XHTML namespace, which is https://www.w3.org/1999/xhtml/, and when open in a browser it show the specification of XHTML

So should we have a redirection to the SWHID docs on https://softwareheritage.org/swhid or use the current link?

The current redirection is fine IMO

It's linking to a blog post, it's not even the formal documentation..
There is also the notion of persistence.

Well we can't change that redirection, it may be linked by other documents

Reviewed and opened the HAL issue (which is on a closed forge, so most of you can't open):
https://gitlab.ccsd.cnrs.fr/ccsd/hal/-/issues/302

Its content is:

  1. Add schema.org to the context
json
...
"@context": ["https://doi.org/doi:10.5063/schema/codemeta-2.0", "http://schema.org"],
...
  1. Add author's role following (https://github.com/codemeta/codemeta/issues/240)
 json
{
...
"author": [
        {
            "@type": "Role",
            "roleName": "Design",
            "author": {
                "@type Person",
                "@id": "http://example.org/~useIDHAL",
                "givenName": "John",
                "familyName": "Dupont",
                "email": "email@email.org"
                "affiliation": [
                    {
                        "@type": "Organization",
                        "name": "CNRS"
                    },
                    {
                        "@type": "Organization",
                        "name": "Inria"
                    },
                    {
                        "@type": "Organization",
                        "name": "Université de Paris"
                    }
                ]
            }
        },
    ]
  1. "identifier" value should not be a dict but a PropertyValue
    • decide which propertyID to use for the HAL-ID
jsonld=
{
...
"identifier": [
	{
	  "@type": "PropertyValue",
	  "propertyID": "https://softwareheritage.org/swhid",
	  "value": "swh:1:dir:9f85c8f51850028a9fbc03463c74de29a2d24c6c"
	},
    {
	  "@type": "PropertyValue",
	  "propertyID": "https://hal.archives-ouvertes.fr/hal_id",
	  "value": "hal-02071874"
    }        
],
...
}
moranegg closed this task as Resolved.Sep 21 2020, 3:19 PM