Page MenuHomeSoftware Heritage

Extend CodeMeta vocabulary to qualify author relationships
Open, NormalPublic

Description

The current CodeMeta vocabulary needs to be extended to accommodate

  1. the rich set of contributor roles identified in https://hal.archives-ouvertes.fr/hal-02135891
  2. the possibility of adding multiple affiliations to any Person entity

Issue on CodeMeta repository:
https://github.com/codemeta/codemeta/issues/240

Event Timeline

rdicosmo triaged this task as Normal priority.Mar 21 2020, 11:57 AM
rdicosmo created this task.
vlorentz added a comment.EditedMar 21 2020, 12:30 PM

Taxon map:

  • (Algorithm) Design: referencePublication -> author would work if there's an academic publication. (referencePublication is codemeta only)
  • Debugging: ?
  • Maintenance: maintainer (codemeta only for now, it's a pending property of schema.org)
  • Coding: author property of SoftwareSourceCode?
  • Architecture design: author property of SoftwareApplication?
  • Documentation: softwareHelp -> author
  • Testing: ? (I guess technically you could use review -> author, but that's stretching the meaning of "review")
  • Support: ?
  • Management: ?

Another relevant property not mentioned in the paper is translator

Encoding the roles into existing property of CodeMeta or schema.org is a possibility.
The other, that I prefer, is to associate a "role" property to authors/contributors: indeed, an author/contributor may have multiple roles, just like multiple affiliations.

moranegg added a comment.EditedMar 23 2020, 12:06 AM

I just discovered that there is a pending term maintaineron schema.org, thanks @vlorentz
for pointing this out.
https://schema.org/maintainer

I agree that neither CodeMeta nor schema.org gives the possibility to specify all roles at the moment, and I'm still not sure what we should push for.
I do like the idea of adding a role to author, but the way schema.org is organized, is by adding properties with specific roles:

Instances of Person may appear as values for the following properties

On the Person page: https://schema.org/Person

I do like the idea of adding a role to author, but the way schema.org is organized, is by adding properties with specific roles

Indeed.

@rdicosmo JSON-LD (and therefore codemeta) is about describing nodes (Persons, Organization, SoftwareSourceCode, SoftwareApplication, ...), and relationships between them (author, funder, ...) with (subject, predicate, object) triplets (eg. (parmap, author, rdicosmo)).
My understanding of your suggestion is that you want to qualify the relationship/triplet with extra attributes (the roles).
While this is possible with RDF ("statement reification", see this nice example: https://stackoverflow.com/a/1315775 ), JSON-LD unfortunately doesn't allow this directly.

One way to do this reification while keeping a JSON-LD representation is to turn links into a new type of node (with a new type eg. AuthorRelationship), which would have relationships with the two objects ((parmap, authoredByRelationship, id1) and (rdicosmo, authoringRelationship, id1)) and add the role as a property on that new node ((id1, role, designer), (id1, level, 5)).

Unfortunately, it would mean ditching (the current version of) Codemeta as well as schema.org, because they absolutely do not have the vocabulary to express it.

Another way would be to use the Codemeta/schema.org vocabulary and use the RDF representation instead of JSON-LD. But then it's no longer Codemeta-as-a-file-format.

If you agree with this proposal with @vlorentz's example, I can submit the issue tomorrow.

Proposal for author roles in CodeMeta v3

The question of authorship and contribution roles in the scholarly ecosystem has been addressed in different ways, yet software remains an uncharted terrain.
In the CRedit for example, software is a unique role that combines all roles related to software authorship (see https://casrai.org/credit/ )

We know that the situation with software development is a bit more complex and can't be resumed in the distinction between author and contributor.

Recently the Inria research center's citation WG has published the following article with a specific taxonomy concerning software roles:

Pierre Alliez, Roberto Di Cosmo, Benjamin Guedj, Alain Girault, Mohand-Said Hacid, Arnaud Legrand, Nicolas Rougier. Attributing and Referencing (Research) Software: Best Practices and Outlook From Inria. Computing in Science > Engineering, 22 (1), pp. 39-52, 2020, ISSN: 1558-366X. https://dx.doi.org/10.1109/MCSE.2019.2949413 also available https://hal.archives-ouvertes.fr/hal-02135891

We want to find a way to integrate the identified roles in the CodeMeta vocabulary and eventually in schema.org.
To that end we propose on adding hasOccupation property as a Role property under author, which is a Person in schema.org.

List of roles:

  • Algorithm design
  • Architecture design
  • Debugging
  • Maintenance
  • Coding
  • Documentation
  • Testing
  • Support
  • Management

Here an example:

{
    "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
    "@type": "SoftwareSourceCode",
    "name": "Foo Software",
    "author": [
        {
            "@type": "Role",
            "roleName": "Design",
            "startDate": "2000",
            "endDate": "2002",
            "author": {
                "@type Person",
                "@id": "http://example.org/~jdupont",
                "givenName": "John",
                "familyName": "Dupont",
                "affiliation": [
                    {
                        "@type": "Organization",
                        "name": "CNRS"
                    },
                    {
                        "@type": "Organization",
                        "name": "Inria"
                    },
                    {
                        "@type": "Organization",
                        "name": "Université de Paris"
                    }
                ]
            }
        },
        {
            "@type": "Role",
            "roleName": "Coding",
            "startDate": "2000",
            "endDate": "2002",
            "author": {
                "@id": "http://example.org/~jdupont"
            }
        },
        {
            "@type": "Role",
            "roleName": "Documentation",
            "startDate": "2000",
            "endDate": "2002",
            "author": 
                "@id": "http://example.org/~jdupont"
            }
        }
    ]
}

Note that I added an ID to avoid duplicating the person's data


Here is the link to the pad on which we worked on the following proposal: https://pad.inria.fr/p/np_4OhCOtQzpjSQBuAN_codemeta-proposal

I just realized I didn't follow schema.org's roleName examples when writing this example. Could you copy-paste the new example from the pad, which uses "designer", "coder", and "documenter"?

I've seen the new example, is this the right transformation?

Algorithm design = designer
Architecture design = architect
Debugging = debugger
Maintenance = maintainer
Coding = coder
Documentation = documenter (is that a word?)
Testing = tester
Support = supporter ? (not convinced)
Management = manager

I don't know if I don't prefer keeping the first suggested roles.

zack added a subscriber: zack.Apr 2 2020, 5:23 PM

I've seen the new example, is this the right transformation?

just a few comments/suggestions on some of these:

Algorithm design = designer

This is too ambiguous, I think, might be visual design and, without qualification, people will more likely think of software architecture design than algo.
(I don't have a better suggestion, unfortunately.)

Architecture design = architect
Debugging = debugger
Maintenance = maintainer
Coding = coder

"developer", rather?

Documentation = documenter (is that a word?)

this is usually "technical writer"

Testing = tester
Support = supporter ? (not convinced)
Management = manager

I don't know if I don't prefer keeping the first suggested roles.

they don't have to be single words. I'd say:

Algorithm design -> "algorithm designer"
Support -> Customer Success Specialist customer support (by lack of a better term)

Our proposal to extend the CodeMeta vocabulary is based on the following nine roles identified in https://hal.archives-ouvertes.fr/hal-02135891v2/document :

• Design • Debugging • Maintenance • Coding • Architecture • Documentation • Testing • Support • Management

We need to keep the same terms, and refer to the article for the detailed explanation.

Here is the link to the open issue on the CodeMeta repository:
https://github.com/codemeta/codemeta/issues/240

moranegg updated the task description. (Show Details)Apr 3 2020, 4:36 PM
vlorentz renamed this task from Extend CodeMeta vocabulary to Extend CodeMeta vocabulary to qualify author relationships.Apr 6 2020, 2:56 PM