Page MenuHomeSoftware Heritage

Check why origin is created for IPOL with uuid
Closed, ResolvedPublic

Description

The url is sent over in the metadata with the <codemeta:url> property (see T2369), but the origin that is created uses a uuid.
This is a scenario where a client doesn't have a url attached to the deposited artifact.

Here, the url should be used for the origin.
It might be, because the url is send in <codemeta:url> instead of an <url>.

If this is the reason, we should check this is the behavior we aim for and fix the documentation.

Event Timeline

moranegg triaged this task as High priority.Apr 23 2020, 4:30 PM
moranegg created this task.
ardumont added a subscriber: ardumont.EditedApr 23 2020, 5:17 PM

I think you'll probably need the user's input command as well to investigate
this. The deposit client can generate a uuid to use as external identifier (if none
is provided).

And that might be the origin of the noise here (thus why i was reticent on that
uuid generation in the first place ¯\_(ツ)_/¯). Hopefully, i'm wrong and that's not
the issue.

Here is the command:

swh deposit upload --username ipol --password "mypassword" \
                   --archive mlheIPOL.tgz \
                   --collection 'ipol' \
                   --metadata metadata_test1.xml

in https://forge.softwareheritage.org/source/swh-deposit/browse/master/swh/deposit/cli/client.py$187
Before verifying if a metadata file exists there is a slug that is generated (if it is not given in param)
Normally the slug is the external-identifier, which should be passed in the metadata file.

I'm not sure, but maybe the origin is created with the concatenation of the client-url and slug, instead of the actual url in the metadata.

Continuing my investigation...

The slug must be added to params.
Origin is created from the concatenation of provider_url+slug.

See P662 for client configuration.

Users of the deposit API command line client must add a --slug command line option to avoid the creation of a uuid.
For IPOL, we agreed on the following:
SWH configuration side: provider_url = https://doi.org/10.5201/
slug on the command line = rest of the url, that is, for example, ipol.2018.236

rdicosmo closed this task as Resolved.May 4 2020, 6:03 PM

The issue has been clarified, no need to change anything in the code, only clarifications in the documentation with examples.

ardumont moved this task from Backlog to Archived on the SWORD deposit board.Tue, Nov 3, 4:04 PM