Consider making SWHID handling case insensitive
Closed, MigratedEdits Locked
Actions

Assigned To

Authored By

	rdicosmo
	Apr 29 2021, 12:02 PM

Description

Some of our (growing number of) users have raised an interesting issue: we should consider making SWHID handling case insensitive.

Here is an excerpt of a message from Mohammad Akhlaghi that explains the use case.

it would be good if the resolvers also interpret all-caps SWHIDs like:
SWH:1:CNT:66C1D53B2860A40AA9D350048F6B02C73C3B46C8
[...] the issue is that some LaTeX packages or web services automatically set everything to all caps, for example, the header on the top of the PDF pages of https://gitlab.com/makhlaghi/maneage-paper-pdf/-/raw/master/paper.pdf (that contain the DOI, arXiv or Zenodo links and uses '\markboth'). If you click on the arXiv link at the header of the PDF above, it won't work because while the 'arxiv.org' part is not case-sensitive, the 'abs/' part is. I have sent an email to the arXiv maintainers about this. But it works for the DOI links (I guess the 'doi.org' server is not-case-sensitive).

Should we consider changing the current specification of the SWHIDs that mandates lowercase letters everywhere in a core SWHID, or just simply lowercasing the core SWHIDs during resolutions?

Revisions and Commits

rDWAPPS Web applications
	D5655	rDWAPPS619576342307 assets/webapp-utils: Add lowercase validator for core SWHIDs
	D5649	rDWAPPSfde7413968ad identifiers: Add support for resolving core SWHID with uppercase chars
rDMOD Data model
	D5654	rDMODdf036ef1c3d1 docs/persistent-identifiers: Add guidelines for fixing invalid SWHIDs.

Related Objects

Mentioned In: D5654: docs/persistent-identifiers: Add guidelines for fixing invalid SWHIDs (this time for uppercase)

Event Timeline

rdicosmo triaged this task as Normal priority.Apr 29 2021, 12:02 PM

rdicosmo created this task.

rdicosmo updated the task description. (Show Details)

Ah, this is an interesting practical problem.
I'm not a fan of changing the spec of SWHID version 1 to make them case insensitive, as it seems to be a significant change (in particular for the code that checks for the syntactic correctness of IDs).
But we can totally add a "SHOULD" section to the resolvers part of the spec recommending (but not mandating) that resolvers treat core SWHIDs as case insensitive. (Of course all the contextual parts cannot be considered case insensitive.)

This is going to be an interesting challenge/trade-off for SWHIDv2. Because I was considering there to use more compact encodings than hex, in order to shorten the SWHID length, like base58, but those are case-sensitive in order to be more dense.

So, as a counter argument above the "SHOULD" idea, we need to be careful about promoting a practice now that might change when switching from SWHIDv1 to SWHIDv2.

In T3298#64426, @zack wrote:

This is going to be an interesting challenge/trade-off for SWHIDv2. Because I was considering there to use more compact encodings than hex, in order to shorten the SWHID length, like base58, but those are case-sensitive in order to be more dense.

So, as a counter argument above the "SHOULD" idea, we need to be careful about promoting a practice now that might change when switching from SWHIDv1 to SWHIDv2.

Agreed, and nice to see this coming in just in time for the SWHIDv2 discussion :-)

vlorentz added a project: Data Model.Apr 29 2021, 12:28 PM

I'm not a fan of changing the spec of SWHID version 1 to make them case insensitive, as it seems to be a significant change (in particular for the code that checks for the syntactic correctness of IDs).

So for SWHID v1, the resolver should turn the core part into lowercase , am I right ?

In T3298#64431, @anlambert wrote:

So for SWHID v1, the resolver should turn the core part into lowercase , am I right ?

For SWHIDv1, this seems the consensus indeed.

anlambert added a revision: D5649: identifiers: Add support for resolving core SWHID with uppercase chars.Apr 29 2021, 5:41 PM

anlambert added a commit: rDWAPPSfde7413968ad: identifiers: Add support for resolving core SWHID with uppercase chars.Apr 30 2021, 11:32 AM

vlorentz mentioned this in D5654: docs/persistent-identifiers: Add guidelines for fixing invalid SWHIDs (this time for uppercase).Apr 30 2021, 12:57 PM

vlorentz added a revision: D5654: docs/persistent-identifiers: Add guidelines for fixing invalid SWHIDs (this time for uppercase).Apr 30 2021, 12:57 PM

anlambert added a revision: D5655: assets/webapp-utils: Add lowercase validator for core SWHIDs.Apr 30 2021, 2:43 PM

anlambert added a commit: rDWAPPS619576342307: assets/webapp-utils: Add lowercase validator for core SWHIDs.Apr 30 2021, 5:25 PM

vlorentz added a commit: rDMODdf036ef1c3d1: docs/persistent-identifiers: Add guidelines for fixing invalid SWHIDs..Apr 30 2021, 6:26 PM

This task has been migrated to GitLab.

Consider making SWHID handling case insensitiveClosed, MigratedEdits LockedActions

Description

Revisions and Commits

Related Objects

Event Timeline

Consider making SWHID handling case insensitive
Closed, MigratedEdits Locked
Actions