Page MenuHomeSoftware Heritage

Implement SWHID validation in frontend
Closed, MigratedEdits Locked

Description

Sending malformed SWHID to the resolver implemented in the webapp will end up with parsing errors.

We should validate SWHID format in any form of the webapp taking such input (currently search forms)
and report errors in the Web UI if format errors get detected.

To do so, we must implement SWHID validation in Javascript in a similar manner as in the Python code
available in the identifiers module from swh-model.

This code could later be packaged in a small npm module to offer its use in external projects.

Event Timeline

anlambert triaged this task as Normal priority.Apr 13 2021, 11:12 AM
anlambert created this task.

I wonder if this is not overkill: SWHID may evolve in the future, and maintaining two implementations (one of them in JS!) may be source of headaches down the line.
A simple "sanitization" phase in the frontend catching the most common issues (trailing slashes, leading or trailing tabs or spaces, etc.) would probably be enough for our purpose.

I wonder if this is not overkill: SWHID may evolve in the future, and maintaining two implementations (one of them in JS!) may be source of headaches down the line.
A simple "sanitization" phase in the frontend catching the most common issues (trailing slashes, leading or trailing tabs or spaces, etc.) would probably be enough for our purpose.

Other solution would be to try to resolve a SWHID using the Web API when such input is entered / pasted in search forms.

If there is some error, report it in the Web UI in a similar manner as for the save code now request.
It has the advantage to rely on the Python implementation to avoid re-implementing the validation in JS.

Better error message regarding what part of a SWHID did not get validated should be returned by the API though.

Ok, this is converging with the discussion in T3234: we fully agree that having proper errors reported to the user is the way to go, so let's forget about the "sanitization" approach.

Better error message regarding what part of a SWHID did not get validated should be returned by the API though.

This would be a definite plus in all cases: how difficult would it be to report proper errors?

Ok, this is converging with the discussion in T3234: we fully agree that having proper errors reported to the user is the way to go, so let's forget about the "sanitization" approach.

Better error message regarding what part of a SWHID did not get validated should be returned by the API though.

This would be a definite plus in all cases: how difficult would it be to report proper errors?

Turns out some errors are not properly reported due to small bugs in swh-model. I fixed those and update tests in D5492.

SWHIDs are now validated in each search input in production.

Give it a try by copying/pasting that erroneous SWHID in a search input:
swh:1:cnt:bb0faf6919fc60636b2696f32ec9b3c2adb247fe;origin=https://github.com/id-Software/Quake-III-Arena;lines=549-572/