Implement T1241 in documentation of PIDs
Details
Diff Detail
- Repository
- rDMOD Data model
- Branch
- pid-extension
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 11446 Build 17344: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 17343: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D2924 (id=10362)
Rebasing onto 4a2233c5f7...
Current branch diff-target is up to date.
Changes applied before test
commit 0767c811c859a798a7cfaa4247c6b2b15cc7e8aa Author: Roberto Di Cosmo <roberto@dicosmo.org> Date: Sat Mar 28 15:16:04 2020 +0100 Extend SWH PID definition with additional context qualifiers.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/2/ for more details.
docs/persistent-identifiers.rst | ||
---|---|---|
138–141 | I think it would be nice to mention that these qualifiers are extrinsic and only contextualize one occurence of the object, even if that's redundant with the formal definition | |
166–168 | path-absolute = "/" [ segment-nz *( "/" segment ) ] segment = *pchar segment-nz = 1*pchar pchar = unreserved / pct-encoded / sub-delims / ":" / "@" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" So the grammar is ambiguous because both ; and = are allowed in path_absolute. A possible fix is to require them to be percent-encoded. | |
180–215 | Object don't have to be ingested by SWH for their swh-id to be valid | |
181–182 | I'm really confused by this. Is it the PID of a snapshot? | |
185–188 | I'd change the punctuation to make this more readable: * **path** : the *absolute file path* from the *root directory* associated to the *anchor node*, to the object. When the anchor denotes a directory or a revision, and almost always when it's a release, the root directory is uniquely determined. When the anchor denotes a snapshot, the root directory is considered to be the one associated to the main branch of that snapshot; And you must define what the "main branch" is. You should also mention what to do if the anchor resolves to a snapshot doesn't have any branch that fits your definition of main branch. |
Build is green
Patch application report for D2924 (id=10368)
Rebasing onto 4a2233c5f7...
Current branch diff-target is up to date.
Changes applied before test
commit d14883e8d24efcf0c43e1037bb19179d3cbfb57b Author: Roberto Di Cosmo <roberto@dicosmo.org> Date: Sat Mar 28 18:22:11 2020 +0100 Clarify ambiguities in PID extensions commit 0767c811c859a798a7cfaa4247c6b2b15cc7e8aa Author: Roberto Di Cosmo <roberto@dicosmo.org> Date: Sat Mar 28 15:16:04 2020 +0100 Extend SWH PID definition with additional context qualifiers.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/3/ for more details.
You could add this example of a path that need extra percent escapes:
swh:1:cnt:f10371aa7b8ccabca8479196d6cd640676fd4a04?origin=https://github.com/web-platform-tests/wpt;anchor=swh:1:rev:259d0612af038d14f2cd889a14a3adb6c9e96d96;path=/html/semantics/document-metadata/the-meta-element/pragma-directives/attr-meta-http-equiv-refresh/support/x%3Burl=foo/
(It's almost the only file/dir I could find in a real-life repository with a ;...)
Unrelatedly, could you add a (non-normative) recommendation that qualifiers should be in the same order as the spec (unlike the current example), so they are ordered from the most generic (origin) to the most specific (line)? I doesn't change anything to how they are parsed, but it makes them more readable imo
docs/persistent-identifiers.rst | ||
---|---|---|
166–168 | So now this doesn't match the examples, as the definition requires all characters to be percent-encoded. You should redefine <path_absolute_encoded> as <path_absolute_escaped>, which is <path_absolute> where all ; must be percent-encoded (as %3B) | |
181–182 | I think it's confusing to call it visit if it's a snapshot PID. If it's for the lack of visit identifiers, maybe we should fix that issue first. | |
185–188 | I think "the one pointed to by the HEAD (possibly indirectly)" should be enough. HEAD might be a branch by itself |
Build is green
Patch application report for D2924 (id=10379)
Rebasing onto 4a2233c5f7...
Current branch diff-target is up to date.
Changes applied before test
commit accca603c42ad68252532222ca6467a19691524e Author: Roberto Di Cosmo <roberto@dicosmo.org> Date: Mon Mar 30 14:13:59 2020 +0200 Typo commit b6e92eae7a798dcb203ce786e984fd96a6f7ee07 Author: Roberto Di Cosmo <roberto@dicosmo.org> Date: Mon Mar 30 14:11:55 2020 +0200 Further clarifications in the PID extension commit d14883e8d24efcf0c43e1037bb19179d3cbfb57b Author: Roberto Di Cosmo <roberto@dicosmo.org> Date: Sat Mar 28 18:22:11 2020 +0100 Clarify ambiguities in PID extensions commit 0767c811c859a798a7cfaa4247c6b2b15cc7e8aa Author: Roberto Di Cosmo <roberto@dicosmo.org> Date: Sat Mar 28 15:16:04 2020 +0100 Extend SWH PID definition with additional context qualifiers.
See https://jenkins.softwareheritage.org/job/DMOD/job/tests-on-diff/13/ for more details.