Page MenuHomeSoftware Heritage

Use string equality instead of substring search to check for mandatory fields.
ClosedPublic

Authored by vlorentz on Dec 10 2020, 12:14 PM.

Details

Summary

eg. 'atom:authorblahblah' should not be accepted when we expect 'atom:author'

Issue found while working on T2871.

Diff Detail

Repository
rDDEP Push deposit
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 17862
Build 27599: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 27598: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D4714 (id=16695)

Could not rebase; Attempt merge onto 5ed8942e84...

Updating 5ed8942e..3418f16d
Fast-forward
 swh/deposit/api/checks.py                          | 32 ++++-----------
 swh/deposit/tests/api/test_checks.py               | 48 ++++++++++++++++++++--
 .../tests/api/test_deposit_private_check.py        | 14 ++++---
 swh/deposit/tests/api/test_deposit_update.py       | 10 +++--
 4 files changed, 66 insertions(+), 38 deletions(-)
Changes applied before test
commit 3418f16d22ffb0834d7fa9546d2a0a7fe863102a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Dec 10 12:13:21 2020 +0100

    Use string equality instead of substring search to check for mandatory fields.
    
    eg. 'atom:authorblahblah' should not be accepted when we expect 'atom:author'

commit 9e79057cc2f7d88f01e3bc2220ccd88ca26f565b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Dec 10 12:09:40 2020 +0100

    Accept <codemeta:name> and <codemeta:author> as alternatives to <atom:name>/<atom:title> and <atom:author>.
    
    This was broken by a8e86a92eeeba4ed5114ff2ef4096ba25213d98a,
    as the check_metadata() checks whether there is a tag that *contains*
    the expected name, checking for 'author' (the old name of 'atom:author')
    accidentally matched 'codemeta:author' as well.
    
    This resulted in the right behavior in the majority of the cases
    (accepted 'codemeta:author'), but for the wrong reason, and explicitly
    renaming to atom:author broke this.
    
    Ditto for name.
    
    A future commit will remove the substring matching to remove
    false positives (eg. 'atom:authorblahblah' should not be accepted
    as 'atom:author')

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/475/ for more details.

This revision is now accepted and ready to land.Dec 10 2020, 12:18 PM
This revision was landed with ongoing or failed builds.Dec 10 2020, 1:52 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D4714 (id=16697)

Could not rebase; Attempt merge onto 5ed8942e84...

Updating 5ed8942e..c436adcf
Fast-forward
 swh/deposit/api/checks.py                          | 32 ++++-----------
 swh/deposit/tests/api/test_checks.py               | 48 ++++++++++++++++++++--
 .../tests/api/test_deposit_private_check.py        | 14 ++++---
 swh/deposit/tests/api/test_deposit_update.py       |  9 ++--
 4 files changed, 65 insertions(+), 38 deletions(-)
Changes applied before test
commit c436adcf2b49694f640efc71debcf921ee5f06bd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Dec 10 12:13:21 2020 +0100

    Use string equality instead of substring search to check for mandatory fields.
    
    eg. 'atom:authorblahblah' should not be accepted when we expect 'atom:author'

commit 00795b41d6ab75a2d9d79c61eaea9ca70e0617ce
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Dec 10 12:09:40 2020 +0100

    Accept <codemeta:name> and <codemeta:author> as alternatives to <atom:name>/<atom:title> and <atom:author>.
    
    This was broken by a8e86a92eeeba4ed5114ff2ef4096ba25213d98a,
    as the check_metadata() checks whether there is a tag that *contains*
    the expected name, checking for 'author' (the old name of 'atom:author')
    accidentally matched 'codemeta:author' as well.
    
    This resulted in the right behavior in the majority of the cases
    (accepted 'codemeta:author'), but for the wrong reason, and explicitly
    renaming to atom:author broke this.
    
    Ditto for name.
    
    A future commit will remove the substring matching to remove
    false positives (eg. 'atom:authorblahblah' should not be accepted
    as 'atom:author')

See https://jenkins.softwareheritage.org/job/DDEP/job/tests-on-diff/477/ for more details.