Page MenuHomeSoftware Heritage

api/metadata: Fix issues detected with hypothesis
ClosedPublic

Authored by anlambert on Aug 19 2021, 12:12 PM.

Details

Summary

Running metadata tests with multiple hyothesis examples uncovered those
issues in api-1-raw-extrinsic-metadata-swhid Web API view:

  • ExtendedSWHID.from_string must be used to parse extended SWHID.
  • link-next URL for pagination was invalid.
  • next_page_token must be encoded before providing it to urlsafe_b64encode.

Depends on D6115

Diff Detail

Repository
rDWAPPS Web applications
Branch
api-metadata-fix
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 23085
Build 36000: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 35999: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D6116 (id=22121)

Could not rebase; Attempt merge onto 885be5dde5...

Updating 885be5dd..bb360104
Fast-forward
 swh/web/api/views/metadata.py             |  12 +-
 swh/web/tests/api/views/test_metadata.py  | 194 ++++++++--------
 swh/web/tests/api/views/test_origin.py    | 355 ++++++++++++++++--------------
 swh/web/tests/browse/views/test_origin.py |   2 -
 swh/web/tests/common/test_archive.py      |  33 +--
 swh/web/tests/common/test_identifiers.py  |   9 +-
 swh/web/tests/conftest.py                 |  42 +++-
 swh/web/tests/strategies.py               |  18 +-
 8 files changed, 371 insertions(+), 294 deletions(-)
Changes applied before test
commit bb36010483a2798b56d224e268256f5639bed9db
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Aug 19 12:04:12 2021 +0200

    api/metadata: Fix issues detected with hypothesis
    
    Running metadata tests with multiple hyothesis examples uncovered those
    issues in api-1-raw-extrinsic-metadata-swhid Web API view:
    
      - ExtendedSWHID.from_string must be used to parse extended SWHID.
    
      - link-next URL for pagination was invalid.
    
      - next_page_token must be encoded before providing it to urlsafe_b64encode.

commit fc086320d520c5f6ecc4411ac5fd4b2b71bcbdad
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Aug 19 12:00:40 2021 +0200

    tests: Ensure they all can be run with multiple hypothesis examples
    
    When running swh-web tests using 'make test-full', multiple hypothesis examples
    are provided as test inputs instead of a single one when running 'make test'.
    In that case some tests were failing mostly due to the fact they were not stateless
    between test runs.
    
    That commit fixes those tests execution and ensures stateless test runs by:
    
      - turning some hypothesis strategies into stateless ones
    
      - turning the archive_data fixture into a function scope one
    
      - using subtest fixture from pytest-subtesthack when it is required to reset
        the archive_data fixture for a test between hypothesis example runs
    
    As a consequence, tests will be longer to execute as global state will be reseted
    between each test.
    
    Nevertheless, metadata related tests are still failing when running with multiple
    hypothesis examples, fix will be handled in next commit.
    
    Related to T1695

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1008/ for more details.

Oh I missed D6081, I guess we both encounter the same kind of issues but fixes are not exactly the same.

We don't want to allow extended SWHID in the public API, you should restrict data generated by hypothesis instead

Build is green

Patch application report for D6116 (id=22123)

Could not rebase; Attempt merge onto 885be5dde5...

Updating 885be5dd..a50e0c5a
Fast-forward
 swh/web/api/views/metadata.py             |  12 +-
 swh/web/tests/api/views/test_metadata.py  | 194 ++++++++--------
 swh/web/tests/api/views/test_origin.py    | 355 ++++++++++++++++--------------
 swh/web/tests/browse/views/test_origin.py |   2 -
 swh/web/tests/common/test_archive.py      |  33 +--
 swh/web/tests/common/test_identifiers.py  |   9 +-
 swh/web/tests/conftest.py                 |  42 +++-
 swh/web/tests/strategies.py               |  18 +-
 8 files changed, 371 insertions(+), 294 deletions(-)
Changes applied before test
commit a50e0c5aa23b40daecdec9a86053ff8b1d919273
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Aug 19 12:04:12 2021 +0200

    api/metadata: Fix issues detected with hypothesis
    
    Running metadata tests with multiple hyothesis examples uncovered those
    issues in api-1-raw-extrinsic-metadata-swhid Web API view:
    
      - ExtendedSWHID.from_string must be used to parse extended SWHID.
    
      - link-next URL for pagination was invalid.
    
      - next_page_token must be encoded before providing it to urlsafe_b64encode.

commit 87cc9e042dc2e3cc79c63c25b0bd5f04484693b4
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Aug 19 12:00:40 2021 +0200

    tests: Ensure they all can be run with multiple hypothesis examples
    
    When running swh-web tests using 'make test-full', multiple hypothesis examples
    are provided as test inputs instead of a single one when running 'make test'.
    In that case some tests were failing mostly due to the fact they were not stateless
    between test runs.
    
    That commit fixes those tests execution and ensures stateless test runs by:
    
      - turning some hypothesis strategies into stateless ones
    
      - turning the archive_data fixture into a function scope one
    
      - using subtest fixture from pytest-subtesthack when it is required to reset
        the archive_data fixture for a test between hypothesis example runs
    
    As a consequence, tests will be longer to execute as global state will be reseted
    between each test.
    
    Nevertheless, metadata related tests are still failing when running with multiple
    hypothesis examples, fix will be handled in next commit.
    
    Related to T1695

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1010/ for more details.

We don't want to allow extended SWHID in the public API, you should restrict data generated by hypothesis instead

Ack, I think I will create a new strategy ensuring target object types are not extended ones.

Update: Provide RawExtrinsicMetadata targetting core SWHIDs as test inputs and reverse related changes.

Build is green

Patch application report for D6116 (id=22128)

Could not rebase; Attempt merge onto 885be5dde5...

Updating 885be5dd..869b6077
Fast-forward
 swh/web/api/views/metadata.py             |   8 +-
 swh/web/tests/api/views/test_metadata.py  | 216 ++++++++++--------
 swh/web/tests/api/views/test_origin.py    | 355 ++++++++++++++++--------------
 swh/web/tests/browse/views/test_origin.py |   2 -
 swh/web/tests/common/test_archive.py      |  33 +--
 swh/web/tests/common/test_identifiers.py  |   9 +-
 swh/web/tests/conftest.py                 |  42 +++-
 swh/web/tests/strategies.py               |  18 +-
 8 files changed, 390 insertions(+), 293 deletions(-)
Changes applied before test
commit 869b60771f2e3157d77de4f15f2af81e895c80e9
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Aug 19 12:04:12 2021 +0200

    api/metadata: Fix issues detected with hypothesis
    
    Running metadata tests with multiple hyothesis examples uncovered those
    issues in api-1-raw-extrinsic-metadata-swhid Web API view:
    
      - RawExtrinsincMetaData only targetting core SWHIds must be provided as test inputs.
    
      - link-next URL for pagination was invalid.
    
      - next_page_token must be encoded before providing it to urlsafe_b64encode.

commit 87cc9e042dc2e3cc79c63c25b0bd5f04484693b4
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Aug 19 12:00:40 2021 +0200

    tests: Ensure they all can be run with multiple hypothesis examples
    
    When running swh-web tests using 'make test-full', multiple hypothesis examples
    are provided as test inputs instead of a single one when running 'make test'.
    In that case some tests were failing mostly due to the fact they were not stateless
    between test runs.
    
    That commit fixes those tests execution and ensures stateless test runs by:
    
      - turning some hypothesis strategies into stateless ones
    
      - turning the archive_data fixture into a function scope one
    
      - using subtest fixture from pytest-subtesthack when it is required to reset
        the archive_data fixture for a test between hypothesis example runs
    
    As a consequence, tests will be longer to execute as global state will be reseted
    between each test.
    
    Nevertheless, metadata related tests are still failing when running with multiple
    hypothesis examples, fix will be handled in next commit.
    
    Related to T1695

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1012/ for more details.

This revision is now accepted and ready to land.Aug 20 2021, 11:39 AM

Build is green

Patch application report for D6116 (id=22135)

Rebasing onto e18d30e5bc...

Current branch diff-target is up to date.
Changes applied before test
commit c7548f93a171667d7ac7ea84ecbaceaa2f54789b
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Thu Aug 19 12:04:12 2021 +0200

    api/metadata: Fix issues detected with hypothesis
    
    Running metadata tests with multiple hyothesis examples uncovered those
    issues in api-1-raw-extrinsic-metadata-swhid Web API view:
    
      - RawExtrinsincMetaData only targetting core SWHIds must be provided as test inputs.
    
      - link-next URL for pagination was invalid.
    
      - next_page_token must be encoded before providing it to urlsafe_b64encode.

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1014/ for more details.