Page MenuHomeSoftware Heritage

Type annotations in metadata mappings
Needs RevisionPublic

Authored by lucifer4j on Mar 13 2022, 9:42 AM.

Details

Summary

Add type annotations to metadata mappings (T2259)

This patch adds type annotations to metadata mappings so that the mypy
type checker can detect bugs in code. While working on the path, I also
encountered these issues:

  1. mypy raises error on detecting for implicit None returns, so I had to add return None at many places.
  1. At a few places, @property s' are accessed using cls. mypy rightfully raises an error, so I had to add ignore to silence it for now. It appears Python 3.9+ supports class level properties so the ignores could be replaced by the proper handling when the minimum python version is raised.

Closes T2259.

Diff Detail

Repository
rDCIDX Metadata indexer
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 27421
Build 42909: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 42908: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D7342 (id=26539)

Rebasing onto 9e89b83be4...

Current branch diff-target is up to date.
Changes applied before test
commit f60c67aa77c039db8f5399c7a79577b1dd658809
Author: Kartik Ohri <kartikohri13@gmail.com>
Date:   Sun Mar 13 14:09:23 2022 +0530

    Type annotations in metadata mappings
    
    Summary: Add type annotations to files in swh/indexer/metadata_dictionary/ (T2259)
    
    This patch adds type annotations to metadata mappings so that the mypy
    type checker can detect bugs in code. While working on the path, I also
    encountered these issues:
    
    1) TypedDict does not support using variables in defining or assigning
    keys. At a lot of places, the code uses SCHEMA_URI to construct the
    keys. mypy raises errors at these places. There is a tradeoff between
    replacing SCHEMA_URI with its literal value everywhere and using
    type: ignore hints.
    
    2) mypy raises error on detecting for implicit None returns, so I had
    to add return None at many places.
    
    3) At a few places, @property s' are accessed using `cls`. mypy
    rightfully raises an error, so I had to add ignore to silence it for
    now. It appears Python 3.9+ supports class level properties so the
    ignores could be replaced by the proper handling when the minimum
    python version is raised.
    
    Closes T2259.
    
    Reviewers: vlorentz

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/197/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/197/console

Harbormaster returned this revision to the author for changes because remote builds failed.Mar 13 2022, 9:43 AM
Harbormaster failed remote builds in B27421: Diff 26539!

Build has FAILED

Patch application report for D7342 (id=26540)

Rebasing onto 9e89b83be4...

Current branch diff-target is up to date.
Changes applied before test
commit 518ac10085e4109572a9108726286c6a18441b69
Author: Kartik Ohri <kartikohri13@gmail.com>
Date:   Sun Mar 13 15:51:53 2022 +0530

    Fix mypy issues

commit f60c67aa77c039db8f5399c7a79577b1dd658809
Author: Kartik Ohri <kartikohri13@gmail.com>
Date:   Sun Mar 13 14:09:23 2022 +0530

    Type annotations in metadata mappings
    
    Summary: Add type annotations to files in swh/indexer/metadata_dictionary/ (T2259)
    
    This patch adds type annotations to metadata mappings so that the mypy
    type checker can detect bugs in code. While working on the path, I also
    encountered these issues:
    
    1) TypedDict does not support using variables in defining or assigning
    keys. At a lot of places, the code uses SCHEMA_URI to construct the
    keys. mypy raises errors at these places. There is a tradeoff between
    replacing SCHEMA_URI with its literal value everywhere and using
    type: ignore hints.
    
    2) mypy raises error on detecting for implicit None returns, so I had
    to add return None at many places.
    
    3) At a few places, @property s' are accessed using `cls`. mypy
    rightfully raises an error, so I had to add ignore to silence it for
    now. It appears Python 3.9+ supports class level properties so the
    ignores could be replaced by the proper handling when the minimum
    python version is raised.
    
    Closes T2259.
    
    Reviewers: vlorentz

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/198/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/198/console

Harbormaster returned this revision to the author for changes because remote builds failed.Mar 13 2022, 11:26 AM
Harbormaster failed remote builds in B27422: Diff 26540!
  • Remove TypedDict's as per chat

Build is green

Patch application report for D7342 (id=26543)

Rebasing onto 9e89b83be4...

Current branch diff-target is up to date.
Changes applied before test
commit 601b6a0def92f7324cdb2ca12a20ba4f1504b62f
Author: Kartik Ohri <kartikohri13@gmail.com>
Date:   Mon Mar 14 15:29:48 2022 +0530

    Remove TypedDict's as per chat

commit 518ac10085e4109572a9108726286c6a18441b69
Author: Kartik Ohri <kartikohri13@gmail.com>
Date:   Sun Mar 13 15:51:53 2022 +0530

    Fix mypy issues

commit f60c67aa77c039db8f5399c7a79577b1dd658809
Author: Kartik Ohri <kartikohri13@gmail.com>
Date:   Sun Mar 13 14:09:23 2022 +0530

    Type annotations in metadata mappings
    
    Summary: Add type annotations to files in swh/indexer/metadata_dictionary/ (T2259)
    
    This patch adds type annotations to metadata mappings so that the mypy
    type checker can detect bugs in code. While working on the path, I also
    encountered these issues:
    
    1) TypedDict does not support using variables in defining or assigning
    keys. At a lot of places, the code uses SCHEMA_URI to construct the
    keys. mypy raises errors at these places. There is a tradeoff between
    replacing SCHEMA_URI with its literal value everywhere and using
    type: ignore hints.
    
    2) mypy raises error on detecting for implicit None returns, so I had
    to add return None at many places.
    
    3) At a few places, @property s' are accessed using `cls`. mypy
    rightfully raises an error, so I had to add ignore to silence it for
    now. It appears Python 3.9+ supports class level properties so the
    ignores could be replaced by the proper handling when the minimum
    python version is raised.
    
    Closes T2259.
    
    Reviewers: vlorentz

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/199/ for more details.

lucifer4j edited the summary of this revision. (Show Details)
swh/indexer/metadata_dictionary/base.py
132

Need to add # type: ignore because property is accessed using cls.

vlorentz added a reviewer: Reviewers.

Hi,

Thanks for this, it looks good.

One small change, though: instead of repeating Dict[str, Any] (and Dict[str, str] and others) everywhere, you could create a specific type alias for dict that represent JSON-LD documents, eg. JsonldDict = Dict[str, Any], and use it in signatures

This revision now requires changes to proceed.Mar 14 2022, 11:46 AM