Page MenuHomeSoftware Heritage

use RevisionIntrinsicMetadataRow in the storage interface instead of dicts.
ClosedPublic

Authored by vlorentz on Oct 7 2020, 4:01 PM.

Diff Detail

Repository
rDCIDX Metadata indexer
Branch
attrs
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 16007
Build 24627: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 24626: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D4184 (id=14730)

Could not rebase; Attempt merge onto 51b10e891d...

Updating 51b10e8..f954853
Fast-forward
 swh/indexer/ctags.py                         |  40 +-
 swh/indexer/fossology_license.py             |  25 +-
 swh/indexer/indexer.py                       |  56 +-
 swh/indexer/metadata.py                      | 118 ++--
 swh/indexer/metadata_dictionary/base.py      |  18 +-
 swh/indexer/mimetype.py                      |  31 +-
 swh/indexer/origin_head.py                   |  15 +-
 swh/indexer/storage/__init__.py              | 292 +++++----
 swh/indexer/storage/api/client.py            |   3 +
 swh/indexer/storage/api/serializers.py       |  26 +
 swh/indexer/storage/api/server.py            |   9 +-
 swh/indexer/storage/converters.py            |  15 +-
 swh/indexer/storage/db.py                    |  34 +-
 swh/indexer/storage/in_memory.py             | 172 +++--
 swh/indexer/storage/interface.py             | 162 +++--
 swh/indexer/storage/model.py                 |   3 +-
 swh/indexer/tests/storage/conftest.py        |  10 +-
 swh/indexer/tests/storage/test_converters.py |  17 +-
 swh/indexer/tests/storage/test_metrics.py    |   8 +-
 swh/indexer/tests/storage/test_server.py     |  14 +-
 swh/indexer/tests/storage/test_storage.py    | 928 +++++++++++++--------------
 swh/indexer/tests/test_cli.py                |  19 +-
 swh/indexer/tests/test_ctags.py              |  23 +-
 swh/indexer/tests/test_fossology_license.py  |  22 +-
 swh/indexer/tests/test_indexer.py            |   4 +-
 swh/indexer/tests/test_metadata.py           |  69 +-
 swh/indexer/tests/test_mimetype.py           |  46 +-
 swh/indexer/tests/test_origin_metadata.py    | 101 ++-
 swh/indexer/tests/utils.py                   |  72 +--
 29 files changed, 1226 insertions(+), 1126 deletions(-)
 create mode 100644 swh/indexer/storage/api/serializers.py
Changes applied before test
commit f954853a4cd8bfcc17fee17309db8aed4e75f7ad
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 16:01:15 2020 +0200

    use RevisionIntrinsicMetadataRow in the storage interface instead of dicts.
    
    also add typing to some test functions

commit a822ada3a012d1a384ac7950527a44344568cdf1
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 15:10:04 2020 +0200

    use ContentMetadataRow in the storage interface instead of dicts.

commit 7566fec49a46abd03c2eb6abced7a7f786da6a4c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 13:39:49 2020 +0200

    use ContentCtagsRow in the storage interface instead of dicts.

commit af3c220b14e22ae270679511045f1f6563febe9b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 12:43:45 2020 +0200

    use ContentLanguageRow in the storage interface instead of dicts.

commit b4d084d675bb89f5d6803ddbf75bbe54c3e91c03
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 11:25:05 2020 +0200

    indexer.storage: Change return types from Iterable to List
    
    For consistency with the main storage.

commit 570816fd87a1451e560fce8881a0da5626cf0ab4
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 11:06:15 2020 +0200

    license: use ContentLicenseRow in the storage interface instead of dicts.

commit d346feffffaac7e8cc5c3bc0eedbffc66db1d37e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 11:43:05 2020 +0200

    all indexers: make index() return a list of results instead of a single one.
    
    1. it was wrongfully annotated as '-> TResult' even though some indexers
       can return None
    2. in a future commit, the fossology indexer will need to return multiple
       results.

commit 7fe4a89dbcf221a0125c474640cbcdc8b01b1df2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 10:54:35 2020 +0200

    base indexers: add type annotation for self.{storage,idx_storage}.

commit 486ee085f5ee999da38793815ae462c00ece4efb
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 10:48:44 2020 +0200

    indexer.storage: Update docstrings of mimetype-related endpoints.

commit 4ec112337909cc364bec119b2ef6ae047a81b96f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 10:46:07 2020 +0200

    indexer.storage: Change return type annotation from Iterator to Iterable.
    
    When going through the RPC, it's turned into a list.

commit e8e94cf237471636e84ceac5d1740cfa7a982f90
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 7 10:01:26 2020 +0200

    tests: Enable type-checking on storage test functions.
    
    By adding a simple type annotation to the test functions' signature.

commit 44cee8f213f26e01aa63519a8c21962b2fca460b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Oct 6 15:42:54 2020 +0200

    Make base indexers generic, with the result of index() as their type parameter.
    
    So the type of results can be statically checked, instead of needing to
    assert it to please mypy.

commit c3caf300830030f0cd1b3002d715e430582db8f1
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Oct 6 15:35:39 2020 +0200

    mimetype: use ContentMimetypeRow in the storage interface instead of dicts.
    
    This temporarily adds mess in the generic tests to support both rows and dicts,
    but I'll remove it once I migrated all endpoints.

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/82/ for more details.

ardumont added a subscriber: ardumont.
ardumont added inline comments.
swh/indexer/metadata.py
149

What's the new data?

Maybe update a bit the docstring then.

This revision is now accepted and ready to land.Oct 8 2020, 9:16 AM
swh/indexer/metadata.py
149

In this diff, it's nothing; but D4189 starts using it.