Details
tox
Diff Detail
- Repository
- rDCORE Foundations and core functionalities
- Branch
- master
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 14019 Build 21516: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 21515: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D3632 (id=12778)
Rebasing onto 028ef739da...
Current branch diff-target is up to date.
Changes applied before test
commit 3cab2527744dfcdbb831d7251258ded514036a1b Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jul 28 15:41:23 2020 +0200 swh.core.api: Expose a serializable PagedResult object This will allow typing paginated endpoints across our modules (swh.storage, swh.search, swh.indexer, ...). Related to T645
See https://jenkins.softwareheritage.org/job/DCORE/job/tests-on-diff/25/ for more details.
cool!
swh/core/api/serializers.py | ||
---|---|---|
56 | I wanted to name it like this for a moment, but I have a small preference for "api_class" now. Up to you. or "swh_core_class" / "swh_api_class" |
Oh, and please rename "swh/core/api/model.py" to "swh/core/api/classes.py", it has nothing to do with the model
swh/core/api/serializers.py | ||
---|---|---|
56 | api_class appeals to me ;) |
Oh, and please rename "swh/core/api/model.py" to "swh/core/api/classes.py", it has nothing to do with the model
well i could not make up a proper name so...
I'll adapt accordingly.
This looks like a good direction to go in, and a generic useful thing, but I'm a bit worried about how open-ended/lax the serialization/deserialization is, as it stands. Nothing would prevent a user expecting a PagedResult[ResultTypeA] from receiving a response full of ResultTypeB results if the internal API wires get crossed somehow (which seems unlikely, but...).
Of course, doing runtime type checking on these objects means opening a https://stackoverflow.com/questions/48572831/how-to-access-the-type-arguments-of-typing-generic-shaped can of worms, which doesn't seem very appealing either.
I'm not sure if the runtime type checking is tractable, but please at least address the inline comments (you can drop me from reviewers after you've done that)
swh/core/api/model.py | ||
---|---|---|
18 ↗ | (On Diff #12778) | Please give this type variable a meaningful name (e.g. ResultType). |
25 ↗ | (On Diff #12778) | If the class is frozen then this shouldn't be a mutable object. But I don't think making it frozen has much value in the first place. I'm not enthused by the lack of runtime type checking. Deserializing a PagedResult containing serialized results of an unexpected type would not report any error. |
27 ↗ | (On Diff #12778) | I think this should be a generic type too. We may need to use composite keys (e.g. what swh.scheduler needs for get_listed_origins). |
swh/core/api/serializers.py | ||
33 | I don't think this is needed. | |
56 | Why not just call it paged_result? your serializer/deserializer only supports these anyway. |
swh/core/api/model.py | ||
---|---|---|
18 ↗ | (On Diff #12778) | This goes against conventions in multiple languages (at least Python, C++, Rust) and PEP8:
https://www.python.org/dev/peps/pep-0008/#type-variable-names |
27 ↗ | (On Diff #12778) | The whole point of next_page_token is that it's an opaque token that can change at the server's will. If you need composite values, the server can use msgpack + base64 (that's what swh-search does) |
swh/core/api/serializers.py | ||
33 | It will be if we ever need another class that works like PagedResult | |
56 | In case we need another class in the future. |
Adapt according to the first review first:
- core_class -> api_class
- rename model to classes
swh/core/api/model.py | ||
---|---|---|
18 ↗ | (On Diff #12778) | ResultType is a short, CapWords name. |
27 ↗ | (On Diff #12778) | This is a return type for our backend API/RPC calls. It doesn't have to be opaque, let's not make it gratuitously so. |
swh/core/api/serializers.py | ||
33 | So, it's not needed yet. | |
56 | Then add another encoder and decoder pair then. |
Build is green
Patch application report for D3632 (id=12784)
Rebasing onto 028ef739da...
Current branch diff-target is up to date.
Changes applied before test
commit 94947bcba5b9e934f841b47c22d188388ab05ed3 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jul 28 15:41:23 2020 +0200 swh.core.api: Expose a serializable PagedResult object This will allow typing paginated endpoints across our modules (swh.storage, swh.search, swh.indexer, ...). Related to T645
See https://jenkins.softwareheritage.org/job/DCORE/job/tests-on-diff/26/ for more details.
swh/core/api/model.py | ||
---|---|---|
27 ↗ | (On Diff #12778) | It's not gratuitous, it's hiding implementation details. It also keeps PagedResult simpler and with a type validator |
swh/core/api/model.py | ||
---|---|---|
27 ↗ | (On Diff #12778) | Yeah, i was gonna ask to "type validate" it. |
swh/core/api/serializers.py | ||
---|---|---|
56 | then why do we need to abstract now the Optional[str] for the PagedResult next_page_token field? |
Adapt according to most of the second review:
- Drop unnecessary type
- Rename generic T type to ResultType
- Rename api_class to paged_result
- Drop the frozen property of PagedResult
What did not change:
- next_page_token is still typed Optional[str]
swh/core/api/model.py | ||
---|---|---|
27 ↗ | (On Diff #12778) | how* to "type validate" it... |
Build is green
Patch application report for D3632 (id=12785)
Rebasing onto 028ef739da...
Current branch diff-target is up to date.
Changes applied before test
commit 14f75721488bf0085f543cbdc5dd49e0b9f5bb9c Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jul 28 15:41:23 2020 +0200 swh.core.api: Expose a serializable PagedResult object This will allow typing paginated endpoints across our modules (swh.storage, swh.search, swh.indexer, ...). Related to T645
See https://jenkins.softwareheritage.org/job/DCORE/job/tests-on-diff/27/ for more details.
swh/core/api/serializers.py | ||
---|---|---|
56 | We have other typed paginated API endpoints already existing in our codebase; One of them uses a composite next page token: https://forge.softwareheritage.org/source/swh-scheduler/browse/master/swh/scheduler/model.py$183-193 If you hardcode the next page token type as a string, then to be able to reuse this code, you would need the API function using this return type to handle encoding / decoding an opaque next page token, instead of just using its own explicit and transparent types. I'm not convinced these internal APIs generally need opaque page tokens, on the contrary. |
swh/core/api/serializers.py | ||
---|---|---|
56 |
I missed that part. Thanks.
Well, I recall we decided to go that road for cassandra storage and for storage paginated endpoints in general then. Though i don't have links that demonstrates this... (Note that i'm not entirely clear on what you explained to me yet, still thinking, it's on me heh ;) |
swh/core/api/serializers.py | ||
---|---|---|
56 | Yeah, sorry, I didn't mean to question whether the specific swh.storage APIs you're looking at should have opaque tokens or not; I understand why they ended up that way and I think that's the most sensible choice. I'm saying having opaque pagination tokens in internal APIs should be a last resort for specific use-cases where we can't do better. |
Build is green
Patch application report for D3632 (id=12786)
Rebasing onto 028ef739da...
Current branch diff-target is up to date.
Changes applied before test
commit 2d88a36722d162eb458af437e16844a664832771 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jul 28 15:41:23 2020 +0200 swh.core.api: Expose a serializable PagedResult object This will allow typing paginated endpoints across our modules (swh.storage, swh.search, swh.indexer, ...). Related to T645
See https://jenkins.softwareheritage.org/job/DCORE/job/tests-on-diff/28/ for more details.
swh/core/api/classes.py | ||
---|---|---|
25 | Otherwise: swh/core/api/classes.py:25: error: 'TokenType' is a type variable and only valid in type context |
swh/core/api/model.py | ||
---|---|---|
18 ↗ | (On Diff #12778) | I disagree that it's short. The examples in PEP8 all shorten nouns (ie. they are not AnyString nor Number) But as a compromise we could use TRes, or even TResult. |
swh/core/api/serializers.py | ||
---|---|---|
56 |
Why do you see them as a bad thing? Even if they are for internal APIs, it's good to hide implementation details of one component from the others. And if you want clients to use the data they contain, then this data shouldn't be in a pagination token. |
swh/core/api/classes.py | ||
---|---|---|
18 | Compromising, TResult for shorter as expressed by val below. |
Build is green
Patch application report for D3632 (id=12791)
Rebasing onto 028ef739da...
Current branch diff-target is up to date.
Changes applied before test
commit 1e48fca78aa0a180c2af18be2cf2766aba6e3379 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jul 28 15:41:23 2020 +0200 core.api: Expose serializable PagedResult object for pagination api This will allow typing paginated endpoints across our modules (swh.storage, swh.search, swh.indexer, swh.scheduler, ...). Related to T645
See https://jenkins.softwareheritage.org/job/DCORE/job/tests-on-diff/29/ for more details.
We no longer use attrs-strict, so you should remove it from the requirements (and add attrs directly instead).
Actually, since we're no longer using any features specific to attrs, could you switch to a dataclass, so that's one less dependency for swh-core?
We no longer use attrs-strict, so you should remove it from the requirements (and add attrs directly instead).
yeah i saw, was gonna remove it but got side-tracked on how to properly type stuff in the storage...
Actually, since we're no longer using any features specific to attrs, could you switch to a dataclass, so that's one less dependency for swh-core?
do you have a link for the dataclass stuff please?
swh/core/api/classes.py | ||
---|---|---|
24 | Do we want to have a List here or an Iterable or a Sequence? |
https://docs.python.org/3/library/dataclasses.html
You shouldn't need any more than what the example at the top does
swh/core/api/classes.py | ||
---|---|---|
24 | all three are fine, but I don't see any reason not to use List. |
swh/core/api/classes.py | ||
---|---|---|
24 | In my tryouts, in storage, at some point, I tried a specific type involving PagedResult[BaseModel, str]. Still, mypy tried to lure me into the variance/co-variance/invariance things. Keeping List then ;) |
Adapt according to discussion:
- Use dataclasses over attr, 1 less deps
- Drop attr_strict as it's no longer used here
Build is green
Patch application report for D3632 (id=12793)
Rebasing onto 028ef739da...
Current branch diff-target is up to date.
Changes applied before test
commit 83d82112cf984ae9e473348d9532a254cfd90d9a Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jul 28 15:41:23 2020 +0200 core.api: Expose serializable PagedResult object for pagination api This will allow typing paginated endpoints across our modules (swh.storage, swh.search, swh.indexer, swh.scheduler, ...). Related to T645
See https://jenkins.softwareheritage.org/job/DCORE/job/tests-on-diff/30/ for more details.
Build is green
Patch application report for D3632 (id=12795)
Rebasing onto 028ef739da...
Current branch diff-target is up to date.
Changes applied before test
commit 1ec4df82e4774de702bd6614ab64b0609dcad720 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jul 28 15:41:23 2020 +0200 core.api: Expose serializable PagedResult object for pagination api This will allow typing paginated endpoints across our modules (swh.storage, swh.search, swh.indexer, swh.scheduler, ...). Related to T645
See https://jenkins.softwareheritage.org/job/DCORE/job/tests-on-diff/32/ for more details.