- for consistency with the serializers in swh.core
- it's not clear to me that we ever needed that (it was added in rDJNL64cd01fc7c049123e16341fa93239013f152bacc with no rational)
- it messes with swh-search reading from the journal and expecting JSON-like data. (T2876#54729)
Details
- Reviewers
- None
- Group Reviewers
Reviewers - Commits
- rDJNLc4bf313c83e9: serializers: Deserialize as lists instead of tuples
Tests will fail because of https://github.com/msgpack/msgpack-python/issues/451
Diff Detail
- Repository
- rDJNL Journal infrastructure
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Build has FAILED
Patch application report for D4758 (id=16847)
Rebasing onto d8cd3f9bbe...
Current branch diff-target is up to date.
Changes applied before test
commit 34bcfa4329983bb99e40ceb4b8b41a147c47baf0 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Dec 17 13:03:18 2020 +0100 serializers: Deserialize as lists instead of tuples 1. for consistency with the serializers in swh.core 2. it's not clear to me that we ever needed that (it was added in 64cd01fc7c049123e16341fa93239013f152bacc with no rational) 3. it messes with swh-search reading from the journal and expecting JSON-like data.
Link to build: https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/134/
See console output for more information: https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/134/console
Build has FAILED
Patch application report for D4758 (id=16848)
Rebasing onto d8cd3f9bbe...
Current branch diff-target is up to date.
Changes applied before test
commit 79c32d7a890894c31eef186adeffc35ebffbbc71 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Dec 17 13:03:18 2020 +0100 serializers: Deserialize as lists instead of tuples 1. for consistency with the serializers in swh.core 2. it's not clear to me that we ever needed that (it was added in 64cd01fc7c049123e16341fa93239013f152bacc with no rational) 3. it messes with swh-search reading from the journal and expecting JSON-like data.
Link to build: https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/135/
See console output for more information: https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/135/console
Build has FAILED
Patch application report for D4758 (id=16849)
Rebasing onto d8cd3f9bbe...
Current branch diff-target is up to date.
Changes applied before test
commit c4bf313c83e9a832cde2dc11f3fce6e351eed43b Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Dec 17 13:03:18 2020 +0100 serializers: Deserialize as lists instead of tuples 1. for consistency with the serializers in swh.core 2. it's not clear to me that we ever needed that (it was added in 64cd01fc7c049123e16341fa93239013f152bacc with no rational) 3. it messes with swh-search reading from the journal and expecting JSON-like data.
Link to build: https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/136/
See console output for more information: https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/136/console
it's not clear to me that we ever needed that (it was added ...
Well, I only recall it was around the time we started using more tuple everywhere
because it's immutable and list are not.
Aside that, i don't remember...
So, to summarize here, the rationale here try to keep serliazed input and serialized output consistent.
Well, i think we should be consistent all the time then.
Tuple stays tuple, List stays list... to keep some sanity...
Although, now i'm wondering if there was not a joke about msgpack not dealing
correctly with list at some point...
@douardda Do you recall the rationale of D3207 (for the ensure_tuples function)?
(which is becoming ensure_lists in here)
rDJNL64cd01fc7c049123e16341fa93239013f152bacc
But dicts are mutable. We moved to model objects since then, and .from_dict will do the conversion to tuples when needed
What do you mean? If that's the answer to my question to david, that does not help.
You said yourself in the description it was missing the rationale.
I'm talking about the comment inlined below (from that diff ^ description).
It also needs the kafka_to_value serializer function to ensure tuples are
used instead of lists.
The previous diff D3207 description does not explain why it needs to be tuples instead of lists.
I'd like to know the explicit reason now [1]
All the more reasons that now you seem to want to keep list both for input as list and tuple...
That change back and forth in behavior does not bode well...
But dicts are mutable. We moved to model objects since then, and .from_dict will do the conversion to tuples when needed
Sorry, I don't follow...
[1] I forgot to ask the rationale at the time, time to fix it.
Sorry, I didn't mean to paste that
But dicts are mutable. We moved to model objects since then, and .from_dict will do the conversion to tuples when needed
Sorry, I don't follow...
You say we use tuples instead of lists because they are immutable. But kafka_to_value returns dicts, so they are mutable values anyway.
Sorry, I didn't mean to paste that
ok
You say we use tuples instead of lists because they are immutable. But kafka_to_value
returns dicts, so they are mutable anyway.
oh yeah, sure.
Still, that would not be the first time that we are inconsistent "temporarily" [1] and
then never get a chance to improve it later on.
[1] I actually don't recall if we have any plan to actually migrate the journal to use
model objects (and stop using dicts as well)...
But anyway, back on track, i'm a bit at a loss right now on this.
I'll ping @douardda ;)
It looks to me that this really about
- it messes with swh-search reading from the journal and expecting JSON-like data. (T2876#54729)
in which case I'm not sure why you do not adapt swh-search accordingly. Why not put and use that ensure_list function in the swh-search?
There are two options:
- if the journal keeps lists as-is, then:
- this couple of tests needs ensure_list
- swh-model converts to tuples
- if the journal converts to tuples with ensure_tuples, then:
- swh-search needs ensure_list
- swh-model still copies tuples (and needs the code to convert from lists anyway)
So option 2 needs more code in total, runs more code in production (convertion back and forth).
I don't see any reason to prefer option 2, besides running a little less code in tests.
Sorry @douardda, but I'm going to assume you are fine with this diff, because I'd really like this to work before Vincent goes on vacation