That looks promising.
Just have a couple of docstring fix and interrogation.
why don't use rename origin_id here?
origin (str): the origin's url
do you still need to do that now?
I mean, do we still need to pass the full origin to the journal (including our origin id)?
wow, it's been a while since i looked to the origin-get implementation...
It's kind of a mess to read...
That has nothing to do with the diff though...
In any case, last return statement, is res['url'] is None possible at all here now?
I guess yes because we don't know what the url inputted can be?
That doesn't include the id, but it includes the type, which is currently needed by origin_add
Let's not make this diff even bigger.
Yes, if the origin is not already in storage, because missing rows are returned by pg as (null, null) because it's a left join instead of inner join.
Build has FAILED
Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tox/752/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tox/752/console
Build has FAILED
Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tox/759/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tox/759/console
I've rebased this on top of the origin['type'] removal to see whether it could land.
However, this now needs D2174 on swh.model, as the to_dict recursion introduced in OriginVisit breaks when you attr.evlove(OriginVisit, origin=origin_url) (instead of attr.evolve(OriginVisit, origin=Origin(origin_url). This happens in a couple places and is inconsistent with the declared type for OriginVisit.origin.
I guess we should just end up replacing origin with origin_url in OriginVisit objects, as there's not much point carrying a single-key-dict around...