Page MenuHomeSoftware Heritage

Stop supporting origin ids in API (except in origin_get_range).
ClosedPublic

Authored by vlorentz on Oct 17 2019, 7:04 PM.

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

vlorentz created this revision.Oct 17 2019, 7:04 PM
ardumont requested changes to this revision.Oct 18 2019, 2:35 AM
ardumont added a subscriber: ardumont.

Thanks.

That looks promising.

Just have a couple of docstring fix and interrogation.

swh/storage/in_memory.py
874

origin (str):

1493

why don't use rename origin_id here?

swh/storage/storage.py
1102

origin (str): the origin's url

1288

do you still need to do that now?

I mean, do we still need to pass the full origin to the journal (including our origin id)?

1493

wow, it's been a while since i looked to the origin-get implementation...
too many returns/else within the block instructions...

It's kind of a mess to read...
i'd either go with returns all around and remove the redundant else
or remove the returns and use a single result variable which i'd return at the bottom...

That has nothing to do with the diff though...
but since we are touching it...


In any case, last return statement, is res['url'] is None possible at all here now?

I guess yes because we don't know what the url inputted can be?

This revision now requires changes to proceed.Oct 18 2019, 2:35 AM

Just have a couple of docstring fix and interrogation.

Ah and you guessed it, i miss some migration scripts with the *sql functions impacted ;)

vlorentz marked 3 inline comments as done.Oct 18 2019, 11:25 AM
vlorentz added inline comments.
swh/storage/storage.py
1288

That doesn't include the id, but it includes the type, which is currently needed by origin_add

1493

That has nothing to do with the diff though...
but since we are touching it...

Let's not make this diff even bigger.

In any case, last return statement, is res['url'] is None possible at all here now?

Yes, if the origin is not already in storage, because missing rows are returned by pg as (null, null) because it's a left join instead of inner join.

vlorentz updated this revision to Diff 7338.Oct 18 2019, 11:31 AM
  • update docstrings as requested
  • add migration script
vlorentz updated this revision to Diff 7339.Oct 18 2019, 11:43 AM

fix argument names in API client.

ardumont added inline comments.Oct 18 2019, 3:09 PM
swh/storage/storage.py
1288

right!

ardumont accepted this revision.Oct 18 2019, 6:55 PM
ardumont added a subscriber: olasd.

sounds good.

If @olasd could also have a look, that'd be neat.

This revision is now accepted and ready to land.Oct 18 2019, 6:55 PM
olasd updated this revision to Diff 7415.Wed, Oct 23, 2:05 PM

Rebase on top of the origin['type'] removal

olasd added a comment.Wed, Oct 23, 2:19 PM

I've rebased this on top of the origin['type'] removal to see whether it could land.

However, this now needs D2174 on swh.model, as the to_dict recursion introduced in OriginVisit breaks when you attr.evlove(OriginVisit, origin=origin_url) (instead of attr.evolve(OriginVisit, origin=Origin(origin_url). This happens in a couple places and is inconsistent with the declared type for OriginVisit.origin.

I guess we should just end up replacing origin with origin_url in OriginVisit objects, as there's not much point carrying a single-key-dict around...

In D2157#51123, @olasd wrote:

I guess we should just end up replacing origin with origin_url in OriginVisit objects, as there's not much point carrying a single-key-dict around...

Yes, good idea. Should I land this diff as-is, and then we'll handle this renaming next week?