Page MenuHomeSoftware Heritage

Stop supporting origin ids in API (except in origin_get_range).
ClosedPublic

Authored by vlorentz on Oct 17 2019, 7:04 PM.

Diff Detail

Repository
rDSTO Storage manager
Branch
kill-origin-ids
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 8529
Build 12390: tox-on-jenkinsJenkins
Build 12389: arc lint + arc unit

Event Timeline

ardumont added a subscriber: ardumont.

Thanks.

That looks promising.

Just have a couple of docstring fix and interrogation.

swh/storage/in_memory.py
871

origin (str):

1494

why don't use rename origin_id here?

swh/storage/storage.py
1102

origin (str): the origin's url

1288

do you still need to do that now?

I mean, do we still need to pass the full origin to the journal (including our origin id)?

1495

wow, it's been a while since i looked to the origin-get implementation...
too many returns/else within the block instructions...

It's kind of a mess to read...
i'd either go with returns all around and remove the redundant else
or remove the returns and use a single result variable which i'd return at the bottom...

That has nothing to do with the diff though...
but since we are touching it...


In any case, last return statement, is res['url'] is None possible at all here now?

I guess yes because we don't know what the url inputted can be?

This revision now requires changes to proceed.Oct 18 2019, 2:35 AM

Just have a couple of docstring fix and interrogation.

Ah and you guessed it, i miss some migration scripts with the *sql functions impacted ;)

vlorentz added inline comments.
swh/storage/storage.py
1288

That doesn't include the id, but it includes the type, which is currently needed by origin_add

1495

That has nothing to do with the diff though...
but since we are touching it...

Let's not make this diff even bigger.

In any case, last return statement, is res['url'] is None possible at all here now?

Yes, if the origin is not already in storage, because missing rows are returned by pg as (null, null) because it's a left join instead of inner join.

  • update docstrings as requested
  • add migration script

fix argument names in API client.

swh/storage/storage.py
1288

right!

ardumont added a subscriber: olasd.

sounds good.

If @olasd could also have a look, that'd be neat.

This revision is now accepted and ready to land.Oct 18 2019, 6:55 PM

Rebase on top of the origin['type'] removal

I've rebased this on top of the origin['type'] removal to see whether it could land.

However, this now needs D2174 on swh.model, as the to_dict recursion introduced in OriginVisit breaks when you attr.evlove(OriginVisit, origin=origin_url) (instead of attr.evolve(OriginVisit, origin=Origin(origin_url). This happens in a couple places and is inconsistent with the declared type for OriginVisit.origin.

I guess we should just end up replacing origin with origin_url in OriginVisit objects, as there's not much point carrying a single-key-dict around...

In D2157#51123, @olasd wrote:

I guess we should just end up replacing origin with origin_url in OriginVisit objects, as there's not much point carrying a single-key-dict around...

Yes, good idea. Should I land this diff as-is, and then we'll handle this renaming next week?