Page MenuHomeSoftware Heritage

proper pagination for IndexerStorage.origin_intrinsic_metadata_search_by_producer
ClosedPublic

Authored by douardda on Nov 5 2019, 4:08 PM.

Details

Summary

replace the 'start' argument by a 'page_token' which is now expected to be an (not
so) opaque token used for pagination.

The 'end' argument is dropped.

Adapt tests and the list_origins_by_producer() helper function (in cli.py)
accordingly.

Diff Detail

Repository
rDCIDX Metadata indexer
Branch
pytest
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 8903
Build 12999: tox-on-jenkinsJenkins
Build 12998: arc lint + arc unit

Event Timeline

vlorentz requested changes to this revision.Nov 6 2019, 1:58 PM
vlorentz added a subscriber: vlorentz.

I would rather we move to opaque tokens for all pagination stuff.

Using last this way will not work if we move the idx-storage to a non-postgresql storage, and I don't want to go through this again.

swh/indexer/cli.py
157

lol I missed this

This revision now requires changes to proceed.Nov 6 2019, 1:58 PM

I would rather we move to opaque tokens for all pagination stuff.

That's why I put a 'Note' in the Diff's description ;-)

Using last this way will not work if we move the idx-storage to a non-postgresql storage, and I don't want to go through this again.

I know. Let's have this "specify pagination" meeting ASAP.

I would rather we move to opaque tokens for all pagination stuff.

That's why I put a 'Note' in the Diff's description ;-)

Well not exactly exactly, but meh.

use 'page_token' as (opaque) pagination argument

vlorentz requested changes to this revision.Nov 7 2019, 5:19 PM
vlorentz added inline comments.
swh/indexer/cli.py
152–153

I don't think this condition is needed.

swh/indexer/tests/storage/test_storage.py
1558–1587

Either test the value of next_page_token, or use its value for the next call.

I prefer the latter because it actually treats the token as opaque.

This revision now requires changes to proceed.Nov 7 2019, 5:19 PM
douardda added inline comments.
swh/indexer/cli.py
152–153

yes it is

swh/indexer/tests/storage/test_storage.py
1558–1587

That's a refactoring to include in another 'make opaque token really opaque' diff.

swh/indexer/cli.py
152–153

Why? If there are no origin, then next_page_token is None. (and was None at the previous iteration, if there are any results)

swh/indexer/tests/storage/test_storage.py
1558–1587

Then you should test its value

douardda edited the summary of this revision. (Show Details)

this version should address the points raised by vlorentz

hopefully, since I messed heavily my git rebase...

ok now I can get rid of the spurious if in list_origins_by_producer...

This revision is now accepted and ready to land.Nov 8 2019, 3:57 PM