Page MenuHomeSoftware Heritage

proper pagination for IndexerStorage.origin_intrinsic_metadata_search_by_producer
ClosedPublic

Authored by douardda on Tue, Nov 5, 4:08 PM.

Details

Summary

replace the 'start' argument by a 'page_token' which is now expected to be an (not
so) opaque token used for pagination.

The 'end' argument is dropped.

Adapt tests and the list_origins_by_producer() helper function (in cli.py)
accordingly.

Diff Detail

Repository
rDCIDX Metadata indexer
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

douardda created this revision.Tue, Nov 5, 4:08 PM
vlorentz requested changes to this revision.Wed, Nov 6, 1:58 PM
vlorentz added a subscriber: vlorentz.

I would rather we move to opaque tokens for all pagination stuff.

Using last this way will not work if we move the idx-storage to a non-postgresql storage, and I don't want to go through this again.

swh/indexer/cli.py
157

lol I missed this

This revision now requires changes to proceed.Wed, Nov 6, 1:58 PM

I would rather we move to opaque tokens for all pagination stuff.

That's why I put a 'Note' in the Diff's description ;-)

Using last this way will not work if we move the idx-storage to a non-postgresql storage, and I don't want to go through this again.

I know. Let's have this "specify pagination" meeting ASAP.

I would rather we move to opaque tokens for all pagination stuff.

That's why I put a 'Note' in the Diff's description ;-)

Well not exactly exactly, but meh.

douardda updated this revision to Diff 7701.Thu, Nov 7, 4:20 PM

use 'page_token' as (opaque) pagination argument

vlorentz requested changes to this revision.Thu, Nov 7, 5:19 PM
vlorentz added inline comments.
swh/indexer/cli.py
153–154

I don't think this condition is needed.

swh/indexer/tests/storage/test_storage.py
1558–1587

Either test the value of next_page_token, or use its value for the next call.

I prefer the latter because it actually treats the token as opaque.

This revision now requires changes to proceed.Thu, Nov 7, 5:19 PM
douardda edited the summary of this revision. (Show Details)Thu, Nov 7, 5:31 PM
douardda added inline comments.
swh/indexer/cli.py
153–154

yes it is

swh/indexer/tests/storage/test_storage.py
1558–1587

That's a refactoring to include in another 'make opaque token really opaque' diff.

vlorentz added inline comments.Thu, Nov 7, 6:25 PM
swh/indexer/cli.py
153–154

Why? If there are no origin, then next_page_token is None. (and was None at the previous iteration, if there are any results)

swh/indexer/tests/storage/test_storage.py
1558–1587

Then you should test its value

douardda updated this revision to Diff 7733.Fri, Nov 8, 1:57 PM
douardda edited the summary of this revision. (Show Details)

this version should address the points raised by vlorentz

hopefully, since I messed heavily my git rebase...

douardda updated this revision to Diff 7735.Fri, Nov 8, 2:08 PM

ok now I can get rid of the spurious if in list_origins_by_producer...

douardda marked 4 inline comments as done.Fri, Nov 8, 2:16 PM
vlorentz accepted this revision.Fri, Nov 8, 3:57 PM
This revision is now accepted and ready to land.Fri, Nov 8, 3:57 PM