Page MenuHomeSoftware Heritage

Fix buffer interface protocol and impls
ClosedPublic

Authored by tenma on Sep 28 2020, 4:15 PM.

Details

Summary

Default argument object_types was not properly declared in StorageInterface
and concrete implmentations PostgreSQL and Cassandra.
Reverted unnecessary fix in storage tests.

Related to D4033

Diff Detail

Repository
rDSTO Storage manager
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 15721
Build 24201: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 24200: arc lint + arc unit

Event Timeline

Tested on storage and loader-core as seen with mr grep '\.flush('.

This revision is now accepted and ready to land.Sep 28 2020, 4:19 PM

Build is green

Patch application report for D4066 (id=14358)

Rebasing onto e37c8f7ccc...

Current branch diff-target is up to date.
Changes applied before test
commit 7fe76e217133e3f3ba0f3ab86961c41f51a2944f
Author: tenma <tenma+swh@mailbox.org>
Date:   Mon Sep 28 15:59:39 2020 +0200

    Fix buffer interface protocol and impls
    
    Default argument object_types was not properly declared in StorageInterface
    and concrete implmentations PostgreSQL and Cassandra.
    Reverted unnecessary fix in storage tests.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/986/ for more details.

olasd requested changes to this revision.Sep 28 2020, 4:24 PM

What happens when we have the following layering:

  • swh.storage RPC client
  • swh.storage RPC server
  • buffer storage
  • PostgreSQL storage

and the top level swh.storage RPC client calls storage.flush()?

What are the arguments actually received by the RPC server? How does this behavior translate?

The earlier implementation, with a single default argument None, was at least crystal clear on what happened when the default argument fell through. Here I'm not sure at all that a call to flush with the default argument will do the right thing.

This revision now requires changes to proceed.Sep 28 2020, 4:24 PM

(I'll note that the buffer storage usually comes *before* the RPC client, so the actual concern is probably moot. But the apparent mismatch in default arguments between different implems might lead to surprising behavior in the future.)

If whatever component before BPS has its flush called with no object_types, it depends on the impl of this component how it chains well. As is, it should either chain down the arg unprocessed (from *args/**kwargs for example, what I have done for all proxies before I understood the getattr hooks), call it without this arg or give it a valid value like OBJECT_TYPES (which could be shared). In short intermediate should not have an invalid default passed down verbatim.
The case of concrete storage is not clear, because they don't chain call it nor need it in the first place. So any value is OK.
We have no other component than BPS that would do something with this (buffering) interface though, and it maybe would not make sense to have another.

And the first line of the commit message should be changed; it should say what was changed, not just what components were changed

If my last comment was unclear:

  • concrete storages having different default does not matter since they are the endpoints in this chain.
  • any other non-terminal impl the buffer interface, if there may exist, must follow some rules: pass arguments unprocessed (*args) or a subset of BPS.OBJECT_TYPES.

Does this answers your question?

Build was aborted

Patch application report for D4066 (id=14440)

Rebasing onto 40997c0506...

Current branch diff-target is up to date.
Changes applied before test
commit bef08d6316c8534d52a946ae49f9bb11b39b6b73
Author: tenma <tenma+swh@mailbox.org>
Date:   Mon Sep 28 15:59:39 2020 +0200

    Fix object_types default in buffer interface protocol and impls
    
    Default argument object_types was not properly declared in StorageInterface
    and concrete implmentations PostgreSQL and Cassandra.
    Reverted unnecessary fix in storage tests.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/989/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/989/console

This revision is now accepted and ready to land.Sep 30 2020, 2:01 PM

Build is green

Patch application report for D4066 (id=14440)

Rebasing onto 40997c0506...

Current branch diff-target is up to date.
Changes applied before test
commit bef08d6316c8534d52a946ae49f9bb11b39b6b73
Author: tenma <tenma+swh@mailbox.org>
Date:   Mon Sep 28 15:59:39 2020 +0200

    Fix object_types default in buffer interface protocol and impls
    
    Default argument object_types was not properly declared in StorageInterface
    and concrete implmentations PostgreSQL and Cassandra.
    Reverted unnecessary fix in storage tests.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/991/ for more details.