Page MenuHomeSoftware Heritage

cassandra: Discard Content ctime field in content_get_partition
ClosedPublic

Authored by anlambert on Sep 1 2020, 3:50 PM.

Details

Summary

That field should only be returned by content_find method.

Add tests to check expected behaviors according to methods.

This fixes failing tests observed in https://jenkins.softwareheritage.org/job/DENV/job/tests/7/console

Diff Detail

Repository
rDSTO Storage manager
Branch
cassandra-ctime-fixes
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 14881
Build 22929: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 22928: arc lint + arc unit

Event Timeline

swh/storage/cassandra/storage.py
244

not sure about that change but I did not get why the ctime value vas removed in that case and not in the other

Build is green

Patch application report for D3856 (id=13628)

Rebasing onto e6fcfb931a...

Current branch diff-target is up to date.
Changes applied before test
commit 9082a8173ef0ba53c837756dceee9ea3d0cd7b5e
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Tue Sep 1 11:55:24 2020 +0200

    cassandra: Do not return naive datetimes in content_get*

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/891/ for more details.

ardumont added inline comments.
swh/storage/cassandra/storage.py
244

I recall it's not used for the content hashes computations and something about us adding that value...
@vlorentz must have a clearer view than me on the subject.

But yeah, the inconsistency is not good and we should have comments on those kind of instructions...

it's fine, we're even doing it in content_find already. I just forgot these.

Could you add tests, though?

This revision now requires changes to proceed.Sep 2 2020, 10:29 AM

Also add ctime checks in test_content_find_ctime

Build has FAILED

Patch application report for D3856 (id=13643)

Rebasing onto e6fcfb931a...

Current branch diff-target is up to date.
Changes applied before test
commit 21823987a3ca2678c60d30392ff7d38bd0d1cbd3
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Tue Sep 1 11:55:24 2020 +0200

    cassandra: Do not return naive datetimes in content_get*

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/893/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/893/console

Build has FAILED

Patch application report for D3856 (id=13644)

Rebasing onto e6fcfb931a...

Current branch diff-target is up to date.
Changes applied before test
commit 8b58bb252f0893649e7c5bcf9790db7a8298f700
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Tue Sep 1 11:55:24 2020 +0200

    cassandra: Do not return naive datetimes in content_get*

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/894/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/894/console

swh/storage/cassandra/storage.py
244

Turns out ctime field is not returned by content_get in the postgresql storage but only by content_find. So I will revert that change.

Do not modify content_get implementation, the fix is only needed for content_find.

Looks like you fixed content_get_partition, not content_find

And wasn't content_find the one that was already fine, anyway?

Build is green

Patch application report for D3856 (id=13645)

Rebasing onto e6fcfb931a...

Current branch diff-target is up to date.
Changes applied before test
commit 813ad9959ba9c1677cdea31e0461099788c9996d
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Tue Sep 1 11:55:24 2020 +0200

    cassandra: Do not return naive datetimes in content_find

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/895/ for more details.

Looks like you fixed content_get_partition, not content_find

Right, I mixed up the functions. So in fact ctime should be discarded by content_get_partition like in the postgresql storage, correct ?

Turns out ctime field is not returned by content_get in the postgresql storage but

Damn, i forget that part each time...

So in fact ctime should be discarded by content_get_partition like in the postgresql storage, correct ?

yes, that'd be my opinion as well.

Final update (I hope): discard Content ctime field in content_get_partition cassandra implementation

anlambert retitled this revision from cassandra: Do not return naive datetimes in content_get* to cassandra: Discard Content ctime field in content_get_partition.Sep 2 2020, 3:16 PM
anlambert edited the summary of this revision. (Show Details)
anlambert edited the summary of this revision. (Show Details)

Build is green

Patch application report for D3856 (id=13647)

Rebasing onto e6fcfb931a...

Current branch diff-target is up to date.
Changes applied before test
commit 36d284c730565d99f6c3da6e07b8e5864ed2d216
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Tue Sep 1 11:55:24 2020 +0200

    cassandra: Discard Content ctime field in content_get_partition
    
    That field should only be returned by content_find method.
    
    Add tests to check expected behaviors according to methods.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/896/ for more details.

This revision was not accepted when it landed; it landed in state Needs Review.Sep 2 2020, 3:57 PM
This revision was automatically updated to reflect the committed changes.