Page MenuHomeSoftware Heritage

Adapt cassandra storage to ignore the new OriginVisitStatus.type field
ClosedPublic

Authored by vsellier on Jan 12 2021, 6:29 PM.

Details

Summary

In the incoming model release, OriginVisitStatus has a new "type" entry.
We want to ignore it temporarily.

Related to T2443

Test Plan

tox

Diff Detail

Repository
rDSTO Storage manager
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 18315
Build 28284: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 28283: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D4849 (id=17160)

Rebasing onto 2b35198d30...

Current branch diff-target is up to date.
Changes applied before test
commit 68c0cc9b3dfab13b80a415ca55eedbff70f793c2
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Tue Jan 12 18:26:04 2021 +0100

    Adapt cassandra storage to support the new OriginVisitStatus.type field
    
    Depends on D4848
    Related to T2443

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1092/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1092/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 12 2021, 6:34 PM
Harbormaster failed remote builds in B18304: Diff 17160!

@olasd suggested this on irc

18:41 <+olasd> instead of D4849, how about changing the BaseRow.from_dict to only consider the fields that are actually expected in that row?
18:42 <+olasd> this would make it ignore future new fields until they're explicitly implemented
18:42 <+olasd> (rather than bomb out every time we introduce a new "backwards compatible but in fact not really" field)

And i think we should give it a shot ;)

Change the strategy to ignore the new added field

Build is green

Patch application report for D4849 (id=17171)

Rebasing onto 30945a5890...

First, rewinding head to replay your work on top of it...
Applying: Adapt cassandra storage to ignore the new OriginVisitStatus.type field
Changes applied before test
commit ea21d246879484cc7cb4a3905e5eeaaaab151a30
Author: Vincent SELLIER <vincent.sellier@softwareheritage.org>
Date:   Tue Jan 12 18:26:04 2021 +0100

    Adapt cassandra storage to ignore the new OriginVisitStatus.type field
    
    Depends on D4848
    Related to T2443

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1097/ for more details.

ardumont retitled this revision from Adapt cassandra storage to support the new OriginVisitStatus.type field to Adapt cassandra storage to ignore the new OriginVisitStatus.type field.Jan 13 2021, 11:14 AM
ardumont edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.Jan 13 2021, 12:14 PM

Here's my suggestion, more explicitly.

The main drawback is that it hides potential data retention problems until later runtime, so the function should probably implement a warning so that we don't think we've stored some data and then didn't.

All in all I don't know if making this more generic is worth the flexibility/risk.

swh/storage/cassandra/model.py
45

Here's my suggestion, more explicitly.

The main drawback is that it hides potential data retention problems until later runtime, so the function should probably implement a warning so that we don't think we've stored some data and then didn't.

All in all I don't know if making this more generic is worth the flexibility/risk.

Thanks for the suggestion.

As it's needed only once for a "short" period of time, we'll keep the explicit version though.