Page MenuHomeSoftware Heritage

content_add: Write to the objstorage before the DB or Kafka
ClosedPublic

Authored by vlorentz on Mar 15 2021, 12:54 PM.

Details

Summary

Must add to the objstorage before the DB and journal. Otherwise:

  1. in case of a crash the DB may "believe" we have the content, but we didn't have time to write to the objstorage before the crash
  2. the objstorage mirroring, which reads from the journal, may attempt to read from the objstorage before we finished writing it

This is already done in the postgresql backend unintentionally since D2848.

This commit documents it, makes the cassandra backend behave that way too,
and adds a test.

Resolves T2003.

Diff Detail

Repository
rDSTO Storage manager
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 19908
Build 30921: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 30920: arc lint + arc unit

Event Timeline

vlorentz edited the summary of this revision. (Show Details)

Build is green

Patch application report for D5246 (id=18813)

Rebasing onto b565201dcf...

First, rewinding head to replay your work on top of it...
Applying: content_add: Write to the objstorage before the DB or Kafka
Changes applied before test
commit 1180902ab75f4b92fce4e68e4e00c111d5250156
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Mar 15 12:50:41 2021 +0100

    content_add: Write to the objstorage before the DB or Kafka
    
    Must add to the objstorage before the DB and journal. Otherwise:
    1. in case of a crash the DB may "believe" we have the content, but
       we didn't have time to write to the objstorage before the crash
    2. the objstorage mirroring, which reads from the journal, may attempt to
       read from the objstorage before we finished writing it
    
    This is already done in the postgresql backend unintentionally since
    209de5dbaa127dacd114fbbd084f22632982eb77.
    
    This commit makes the cassandra backend behave that way too, and adds a
    test.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1225/ for more details.

Build is green

Patch application report for D5246 (id=18814)

Rebasing onto b565201dcf...

First, rewinding head to replay your work on top of it...
Applying: content_add: Write to the objstorage before the DB or Kafka
Changes applied before test
commit 53df7470a8be339ec04fc9eea5ca8cbff68a61a7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Mar 15 12:50:41 2021 +0100

    content_add: Write to the objstorage before the DB or Kafka
    
    Must add to the objstorage before the DB and journal. Otherwise:
    1. in case of a crash the DB may "believe" we have the content, but
       we didn't have time to write to the objstorage before the crash
    2. the objstorage mirroring, which reads from the journal, may attempt to
       read from the objstorage before we finished writing it
    
    This is already done in the postgresql backend unintentionally since
    209de5dbaa127dacd114fbbd084f22632982eb77.
    
    This commit documents it, makes the cassandra backend behave that way too,
    and adds a test.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1226/ for more details.

This revision is now accepted and ready to land.Mar 15 2021, 2:18 PM

Landed as ffc0841bdc383762fccb002a8df21cea745e3c7d