Page MenuHomeSoftware Heritage

Document the existing metadata formats
ClosedPublic

Authored by vlorentz on Mar 15 2021, 2:35 PM.

Details

Summary

Addresses the remaining part of @douardda's comment here: D5239#133331

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D5247 (id=18817)

Could not rebase; Attempt merge onto b565201dcf...

Updating b565201d..091d5c83
Fast-forward
 docs/extrinsic-metadata-specification.rst | 86 ++++++++++++++++++++++++++++++-
 swh/storage/cassandra/storage.py          | 14 ++---
 swh/storage/postgresql/storage.py         |  5 ++
 swh/storage/tests/storage_tests.py        | 20 +++++++
 swh/storage/tests/test_api_client.py      |  1 +
 5 files changed, 119 insertions(+), 7 deletions(-)
Changes applied before test
commit 091d5c83af6cf8fdf8d88092bdb49f109d7cd66c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Mar 15 14:35:03 2021 +0100

    Document the existing metadata formats

commit ffc0841bdc383762fccb002a8df21cea745e3c7d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Mar 15 12:50:41 2021 +0100

    content_add: Write to the objstorage before the DB or Kafka
    
    Must add to the objstorage before the DB and journal. Otherwise:
    1. in case of a crash the DB may "believe" we have the content, but
       we didn't have time to write to the objstorage before the crash
    2. the objstorage mirroring, which reads from the journal, may attempt to
       read from the objstorage before we finished writing it
    
    This is already done in the postgresql backend unintentionally since
    209de5dbaa127dacd114fbbd084f22632982eb77.
    
    This commit documents it, makes the cassandra backend behave that way too,
    and adds a test.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1227/ for more details.

This revision is now accepted and ready to land.Mar 15 2021, 2:47 PM

Nicely documented.

docs/extrinsic-metadata-specification.rst
281

sent

334

maybe change "is bad" to "will alter the metadata document itself and its intrinsic identification"?

This revision was landed with ongoing or failed builds.Mar 15 2021, 3:59 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D5247 (id=18821)

Rebasing onto 8dd9f7b635...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-1228-D5247.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1228/ for more details.