Page MenuHomeSoftware Heritage

Drop content_metadata table and indexer.

Authored by vlorentz on Aug 9 2019, 4:15 PM.


Group Reviewers

Depends on D1835.
Mutually exclusive with D1836.

Motivation: the content_metadata table is currently used only as a cache for indexing revision/origin intrinsic metadata. But this cache is not properly handled, and not invalidated when indexers are updated.
Fixing the issue would require extensive changes, and I don't think they are worth it; because content_metadata is a lot of complexity to handle, for very little performance benefit as a cache; and there is currently no use other than cache.

So I think we should remove it, and maybe add it back later if needed.

Diff Detail

rDCIDX Metadata indexer
No Linters Available
No Unit Test Coverage
Build Status
Buildable 7759
Build 11155: tox-on-jenkinsJenkins
Build 11154: arc lint + arc unit

Event Timeline

vlorentz created this revision.Aug 9 2019, 4:15 PM
vlorentz planned changes to this revision.Aug 12 2019, 10:11 AM
vlorentz updated this revision to Diff 6201.Aug 12 2019, 1:38 PM

remove WIP

vlorentz retitled this revision from [WIP] Drop content_metadata table and indexer. to Drop content_metadata table and indexer..Aug 12 2019, 1:38 PM
vlorentz updated this revision to Diff 6202.Aug 12 2019, 1:38 PM

fix base commit

ardumont requested changes to this revision.Aug 26 2019, 3:41 PM
ardumont added a subscriber: ardumont.

I'm not exactly sure of what that is (no real description attached to the diff).

I'm missing the migration script (-> required changes).


Please remove the print ;)

This revision now requires changes to proceed.Aug 26 2019, 3:41 PM
vlorentz edited the summary of this revision. (Show Details)Sep 11 2019, 10:38 AM
vlorentz updated this revision to Diff 6673.Sep 11 2019, 11:10 AM
  • rebase
  • remove print
  • add SQL migration
ardumont accepted this revision.Sep 11 2019, 11:18 AM
This revision is now accepted and ready to land.Sep 11 2019, 11:18 AM
vlorentz abandoned this revision.Sep 11 2019, 1:33 PM

<vlorentz> zack: so, conclusion of the content_metadata discussion: let's keep it but not use it as a cache?
<zack> vlorentz: that'd be fine with me