Page MenuHomeSoftware Heritage

rehash: Call objstorage.content_get() with a HashDict instead of single hash
ClosedPublic

Authored by vlorentz on Jul 19 2022, 2:10 PM.

Details

Summary

Hash dicts are now prefered by swh-objstorage, in order to support
individual hash collisions.

Test Plan

Depends on D8122

Diff Detail

Repository
rDCIDX Metadata indexer
Branch
objid
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 30452
Build 47606: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 47605: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D8135 (id=29378)

Rebasing onto fa67b73d6a...

First, rewinding head to replay your work on top of it...
Applying: rehash: Call objstorage.content_get() with a HashDict instead of single hash
Changes applied before test
commit b319f2a4f54bd9ac676abf805b39cc47aff1787b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Jul 19 14:09:46 2022 +0200

    rehash: Call objstorage.content_get() with a HashDict instead of single hash
    
    Hash dicts are now prefered by swh-objstorage, in order to support
    individual hash collisions.

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/361/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/361/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 19 2022, 2:11 PM
Harbormaster failed remote builds in B30452: Diff 29378!

Build is green

Patch application report for D8135 (id=29378)

Rebasing onto 466108c166...

First, rewinding head to replay your work on top of it...
Applying: rehash: Call objstorage.content_get() with a HashDict instead of single hash
Changes applied before test
commit 16cb9749f0015f49345ac9be59d0c68023aa941a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Jul 19 14:09:46 2022 +0200

    rehash: Call objstorage.content_get() with a HashDict instead of single hash
    
    Hash dicts are now prefered by swh-objstorage, in order to support
    individual hash collisions.

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/452/ for more details.

ardumont added a subscriber: ardumont.

lgtm

Note: That's actually a tool (and not an indexer). It was used once some (long) time ago to add the blake
columns to the content model.

We never used it after that. I wonder whether we want to keep it or simply drop it to reduce
the maintenance volume to a minimum in that part of the code. And if we want to keep it, we might
want to move it somewhere else, make it a cli probably using the journal as well...

This revision is now accepted and ready to land.Aug 23 2022, 10:47 AM
This revision was landed with ongoing or failed builds.Aug 30 2022, 11:58 AM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D8135 (id=30121)

Rebasing onto 85b675fd19...

Current branch diff-target is up to date.
Changes applied before test
commit 42cb37769714e245803023ccf02105006dd0e474
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Jul 19 14:09:46 2022 +0200

    rehash: Call objstorage.content_get() with a HashDict instead of single hash
    
    Hash dicts are now prefered by swh-objstorage, in order to support
    individual hash collisions.

See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/466/ for more details.