Page MenuHomeSoftware Heritage

ContentIndexer: convert hash to bytes.
ClosedPublic

Authored by vlorentz on Dec 20 2018, 6:35 PM.

Details

Summary

When received via the scheduler (which manipulates JSON), it's a string.

The RevisionIndexer uses the exact same conversion code.

Test Plan
docker-compose up
python3 -m swh.loader.git.loader --origin-url https://github.com/SoftwareHeritage/swh-storage.git
echo 'indexer_mimetype;oneshot;[["6dfe5dd2ab86d1ad3677285155027332fb35e9e5"]];{"policy_update": "update-dups"}' | python3 -m swh.scheduler.cli --cls remote --url http://localhost:5008/ task schedule /dev/stdin -c type -c policy -c args -c kwargs -d ';'

Diff Detail

Repository
rDCIDX Metadata indexer
Branch
content-indexer-hash-type
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 3241
Build 4164: tox-on-jenkinsJenkins
Build 4163: arc lint + arc unit

Event Timeline

olasd added a subscriber: olasd.

I'm not too fond of silently accepting both data types (I'd rather have limitations of the scheduler handled on the task side rather than here) but I guess that's fine.

This also probably needs test coverage so we don't break it in the future.

swh/indexer/indexer.py
339

needs to be fixed

This revision is now accepted and ready to land.Dec 21 2018, 3:37 PM
In D874#18797, @olasd wrote:

I'm not too fond of silently accepting both data types (I'd rather have limitations of the scheduler handled on the task side rather than here) but I guess that's fine.

I agree, but we already do it all over the indexers :/

This also probably needs test coverage so we don't break it in the future.

It's already tested, modulo D876

This revision was automatically updated to reflect the committed changes.