In the Indexer Storage API, most get methods (eg. content_ctags_get) yield items with this format:
{"id": sha1, "tool": TOOL, "ctags": ctags1} {"id": sha1, "tool": TOOL, "ctags": ctags2}
Starting with T782/D301, content_fossology_license_get yields item with this format:
{sha1: {"tool": TOOL, "licenses": [license1, license2]}}
This task is twofold:
- first, improve content_fossology_license_get's result to return a dictionary instead of yielding dictionaries each with a single key-value
- secondly, refactor other _get methods to use the same format.
The files that should be edited are:
- swh/indexer/tests/storage/test_storage.py: this are the test cases for both Indexer Storage implementations. It should be adapted to test for the new format.
- swh/indexer/storage/in_memory.py: a fully in-memory implementation of the Indexer Storage. This is the easiest implementation to start with.
- swh/indexer/storage/__init__.py and swh/indexer/storage/converters.py: an implementation of the Indexer Storage backed by postgresql. Look at D301 for examples of how to do it.