Page MenuHomeSoftware Heritage

discovery: Fix compatibility with storage RPC API
ClosedPublic

Authored by anlambert on Sep 28 2022, 8:28 PM.

Details

Summary

Software Heritage homemade RPC layer does not known how to serialize
set objects so we need to pass lists as parameters of *_missing
methods from storage API.

I found the issue while hacking on loaders code and testing my changes
in docker environment, it suddenly appeared after rebasing to the
master branch.

docker-swh-loader-1  | [2022-09-28 18:14:27,330: DEBUG/ForkPoolWorker-1] filename: webpack-0.11.8.tgz
docker-swh-loader-1  | [2022-09-28 18:14:27,330: DEBUG/ForkPoolWorker-1] filepath: /tmp/tmpo2shpn1v/webpack-0.11.8.tgz
docker-swh-loader-1  | [2022-09-28 18:14:27,338: DEBUG/ForkPoolWorker-1] extrinsic_metadata
docker-swh-loader-1  | [2022-09-28 18:14:27,400: DEBUG/ForkPoolWorker-1] uncompressed_path: /tmp/tmpo2shpn1v/src
docker-swh-loader-1  | [2022-09-28 18:14:27,453: ERROR/ForkPoolWorker-1] Failed to load branch releases/0.11.8 for https://www.npmjs.com/package/webpack
docker-swh-loader-1  | Traceback (most recent call last):
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/loader.py", line 684, in load
docker-swh-loader-1  |     res = self._load_release(p_info, origin)
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/loader.py", line 869, in _load_release
docker-swh-loader-1  |     (uncompressed_path, directory) = self._load_directory(dl_artifacts, tmpdir)
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/loader.py", line 837, in _load_directory
docker-swh-loader-1  |     contents, skipped_contents, directories, self.storage
docker-swh-loader-1  |   File "/usr/local/lib/python3.7/asyncio/runners.py", line 43, in run
docker-swh-loader-1  |     return loop.run_until_complete(main)
docker-swh-loader-1  |   File "/usr/local/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
docker-swh-loader-1  |     return future.result()
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/core/discovery.py", line 248, in filter_known_objects
docker-swh-loader-1  |     sample = await graph.get_sample()
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/core/discovery.py", line 184, in do_query
docker-swh-loader-1  |     known = set(sample_per_type)
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/core/discovery.py", line 93, in directory_missing
docker-swh-loader-1  |     logger.debug(directories)
docker-swh-loader-1  |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 188, in meth_
docker-swh-loader-1  |     return self._post(meth._endpoint_path, post_data)
docker-swh-loader-1  |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 267, in _post
docker-swh-loader-1  |     data = self._encode_data(data)
docker-swh-loader-1  |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 286, in _encode_data
docker-swh-loader-1  |     return encode_data(data, extra_encoders=self.extra_type_encoders)
docker-swh-loader-1  |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/serializers.py", line 129, in encode_data_client
docker-swh-loader-1  |     return msgpack_dumps(data, extra_encoders=extra_encoders)
docker-swh-loader-1  |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/serializers.py", line 278, in msgpack_dumps
docker-swh-loader-1  |     default=encode_types,
docker-swh-loader-1  |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/msgpack/__init__.py", line 38, in packb
docker-swh-loader-1  |     return Packer(**kwargs).pack(o)
docker-swh-loader-1  |   File "msgpack/_packer.pyx", line 294, in msgpack._cmsgpack.Packer.pack
docker-swh-loader-1  |   File "msgpack/_packer.pyx", line 300, in msgpack._cmsgpack.Packer.pack
docker-swh-loader-1  |   File "msgpack/_packer.pyx", line 297, in msgpack._cmsgpack.Packer.pack
docker-swh-loader-1  |   File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
docker-swh-loader-1  |   File "msgpack/_packer.pyx", line 291, in msgpack._cmsgpack.Packer._pack
docker-swh-loader-1  | TypeError: can not serialize 'set' object

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D8573 (id=30927)

Rebasing onto 1facea3cd2...

Current branch diff-target is up to date.
Changes applied before test
commit 7375a83c2fba8d0265023024d499a65db2ca6b60
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Sep 28 20:23:02 2022 +0200

    discovery: Fix compatibility with storage RPC API
    
    Software Heritage homemade RPC layer does not known how to serialize
    set objects so we need to pass lists as parameters of *_missing
    methods from storage API.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/908/ for more details.

This revision is now accepted and ready to land.Sep 28 2022, 8:31 PM