Some package.json files may be encoded to something different from ascii/utf-8.
So detect file encoding using chardet before parsing it.
Previously, the following errors were raised:
Traceback (most recent call last): File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 893, in load more_data_to_fetch = self.fetch_data() File "/usr/lib/python3/dist-packages/swh/loader/npm/loader.py", line 203, in fetch_data data = next(self.new_versions) File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 145, in prepare_package_versions version_data) File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 197, in _prepare_package_version package_json = json.load(package_json_file) File "/usr/lib/python3.5/json/__init__.py", line 265, in load return loads(fp.read(), File "/usr/lib/python3.5/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcd in position 183: invalid continuation byte
Traceback (most recent call last): File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 893, in load more_data_to_fetch = self.fetch_data() File "/usr/lib/python3/dist-packages/swh/loader/npm/loader.py", line 203, in fetch_data data = next(self.new_versions) File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 145, in prepare_package_versions version_data) File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 197, in _prepare_package_version package_json = json.load(package_json_file) File "/usr/lib/python3.5/json/__init__.py", line 268, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File "/usr/lib/python3.5/json/__init__.py", line 315, in loads s, 0) json.decoder.JSONDecodeError: Unexpected UTF-8 BOM (decode using utf-8-sig): line 1 column 1 (char 0)
Related T1644