Since the release of werkzeuhg 2.1.0, the TestRemoteObjStorage::test_content_iterator is failing with the error below:
(swh) ✘-1 ~/swh/swh-environment/swh-objstorage [master|✚ 1] 15:00 $ pytest -sv swh/objstorage/tests/test_objstorage_api.py::TestRemoteObjStorage::test_content_iterator ================================================================================================================================== test session starts ================================================================================================================================== platform linux -- Python 3.9.2, pytest-7.1.1, pluggy-1.0.0 -- /home/anlambert/.virtualenvs/swh/bin/python cachedir: .pytest_cache hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/anlambert/swh/swh-environment/swh-objstorage/.hypothesis/examples') rootdir: /home/anlambert/swh/swh-environment/swh-objstorage, configfile: pytest.ini plugins: redis-2.4.0, postgresql-3.1.3, forked-1.4.0, hypothesis-6.40.0, mock-3.7.0, flask-1.2.0, xdist-2.5.0, cov-3.0.0, requests-mock-1.9.3, django-test-migrations-1.2.0, django-4.5.2, dash-2.3.1, asyncio-0.18.3, swh.core-2.4.0, swh.journal-1.0.1.dev3+g3771edb asyncio: mode=legacy collected 1 item swh/objstorage/tests/test_objstorage_api.py::TestRemoteObjStorage::test_content_iterator FAILED ======================================================================================================================================= FAILURES ======================================================================================================================================== ______________________________________________________________________________________________________________________ TestRemoteObjStorage.test_content_iterator _______________________________________________________________________________________________________________________ self = <urllib3.response.HTTPResponse object at 0x7f5da43faa90> def _update_chunk_length(self): # First, we'll figure out length of a chunk and then # we'll try to read it from socket. if self.chunk_left is not None: return line = self._fp.fp.readline() line = line.split(b";", 1)[0] try: > self.chunk_left = int(line, 16) E ValueError: invalid literal for int() with base 16: b'' ../../../.virtualenvs/swh/lib/python3.9/site-packages/urllib3/response.py:700: ValueError During handling of the above exception, another exception occurred: self = <urllib3.response.HTTPResponse object at 0x7f5da43faa90> @contextmanager def _error_catcher(self): """ Catch low-level python exceptions, instead re-raising urllib3 variants, so that low-level exceptions are not leaked in the high-level api. On exit, release the connection back to the pool. """ clean_exit = False try: try: > yield ../../../.virtualenvs/swh/lib/python3.9/site-packages/urllib3/response.py:441: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <urllib3.response.HTTPResponse object at 0x7f5da43faa90>, amt = 4096, decode_content = True def read_chunked(self, amt=None, decode_content=None): """ Similar to :meth:`HTTPResponse.read`, but with an additional parameter: ``decode_content``. :param amt: How much of the content to read. If specified, caching is skipped because it doesn't make sense to cache partial content as the full response. :param decode_content: If True, will attempt to decode the body based on the 'content-encoding' header. """ self._init_decoder() # FIXME: Rewrite this method and make it a class with a better structured logic. if not self.chunked: raise ResponseNotChunked( "Response is not chunked. " "Header 'transfer-encoding: chunked' is missing." ) if not self.supports_chunked_reads(): raise BodyNotHttplibCompatible( "Body should be http.client.HTTPResponse like. " "It should have have an fp attribute which returns raw chunks." ) with self._error_catcher(): # Don't bother reading the body of a HEAD request. if self._original_response and is_response_to_head(self._original_response): self._original_response.close() return # If a response is already read and closed # then return immediately. if self._fp.fp is None: return while True: > self._update_chunk_length() ../../../.virtualenvs/swh/lib/python3.9/site-packages/urllib3/response.py:767: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <urllib3.response.HTTPResponse object at 0x7f5da43faa90> def _update_chunk_length(self): # First, we'll figure out length of a chunk and then # we'll try to read it from socket. if self.chunk_left is not None: return line = self._fp.fp.readline() line = line.split(b";", 1)[0] try: self.chunk_left = int(line, 16) except ValueError: # Invalid chunked protocol response, abort. self.close() > raise InvalidChunkLength(self, line) E urllib3.exceptions.InvalidChunkLength: InvalidChunkLength(got length b'', 0 bytes read) ../../../.virtualenvs/swh/lib/python3.9/site-packages/urllib3/response.py:704: InvalidChunkLength During handling of the above exception, another exception occurred: def generate(): # Special case for urllib3. if hasattr(self.raw, 'stream'): try: > for chunk in self.raw.stream(chunk_size, decode_content=True): ../../../.virtualenvs/swh/lib/python3.9/site-packages/requests/models.py:760: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <urllib3.response.HTTPResponse object at 0x7f5da43faa90>, amt = 4096, decode_content = True def stream(self, amt=2 ** 16, decode_content=None): """ A generator wrapper for the read() method. A call will block until ``amt`` bytes have been read from the connection or until the connection is closed. :param amt: How much of the content to read. The generator will return up to much data per iteration, but may return less. This is particularly likely when using compressed data. However, the empty string will never be returned. :param decode_content: If True, will attempt to decode the body based on the 'content-encoding' header. """ if self.chunked and self.supports_chunked_reads(): > for line in self.read_chunked(amt, decode_content=decode_content): ../../../.virtualenvs/swh/lib/python3.9/site-packages/urllib3/response.py:575: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <urllib3.response.HTTPResponse object at 0x7f5da43faa90>, amt = 4096, decode_content = True def read_chunked(self, amt=None, decode_content=None): """ Similar to :meth:`HTTPResponse.read`, but with an additional parameter: ``decode_content``. :param amt: How much of the content to read. If specified, caching is skipped because it doesn't make sense to cache partial content as the full response. :param decode_content: If True, will attempt to decode the body based on the 'content-encoding' header. """ self._init_decoder() # FIXME: Rewrite this method and make it a class with a better structured logic. if not self.chunked: raise ResponseNotChunked( "Response is not chunked. " "Header 'transfer-encoding: chunked' is missing." ) if not self.supports_chunked_reads(): raise BodyNotHttplibCompatible( "Body should be http.client.HTTPResponse like. " "It should have have an fp attribute which returns raw chunks." ) with self._error_catcher(): # Don't bother reading the body of a HEAD request. if self._original_response and is_response_to_head(self._original_response): self._original_response.close() return # If a response is already read and closed # then return immediately. if self._fp.fp is None: return while True: self._update_chunk_length() if self.chunk_left == 0: break chunk = self._handle_chunk(amt) decoded = self._decode( chunk, decode_content=decode_content, flush_decoder=False ) if decoded: yield decoded if decode_content: # On CPython and PyPy, we should never need to flush the # decoder. However, on Jython we *might* need to, so # lets defensively do it anyway. decoded = self._flush_decoder() if decoded: # Platform-specific: Jython. yield decoded # Chunk content ends with \r\n: discard it. while True: line = self._fp.fp.readline() if not line: # Some sites may not end with '\r\n'. break if line == b"\r\n": break # We read everything; close the "file". if self._original_response: > self._original_response.close() ../../../.virtualenvs/swh/lib/python3.9/site-packages/urllib3/response.py:796: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <contextlib._GeneratorContextManager object at 0x7f5da43fa4c0>, type = <class 'urllib3.exceptions.InvalidChunkLength'>, value = InvalidChunkLength(got length b'', 0 bytes read), traceback = <traceback object at 0x7f5da43f0940> def __exit__(self, type, value, traceback): if type is None: try: next(self.gen) except StopIteration: return False else: raise RuntimeError("generator didn't stop") else: if value is None: # Need to force instantiation so we can reliably # tell if we get the same exception back value = type() try: > self.gen.throw(type, value, traceback) /usr/lib/python3.9/contextlib.py:135: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <urllib3.response.HTTPResponse object at 0x7f5da43faa90> @contextmanager def _error_catcher(self): """ Catch low-level python exceptions, instead re-raising urllib3 variants, so that low-level exceptions are not leaked in the high-level api. On exit, release the connection back to the pool. """ clean_exit = False try: try: yield except SocketTimeout: # FIXME: Ideally we'd like to include the url in the ReadTimeoutError but # there is yet no clean way to get at it from this context. raise ReadTimeoutError(self._pool, None, "Read timed out.") except BaseSSLError as e: # FIXME: Is there a better way to differentiate between SSLErrors? if "read operation timed out" not in str(e): # SSL errors related to framing/MAC get wrapped and reraised here raise SSLError(e) raise ReadTimeoutError(self._pool, None, "Read timed out.") except (HTTPException, SocketError) as e: # This includes IncompleteRead. > raise ProtocolError("Connection broken: %r" % e, e) E urllib3.exceptions.ProtocolError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read)) ../../../.virtualenvs/swh/lib/python3.9/site-packages/urllib3/response.py:458: ProtocolError During handling of the above exception, another exception occurred: self = <swh.objstorage.tests.test_objstorage_api.TestRemoteObjStorage testMethod=test_content_iterator> def test_content_iterator(self): sto_obj_ids = iter(self.storage) > sto_obj_ids = list(sto_obj_ids) swh/objstorage/tests/objstorage_testing.py:229: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ swh/objstorage/api/client.py:48: in __iter__ yield from self.list_content() swh/objstorage/api/client.py:54: in list_content yield from iter_chunks( ../../../.virtualenvs/swh/lib/python3.9/site-packages/swh/core/utils.py:83: in iter_chunks new_data = next(iterator) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ def generate(): # Special case for urllib3. if hasattr(self.raw, 'stream'): try: for chunk in self.raw.stream(chunk_size, decode_content=True): yield chunk except ProtocolError as e: > raise ChunkedEncodingError(e) E requests.exceptions.ChunkedEncodingError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read)) ../../../.virtualenvs/swh/lib/python3.9/site-packages/requests/models.py:763: ChunkedEncodingError =================================================================================================================================== warnings summary ==================================================================================================================================== ../../../.virtualenvs/swh/lib/python3.9/site-packages/pytest_asyncio/plugin.py:191 /home/anlambert/.virtualenvs/swh/lib/python3.9/site-packages/pytest_asyncio/plugin.py:191: DeprecationWarning: The 'asyncio_mode' default value will change to 'strict' in future, please explicitly use 'asyncio_mode=strict' or 'asyncio_mode=auto' in pytest configuration file. config.issue_config_time_warning(LEGACY_MODE, stacklevel=2) swh/objstorage/tests/test_objstorage_api.py::TestRemoteObjStorage::test_content_iterator /home/anlambert/swh/swh-environment/swh-objstorage/swh/objstorage/factory.py:92: DeprecationWarning: Explicit "args" key is deprecated for objstorage initialization, use class arguments keys directly instead. warnings.warn( -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ================================================================================================================================ short test summary info ================================================================================================================================ FAILED swh/objstorage/tests/test_objstorage_api.py::TestRemoteObjStorage::test_content_iterator - requests.exceptions.ChunkedEncodingError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))
I managed to identify the commit that introduced the regression using git bisect.
The main difference between previous werkzeug release (2.0.3) and the latest one (2.1.0) is that the HTTP protocol version used by our RPC servers was bumped from 1.0 to 1.1.