Rebase
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jan 12 2022
Rebase
Jan 11 2022
Nevertheless, we can have such issue if the ? got quoted to %3F in the Web API URL. I will handle that case in another diff.
In D6917#179774, @vlorentz wrote:oops, I misunderstood the original code
Looks good to me. I do not know the codebase of swh-graph really well but I do not see what could go wrong considering the few changes.
Jan 10 2022
Looks good to me.
Looks good to me.
Use try/except/else block
I will do it in another diff.
In D6898#179397, @vlorentz wrote:Would it make sense to hide the "history" button from the directory view?
Ah right, we now have snapshots that only contain releases targeting directories only (typically those generated by the package loaders).
In T3831#76627, @seirl wrote:I made a temporary fix in D6893, it doesn't solve the underlying issue but greatly decreases the probability of it happening. I'm not quite sure what would be a proper test for this endpoint, but this is at least enough to fix this issue in particular.
Jan 7 2022
Fix edge case encountered while testing on real world repositories.
Abandon this in favor of D6891.
After more tests using nginx, I stumbled across that error:
swh-loader_1 | [2022-01-07 10:13:48,333: DEBUG/ForkPoolWorker-1] Flushing 1508 objects of type content (359031722 bytes) swh-loader_1 | [2022-01-07 10:13:48,713: ERROR/ForkPoolWorker-1] Loading failure, updating to `failed` status swh-loader_1 | Traceback (most recent call last): swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 328, in raise_for_status swh-loader_1 | exception = pickle.loads(data) swh-loader_1 | TypeError: a bytes-like object is required, not 'str' swh-loader_1 | swh-loader_1 | During handling of the above exception, another exception occurred: swh-loader_1 | swh-loader_1 | Traceback (most recent call last): swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/loader/core/loader.py", line 339, in load swh-loader_1 | self.store_data() swh-loader_1 | File "/src/swh-loader-svn/swh/loader/svn/loader.py", line 486, in store_data swh-loader_1 | self.storage.content_add(self._contents) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 159, in content_add swh-loader_1 | return self.flush(["content"]) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 286, in flush swh-loader_1 | stats = add_fn(list(batch)) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/filter.py", line 58, in content_add swh-loader_1 | [x for x in content if x.sha256 in contents_to_add] swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/api/client.py", line 45, in content_add swh-loader_1 | return self.post("content/add", {"content": content}) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 278, in post swh-loader_1 | return self._decode_response(response) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 354, in _decode_response swh-loader_1 | self.raise_for_status(response) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/api/client.py", line 29, in raise_for_status swh-loader_1 | super().raise_for_status(response) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 341, in raise_for_status swh-loader_1 | raise RemoteException(payload=data, response=response) swh-loader_1 | swh.core.api.RemoteException: <html> swh-loader_1 | <head><title>413 Request Entity Too Large</title></head> swh-loader_1 | <body> swh-loader_1 | <center><h1>413 Request Entity Too Large</h1></center> swh-loader_1 | <hr><center>nginx/1.21.3</center> swh-loader_1 | </body> swh-loader_1 | </html> swh-loader_1 | swh-loader_1 | [2022-01-07 10:13:48,743: DEBUG/ForkPoolWorker-1] Flushing 1508 objects of type content (359031722 bytes) swh-loader_1 | [2022-01-07 10:13:49,327: ERROR/ForkPoolWorker-1] Task swh.loader.svn.tasks.DumpMountAndLoadSvnRepository[aa981fcc-863b-4e79-8280-9681b0a6f7fa] raised unexpected: RemoteException('<html>\r\n<head><title>413 Request Entity Too Large</title></head>\r\n<body>\r\n<center><h1>413 Request Entity Too Large</h1></center>\r\n<hr><center>nginx/1.21.3</center>\r\n</body>\r\n</html>\r\n') swh-loader_1 | Traceback (most recent call last): swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 328, in raise_for_status swh-loader_1 | exception = pickle.loads(data) swh-loader_1 | TypeError: a bytes-like object is required, not 'str' swh-loader_1 | swh-loader_1 | During handling of the above exception, another exception occurred: swh-loader_1 | swh-loader_1 | Traceback (most recent call last): swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/celery/app/trace.py", line 450, in trace_task swh-loader_1 | R = retval = fun(*args, **kwargs) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/scheduler/task.py", line 55, in __call__ swh-loader_1 | result = super().__call__(*args, **kwargs) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/celery/app/trace.py", line 731, in __protected_call__ swh-loader_1 | return self.run(*args, **kwargs) swh-loader_1 | File "/src/swh-loader-svn/swh/loader/svn/tasks.py", line 113, in load_svn_from_remote_dump swh-loader_1 | return loader.load() swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/loader/core/loader.py", line 382, in load swh-loader_1 | self.flush() swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/loader/core/loader.py", line 168, in flush swh-loader_1 | self.storage.flush() swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 286, in flush swh-loader_1 | stats = add_fn(list(batch)) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/filter.py", line 58, in content_add swh-loader_1 | [x for x in content if x.sha256 in contents_to_add] swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/api/client.py", line 45, in content_add swh-loader_1 | return self.post("content/add", {"content": content}) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 278, in post swh-loader_1 | return self._decode_response(response) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 354, in _decode_response swh-loader_1 | self.raise_for_status(response) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/api/client.py", line 29, in raise_for_status swh-loader_1 | super().raise_for_status(response) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 341, in raise_for_status swh-loader_1 | raise RemoteException(payload=data, response=response) swh-loader_1 | swh.core.api.RemoteException: <html> swh-loader_1 | <head><title>413 Request Entity Too Large</title></head> swh-loader_1 | <body> swh-loader_1 | <center><h1>413 Request Entity Too Large</h1></center> swh-loader_1 | <hr><center>nginx/1.21.3</center> swh-loader_1 | </body> swh-loader_1 | </html>
@olasd, so I tested the nginx approach. First I configured the storage to use the nginx proxy but I encountered the following error at the objstorage level:
swh-storage_1 | Traceback (most recent call last): swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request swh-storage_1 | rv = self.dispatch_request() swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request swh-storage_1 | return self.view_functions[rule.endpoint](**req.view_args) swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/negotiation.py", line 153, in newf swh-storage_1 | return f.negotiator(*args, **kwargs) swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/negotiation.py", line 81, in __call__ swh-storage_1 | result = self.func(*args, **kwargs) swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 460, in _f swh-storage_1 | return obj_meth(**kw) swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/metrics.py", line 24, in d swh-storage_1 | return f(*a, **kw) swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/metrics.py", line 77, in d swh-storage_1 | r = f(*a, **kw) swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/postgresql/storage.py", line 241, in content_add swh-storage_1 | objstorage_summary = self.objstorage.content_add(contents) swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/objstorage.py", line 62, in content_add swh-storage_1 | summary = self.objstorage.add_batch({cont.sha1: cont.data for cont in contents}) swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/objstorage/api/client.py", line 47, in add_batch swh-storage_1 | {"contents": contents, "check_presence": check_presence,}, swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 272, in post swh-storage_1 | **opts, swh-storage_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 256, in raw_verb swh-storage_1 | raise self.api_exception(e) swh-storage_1 | swh.objstorage.exc.ObjStorageAPIError: An unexpected error occurred in the api backend: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Jan 6 2022
Any logs in the storage container?
In D6874#178947, @vlorentz wrote:Can you reproduce it outside Docker?
Jan 5 2022
Rebase
rebase
rebase
In D6881#178901, @ardumont wrote:Thanks.
I guess it has something to do with your class merge last month or so (on that module).
That's no longer useful so yeah, let's drop it.
Pin core-js to 3.20.1 and deduplicate entries in yarn.lock
core-js 3.20.1 OK, bump to 3.20.2
core-js 3.20 OK, bump to 3.20.1
Check if core-js 3.20 introduces build failure.
It has been implemented in the archive coverage widget that can be found in the homepage of the webapp, closing this.
Hypothesis strategies have been replaced by pytest fixtures in swh-web tests (T3603), closing this.
So I managed to reproduce the issue I encountered, turns out it was not a storage timeout but rather a Connection reset by peer error, see stacktrace below:
swh-loader_1 | [2022-01-04 21:07:53,446: DEBUG/ForkPoolWorker-1] Flushing 1 objects of type revision (1 parents, 145 estimated bytes) swh-loader_1 | [2022-01-04 21:07:53,500: DEBUG/ForkPoolWorker-1] Flushing 2012 objects of type directory (39967 entries) swh-loader_1 | [2022-01-04 21:07:55,932: DEBUG/ForkPoolWorker-1] rev: 5232, swhrev: a6c32f5ad6136e57f6b1a02726296fd9b274f337, dir: 3d3432d677755ecb3371a13416b7de5f9c6ffa73 swh-loader_1 | [2022-01-04 21:07:55,972: DEBUG/ForkPoolWorker-1] Flushing 13093 objects of type content (159073357 bytes) swh-loader_1 | [2022-01-04 21:07:56,414: DEBUG/ForkPoolWorker-1] Flushing 1 objects of type revision (1 parents, 143 estimated bytes) swh-loader_1 | [2022-01-04 21:07:56,463: DEBUG/ForkPoolWorker-1] Flushing 2012 objects of type directory (39967 entries) swh-loader_1 | [2022-01-04 21:07:58,587: DEBUG/ForkPoolWorker-1] rev: 5233, swhrev: c55279cf36f060d8a0c62c21832db07dedb84044, dir: 0e2657f6e50ebc418960e7712a3456b7bc52b65c swh-loader_1 | [2022-01-04 21:07:58,627: DEBUG/ForkPoolWorker-1] Flushing 13093 objects of type content (159073290 bytes) swh-loader_1 | [2022-01-04 21:07:58,652: ERROR/ForkPoolWorker-1] Loading failure, updating to `failed` status swh-loader_1 | Traceback (most recent call last): swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen swh-loader_1 | chunked=chunked, swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request swh-loader_1 | conn.request(method, url, **httplib_request_kw) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/urllib3/connection.py", line 239, in request swh-loader_1 | super(HTTPConnection, self).request(method, url, body=body, headers=headers) swh-loader_1 | File "/usr/local/lib/python3.7/http/client.py", line 1281, in request swh-loader_1 | self._send_request(method, url, body, headers, encode_chunked) swh-loader_1 | File "/usr/local/lib/python3.7/http/client.py", line 1327, in _send_request swh-loader_1 | self.endheaders(body, encode_chunked=encode_chunked) swh-loader_1 | File "/usr/local/lib/python3.7/http/client.py", line 1276, in endheaders swh-loader_1 | self._send_output(message_body, encode_chunked=encode_chunked) swh-loader_1 | File "/usr/local/lib/python3.7/http/client.py", line 1075, in _send_output swh-loader_1 | self.send(chunk) swh-loader_1 | File "/usr/local/lib/python3.7/http/client.py", line 997, in send swh-loader_1 | self.sock.sendall(data) swh-loader_1 | ConnectionResetError: [Errno 104] Connection reset by peer swh-loader_1 | swh-loader_1 | During handling of the above exception, another exception occurred: swh-loader_1 | swh-loader_1 | Traceback (most recent call last): swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/requests/adapters.py", line 450, in send swh-loader_1 | timeout=timeout swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen swh-loader_1 | method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2] swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/urllib3/util/retry.py", line 532, in increment swh-loader_1 | raise six.reraise(type(error), error, _stacktrace) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/urllib3/packages/six.py", line 769, in reraise swh-loader_1 | raise value.with_traceback(tb) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen swh-loader_1 | chunked=chunked, swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request swh-loader_1 | conn.request(method, url, **httplib_request_kw) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/urllib3/connection.py", line 239, in request swh-loader_1 | super(HTTPConnection, self).request(method, url, body=body, headers=headers) swh-loader_1 | File "/usr/local/lib/python3.7/http/client.py", line 1281, in request swh-loader_1 | self._send_request(method, url, body, headers, encode_chunked) swh-loader_1 | File "/usr/local/lib/python3.7/http/client.py", line 1327, in _send_request swh-loader_1 | self.endheaders(body, encode_chunked=encode_chunked) swh-loader_1 | File "/usr/local/lib/python3.7/http/client.py", line 1276, in endheaders swh-loader_1 | self._send_output(message_body, encode_chunked=encode_chunked) swh-loader_1 | File "/usr/local/lib/python3.7/http/client.py", line 1075, in _send_output swh-loader_1 | self.send(chunk) swh-loader_1 | File "/usr/local/lib/python3.7/http/client.py", line 997, in send swh-loader_1 | self.sock.sendall(data) swh-loader_1 | urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')) swh-loader_1 | swh-loader_1 | During handling of the above exception, another exception occurred: swh-loader_1 | swh-loader_1 | Traceback (most recent call last): swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 254, in raw_verb swh-loader_1 | return getattr(self.session, verb)(self._url(endpoint), **opts) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/requests/sessions.py", line 577, in post swh-loader_1 | return self.request('POST', url, data=data, json=json, **kwargs) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/requests/sessions.py", line 529, in request swh-loader_1 | resp = self.send(prep, **send_kwargs) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/requests/sessions.py", line 645, in send swh-loader_1 | r = adapter.send(request, **kwargs) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/requests/adapters.py", line 501, in send swh-loader_1 | raise ConnectionError(err, request=request) swh-loader_1 | requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')) swh-loader_1 | swh-loader_1 | During handling of the above exception, another exception occurred: swh-loader_1 | swh-loader_1 | Traceback (most recent call last): swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/loader/core/loader.py", line 339, in load swh-loader_1 | self.store_data() swh-loader_1 | File "/src/swh-loader-svn/swh/loader/svn/loader.py", line 489, in store_data swh-loader_1 | self.storage.content_add(self._contents) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 153, in content_add swh-loader_1 | keys=["sha1", "sha1_git", "sha256", "blake2s256"], swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 224, in object_add swh-loader_1 | return self.flush() swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/buffer.py", line 286, in flush swh-loader_1 | stats = add_fn(list(batch)) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/filter.py", line 54, in content_add swh-loader_1 | contents_to_add = self._filter_missing_contents(content) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/storage/proxies/filter.py", line 113, in _filter_missing_contents swh-loader_1 | return set(self.storage.content_missing(missing_contents, key_hash="sha256",)) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 181, in meth_ swh-loader_1 | return self.post(meth._endpoint_path, post_data) swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 272, in post swh-loader_1 | **opts, swh-loader_1 | File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/api/__init__.py", line 256, in raw_verb swh-loader_1 | raise self.api_exception(e) swh-loader_1 | swh.storage.exc.StorageAPIError: An unexpected error occurred in the api backend: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')) swh-loader_1 | [2022-01-04 21:07:58,683: DEBUG/ForkPoolWorker-1] Flushing 13093 objects of type content (159073290 bytes) swh-loader_1 | [2022-01-04 21:07:59,137: DEBUG/ForkPoolWorker-1] Flushing 1 objects of type revision (1 parents, 164 estimated bytes) swh-loader_1 | [2022-01-04 21:07:59,171: ERROR/ForkPoolWorker-1] NOT FOR PRODUCTION - debug flag activated swh-loader_1 | Local repository not cleaned up for investigation: /tmp/swh.loader.svn.arfljmvh-128/tmp4uusn8rz swh-loader_1 | [2022-01-04 21:07:59,771: INFO/ForkPoolWorker-1] Task swh.loader.svn.tasks.DumpMountAndLoadSvnRepository[4f2645f8-b3bd-4c00-9dbb-6f98f3a610f3] succeeded in 11831.774478806s: {'status': 'failed'}
Jan 4 2022
Fix docstring
Rebase and adapt according to @douardda reviews
In D6839#178745, @douardda wrote:Gawd this is horrible! (not your fault!)
In D6874#178709, @olasd wrote:What kinds of storage timeouts? I'm not against this on principle, but I'm a bit worried that this could be masking a real bug.
Banner has been updated and deployed, closing this.
Thanks a lot for fixing this and the detailed report, this is pretty terrifying. I do not get why the Stripe API does not have some rate limit to prevent such abuse.