Page MenuHomeSoftware Heritage

indexer-license: Investigate timeouts
Open, NormalPublic

Description

Only 1 worker is currently running.
I expect only 1 query in the backend at a time with such setup.
That's not what's currently seen in the pg_activity -p 5434 (softwareheritage-indexer).

In the mean time, the current worker shows the following stacktrace [1].

So my take on this is that the query (using index scans as designed) works on a range too large for the query to finish.
What's not expected though is that the worker part explodes like [1] but the query in the backend (indexer-storage's db) happily continues querying.
Thus the load on somerset happily grows...

Maybe the following plan would be acceptable:

  • adding some @timeout on the indexer-storage's storage api (as we do in the swh-storage's)
  • and rework the ranges defined in the scheduler for the fossology-license indexer (IMSMW, 100k range tasks were created, we should reduce those ranges' size, thus increasing the number of tasks)

[1]

Jun 07 06:17:53 worker08 python3[123583]: [2019-06-07 06:17:53,918: INFO/MainProcess] Received task: swh.indexer.tasks.ContentRangeFossologyLicense[452abd0b-8db8-465c-9a2d-eb84d3ed90e5]
Jun 07 07:17:57 worker08 python3[59331]: [2019-06-07 07:17:57,176: ERROR/ForkPoolWorker-3] Problem when computing metadata.
                                         Traceback (most recent call last):
                                           File "/usr/lib/python3/dist-packages/swh/indexer/indexer.py", line 516, in run
                                             n=self.config['write_batch_size']):
                                           File "/usr/lib/python3/dist-packages/swh/core/utils.py", line 48, in grouper
                                             for _data in itertools.zip_longest(*args, fillvalue=stop_value):
                                           File "/usr/lib/python3/dist-packages/swh/indexer/indexer.py", line 479, in _index_with_skipping_already_done
                                             indexed_page = self.indexed_contents_in_range(start, end)
                                           File "/usr/lib/python3/dist-packages/swh/indexer/fossology_license.py", line 172, in indexed_contents_in_range
                                             start, end, self.tool['id'])
                                           File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 133, in meth_
                                             return self.post(meth._endpoint_path, post_data)
                                           File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 198, in post
                                             return self._decode_response(response)
                                           File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 235, in _decode_response
                                             response.content,
                                         swh.core.api.RemoteException: Unexpected status code for API request: 504 (b'<html>\r\n<head><title>504 Gateway Time-out</title></head>\r\n<body bgcolor="white">\r\n<center><h1>504 Gateway Time-out</h1></center>\r\n<hr><center>nginx/1.10.3</center>\r\n</body>\r\n</html>\r\n')

Event Timeline

ardumont created this task.Jun 7 2019, 10:27 AM
ardumont triaged this task as Normal priority.

In the mean time, i've stopped those indexers as this impacts other (i see transactions piling-up).