Page MenuHomeSoftware Heritage

NPM lister is failing with a database update conflict
Closed, MigratedEdits Locked

Description

The npm listing is stuck since the 1st of December

softwareheritage-scheduler=> select * from task where type = 'list-npm-full';
    id     |     type      |         arguments          |           next_run            | current_interval |       status       |  policy   | retries_left | priority 
-----------+---------------+----------------------------+-------------------------------+------------------+--------------------+-----------+--------------+----------
 153874548 | list-npm-full | {"args": [], "kwargs": {}} | 2021-12-01 15:31:30.993357+00 | 12:00:00         | next_run_scheduled | recurring |            0 | 
(1 row)

The loadings are failing with this error [1] (which prevents the scheduler task's status to be updated which prevents another run) :

Dec 06 11:29:54 worker11 python3[292484]: [2021-12-06 11:29:54,941: ERROR/ForkPoolWorker-6] Task swh.lister.npm.tasks.NpmListerTask[4fe2bd37-5c48-47b8-b55e-6ee14ffd9684] raised unexpected: RemoteException({'type': 'CardinalityViolation', 'module': 'psycopg2.errors', 'args': ['ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n'], 'message': 'ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n', 'traceback': ['Traceback (most recent call last):\n', '  File "/usr/lib/python3/dist-packages/flask/app.py", line 1813, in full_dispatch_request\n    rv = self.dispatch_request()\n', '  File "/usr/lib/python3/dist-packages/flask/app.py", line 1799, in dispatch_request\n    return self.view_functions[rule.endpoint](**req.view_args)\n', '  File "/usr/lib/python3/dist-packages/swh/core/api/negotiation.py", line 153, in newf\n    return f.negotiator(*args, **kwargs)\n', '  File "/usr/lib/python3/dist-packages/swh/core/api/negotiation.py", line 81, in __call__\n    result = self.func(*args, **kwargs)\n', '  File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 460, in _f\n    return obj_meth(**kw)\n', '  File "/usr/lib/python3/dist-packages/swh/core/db/common.py", line 62, in _meth\n    return meth(self, *args, db=db, cur=cur, **kwargs)\n', '  File "/usr/lib/python3/dist-packages/swh/scheduler/backend.py", line 280, in record_listed_origins\n    fetch=True,\n', '  File "/usr/lib/python3/dist-packages/psycopg2/extras.py", line 1281, in execute_values\n    cur.execute(b\'\'.join(parts))\n', '  File "/usr/lib/python3/dist-packages/psycopg2/extras.py", line 243, in execute\n    return super(RealDictCursor, self).execute(query, vars)\n', 'psycopg2.errors.CardinalityViolation: ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n\n']})
                                          Traceback (most recent call last):
                                            File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 385, in trace_task
                                              R = retval = fun(*args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/swh/scheduler/task.py", line 55, in __call__
                                              result = super().__call__(*args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 650, in __protected_call__
                                              return self.run(*args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/sentry_sdk/integrations/celery.py", line 161, in _inner
                                              reraise(*exc_info)
                                            File "/usr/lib/python3/dist-packages/sentry_sdk/_compat.py", line 57, in reraise
                                              raise value
                                            File "/usr/lib/python3/dist-packages/sentry_sdk/integrations/celery.py", line 156, in _inner
                                              return f(*args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/swh/lister/npm/tasks.py", line 14, in list_npm_full
                                              return lister.run().dict()
                                            File "/usr/lib/python3/dist-packages/swh/lister/pattern.py", line 130, in run
                                              full_stats.origins += self.send_origins(origins)
                                            File "/usr/lib/python3/dist-packages/swh/lister/pattern.py", line 234, in send_origins
                                              ret = self.scheduler.record_listed_origins(batch_origins)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 181, in meth_
                                              return self.post(meth._endpoint_path, post_data)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 278, in post
                                              return self._decode_response(response)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 354, in _decode_response
                                              self.raise_for_status(response)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 344, in raise_for_status
                                              raise exception from None
                                          swh.core.api.RemoteException: <RemoteException 500 CardinalityViolation: ['ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n']>

[1] https://sentry.softwareheritage.org/share/issue/717820b5113441a985e5f521851f4c63/

Event Timeline

vsellier changed the task status from Open to Work in Progress.Dec 6 2021, 12:29 PM
vsellier triaged this task as High priority.
vsellier created this task.

Unstuck the task scheduling:

softwareheritage-scheduler=> begin; update task set next_run=now(), status='next_run_not_scheduled' where id=153874548;
BEGIN
UPDATE 1
softwareheritage-scheduler=*> commit;
COMMIT

The last run failed with this error:

Dec 01 13:17:42 worker10 python3[3443684]: [2021-12-01 13:17:42,546: ERROR/ForkPoolWorker-7] Task swh.lister.npm.tasks.NpmListerTask[f1ad0e88-6764-4691-998b-141eb9c0d5a0] raised unexpected: RemoteException({'type': 'CardinalityViolation', 'module': 'psycopg2.errors', 'args': ['ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n'], 'message': 'ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n', 'traceback': ['Traceback (most recent call last):\n', '  File "/usr/lib/python3/dist-packages/flask/app.py", line 1813, in full_dispatch_request\n    rv = self.dispatch_request()\n', '  File "/usr/lib/python3/dist-packages/flask/app.py", line 1799, in dispatch_request\n    return self.view_functions[rule.endpoint](**req.view_args)\n', '  File "/usr/lib/python3/dist-packages/swh/core/api/negotiation.py", line 153, in newf\n    return f.negotiator(*args, **kwargs)\n', '  File "/usr/lib/python3/dist-packages/swh/core/api/negotiation.py", line 81, in __call__\n    result = self.func(*args, **kwargs)\n', '  File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 460, in _f\n    return obj_meth(**kw)\n', '  File "/usr/lib/python3/dist-packages/swh/core/db/common.py", line 62, in _meth\n    return meth(self, *args, db=db, cur=cur, **kwargs)\n', '  File "/usr/lib/python3/dist-packages/swh/scheduler/backend.py", line 280, in record_listed_origins\n    fetch=True,\n', '  File "/usr/lib/python3/dist-packages/psycopg2/extras.py", line 1281, in execute_values\n    cur.execute(b\'\'.join(parts))\n', '  File "/usr/lib/python3/dist-packages/psycopg2/extras.py", line 243, in execute\n    return super(RealDictCursor, self).execute(query, vars)\n', 'psycopg2.errors.CardinalityViolation: ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n\n']})

The new listing too:

Dec 06 11:25:16 worker11 python3[279116]: [2021-12-06 11:25:16,068: INFO/MainProcess] Received task: swh.lister.npm.tasks.NpmListerTask[4fe2bd37-5c48-47b8-b55e-6ee14ffd9684]  

Dec 06 11:29:54 worker11 python3[292484]: [2021-12-06 11:29:54,941: ERROR/ForkPoolWorker-6] Task swh.lister.npm.tasks.NpmListerTask[4fe2bd37-5c48-47b8-b55e-6ee14ffd9684] raised unexpected: RemoteException({'type': 'CardinalityViolation', 'module': 'psycopg2.errors', 'args': ['ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n'], 'message': 'ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n', 'traceback': ['Traceback (most recent call last):\n', '  File "/usr/lib/python3/dist-packages/flask/app.py", line 1813, in full_dispatch_request\n    rv = self.dispatch_request()\n', '  File "/usr/lib/python3/dist-packages/flask/app.py", line 1799, in dispatch_request\n    return self.view_functions[rule.endpoint](**req.view_args)\n', '  File "/usr/lib/python3/dist-packages/swh/core/api/negotiation.py", line 153, in newf\n    return f.negotiator(*args, **kwargs)\n', '  File "/usr/lib/python3/dist-packages/swh/core/api/negotiation.py", line 81, in __call__\n    result = self.func(*args, **kwargs)\n', '  File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 460, in _f\n    return obj_meth(**kw)\n', '  File "/usr/lib/python3/dist-packages/swh/core/db/common.py", line 62, in _meth\n    return meth(self, *args, db=db, cur=cur, **kwargs)\n', '  File "/usr/lib/python3/dist-packages/swh/scheduler/backend.py", line 280, in record_listed_origins\n    fetch=True,\n', '  File "/usr/lib/python3/dist-packages/psycopg2/extras.py", line 1281, in execute_values\n    cur.execute(b\'\'.join(parts))\n', '  File "/usr/lib/python3/dist-packages/psycopg2/extras.py", line 243, in execute\n    return super(RealDictCursor, self).execute(query, vars)\n', 'psycopg2.errors.CardinalityViolation: ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n\n']})
                                          Traceback (most recent call last):
                                            File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 385, in trace_task
                                              R = retval = fun(*args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/swh/scheduler/task.py", line 55, in __call__
                                              result = super().__call__(*args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 650, in __protected_call__
                                              return self.run(*args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/sentry_sdk/integrations/celery.py", line 161, in _inner
                                              reraise(*exc_info)
                                            File "/usr/lib/python3/dist-packages/sentry_sdk/_compat.py", line 57, in reraise
                                              raise value
                                            File "/usr/lib/python3/dist-packages/sentry_sdk/integrations/celery.py", line 156, in _inner
                                              return f(*args, **kwargs)
                                            File "/usr/lib/python3/dist-packages/swh/lister/npm/tasks.py", line 14, in list_npm_full
                                              return lister.run().dict()
                                            File "/usr/lib/python3/dist-packages/swh/lister/pattern.py", line 130, in run
                                              full_stats.origins += self.send_origins(origins)
                                            File "/usr/lib/python3/dist-packages/swh/lister/pattern.py", line 234, in send_origins
                                              ret = self.scheduler.record_listed_origins(batch_origins)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 181, in meth_
                                              return self.post(meth._endpoint_path, post_data)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 278, in post
                                              return self._decode_response(response)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 354, in _decode_response
                                              self.raise_for_status(response)
                                            File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 344, in raise_for_status
                                              raise exception from None
                                          swh.core.api.RemoteException: <RemoteException 500 CardinalityViolation: ['ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n']>
vsellier renamed this task from Unstuck npm lister in production to NPM lister is failing with a database update conflict.Dec 6 2021, 12:34 PM
vsellier updated the task description. (Show Details)

This happens when a given listed page returns two origins with the same URL.

We have the same issue in the GitHub lister, and we do a manual deduplication there.

We should probably update the scheduler function to just coalesce multiple ListedOrigin objects with the same url before sending them to the database backend, so that this issue is taken care of once and for all.

We should probably update the scheduler function to just coalesce multiple ListedOrigin objects with the same url before sending them to the database backend, so that this issue is taken care of once and for all.

Yep, i've mentioned orally to @vsellier that we should probably move down the logic you recentfly fixed in the github lister.
Because that also happens for launchpad, gitlab lister runs so far [1]

When i said that, I thought at the time of moving down inside the lister code (swh.lister.pattern.Lister). But your solution
may make more sense directly within the scheduler indeed.

[1] https://sentry.softwareheritage.org/organizations/swh/issues/?project=6

vsellier moved this task from Backlog to done on the System administration board.

The lister was fixed with the deployment of the swh-scheduler v0.22.0.