Investigate and fix:
Jun 30 09:00:54 worker15 python3[23590]: [2019-06-30 09:00:54,448: ERROR/ForkPoolWorker-2] Task swh.lister.gitlab.tasks.RangeGitLabLister[474d600e-ff5c-43b0-83f9-afc29b1cfd88] raised unexpected: IntegrityError('(psycopg2.IntegrityError) duplicate key value violates unique constraint "gitlab_repo_pkey"\nDETAIL: Key (uid)=(debian/nathanruiz-guest/apt) already exists.\n',) [13/6560]
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
context)
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 450, in do_execute
cursor.execute(statement, parameters)
psycopg2.IntegrityError: duplicate key value violates unique constraint "gitlab_repo_pkey"
DETAIL: Key (uid)=(debian/nathanruiz-guest/apt) already exists.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 382, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/lib/python3/dist-packages/swh/scheduler/task.py", line 45, in __call__
return super().__call__(*args, **kwargs)
File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 641, in __protected_call__
return self.run(*args, **kwargs)
File "/usr/lib/python3/dist-packages/swh/lister/gitlab/tasks.py", line 36, in range_gitlab_lister
lister.run(min_bound=start, max_bound=end)
File "/usr/lib/python3/dist-packages/swh/lister/core/page_by_page_lister.py", line 123, in run
checks=check_existence)
File "/usr/lib/python3/dist-packages/swh/lister/core/lister_base.py", line 492, in ingest_data
injected = self.inject_repo_data_into_db(models_list)
File "/usr/lib/python3/dist-packages/swh/lister/core/lister_base.py", line 435, in inject_repo_data_into_db
injected_repos[m['uid']] = self.db_inject_repo(m)
File "/usr/lib/python3/dist-packages/swh/lister/core/lister_base.py", line 372, in db_inject_repo
sql_repo = self.db_query_equal('uid', model_dict['uid'])
File "/usr/lib/python3/dist-packages/swh/lister/core/lister_base.py", line 335, in db_query_equal
.filter(key == value).first()
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/query.py", line 2659, in first
ret = list(self[0:1])
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/query.py", line 2457, in __getitem__
return list(res)
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/query.py", line 2760, in __iter__
self.session._autoflush()
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/session.py", line 1303, in _autoflush
util.raise_from_cause(e)
File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 186, in reraise
raise value
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/session.py", line 1293, in _autoflush
self.flush()
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/session.py", line 2019, in flush
self._flush(objects)
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/session.py", line 2137, in _flush
transaction.rollback(_capture_exception=True)
File "/usr/lib/python3/dist-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 186, in reraise
raise value
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/session.py", line 2101, in _flush
flush_context.execute()
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/unitofwork.py", line 373, in execute
rec.execute(self)
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/unitofwork.py", line 532, in execute
uow
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/persistence.py", line 174, in save_obj
mapper, table, insert)
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/persistence.py", line 767, in _emit_insert_statements
execute(statement, multiparams)
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 914, in execute
return meth(self, multiparams, params)
File "/usr/lib/python3/dist-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
compiled_sql, distilled_params
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context
context)
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1341, in _handle_dbapi_exception
exc_info
File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 185, in reraise
raise value.with_traceback(tb)
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
context)
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 450, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely) (psycopg2.IntegrityError) duplicate key value violates unique constraint "gitlab_repo_pkey"
DETAIL: Key (uid)=(debian/nathanruiz-guest/apt) already exists.
[SQL: 'INSERT INTO gitlab_repo (name, full_name, html_url, origin_url, origin_type, last_seen, task_id, uid, instance) VALUES (%(name)s, %(full_name)s, %(html_url)s, %(origin_url)s, %(origin_type)s, %(last_seen)s, %(task_id)s, %(uid)s, %(instance)s)'] [parameters: {'instance': 'debian', 'last_seen': datetime.datetime(2019, 6, 30, 9, 0, 36,
155540), 'origin_url': 'https://salsa.debian.org/nathanruiz-guest/apt.git', 'full_name': 'nathanruiz-guest/apt', 'name': 'apt', 'html_url': 'https://salsa.debian.org/nathanruiz-guest/apt', 'task_id': None, 'origin_type': 'git', 'uid': 'debian/nathanruiz-guest/apt'}]
Jun 30 09:00:54 worker15 python3[23574]: [2019-06-30 09:00:54,518: INFO/MainProcess] Received task: swh.lister.gitlab.tasks.RangeGitLabLister[71da1490-b1ac-4d93-bc7f-5402472e05d1]With @douardda, we might have encountered those occurrences already.
It was possibly due to range interval overlap IMSMW.
In any case, that must be dealt with:
- by either checking the range computations to avoid overlap
- as a fallback, either trap those errors (if the source of the error is not found for example). Then make sure the main process continues to avoid having holes