Page MenuHomeSoftware Heritage

Bump max postgres connections on db1.staging: 200 is too low
ClosedPublic

Authored by olasd on Fri, Sep 9, 6:53 PM.

Diff Detail

Repository
rSPSITE puppet-swh-site
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

olasd requested review of this revision.Fri, Sep 9, 6:53 PM
18:43:28         ⤷ ╡ icinga PROBLEM: service check_systemd on scrubber0.internal.staging.swh.network is CRITICAL: SYSTEMD CRITICAL - swh-scrubber-checker-postgres@primary-snapshot-0.service: failed, swh-scrubber-checker-postgres@primary-snapshot-1.service: failed
Sep 09 15:01:39 scrubber0 systemd[1]: Started Software Heritage Scrubber Checker Postgres primary-snapshot-0.
Sep 09 16:41:12 scrubber0 swh[1853081]: Traceback (most recent call last):
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/bin/swh", line 33, in <module>
Sep 09 16:41:12 scrubber0 swh[1853081]:     sys.exit(load_entry_point('swh.core==2.14.0', 'console_scripts', 'swh')())
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/swh/core/cli/__init__.py", line 184, in main
Sep 09 16:41:12 scrubber0 swh[1853081]:     return swh(auto_envvar_prefix="SWH")
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
Sep 09 16:41:12 scrubber0 swh[1853081]:     return self.main(*args, **kwargs)
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
Sep 09 16:41:12 scrubber0 swh[1853081]:     rv = self.invoke(ctx)
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
Sep 09 16:41:12 scrubber0 swh[1853081]:     return _process_result(sub_ctx.command.invoke(sub_ctx))
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
Sep 09 16:41:12 scrubber0 swh[1853081]:     return _process_result(sub_ctx.command.invoke(sub_ctx))
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
Sep 09 16:41:12 scrubber0 swh[1853081]:     return _process_result(sub_ctx.command.invoke(sub_ctx))
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
Sep 09 16:41:12 scrubber0 swh[1853081]:     return ctx.invoke(self.callback, **ctx.params)
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
Sep 09 16:41:12 scrubber0 swh[1853081]:     return callback(*args, **kwargs)
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/click/decorators.py", line 17, in new_func
Sep 09 16:41:12 scrubber0 swh[1853081]:     return f(get_current_context(), *args, **kwargs)
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/swh/scrubber/cli.py", line 131, in scrubber_check_storage
Sep 09 16:41:12 scrubber0 swh[1853081]:     checker.run()
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/swh/scrubber/storage_checker.py", line 94, in run
Sep 09 16:41:12 scrubber0 swh[1853081]:     return self._check_postgresql(db)
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/swh/scrubber/storage_checker.py", line 114, in _check_postgresql
Sep 09 16:41:12 scrubber0 swh[1853081]:     objects = list(objects)
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/swh/storage/backfill.py", line 572, in fetch
Sep 09 16:41:12 scrubber0 swh[1853081]:     record = converter(db, record)
Sep 09 16:41:12 scrubber0 swh[1853081]:   File "/usr/lib/python3/dist-packages/swh/storage/backfill.py", line 288, in snapshot_converter
Sep 09 16:41:12 scrubber0 swh[1853081]:     cur.execute(query, (snapshot_d["object_id"],))
Sep 09 16:41:12 scrubber0 swh[1853081]: psycopg2.errors.TooManyConnections: remaining connection slots are reserved for non-replication superuser connections
Sep 09 16:41:12 scrubber0 swh[1853081]: CONTEXT:  parallel worker
Sep 09 16:41:12 scrubber0 systemd[1]: swh-scrubber-checker-postgres@primary-snapshot-0.service: Main process exited, code=exited, status=1/FAILURE
Sep 09 16:41:12 scrubber0 systemd[1]: swh-scrubber-checker-postgres@primary-snapshot-0.service: Failed with result 'exit-code'.
This revision is now accepted and ready to land.Mon, Sep 12, 9:31 AM