Page MenuHomeSoftware Heritage

pgstorage: Empty temp tables instead of dropping them
ClosedPublic

Authored by ardumont on Jan 24 2020, 12:06 PM.

Details

Summary

Due to our pattern of adding objects [1], vacuum is triggered regularly on
pg_catalog.*, having an heavy impact on performance [2]

This commit tries to avoid the "dropping temporary tables" part, emptying
them instead (they still are dropped at the end of the session but less often).
This should decrease the bloat on pg_catalog.* tables.

[1] adding object pattern:

  • create temporary table
  • insert data from temporary table to production table with filtering
  • drop temporary table

[2] context:

irc discussion on sysadm:

...
10:09 <douardda> ardumont: any idea why the git loading workload seems to have drop drastically?
10:10 <+ardumont> no
10:11 <+ardumont> well, that could be we are finally ingesting relevant repositories
10:11 <+ardumont> relevant as in with new data
10:11 <douardda> not really https://grafana.softwareheritage.org/d/u8eBQQVZz/object-creation-rate?orgId=1&from=now-12h&to=now&var-instance=uffizi.internal.softwareheritage.org&var-object_type=content&var-interval=$__auto_interval_interval
10:12 <douardda> (same with revisions)
...
10:26 <+ardumont> it'd worth checking the storage backend
10:27 <+ardumont> pg_activity shows me a lot more (i don't recall seeing that much yesterday) select queries
10:27 <+ardumont> on the main belvedere storage (the one to write, not really for reading)
...
10:30 <+ardumont> i also realized that's it's not only the git tasks, it's also for the indexers
10:31 <+ardumont> well all tasks even, i barely see that there are other colors
10:42 <+olasd> I don't have my laptop but yesterday postgres was starting to look like it was spending a lot of time vacuuming the catalog tables (because of our usage of temporary tables)
10:43 <+ardumont> it still does
10:45 <+olasd> I think there's a way to reduce churn on temporary tables (making transactions empty them instead of dropping them) that might help
10:46 <+olasd> or get rid of temp tables altogether
10:46 <+olasd> (but that's less incremental)
10:52 <+olasd> https://stackoverflow.com/questions/50366509/temporary-tables-bloating-pg-attribute
10:55 <+ardumont> shall we try?
10:55 <+ardumont> one of the stanza to change are at https://forge.softwareheritage.org/source/swh-storage/browse/master/swh/storage/sql/40-swh-func.sql$22
10:55 <+ardumont> (i'm saying for david ;)
10:56 <+ardumont> the part 'on commit delete rows'
Test Plan

tox

Diff Detail

Event Timeline

ardumont edited the summary of this revision. (Show Details)
ardumont edited the summary of this revision. (Show Details)

accepted through irc ;)

...
11:45 <+olasd> temp tables were also a workaround for the lack of "on conflict do xxx" clauses when we started with 9.x
11:45 <+ardumont> that i also remember
11:46 <+olasd> ardumont: for the column drops you should add "if exists" to the clauses
11:46 <+olasd> that way they become idempotent
...
11:50 <+ardumont> i'll try to propose a diff
12:11 <+ardumont> here we go D2580
12:11 -- Notice(swhbot): D2580 (author: ardumont, Needs Review) on swh-storage: pgstorage: Empty temp tables instead of dropping them <https://forge.softwareheritage.org/D2580>
12:19 <+olasd> lgtm

Build has FAILED

yeah, well, that tends to happen when the environment to build cassandra job is no longer there...

+ python3 -m tox --result-json tox-results.json -e py3-cassandra
ERROR: unknown environment 'py3-cassandra'

Build has FAILED

yeah, well, that tends to happen when the environment to build cassandra job is no longer there...

it worked when that diff was on another branch...
Plugging that diff to master this now fails the cassandra part with this.

+ python3 -m tox --result-json tox-results.json -e py3-cassandra
ERROR: unknown environment 'py3-cassandra'

@vlorentz is that related to D2573?

See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tox/892/console

my understanding is a current transition on jenkins so cassandra build was not supposed to run yet here so meh.
and even though (phabricator) plan were reset, i guess the diff already has its pre-existing build plan built so now there is no changing that...
(i've triggered back all builds including the cassandra one which i hoped would disappear but no ;)
thus the failure again...

anyways, moving along now.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 24 2020, 2:38 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.