Due to our pattern of adding objects [1], vacuum is triggered regularly on
pg_catalog.*, having an heavy impact on performance [2]
This commit tries to avoid the "dropping temporary tables" part, emptying
them instead (they still are dropped at the end of the session but less often).
This should decrease the bloat on pg_catalog.* tables.
[1] adding object pattern:
- create temporary table
- insert data from temporary table to production table with filtering
- drop temporary table
[2] context:
irc discussion on sysadm:
... 10:09 <douardda> ardumont: any idea why the git loading workload seems to have drop drastically? 10:10 <+ardumont> no 10:11 <+ardumont> well, that could be we are finally ingesting relevant repositories 10:11 <+ardumont> relevant as in with new data 10:11 <douardda> not really https://grafana.softwareheritage.org/d/u8eBQQVZz/object-creation-rate?orgId=1&from=now-12h&to=now&var-instance=uffizi.internal.softwareheritage.org&var-object_type=content&var-interval=$__auto_interval_interval 10:12 <douardda> (same with revisions) ... 10:26 <+ardumont> it'd worth checking the storage backend 10:27 <+ardumont> pg_activity shows me a lot more (i don't recall seeing that much yesterday) select queries 10:27 <+ardumont> on the main belvedere storage (the one to write, not really for reading) ... 10:30 <+ardumont> i also realized that's it's not only the git tasks, it's also for the indexers 10:31 <+ardumont> well all tasks even, i barely see that there are other colors 10:42 <+olasd> I don't have my laptop but yesterday postgres was starting to look like it was spending a lot of time vacuuming the catalog tables (because of our usage of temporary tables) 10:43 <+ardumont> it still does 10:45 <+olasd> I think there's a way to reduce churn on temporary tables (making transactions empty them instead of dropping them) that might help 10:46 <+olasd> or get rid of temp tables altogether 10:46 <+olasd> (but that's less incremental) 10:52 <+olasd> https://stackoverflow.com/questions/50366509/temporary-tables-bloating-pg-attribute 10:55 <+ardumont> shall we try? 10:55 <+ardumont> one of the stanza to change are at https://forge.softwareheritage.org/source/swh-storage/browse/master/swh/storage/sql/40-swh-func.sql$22 10:55 <+ardumont> (i'm saying for david ;) 10:56 <+ardumont> the part 'on commit delete rows'