From: | "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> |
---|---|
To: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Temporary tables prevent autovacuum, leading to XID wraparound |
Date: | 2018-01-25 06:14:41 |
Message-ID: | 0A3221C70F24FB45833433255569204D1F8A4DC6@G01JPEXMBYT05 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
I've found a problem that an orphaned temporary table could cause XID wraparound. Our customer encountered this problem with PG 9.5.2, but I think this will happen with the latest PG.
I'm willing to fix this, but I'd like to ask you what approach we should take.
PROBLEM
====================================
The customer has a database for application data, which I call it user_db here. They don't store application data in postgres database.
No tables in user_db was autovacuumed for more than a month, leading to user tables bloating. Those tables are eligible for autovacuum according to pg_stat_all_tables and autovacuum settings.
age(datfrozenxid) of user_db and postgres databases are greater than autovacuum_max_freeze_age, so they are eligible for autovacuuming for XID wraparound.
There are user tables in user_db whose age(relfrozenxid) is greater than autovacuum_freeze_max_age, so those tables should get autovacuum treatment.
CAUSE
====================================
postgres database has a table named pg_temp_3.fetchchunks, whose age(relfrozenxid) is greater than autovacuum_freeze_max_age. This temporary table is the culprit. pg_temp_3.fetchchunks is created by pg_rewind. The customer says they ran pg_rewind.
autovacuum launcher always choose postgres, because do_start_worker() scans pg_database and finds that postgres database needs vacuuming for XID wraparound. user_db is never chosen for vacuuming, although it also needs vacuuming for XID wraparound.
autovacuum worker doesn't delete pg_temp3.fetchchunks, because the backendid 3 is used by some application so autovacuum worker thinks that the backend is active and the temporary table cannot be dropped.
I don't know why pg_temp3.fetchchunks still exists. Maybe the user ran pg_ctl stop -mi while pg_rewind was running.
FIX
====================================
I have the following questions. Along which line should I proceed to fix the problem?
* Why does autovacuum launcher always choose only one database when that database need vacuuming for XID wraparound? Shouldn't it also choose other databases?
* I think temporary tables should not require vacuuming for XID wraparound. Furtherover, should updates/deletes to temporary tables be in-place instead of creating garbage, so that any form of vacuum is unnecessary? Other sessions do not need to read temporary tables.
* In this incident, autovacuum worker misjudged that pg_temp_3.fetchchunks can't be deleted, although the creator (pg_rewind) is no longer active. How can we delete orphan temporary tables safely?
Regards
Takayuki Tsunakawa
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2018-01-25 06:16:04 | Re: Rangejoin rebased |
Previous Message | Masahiko Sawada | 2018-01-25 05:31:21 | Re: PATCH: Exclude unlogged tables from base backups |