Toast issues with OldestXmin going backwards

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Toast issues with OldestXmin going backwards
Date: 2018-04-19 10:37:13
Message-ID: 87in8nec96.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Various comments in GetOldestXmin mention the possibility of the oldest
xmin going backward, and assert that this is actually safe. It's not.

Consider:

A table has a toastable column. A row is updated in a way that changes
the toasted value. There are now two row versions pointing to different
toast values, one live and one dead.

Now suppose the toast table - but not the main table - is vacuumed; the
dead toast entries are removed, even though they are still referenced by
the dead main-table row. Autovacuum treats the main table and toast
table separately, so this can happen.

Now suppose that OldestXmin goes backwards so that the older main table
row version is no longer dead, but merely recently-dead.

At this point, VACUUM FULL (or similar rewrites) on the table will fail
with "missing chunk number 0 for ..." toast errors, because it tries to
copy the recently-dead row, but that row's toasted values have been
vacuumed away already.

(I've been working for a while with someone on IRC to track down the
source of unexplained semi-repeatable "missing chunk number 0 ..."
errors from VACUUM FULL. I don't know at this stage whether this is the
actual problem, but it matches the symptoms.)

--
Andrew (irc:RhodiumToad)

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-04-19 11:32:45 Re: Should we add GUCs to allow partition pruning to be disabled?
Previous Message Amit Langote 2018-04-19 10:22:19 Re: pruning disabled for array, enum, record, range type partition keys