Re: New vacuum option to do only freezing

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New vacuum option to do only freezing
Date: 2019-04-16 15:59:31
Message-ID: 20190416155931.ivujhh5zfudfigze@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-04-16 10:54:34 -0400, Alvaro Herrera wrote:
> On 2019-Apr-16, Robert Haas wrote:
> > On Mon, Apr 15, 2019 at 9:07 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > > I'm not sure that's correct. If you do that, it'll end up in the
> > > > non-tupgone case, which might try to freeze a tuple that should've
> > > > been removed. Or am I confused?
> > >
> > > If we're failing to remove it, and it's below the desired freeze
> > > horizon, then we'd darn well better freeze it instead, no?
> >
> > I don't know that that's safe. IIRC, the freeze code doesn't cope
> > nicely with being given a tuple that actually ought to have been
> > deleted. It'll just freeze it anyway, which is obviously bad.
>
> Umm, but if we fail to freeze it, we'll leave a tuple around that's
> below the relfrozenxid for the table, causing later pg_commit to be
> truncated and error messages saying that the tuple cannot be read, no?

Is the below-relfrozenxid case actually reachable? Isn't the theory of
that whole codeblock that we ought to only get there if a transaction
concurrently commits?

* Ordinarily, DEAD tuples would have been removed by
* heap_page_prune(), but it's possible that the tuple
* state changed since heap_page_prune() looked. In
* particular an INSERT_IN_PROGRESS tuple could have
* changed to DEAD if the inserter aborted. So this
* cannot be considered an error condition.

And in case there was a concurrent transaction at the time of the
heap_page_prune(), it got to be above the OldestXmin passed to
HeapTupleSatisfiesVacuum() to - as it's the same OldestXmin value. And
as FreezeLimit should always be older than than OldestXmin, we should
never get into a situation where heap_page_prune() couldn't prune
something that we would have been forced to remove?

> > I don't know that that's safe. IIRC, the freeze code doesn't cope
> > nicely with being given a tuple that actually ought to have been
> > deleted. It'll just freeze it anyway, which is obviously bad.
> >
> > Unless this has been changed since I last looked at it.
>
> I don't think so.

I think it has changed a bit - these days heap_prepare_freeze_tuple()
will detect that case, and error out:

/*
* If we freeze xmax, make absolutely sure that it's not an XID
* that is important. (Note, a lock-only xmax can be removed
* independent of committedness, since a committed lock holder has
* released the lock).
*/
if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) &&
TransactionIdDidCommit(xid))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg_internal("cannot freeze committed xmax %u",
xid)));
and the equivalent multixact case:

if (TransactionIdDidCommit(xid))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg_internal("cannot freeze committed update xid %u", xid)));

We even complain if xmin is uncommitted and would need to be frozen:

if (TransactionIdPrecedes(xid, cutoff_xid))
{
if (!TransactionIdDidCommit(xid))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg_internal("uncommitted xmin %u from before xid cutoff %u needs to be frozen",
xid, cutoff_xid)));

I IIRC added that after one of the multixact issues lead to precisely
that, heap_prepare_freeze_tuple() leading to a valid xmax just being
emptied out, resurfacing dead tuples (and HOT corruption and such).

These messages are obviously intended to be a backstop against
continuing to corrupt further, than actually something a user should
ever see in a working system.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-04-16 16:01:10 Re: New vacuum option to do only freezing
Previous Message Robert Treat 2019-04-16 15:38:49 Re: Checksum errors in pg_stat_database