Re: recovering from "found xmin ... from before relfrozenxid ..."

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: recovering from "found xmin ... from before relfrozenxid ..."
Date: 2020-07-14 00:47:10
Message-ID: CA+TgmoaZwZHtFFU6NUJgEAp6adDs-qWfNOXpZGQpZMmm0VTDfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 13, 2020 at 6:38 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> Not fully, I'm afraid. Afaict it doesn't currently tell you the item
> pointer offset, just the block numer, right? We probably should extend
> it to also include the offset...

Oh, I hadn't realized that limitation. That would be good to fix. It
would be even better, I think, if we could have VACUUM proceed with
the rest of vacuuming the table, emitting warnings about each
instance, instead of blowing up when it hits the first bad tuple, but
I think you may have told me sometime that doing so would be, uh, less
than straightforward. We probably should refuse to update
relfrozenxid/relminmxid when this is happening, but I *think* it would
be better to still proceed with dead tuple cleanup as far as we can,
or at least have an option to enable that behavior. I'm not positive
about that, but not being able to complete VACUUM at all is a FAR more
urgent problem than not being able to freeze, even though in the long
run the latter is more severe.

> > 2. In some other, similar situations, e.g. where the tuple data is
> > garbled, it's often possible to get out from under the problem by
> > deleting the tuple at issue. But I think that doesn't necessarily fix
> > anything in this case.
>
> Huh, why not? That worked in the cases I saw.

I'm not sure I've seen a case where that didn't work, but I don't see
a reason why it couldn't happen. Do you think the code is structured
in such a way that a deleted tuple is guaranteed to be pruned even if
the XID is old? What if clog has been truncated so that the xmin can't
be looked up?

> xmax is among the problematic cases IIRC, so yes, it'd be good to fix
> that.

Thanks for the input.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-07-14 00:58:43 Re: recovering from "found xmin ... from before relfrozenxid ..."
Previous Message Robert Haas 2020-07-14 00:41:22 Re: recovering from "found xmin ... from before relfrozenxid ..."