Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae
Date: 2024-03-30 16:19:33
Message-ID: CAAKRu_aq0oZS88NxN9gEW9Ff-HK6AegTZ0FdC8C4fOZdXbwx+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Mar 29, 2024 at 12:05 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Mar 22, 2024 at 3:34 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > I think we need to understand what's actually happening here before we
> > jump to solutions. I think it's clear that we can't determine a
> > relfrozenxid in a way that is untethered from the decisions made while
> > pruning, but surely whoever wrote this code intended for there to be
> > some relationship between GlobalVisState and
> > VacuumCutoffs->OldestXmin. For example, it may be that they intended
> > for VacuumCutoffs->OldestXmin to always contain an XID that is older
> > than any XID that the GlobalVisState could possibly have pruned; but
> > they might have done that incorrectly, in such a fashion that when
> > GlobalVisState->maybe_needed moves backward the intended invariant no
> > longer holds.
>
> OK, I did some more analysis of this. I kind of feel like maybe this
> should be getting more attention from Melanie (since it was the commit
> of her patch that triggered the addition of this to the open items
> list) or Andres (because it has been alleged to be related to the
> GlobalVisTest stuff), but here we are.

Agreed that I should be doing more of the analysis. I plan to spend
time on this very soon.

Obviously we should actually fix this on back branches, but could we
at least make the retry loop interruptible in some way so people could
use pg_cancel/terminate_backend() on a stuck autovacuum worker or
vacuum process?

- Melanie

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Robert Haas 2024-03-30 17:35:54 Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae
Previous Message walther 2024-03-30 14:05:19 Re: Building with musl in CI and the build farm