Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, robertmhaas(at)gmail(dot)com, Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Date: 2024-01-06 21:41:23
Message-ID: CAH2-Wz=HT+zHGyQ5bykPEh20KQWnS4uZS_uupxrTrtf-Wt886A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Jan 6, 2024 at 1:30 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Sat, Jan 6, 2024 at 12:24 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > Fair enough. While I agree there's a decent chance back-patching would be
> > okay, I think there's also a decent chance that 1ccc1e05ae creates the problem
> > Matthias theorized. Something like: we update relfrozenxid based on
> > OldestXmin, even though GlobalVisState caused us to retain a tuple older than
> > OldestXmin. Then relfrozenxid disagrees with table contents.
>
> Either every relevant code path has the same OldestXmin to work off
> of, or the whole NewRelfrozenXid/relfrozenxid-tracking thing can't be
> expected to work as designed. I find it a bit odd that
> pruneheap.c/GlobalVisState has no direct understanding of this
> dependency (none that I can discern, at least).

What do you think of the idea of adding a defensive "can't happen"
error to lazy_scan_prune() that will catch DEAD or RECENTLY_DEAD
tuples with storage that still contain an xmax < OldestXmin? This
probably won't catch every possible problem, but I suspect it'll work
well enough.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2024-01-06 22:20:36 BUG #18274: Error 'invalid XML content'
Previous Message Peter Geoghegan 2024-01-06 21:30:40 Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()