Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Date: 2024-05-02 16:52:05
Message-ID: CAH2-WzmYAMYBRp==fSCskLmgPDmx13vPe07E4gOzLLSbDFO8-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Apr 27, 2024 at 10:38 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> In 17, we don't ever get a new HTSV_Result, so if the tuple is not
> removed, it would be because HeapTupleSatisfiesVacuumHorizon()
> returned HEAPTUPLE_RECENTLY_DEAD and, if GlobalVisTestIsRemovableXid()
> was called, dead_after did not precede GlobalVisState->maybe_needed.
> This tuple, during this vacuum of the relation, would never be
> determined to be HEAPTUPLE_DEAD or it would have been removed.

That makes sense.

> > > It will always be HEAPTUPLE_RECENTLY_DEAD in 17 and in <= 16, if
> > > HeapTupleSatisfiesVacuum() returns HEAPTUPLE_DEAD, we wouldn't call
> > > heap_prepare_freeze_tuple() because of the retry loop.
> >
> > The retry loop exists precisely because heap_prepare_freeze_tuple()
> > isn't prepared to deal with HEAPTUPLE_DEAD tuples. So I agree that
> > that won't be allowed to happen on versions that have the retry loop
> > (14 - 16).
>
> So, it can't happen in back branches. Let's just address 17. Help me
> understand how this can happen in 17.

Just to be clear, I never said that it was possible in 17. If I
somehow implied it, then I didn't mean to.

> > As Andres pointed out, even if we were to call
> > heap_prepare_freeze_tuple() with a HEAPTUPLE_DEAD tuple, we'd get a
> > "can't happen" error (though it's hard to see this because it doesn't
> > actually rely on the hint bits set in the tuple).
>
> I just don't see how in 17 we could end up calling
> heap_prepare_freeze_tuple() on a HEAPTUPLE_DEAD tuple. If
> heap_prune_satisfies_vacuum() returned HEAPTUPLE_DEAD, we wouldn't
> heap_prepare_freeze_tuple() the tuple.

That part seems relatively straightforward.

> I assume you are talking about a HEAPTUPLE_RECENTLY_DEAD tuple that
> *should* be HEAPTUPLE_DEAD. But it doesn't really matter if it should
> be HEAPTUPLE_DEAD and we fail to remove it, as long as we don't leave
> it unfrozen and advance relfrozenxid past its xids.

I'm not sure that I agree with this part. You're describing a scheme
under which aggressive VACUUM isn't strictly guaranteed to respect
FreezeLimit. And so (for example) a VACUUM FREEZE wouldn't advance
relfrozenxid right up to OldestXmin/FreezeLimit. Right?

It certainly seems possible that, all things considered, you're right
-- perhaps it literally doesn't matter at all on 17, at least in
practice. I cannot think of any problem that's worse than an assertion
failure within heap_vacuum_rel, just before we go to advance the rel's
pg_class.relfrozenxid (and even that seems extremely unlikely). Nobody
particularly intended for things to work this way (the FreezeLimit
invariant isn't supposed to be violated), but it'd hardly be the first
time that something like this happened. Far from it.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Haifang Wang (Centific Technologies Inc) 2024-05-02 17:26:50 Windows Application Issues | PostgreSQL | REF # 48475607
Previous Message Tom Lane 2024-05-02 16:42:25 Re: BUG #18455: DST information is flagged wrongly