Quick Links

Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()

From:	Peter Geoghegan <pg(at)bowt(dot)ie>
To:	Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Date:	2021-11-03 16:21:20
Message-ID:	CAH2-Wzm0DXvLxzCqdiuN7=BwrXWRcm_KTU2VK2aNuo0PqCLNaA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On Wed, Nov 3, 2021 at 8:46 AM Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
> I seem to repeatedly get backends of which the xmin is set from
> InvalidTransactionId to some value < min(ProcGlobal->xids), which then
> result in shared_oldest_nonremovable (and others) being less than the
> value of their previous iteration. This leads to the infinite loop in
> lazy_scan_prune (it stores and uses one value of
> *_oldest_nonremovable, whereas heap_page_prune uses a more up-to-date
> variant).

> I noticed that when this happens, generally a parallel vacuum worker
> is involved.

Hmm. That is plausible. The way that VACUUM (and concurrent index
builds) avoid being seen via the PROC_IN_VACUUM thing is pretty
delicate. Wouldn't surprise me if the parallel VACUUM issue subtly
broke lazy_scan_prune in the way that we see here.

What about testing? Can we find a simple way of reducing this
complicated repro to a less complicated repro with a failing
assertion? Maybe an assertion that we get to keep after the bug is
fixed?

--
Peter Geoghegan

In response to

Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune() at 2021-11-03 15:45:58 from Matthias van de Meent

Responses

Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune() at 2021-11-05 11:43:00 from Matthias van de Meent

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Tom Lane	2021-11-03 16:46:38	Re: INFORMATION_SCHEMA.routines column routine_definition does not show the source
Previous Message	Matthias van de Meent	2021-11-03 15:45:58	Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()