Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()

From: Noah Misch <noah(at)leadboat(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, robertmhaas(at)gmail(dot)com, Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Date: 2024-01-10 21:15:44
Message-ID: 20240110211544.0b.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Jan 10, 2024 at 02:57:34PM -0500, Peter Geoghegan wrote:
> On Wed, Jan 10, 2024 at 2:38 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > > I'm referring to calls such as the
> > > "GetOldestNonRemovableTransactionId(NULL)" and
> > > "GlobalVisCheckRemovableFullXid()" calls that take place inside
> > > _bt_pendingfsm_finalize(). It's not like we do stuff like that in very
> > > many other places.
> >
> > I see what you mean about the rarity and potential importance of
> > "GetOldestNonRemovableTransactionId(NULL)". There's just one other caller,
> > vac_update_datfrozenxid(), which calls it for an unrelated cause.
>
> I just noticed another detail that adds significant weight to this
> theory: it looks like the problem is hit on the first tuple located on
> the first heap page that VACUUM scans *after* it completes its first
> round of index vacuuming (I'm inferring this from vacrel state,
> particular its lpdead_items instrumentation counter). The dead_items
> array is as large as possible here (just under 1 GiB), and
> lpdead_items is 178956692 (which uses up all of our dead_items space).
> VACUUM scans tens of gigabytes of heap pages before it begins this
> initial round of index vacuuming (according to vacrel->scanned_pages).
>
> What are the chances that all of this is just a coincidence? Low, I'd say.

Agreed. I bet you've made an important finding, there.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2024-01-11 01:12:23 Re: BUG #18240: Undefined behaviour in cash_mul_flt8() and friends
Previous Message Robert Haas 2024-01-10 20:04:37 Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger