Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Date: 2021-06-06 19:03:53
Message-ID: CAH2-Wzm56eJrUhRPnXupx1-uONe4bPdZNUfEOb4S8iBE5smReA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 6, 2021 at 11:43 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> Sorry, but I already killed the process to try to follow Matthias' suggestion.
> I have a core file from "gcore" but it looks like it's incomplete and the
> address is now "out of bounds"...

Based on what you said about ending up back in lazy_scan_prune()
alone, I think he's right. That is, I agree that it's very likely that
the stuck VACUUM would not have become stuck had the "goto retry on
HEAPTUPLE_DEAD inside lazy_scan_prune" thing not been added by commit
8523492d4e3. But that in itself doesn't necessarily implicate commit
8523492d4e3.

The interesting question is: Why doesn't heap_page_prune() ever agree
with HeapTupleSatisfiesVacuum() calls made from lazy_scan_prune(), no
matter how many times the call to heap_page_prune() is repeated? (It's
repeated to try to resolve the disagreement that aborted xacts can
sometimes cause.)

If I had to guess I'd say that the underlying problem has something to
do with heap_prune_satisfies_vacuum() not agreeing with
HeapTupleSatisfiesVacuum(), perhaps only when GlobalVisCatalogRels is
used. But that's a pretty wild guess at this point.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-06-06 19:10:07 Re: Since '2001-09-09 01:46:40'::timestamp microseconds are lost when extracting epoch
Previous Message Justin Pryzby 2021-06-06 18:43:11 Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic