Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Date: 2021-06-10 17:29:29
Message-ID: CAEze2Wh-2XjU8GXiWXA-bnqFUas96De1YJH+oA58sqjScdi=rg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 10 Jun 2021 at 19:07, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Thu, Jun 10, 2021 at 9:57 AM Matthias van de Meent
> <boekewurm+postgres(at)gmail(dot)com> wrote:
> > > By "matches what we expect", I meant "involves a just-aborted
> > > transaction". We could defensively verify that the inserting
> > > transaction concurrently aborted at the point of retrying/calling
> > > heap_page_prune() a second time. If there is no aborted transaction
> > > involved (as was the case with this bug), then we can be confident
> > > that something is seriously broken.
> >
> > I believe there are more cases than only the rolled back case, but
> > checking for those cases would potentially help, yes.
>
> Why do you believe that there are other cases?
>
> I'm not aware of any case that causes lazy_scan_prune() to retry using
> the goto, other than the aborted transaction case I described
> (excluding the bug that you diagnosed, which was of course never
> supposed to happen). If it really is possible to observe a retry for
> any other reason then I'd very much like to know all the details - it
> might well signal a distinct bug of the same general variety.

I see one exit for HEAPTUPLE_DEAD on a potentially recently committed
xvac (?), and we might also check against recently committed
transactions if xmin == xmax, although apparently that is not
implemented right now.

With regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-06-10 17:29:34 Re: [bug?] Missed parallel safety checks, and wrong parallel safety
Previous Message Alexander Korotkov 2021-06-10 17:24:20 Re: unnesting multirange data types