Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Date: 2021-06-06 18:43:11
Message-ID: 20210606184311.GU14099@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 06, 2021 at 11:00:38AM -0700, Peter Geoghegan wrote:
> On Sun, Jun 6, 2021 at 9:35 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > I'll leave the instance running for a little bit before restarting (or kill-9)
> > in case someone requests more info.
>
> How about dumping the page image out, and sharing it with the list?
> This procedure should work fine from gdb:

Sorry, but I already killed the process to try to follow Matthias' suggestion.
I have a core file from "gcore" but it looks like it's incomplete and the
address is now "out of bounds"...

#2 0x00000000004fd9bf in lazy_scan_prune (vacrel=vacrel(at)entry=0x1d1b390, buf=buf(at)entry=14138, blkno=blkno(at)entry=75, page=page(at)entry=0x2aaab2089e00 <Address 0x2aaab2089e00 out of bounds>,

I saved a copy of the datadir, but a manual "vacuum" doesn't trigger the
problem. So if Matthias' theory is right, it seems like there may be a race
condition. Maybe that goes without saying.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-06-06 19:03:53 Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Previous Message Andres Freund 2021-06-06 18:01:54 Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic