From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic |
Date: | 2021-06-16 19:22:02 |
Message-ID: | 20210616192202.6q63mu66h4uyn343@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2021-06-16 09:46:07 -0700, Peter Geoghegan wrote:
> On Wed, Jun 16, 2021 at 9:03 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > On Wed, Jun 16, 2021 at 3:59 AM Matthias van de Meent
> > > So the implicit assumption in heap_page_prune that
> > > HeapTupleSatisfiesVacuum(OldestXmin) is always consistent with
> > > heap_prune_satisfies_vacuum(vacrel) has never been true. In that case,
> > > we'll need to redo the condition in heap_page_prune as well.
> >
> > I don't think that this shows that the assumption within
> > lazy_scan_prune() (the assumption that both "satisfies vacuum"
> > functions agree) is wrong, with the obvious exception of cases
> > involving the bug that Justin reported. GlobalVis*.maybe_needed is
> > supposed to be conservative.
>
> I suppose it's true that they can disagree because we call
> vacuum_set_xid_limits() to get an OldestXmin inside vacuumlazy.c
> before calling GlobalVisTestFor() inside vacuumlazy.c to get a
> vistest. But that only implies that a tuple that would have been
> considered RECENTLY_DEAD inside lazy_scan_prune() (it just missed
> being considered DEAD according to OldestXmin) is seen as an LP_DEAD
> stub line pointer. Which really means it's DEAD to lazy_scan_prune()
> anyway. These days the only way that lazy_scan_prune() can consider a
> tuple fully DEAD is if it's no longer a tuple -- it has to actually be
> an LP_DEAD stub line pointer.
I think it's more complicated than that - "before" isn't a guarantee when the
horizon can go backwards. Consider the case where a hot_standby_feedback=on
replica without a slot connects - that can result in the xid horizon going
backwards.
I think a good way to address this might be to have GlobalVisUpdateApply()
ensure that maybe_needed does not go backwards within one backend.
This is *nearly* already guaranteed within vacuum, except for the case where a
catalog access between vacuum_set_xid_limits() and GlobalVisTestFor() could
lead to an attempt at pruning, which could move maybe_needed to go backwards
theoretically if inbetween those two steps a replica connected that causes the
horizon to go backwards.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Matthias van de Meent | 2021-06-16 19:23:06 | Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic |
Previous Message | Peter Geoghegan | 2021-06-16 19:18:20 | Re: snapshot too old issues, first around wraparound and then more. |