Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date: 2021-11-12 04:22:08
Message-ID: CAH2-WznkPUmQEd-7Li35Lbram4agSZsEg_SB92j5xKvw-2QB4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Nov 11, 2021 at 4:58 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > What prevents the scenario that some other backend e.g. has a snapshot with
> > xmin=xmax=RECENTLY_DEAD-row. If the RECENTLY_DEAD row has an xid that is later
> > than the DEAD row, this afaict would make it perfectly legal to prune the DEAD
> > row, but *not* the RECENTLY_DEAD one.
>
> I'll need to think about this very carefully. I didn't think it was
> worth blocking v3 on, though naturally it's a big concern.

If we're to traverse HOT chains right to the end in
heap_prune_chain(), reading even LIVE tuples (per the approach
proposed in my bugfix patch), we probably need to be more careful
about concurrently aborted xacts -- relying on the usual
!HeapTupleHeaderIsHotUpdated(htup) test doesn't seem safe.

Imagine if we land on a concurrently-aborted DEAD tuple at the end of
a physical HOT chain -- this might not be caught before we test the
previous tuple in the chain using HeapTupleHeaderIsHotUpdated(htup) --
the abort might happen just as we land on the final/aborted tuple. We
certainly shouldn't conclude that the whole HOT chain is now DEAD,
just because that one tuple is dead.

That definitely cannot happen on HEAD, I think, because we just give
up as soon as we see anything that isn't either DEAD or RECENTLY_DEAD.
But maybe it's possible with the patch.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2021-11-12 05:46:35 Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Previous Message Peter Geoghegan 2021-11-12 02:14:48 Re: BUG #17245: Index corruption involving deduplicated entries