Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date: 2021-11-13 01:57:07
Message-ID: CAH2-Wzn7GCX_JJ02WQfM9+hCtAEfYa8t5+1kftw1TTktjhsm4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Nov 12, 2021 at 5:48 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2021-11-12 16:37:24 -0800, Peter Geoghegan wrote:
> > There is one thing that I am fairly confident of here: the HOT chain
> > validation stuff is very robust. Experience has shown the invariants
> > to be very reliable (and if they're not reliable then we're in big
> > trouble anyway).
>
> What experience? Isn't this all new stuff?

I meant heapam since 2007, when HOT was first added -- all of it.

Notably, v4 of the patch makes the most conservative possible
assumptions about how HTSV might change its mind about an XID -- no
more "A RECENTLY_DEAD tuple is actually sometimes DEAD in the specific
context of processing a HOT chain". Now it's more like "A DEAD tuple
is actually sometimes RECENTLY_DEAD at the level of a HOT chain,
except for disconnected/aborted heap-only tuples, and if you don't
like that (i.e. during VACUUM) you can just retry pruning from
scratch immediately afterwards".

You said it yourself: who knows exactly what the justification for
RECENTLY_DEAD->DEAD was? I have to imagine it had something to do with the
"INSERT_IN_PROGRESS becomes DEAD due to concurrent xact abort" thing,
but that's unclear. And even if it was clear, and even if we knew that
it was 100% safe at one point, it still wouldn't be clear that it's
safe today, in Postgres 14.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2021-11-13 02:12:59 Re: BUG #17283: localhost should also include IPv6
Previous Message Andres Freund 2021-11-13 01:48:03 Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum