Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date: 2021-11-13 00:37:24
Message-ID: CAH2-WzmLVPByK+5LMHQynn_M45vFot7YXWLNT5aAwvBBve_Ezg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Nov 12, 2021 at 3:56 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Why is that simpler than a boolean array, which represents whether or
> > not the item has had its heap_prune_record_unused() call yet (if it's
> > a tuple with storage)?
>
> Well, your change is basically a new approach of pruning - a much better
> one. But it's a larger change than just eliminating the repeated HTSV calls so
> that they cannot change over time. That'd be ~10-15 lines.

Before Wednesday, I had no idea that a HEAPTUPLE_DELETE_IN_PROGRESS
could turn to DEAD during VACUUM/pruning at all. And so I now have to
wonder: what else don't I know?

There is one thing that I am fairly confident of here: the HOT chain
validation stuff is very robust. Experience has shown the invariants
to be very reliable (and if they're not reliable then we're in big
trouble anyway). The invariants can be leveraged to make the pruning
fix robust against both the issue we know about, and hypothetical
undiscovered issues that also may affect pruning. The overall risk
seems much lower with my patch. It isn't the smallest possible patch,
certainly, but that doesn't seem all that relevant, given the
specifics of the situation.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2021-11-13 01:04:15 BUG #17283: localhost should also include IPv6
Previous Message Andres Freund 2021-11-12 23:56:05 Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum