Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date: 2021-11-12 23:47:45
Message-ID: CAH2-WzkDaMqv=Bw5fzdb750zBkuh=xeUJEy6L=ZWVuLRt4F7bw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Nov 12, 2021 at 3:31 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I wonder if we should try to go for something considerably simpler for 14. How
> about having a new array that just stores the HTSV state for every
> ItemIdIsNormal(). For simplicity, we could populate that array eagerly in a
> separate loop.

Why is that simpler than a boolean array, which represents whether or
not the item has had its heap_prune_record_unused() call yet (if it's
a tuple with storage)?

> That'd fix the known bugs, and yield better efficiency (because we'd not
> re-compute HTSV all the time). Then for HEAD go for something that fixes
> pruning more fundamentally.

I don't know what you mean about the patch recomputing HTSV all the
time. The patch doesn't do that.

It's true that we'll call HTSV (heap_prune_record_unused(), actually)
more often when following HOT chains, because now we follow them until
the end. However, the heap_prune_record_unused() calls only happen
after we've already validated that we found a heap-only tuple that's
part of the same HOT chain. That just leaves disconnected tuples. We
do call heap_prune_record_unused() there too (which is theoretically
unnecessary), but only once.

Overall, we are *guaranteed* to only call heap_prune_record_unused()
at most once per tuple with storage. I believe that this is a small
reduction, since HEAD will do the maybe-aborted precheck call to
heap_prune_record_unused() before anything else.

I guess you might have meant something about the more-conservative
behavior with DEAD vs RECENTLY_DEAD during HOT chain traversal. But
the cases where that makes any difference at all ought to be very rare
indeed.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2021-11-12 23:48:49 Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Previous Message Andres Freund 2021-11-12 23:31:46 Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum