Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date: 2022-03-04 00:22:11
Message-ID: CAH2-WznYsUxVT156rCQ+q=YD4S4=1M37hWvvHLz-H1pwSM8-Ew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Dec 10, 2021 at 8:57 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> 0004 is a patch that tries to address your point about the GlobalVisTestFor()
> placement, sort of

Following up on this now (will follow up on the "harden pruning" patch
separately).

> My earlier point that moving it to earlier would be a bad idea because it'd
> make the horizon "older" was bogus - GlobalVisTestFor() doesn't itself do
> any horizon determination. However moving the GlobalVisTestFor() earlier
> still seems wrong - imo the vacuum_set_xid_limits() call should be moved to
> later. The attached patch moves both, wrapped in a new function, to just
> before the scan in lazy_scan_heap().

Attached is (roughly speaking) a rebased version of this patch (i.e.
your v4-0004- patch from the patch series). Like your patch, this
patch documents our expectations for vistest and OldestXmin. Unlike
your patch, it makes all vacrel initialization take place before
lazy_scan_heap is called, from inside heap_vacuum_rel.

An important detail here is how OldestXmin (and vistest) are related
to rel_pages -- I also added something about that. It definitely
wouldn't be okay to (say) move the call to RelationGetNumberOfBlocks
to any place before the call to vacuum_set_xid_limits. And so it very
much seems in scope here.

I'm a bit confused about at least one detail here: you've said here
that "GlobalVisTestFor() doesn't itself do any horizon determination",
which seems to contradict comments from the v4-0004- patch that was
attached to the same email. This WIP patch claims that there is some
advantage to delaying the call to GlobalVisTestFor regardless, though
I think that there is a good chance that that detail is wrong. Please
let me know which it is.

I also updated nearby comments about the update to the target heap
rel's pg_class entry -- that work was done in passing.

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
v1-0001-WIP-Move-GlobalVisState-state-into-vacrel.patch application/octet-stream 11.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2022-03-04 03:31:32 Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Previous Message Bruce Momjian 2022-03-03 17:51:57 Re: Cannot refresh materialized view concurrently if you have a column name called "mv"