Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date: 2021-11-11 03:33:01
Message-ID: CAH2-Wz=cLQBJEYOD=c8HBwx9+s2X38Y6nhd=WGRFtUHS8fQwWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Nov 10, 2021 at 7:16 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I don't really think it's a good idea to move it to earlier. If anything I
> would want to move it later (and eventually occasionally re-compute it, that'd
> be a *huge* win for longrunning vacuums).

I agree that it would be a huge win, especially when we want to set
the VM after heap vacuuming, during the second heap pass. But...

> We can't really do the same with the
> hard horizons. It's currently super likely that the horizon moves forward a
> lot between vacuum_set_xid_limits() and GlobalVisTestFor(), but moving it
> closer still is going in the wrong direction.

...why is that super likely? I have to admit that I have no idea what
you mean by that.

> So I'm inclined to add a comment saying that we need to do the
> GlobalVisTestFor(), pointing to the comment in ComputeXidHorizons(), and
> adding a TODO that we should find a heuristic when to recompute horizons?

Okay. Even still, I suggest that you say something about this new
DELETE_IN_PROGRESS pruning behavior in vacuumlazy.c. Maybe in
lazy_scan_prune()?

> > We could probably *also* freeze tuples opportunistically (e.g., freeze
> > a few tuples on a page early to be able to mark it all-frozen sooner),
>
> We should definitely do that when a page is already being dirtied for other
> reasons.

Agreed. But at the very least we should be thinking about the
whole-page picture. If we made freezing a whole page cheap enough
(smaller WAL records), then maybe the separate all-visible state
becomes totally unnecessary (except for pg_upgrade).

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2021-11-11 08:08:34 Re: BUG #17280: global-buffer-overflow on select from pg_stat_slru
Previous Message Kyotaro Horiguchi 2021-11-11 03:19:09 Re: BUG #17280: global-buffer-overflow on select from pg_stat_slru