Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date: 2022-01-14 02:15:49
Message-ID: 20220114021549.m6ugihzjmsunv3jd@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2022-01-12 13:05:45 -0800, Peter Geoghegan wrote:
> On Wed, Jan 12, 2022 at 11:25 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > Any blockers?
> >
> > I'm just struggling with / procrastinating on the commit message, tbh. The
> > whole issue is kinda complicated to explain... :/

After struggling some more, I *finally* pushed the fix and the new assertions.

Thanks for the bug report, investigation, review, etc!

> I think that it would make sense for the commit message to frame the
> problem as: pruneheap.c doesn't take sufficient care when traversing
> HOT chains to determine their full extent, for the purposes of
> pruning. There was a general lack of robustness, and the snapshot
> scalability work happened to run into that, resulting in hot chain
> corruption under very specific conditions.
>
> If I was in your position I think I would resist framing the problem
> in this way; I'd probably be concerned that it would come off as
> shifting the blame elsewhere. This high level explanation of things
> makes the most sense to me, though. Surely that's the most important
> thing.

Thanks!

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2022-01-14 04:06:42 BUG #17366: Error result returned in timestamp2timestamptz, expected to be off by one hour
Previous Message Tom Lane 2022-01-13 15:28:16 Re: BUG #17365: Error: redefinition of 'stat' in win32_port.h when including postgres.h