Re: Assert in heapgettup_pagemode() fails due to underlying buffer change

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Assert in heapgettup_pagemode() fails due to underlying buffer change
Date: 2024-06-07 22:36:45
Message-ID: CA+hUKGK9KfChzZxgtXp=6iwzjRNHbMqS3Hje+jM6__z2XAFR7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 8, 2024 at 12:47 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Jun 7, 2024 at 4:05 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > > static void
> > > -ZeroBuffer(Buffer buffer, ReadBufferMode mode)
> > > +ZeroBuffer(Buffer buffer, ReadBufferMode mode, bool zero)
> >
> > This change makes the API very strange. Should the function be called
> > ZeroAndLockBuffer() instead? Then the addition of a "bool zero"
> > argument makes a lot more sense.
>
> I agree that's better, but it still looks a bit weird. You have to
> realize that 'bool zero' means 'is already zeroed' here -- or at
> least, I guess that's the intention. But then I wonder why you'd call
> a function called ZeroAndLockBuffer if all you need to do is
> LockBuffer.

The name weirdness comes directly from RBM_ZERO_AND_LOCK (the fact
that it doesn't always zero despite shouting ZERO is probably what
temporarily confused me). But coming up with a better name is hard
and I certainly don't propose to change it now. I think it's
reasonable for this internal helper function to have that matching
name as Alvaro suggested, with a good comment about that.

Even though that quick-demonstration change fixed the two reported
repros, I think it is still probably racy (or if it isn't, it relies
on higher level interlocking that I don't want to rely on). This case
really should be using the standard StartBufferIO/TerminateBufferIO
infrastructure as it was before. I had moved that around to deal with
multi-block I/O, but dropped the ball on the zero case... sorry.

Here's a version like that. The "zero" argument (yeah that was not a
good name) is now inverted and called "already_valid", but it's only a
sort of optimisation for the case where we already know for sure that
it's valid. If it isn't, we do the standard
BM_IO_IN_PROGRESS/BM_VALID/CV dance, for correct interaction with any
concurrent read or zero operation.

Attachment Content-Type Size
v2-0001-Fix-RBM_ZERO_AND_LOCK.patch application/octet-stream 3.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2024-06-08 04:00:00 Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed
Previous Message Michael Paquier 2024-06-07 22:31:21 Re: PgStat_KindInfo.named_on_disk not required in shared stats