Re: Is this a problem in GenericXLogFinish()?

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is this a problem in GenericXLogFinish()?
Date: 2023-10-16 23:31:14
Message-ID: b1f2d0f230d60fa8df33bb0b2af3236fde9d750d.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2023-10-11 at 14:53 +0300, Heikki Linnakangas wrote:
>
> The comment suggests that you don't need to hold an exclusive lock
> when
> you call this, but there's an assertion that you do.

Reworded.

> Is it a new requirement that you must hold an exclusive lock on the
> buffer when you call XLogRegisterBuffer()? I think that's reasonable.

Agreed.

>
> I'd suggest a comment here to explain what's wrong if someone hits
> this
> assertion. Something like "The buffer must be marked as dirty before
> registering, unless REGBUF_CLEAN_OK is set to indicate that the
> buffer
> is not being modified".

Done, with different wording.

>
> > If we commit this, ideally someone would look into the places where
> > REGBUF_CLEAN_OK is used and make sure that it's not a bug, and
> > perhaps
> > refactor so that it would benefit from the Assert. But refactoring
> > those places is outside of the scope of this patch.

I don't think it makes sense to register a buffer that is perhaps clean
and perhaps not. After registering a buffer and writing WAL, you need
to PageSetLSN(), and that should only be done after MarkBufferDirty(),
right?

So if you need to condition PageSetLSN() on whether MarkBufferDirty()
has happened, it would be trivial to use the same condition for
XLogRegisterBuffer(). Therefore I changed the name back to
REGBUF_NO_CHANGE, as you suggested originally.

The hash indexing code knows it's not modifying the bucket buffer,
doesn't mark it dirty, and doesn't set the LSN. I presume that's safe.

However, there are quite a few places where
XLogRegisterBuffer()+WAL+PageSetLSN() all happen, but it's not
immediately obvious that all code paths to get there also
MarkBufferDirty(). For instanace:

lazy_scan_heap() -- conditionally marks heapbuf as dirty
visibilitymap_set() -- conditionally calls log_heap_visible
log_heap_visible()
XLogRegisterBuffer(1, heap_buffer, flags)

if those conditions don't match up exactly, it seems we could get into
a situation where we update the LSN of a clean page, which seems bad.

There are a few other places in the hash code and spgist code where I
have similar concerns. Not many, though, I looked at all of the call
sites (at least briefly) and most of them are fine.

> About that: you added the flag to a couple of XLogRegisterBuffer()
> calls
> in GIN, because they call MarkBufferDirty() only after
> XLogRegisterBuffer(). Those would be easy to refactor so I'd suggest
> doing that now.

Done.

> > It sounds like we have no intention to change the hash index code,
> > so
> > are we OK if this flag just lasts forever? Do you still think it
> > offers
> > a useful check?
>
> Yeah, I think this is a useful assertion. It might catch some bugs in
> extensions too.

I was asking more about the REGBUF_NO_CHANGE flag. It feels very
specific to the hash indexes, and I'm not sure we want to encourage
extensions to use it.

Though maybe the locking protocol used by hash indexes is more
generally applicable, and other indexing strategies might want to use
it, too?

Another option might be to just change the hash indexing code to follow
the correct protocol, locking and calling MarkBufferDirty() in those 3
call sites. Marking the buffer dirty is easy, but making sure that it's
locked might require some refactoring. Robert, would following the
right protocol here affect performance?

Regards,
Jeff Davis

Attachment Content-Type Size
v4-0002-Assert-that-buffers-are-marked-dirty-before-XLogR.patch text/x-patch 11.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-10-16 23:38:40 Re: The danger of deleting backup_label
Previous Message Michael Paquier 2023-10-16 23:25:29 Re: Add support for AT LOCAL