Re: InstallXLogFileSegment() vs concurrent WAL flush

From: Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: thomas(dot)munro(at)gmail(dot)com, nagata(at)sraoss(dot)co(dot)jp, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InstallXLogFileSegment() vs concurrent WAL flush
Date: 2024-10-31 08:03:06
Message-ID: CAE-ML+-vkbp_oU8qUdJux01EjQK1sDic48yT_aEPmoA0yC8mxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 5, 2024 at 11:58 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Fri, 2 Feb 2024 14:42:46 +0100, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote in
> > On Fri, Feb 2, 2024 at 12:56 PM Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:
> > > On Fri, 2 Feb 2024 11:18:18 +0100
> > > Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > > > One simple way to address that would be to make XLogFileInitInternal()
> > > > wait for InstallXLogFileSegment() to finish. It's a little
> > >
> > > Or, can we make sure the rename is durable by calling fsync before
> > > returning the fd, as a patch attached here?
> >
> > Right, yeah, that works too. I'm not sure which way is better.
>
> I'm not sure I like issuing spurious syncs unconditionally. Therefore,
> I prefer Thomas' approach in that regard. 0002 would be beneficial,
> considering the case of a very large max_wal_size, and the code seems
> to be the minimal required. I don't think it matters that the lock
> attempts occur uselessly until the first segment installation. That
> being said, we could avoid it by initializing
> last_known_installed_segno properly.

I agree with avoiding unconditional fsyncs as well. 0002 seems safer in terms
of avoiding contention. I attached a patch initing the
last_known_installed_segno to the segno of the starting checkpoint. I added
some more commentary too.

I added a comment that I'm not 100% sure about for durable_rename(). It was my
attempt at making other users of durable_rename() aware of a generalization of
the issue we are trying to tackle here.

Are there other uses of durable_rename() that can suffer a similar issue? For
instance, relmapper seems safe because we rely on RelationMappingLock. However,
does the wal summarizer?

Generally speaking, it's bad that durable_rename() has this window that makes
it less atomic than it seems. Ideally, we could have held some generic lock to
prevent that perhaps. Like have some file API within fd.c to grab a lock to
open a rename-able file, with the same LWLock being held during a
durable_rename()? Guess it would invite more contention though, and surrounding
code paths are already holding other LWLocks, which we may not be able to get
rid of.

Regards,
Soumyadeep (Broadcom)

Attachment Content-Type Size
v3-0001-InstallXLogFileSegment-vs-concurrent-WAL-flush.patch text/x-patch 4.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message Kyotaro Horiguchi 2024-10-31 08:01:30 Re: In-placre persistance change of a relation