Re: Get rid of WALBufMappingLock

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
Cc: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, "Zhou, Zhiguo" <zhiguo(dot)zhou(at)intel(dot)com>
Subject: Re: Get rid of WALBufMappingLock
Date: 2025-02-14 10:24:15
Message-ID: CAPpHfdto+M6uTcOh1Snw2isHUs0oZwA_ywioY7KW_R26nu7Wzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 14, 2025 at 11:45 AM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> wrote:
> On Fri, 14 Feb 2025 at 00:59, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> > On Thu, Feb 13, 2025 at 6:39 PM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> wrote:
> > > On Thu, 13 Feb 2025 at 14:08, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> > > >
> > > > On Thu, Feb 13, 2025 at 11:45 AM Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
> > > > > 13.02.2025 12:34, Alexander Korotkov пишет:
> > > > > > On Wed, Feb 12, 2025 at 8:16 PM Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
> > > > > >> 08.02.2025 13:07, Alexander Korotkov пишет:
> > > > > >>> On Fri, Feb 7, 2025 at 1:39 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> > > > > >>>> Good, thank you. I think 0001 patch is generally good, but needs some
> > > > > >>>> further polishing, e.g. more comments explaining how does it work.
> > > > > >>
> > > > > >> I tried to add more comments. I'm not good at, so recommendations are welcome.
> > > > > >>
> > > > > >>> Two things comes to my mind worth rechecking about 0001.
> > > > > >>> 1) Are XLogCtl->InitializeReserved, XLogCtl->InitializedUpTo and
> > > > > >>> XLogCtl->xlblocks always page-aligned? Because algorithm seems to be
> > > > > >>> sensitive to that. If so, I would propose to explicitly comment that
> > > > > >>> and add corresponding asserts.
> > > > > >>
> > > > > >> They're certainly page aligned, since they are page borders.
> > > > > >> I added assert on alignment of InitializeReserved for the sanity.
> > > > > >>
> > > > > >>> 2) Check if there are concurrency issues between
> > > > > >>> AdvanceXLInsertBuffer() and switching to the new WAL file.
> > > > > >>
> > > > > >> There are no issues:
> > > > > >> 1. CopyXLogRecordToWAL for isLogSwitch follows same protocol, ie uses
> > > > > >> GetXLogBuffer to zero-out WAL page.
> > > > > >> 2. WALINSERT_SPECIAL_SWITCH forces exclusive lock on all insertion locks,
> > > > > >> so switching wal is not concurrent. (Although, there is no need in this
> > > > > >> exclusiveness, imho.)
> > > > > >
> > > > > > Good, thank you. I've also revised commit message and comments.
> > > > > >
> > > > > > But I see another issue with this patch. In the worst case, we do
> > > > > > XLogWrite() by ourselves, and it could potentially could error out.
> > > > > > Without patch, that would cause WALBufMappingLock be released and
> > > > > > XLogCtl->InitializedUpTo not advanced. With the patch, that would
> > > > > > cause other processes infinitely waiting till we finish the
> > > > > > initialization.
> > > > > >
> > > > > > Possible solution would be to save position of the page to be
> > > > > > initialized, and set it back to XLogCtl->InitializeReserved on error
> > > > > > (everywhere we do LWLockReleaseAll()). We also must check that on
> > > > > > error we only set XLogCtl->InitializeReserved to the past, because
> > > > > > there could be multiple concurrent failures. Also we need to
> > > > > > broadcast XLogCtl->InitializedUpToCondVar to wake up waiters.
> > > > >
> > > > > The single place where AdvanceXLInsertBuffer is called outside of critical
> > > > > section is in XLogBackgroundFlush. All other call stacks will issue server
> > > > > restart if XLogWrite will raise error inside of AdvanceXLInsertBuffer.
> > > > >
> > > > > XLogBackgroundFlush explicitely avoids writing buffers by passing
> > > > > opportunistic=true parameter.
> > > > >
> > > > > Therefore, error in XLogWrite will not cause hang in AdvanceXLInsertBuffer
> > > > > since server will shutdown/restart.
> > > > >
> > > > > Perhaps, we just need to insert `Assert(CritSectionCount > 0);` before call
> > > > > to XLogWrite here?
> > > >
> > > > You're correct. I just reflected this in the next revision of the patch.
> > >
> > > I've looked at the patchset v6.
> >
> > Oh, sorry, I really did wrong. I've done git format-patch for wrong
> > local branch for v5 and v6. Patches I've sent for v5 and v6 are
> > actually the same as my v1. This is really pity. Please, find the
> > right version of patchset attached.
>
> I've rechecked v7. In v6 a proposal from [1] was not reflected. Now it
> landed in v7.
>
> Other changes are not regarding code behavior. The things from my
> previous review that still could apply to v7:
>
> For 0001:
>
> Comment change proposed:
> "lock-free with cooperation with" -> "lock-free accompanied by changes
> to..." (maybe other variant)

Good catch. I've rephrased this comment even more.

> I propose a new define:
> #define FirstValidXLogRecPtr 1
> While FirstValidXLogRecPtr = InvalidXLogRecPtr + 1 is true in the code
> that has no semantical meaning and it's better to avoid using direct
> arithmetics to relate meaning of FirstValidXLogRecPtr from
> InvalidXLogRecPtr.

Makes sense, but I'm not sure if this change is required at all. I've
reverted this to the state of master, and everything seems to work.

> For 0002 both comments proposals from my message applied to v6 apply
> to v7 as well

Thank you for pointing. For now, I'm concentrated on improvements on
0001. Probably Yura could work on your notes to 0002.

------
Regards,
Alexander Korotkov
Supabase

Attachment Content-Type Size
v8-0001-Get-rid-of-WALBufMappingLock.patch application/octet-stream 14.2 KB
v8-0002-several-attempts-to-lock-WALInsertLocks.patch application/octet-stream 2.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-02-14 10:35:52 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Previous Message Amit Kapila 2025-02-14 10:06:04 Re: Add an option to skip loading missing publication to avoid logical replication failure