From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
Cc: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, "Zhou, Zhiguo" <zhiguo(dot)zhou(at)intel(dot)com> |
Subject: | Re: Get rid of WALBufMappingLock |
Date: | 2025-02-15 09:25:00 |
Message-ID: | CAPpHfds==kM3YJoTX64rV-CVsan8ARk3w32tgdCZ69BzD01i_g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
On Fri, Feb 14, 2025 at 4:11 PM Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
> 14.02.2025 17:09, Yura Sokolov пишет:
> > 14.02.2025 13:24, Alexander Korotkov пишет:
> >> On Fri, Feb 14, 2025 at 11:45 AM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> wrote:
> >>> On Fri, 14 Feb 2025 at 00:59, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> >>>> On Thu, Feb 13, 2025 at 6:39 PM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> wrote:
> >>>>> On Thu, 13 Feb 2025 at 14:08, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> >>>>>>
> >>>>>> On Thu, Feb 13, 2025 at 11:45 AM Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
> >>>>>>> 13.02.2025 12:34, Alexander Korotkov пишет:
> >>>>>>>> On Wed, Feb 12, 2025 at 8:16 PM Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
> >>>>>>>>> 08.02.2025 13:07, Alexander Korotkov пишет:
> >>>>>>>>>> On Fri, Feb 7, 2025 at 1:39 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> >>>>>>>>>>> Good, thank you. I think 0001 patch is generally good, but needs some
> >>>>>>>>>>> further polishing, e.g. more comments explaining how does it work.
> >>>>>>>>>
> >>>>>>>>> I tried to add more comments. I'm not good at, so recommendations are welcome.
> >>>>>>>>>
> >>>>>>>>>> Two things comes to my mind worth rechecking about 0001.
> >>>>>>>>>> 1) Are XLogCtl->InitializeReserved, XLogCtl->InitializedUpTo and
> >>>>>>>>>> XLogCtl->xlblocks always page-aligned? Because algorithm seems to be
> >>>>>>>>>> sensitive to that. If so, I would propose to explicitly comment that
> >>>>>>>>>> and add corresponding asserts.
> >>>>>>>>>
> >>>>>>>>> They're certainly page aligned, since they are page borders.
> >>>>>>>>> I added assert on alignment of InitializeReserved for the sanity.
> >>>>>>>>>
> >>>>>>>>>> 2) Check if there are concurrency issues between
> >>>>>>>>>> AdvanceXLInsertBuffer() and switching to the new WAL file.
> >>>>>>>>>
> >>>>>>>>> There are no issues:
> >>>>>>>>> 1. CopyXLogRecordToWAL for isLogSwitch follows same protocol, ie uses
> >>>>>>>>> GetXLogBuffer to zero-out WAL page.
> >>>>>>>>> 2. WALINSERT_SPECIAL_SWITCH forces exclusive lock on all insertion locks,
> >>>>>>>>> so switching wal is not concurrent. (Although, there is no need in this
> >>>>>>>>> exclusiveness, imho.)
> >>>>>>>>
> >>>>>>>> Good, thank you. I've also revised commit message and comments.
> >>>>>>>>
> >>>>>>>> But I see another issue with this patch. In the worst case, we do
> >>>>>>>> XLogWrite() by ourselves, and it could potentially could error out.
> >>>>>>>> Without patch, that would cause WALBufMappingLock be released and
> >>>>>>>> XLogCtl->InitializedUpTo not advanced. With the patch, that would
> >>>>>>>> cause other processes infinitely waiting till we finish the
> >>>>>>>> initialization.
> >>>>>>>>
> >>>>>>>> Possible solution would be to save position of the page to be
> >>>>>>>> initialized, and set it back to XLogCtl->InitializeReserved on error
> >>>>>>>> (everywhere we do LWLockReleaseAll()). We also must check that on
> >>>>>>>> error we only set XLogCtl->InitializeReserved to the past, because
> >>>>>>>> there could be multiple concurrent failures. Also we need to
> >>>>>>>> broadcast XLogCtl->InitializedUpToCondVar to wake up waiters.
> >>>>>>>
> >>>>>>> The single place where AdvanceXLInsertBuffer is called outside of critical
> >>>>>>> section is in XLogBackgroundFlush. All other call stacks will issue server
> >>>>>>> restart if XLogWrite will raise error inside of AdvanceXLInsertBuffer.
> >>>>>>>
> >>>>>>> XLogBackgroundFlush explicitely avoids writing buffers by passing
> >>>>>>> opportunistic=true parameter.
> >>>>>>>
> >>>>>>> Therefore, error in XLogWrite will not cause hang in AdvanceXLInsertBuffer
> >>>>>>> since server will shutdown/restart.
> >>>>>>>
> >>>>>>> Perhaps, we just need to insert `Assert(CritSectionCount > 0);` before call
> >>>>>>> to XLogWrite here?
> >>>>>>
> >>>>>> You're correct. I just reflected this in the next revision of the patch.
> >>>>>
> >>>>> I've looked at the patchset v6.
> >>>>
> >>>> Oh, sorry, I really did wrong. I've done git format-patch for wrong
> >>>> local branch for v5 and v6. Patches I've sent for v5 and v6 are
> >>>> actually the same as my v1. This is really pity. Please, find the
> >>>> right version of patchset attached.
> >>>
> >>> I've rechecked v7. In v6 a proposal from [1] was not reflected. Now it
> >>> landed in v7.
> >>>
> >>> Other changes are not regarding code behavior. The things from my
> >>> previous review that still could apply to v7:
> >>>
> >>> For 0001:
> >>>
> >>> Comment change proposed:
> >>> "lock-free with cooperation with" -> "lock-free accompanied by changes
> >>> to..." (maybe other variant)
> >>
> >> Good catch. I've rephrased this comment even more.
> >>
> >>> I propose a new define:
> >>> #define FirstValidXLogRecPtr 1
> >>> While FirstValidXLogRecPtr = InvalidXLogRecPtr + 1 is true in the code
> >>> that has no semantical meaning and it's better to avoid using direct
> >>> arithmetics to relate meaning of FirstValidXLogRecPtr from
> >>> InvalidXLogRecPtr.
> >>
> >> Makes sense, but I'm not sure if this change is required at all. I've
> >> reverted this to the state of master, and everything seems to work.
> >>
> >>> For 0002 both comments proposals from my message applied to v6 apply
> >>> to v7 as well
> >>
> >> Thank you for pointing. For now, I'm concentrated on improvements on
> >> 0001. Probably Yura could work on your notes to 0002.
> >
> > I wrote good commit message for 0002 with calculated probabilities and
> > simple Ruby program which calculates them to explain choice of 2
> > conditional attempts. (At least I hope the message is good). And added
> > simple comment before `int attempts = 2;`
> >
> > Also I simplified 0002 a bit to look a bit prettier (ie without goto), and
> > added static assert on NUM_XLOGINSERT_LOCKS being power of 2.
> >
> > (0001 patch is same as for v8)
>
> Oops, forgot to add StaticAssert into v9-0002.
Thank you. I'm planning to push 0001 if there is no objections. And
I'm planning to do more review/revision of 0002.
------
Regards,
Alexander Korotkov
Supabase
From | Date | Subject | |
---|---|---|---|
Next Message | Andrei Lepikhov | 2025-02-15 09:29:41 | Re: Proposal - Allow extensions to set a Plan Identifier |
Previous Message | Alexander Korotkov | 2025-02-15 09:19:42 | Re: Removing unneeded self joins |