Re: MarkBufferDirty Assert held LW_EXCLUSIVE lock fail when ginFinishSplit

From: feichanghong <feichanghong(at)qq(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: MarkBufferDirty Assert held LW_EXCLUSIVE lock fail when ginFinishSplit
Date: 2024-01-23 03:39:00
Message-ID: tencent_8FBD28E3B713E1E3C6C4870305C14EA9E305@qq.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


> On Jan 23, 2024, at 04:47, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> On 22/01/2024 18:21, feichanghong wrote:
>>> From a performance point of view, this doesn't matter. Incomplete split are extremely rare. For convenience, though, I added a new function specifically for handling these "leftover" incomplete splits as opposed to finishing a split that you just made, which performs the lock-upgrade. See attached. I think that helps with readability, and makes it less likely that we'll forget the lock-upgrade in the future if the insertion code is refactored.
>> I think that the lock-upgrade in the ginFinishOldSplit function is unsafe
>> because it violates the requirement of the ginStepRight function that
>> "The next page is locked first, before releasing the current page.”
>
> Good point.
>
> I started to work on a more invasive patch that would move the ginFinishOldSplit() calls to ginTraverseLock() and ginStepRight(), doing the interlocking correctly. That makes life easier for the callers; they don't need to deal with incomplete-splits anymore.
>
> But then I read the Page deletion section in the README and understood that my earlier patch is safe, after all. The lock-coupling in ginStepRight() is only needed for searches, not for inserts. There is another mechanism that prevents concurrent page deletions during an insert: VACUUM holds a cleanup-lock on the root page.
>
> Does that make sense, or am I missing something? Here's the same patch as before, with some extra comments to explain why it's safe.

From my understanding, you are right. The README includes the following
explanation:
ginScanToDelete() traverses the whole tree in depth-first manner. It starts
from the full cleanup lock on the tree root. This lock prevents all the
concurrent insertions into this tree while we're deleting pages. However,
there are still might be some in-progress readers, who traversed root before
we locked it.

For insert, ginFindLeafPage maintains a pin on the relevant pages along the
path, naturally retaining the pin on the root page as well. Before calling
ginScanToDelete to delete a page, it secures an exclusive lock on the root page
and no other backend holds a pin on the root page through LockBufferForCleanup.

Best Regards,
Fei Changhong
Alibaba Cloud Computing Ltd.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2024-01-23 05:10:52 Re: Removing const-false IS NULL quals and redundant IS NOT NULL quals
Previous Message Tender Wang 2024-01-23 03:19:57 Re: BUG #18297: Error when adding a column to a parent table with complex inheritance