Re: Get rid of WALBufMappingLock

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, "Zhou, Zhiguo" <zhiguo(dot)zhou(at)intel(dot)com>
Subject: Re: Get rid of WALBufMappingLock
Date: 2025-02-07 11:02:48
Message-ID: cefdf954-e585-4159-b3ff-45c4f95db001@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

07.02.2025 01:26, Alexander Korotkov пишет:
> Hi!
>
> On Sun, Jan 19, 2025 at 2:11 AM Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
>>
>> During discussion of Increasing NUM_XLOGINSERT_LOCKS [1], Andres Freund
>> used benchmark which creates WAL records very intensively. While I this
>> it is not completely fair (1MB log records are really rare), it pushed
>> me to analyze write-side waiting of XLog machinery.
>>
>> First I tried to optimize WaitXLogInsertionsToFinish, but without great
>> success (yet).
>>
>> While profiling, I found a lot of time is spend in the memory clearing
>> under global WALBufMappingLock:
>>
>> MemSet((char *) NewPage, 0, XLOG_BLCKSZ);
>>
>> It is obvious scalability bottleneck.
>>
>> So "challenge was accepted".
>>
>> Certainly, backend should initialize pages without exclusive lock. But
>> which way to ensure pages were initialized? In other words, how to
>> ensure XLogCtl->InitializedUpTo is correct.
>>
>> I've tried to play around WALBufMappingLock with holding it for a short
>> time and spinning on XLogCtl->xlblocks[nextidx]. But in the end I found
>> WALBufMappingLock is useless at all.
>>
>> Instead of holding lock, it is better to allow backends to cooperate:
>> - I bound ConditionVariable to each xlblocks entry,
>> - every backend now checks every required block pointed by
>> InitializedUpto was successfully initialized or sleeps on its condvar,
>> - when backend sure block is initialized, it tries to update
>> InitializedUpTo with conditional variable.
>
> Looks reasonable for me, but having ConditionVariable per xlog buffer
> seems overkill for me. Find an attached revision, where I've
> implemented advancing InitializedUpTo without ConditionVariable.
> After initialization of each buffer there is attempt to do CAS for
> InitializedUpTo in a loop. So, multiple processes will try to advance
> InitializedUpTo, they could hijack initiative from each other, but
> there is always a leader which will finish the work.
>
> There is only one ConditionVariable to wait for InitializedUpTo being advanced.
>
> I didn't benchmark my version, just checked that tests passed.

Good day, Alexander.

I've got mixed but quite close result for both approaches (single or many
ConditionVariable) on the notebook. Since I have no access to larger
machine, I can't prove "many" is way better (or discover it worse).

Given patch after cleanup looks a bit smaller and clearer, I agree to keep
just single condition variable.

Cleaned version is attached.

I've changed condition for broadcast a bit ("less" instead "not equal"):
- buffer's border may already go into future,
- and then other backend will reach not yet initialized buffer and will
broadcast.

-------
regards
Yura Sokolov aka funny-falcon

Attachment Content-Type Size
v2-0001-Get-rid-of-WALBufMappingLock.patch text/x-patch 12.8 KB
v2-0002-several-attempts-to-lock-WALInsertLocks.patch text/x-patch 3.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2025-02-07 11:14:15 Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints
Previous Message Aleksander Alekseev 2025-02-07 10:48:16 Re: new commitfest transition guidance