From: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, "Zhou, Zhiguo" <zhiguo(dot)zhou(at)intel(dot)com> |
Subject: | Re: Get rid of WALBufMappingLock |
Date: | 2025-02-07 11:02:48 |
Message-ID: | cefdf954-e585-4159-b3ff-45c4f95db001@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
07.02.2025 01:26, Alexander Korotkov пишет:
> Hi!
>
> On Sun, Jan 19, 2025 at 2:11 AM Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
>>
>> During discussion of Increasing NUM_XLOGINSERT_LOCKS [1], Andres Freund
>> used benchmark which creates WAL records very intensively. While I this
>> it is not completely fair (1MB log records are really rare), it pushed
>> me to analyze write-side waiting of XLog machinery.
>>
>> First I tried to optimize WaitXLogInsertionsToFinish, but without great
>> success (yet).
>>
>> While profiling, I found a lot of time is spend in the memory clearing
>> under global WALBufMappingLock:
>>
>> MemSet((char *) NewPage, 0, XLOG_BLCKSZ);
>>
>> It is obvious scalability bottleneck.
>>
>> So "challenge was accepted".
>>
>> Certainly, backend should initialize pages without exclusive lock. But
>> which way to ensure pages were initialized? In other words, how to
>> ensure XLogCtl->InitializedUpTo is correct.
>>
>> I've tried to play around WALBufMappingLock with holding it for a short
>> time and spinning on XLogCtl->xlblocks[nextidx]. But in the end I found
>> WALBufMappingLock is useless at all.
>>
>> Instead of holding lock, it is better to allow backends to cooperate:
>> - I bound ConditionVariable to each xlblocks entry,
>> - every backend now checks every required block pointed by
>> InitializedUpto was successfully initialized or sleeps on its condvar,
>> - when backend sure block is initialized, it tries to update
>> InitializedUpTo with conditional variable.
>
> Looks reasonable for me, but having ConditionVariable per xlog buffer
> seems overkill for me. Find an attached revision, where I've
> implemented advancing InitializedUpTo without ConditionVariable.
> After initialization of each buffer there is attempt to do CAS for
> InitializedUpTo in a loop. So, multiple processes will try to advance
> InitializedUpTo, they could hijack initiative from each other, but
> there is always a leader which will finish the work.
>
> There is only one ConditionVariable to wait for InitializedUpTo being advanced.
>
> I didn't benchmark my version, just checked that tests passed.
Good day, Alexander.
I've got mixed but quite close result for both approaches (single or many
ConditionVariable) on the notebook. Since I have no access to larger
machine, I can't prove "many" is way better (or discover it worse).
Given patch after cleanup looks a bit smaller and clearer, I agree to keep
just single condition variable.
Cleaned version is attached.
I've changed condition for broadcast a bit ("less" instead "not equal"):
- buffer's border may already go into future,
- and then other backend will reach not yet initialized buffer and will
broadcast.
-------
regards
Yura Sokolov aka funny-falcon
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Get-rid-of-WALBufMappingLock.patch | text/x-patch | 12.8 KB |
v2-0002-several-attempts-to-lock-WALInsertLocks.patch | text/x-patch | 3.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2025-02-07 11:14:15 | Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints |
Previous Message | Aleksander Alekseev | 2025-02-07 10:48:16 | Re: new commitfest transition guidance |