From: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Zhou, Zhiguo" <zhiguo(dot)zhou(at)intel(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com> |
Subject: | Re: Increase NUM_XLOGINSERT_LOCKS |
Date: | 2025-01-18 11:53:05 |
Message-ID: | bd9a3b67-d805-4e0a-a167-5b6241bf6adf@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Since it seems Andres missed my request to send answer's copy,
here it is:
On 2025-01-16 18:55:47 +0300, Yura Sokolov wrote:
> 16.01.2025 18:36, Andres Freund пишет:
>> Hi,
>>
>> On 2025-01-16 16:52:46 +0300, Yura Sokolov wrote:
>>> Good day, hackers.
>>>
>>> Zhiguo Zhow proposed to transform xlog reservation to lock-free
algorighm to
>>> increment NUM_XLOGINSERT_LOCKS on very huge (480vCPU) servers. [1]
>>>
>>> While I believe lock-free reservation make sense on huge server, it
is hard
>>> to measure on small servers and personal computers/notebooks.
>>>
>>> But increase of NUM_XLOGINSERT_LOCKS have measurable performance
gain (using
>>> synthetic test) even on my working notebook:
>>>
>>> Ryzen-5825U (8 cores, 16 threads) limited to 2GHz , Ubuntu 24.04
>>
>> I've experimented with this in the past.
>>
>>
>> Unfortunately increasing it substantially can make the contention on the
>> spinlock *substantially* worse.
>>
>> c=80 && psql -c checkpoint -c 'select pg_switch_wal()' && pgbench -n
-M prepared -c$c -j$c -f <(echo "SELECT pg_logical_emit_message(true,
'test', repeat('0', 1024*1024));";) -P1 -T15
>>
>> On a 2x Xeon Gold 5215, with max_wal_size = 150GB and the workload
ran a few
>> times to ensure WAL is already allocated.
>>
>> With
>> NUM_XLOGINSERT_LOCKS = 8: 1459 tps
>> NUM_XLOGINSERT_LOCKS = 80: 2163 tps
>
> So, even in your test you have +50% gain from increasing
> NUM_XLOGINSERT_LOCKS.
>
> (And that is why I'm keen on smaller increase, like upto 64, not 128).
Oops, I swapped the results around when reformatting the results, sorry!
It's
the opposite way. I.e. increasing the locks hurts.
Here's that issue fixed and a few more NUM_XLOGINSERT_LOCKS. This is a
slightly different disk (the other seems to have to go the way of the dodo),
so the results aren't expected to be exactly the same.
NUM_XLOGINSERT_LOCKS TPS
1 2583
2 2524
4 2711
8 2788
16 1938
32 1834
64 1865
128 1543
>>
>> The main reason is that the increase in insert locks puts a lot more
pressure
>> on the spinlock.
>
> That it addressed by Zhiguo Zhow and me in other thread [1]. But
increasing
> NUM_XLOGINSERT_LOCKS gives benefits right now (at least on smaller
> installations), and "lock-free reservation" should be measured
against it.
I know that there's that thread, I just don't see how we can increase
NUM_XLOGINSERT_LOCKS due to the regressions it can cause.
>> Secondarily it's also that we spend more time iterating
>> through the insert locks when waiting, and that that causes a lot of
cacheline
>> pingpong.
>
> Waiting is done with LWLockWaitForVar, and there is no wait if
`insertingAt`
> is in future. It looks very efficient in master branch code.
But LWLockWaitForVar is called from WaitXLogInsertionsToFinish, which just
iterates over all locks.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Michail Nikolaev | 2025-01-18 12:59:35 | Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY |
Previous Message | Dilip Kumar | 2025-01-18 11:12:11 | Re: Parallel heap vacuum |