From: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
---|---|
To: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, "Zhou, Zhiguo" <zhiguo(dot)zhou(at)intel(dot)com> |
Subject: | Get rid of WALBufMappingLock |
Date: | 2025-01-19 00:11:32 |
Message-ID: | 39b39e7a-41b4-4f34-b3f5-db735e74a723@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Good day, hackers.
During discussion of Increasing NUM_XLOGINSERT_LOCKS [1], Andres Freund
used benchmark which creates WAL records very intensively. While I this
it is not completely fair (1MB log records are really rare), it pushed
me to analyze write-side waiting of XLog machinery.
First I tried to optimize WaitXLogInsertionsToFinish, but without great
success (yet).
While profiling, I found a lot of time is spend in the memory clearing
under global WALBufMappingLock:
MemSet((char *) NewPage, 0, XLOG_BLCKSZ);
It is obvious scalability bottleneck.
So "challenge was accepted".
Certainly, backend should initialize pages without exclusive lock. But
which way to ensure pages were initialized? In other words, how to
ensure XLogCtl->InitializedUpTo is correct.
I've tried to play around WALBufMappingLock with holding it for a short
time and spinning on XLogCtl->xlblocks[nextidx]. But in the end I found
WALBufMappingLock is useless at all.
Instead of holding lock, it is better to allow backends to cooperate:
- I bound ConditionVariable to each xlblocks entry,
- every backend now checks every required block pointed by
InitializedUpto was successfully initialized or sleeps on its condvar,
- when backend sure block is initialized, it tries to update
InitializedUpTo with conditional variable.
Andres's benchmark looks like:
c=100 && install/bin/psql -c checkpoint -c 'select pg_switch_wal()'
postgres && install/bin/pgbench -n -M prepared -c$c -j$c -f <(echo
"SELECT pg_logical_emit_message(true, 'test', repeat('0',
1024*1024));";) -P1 -T45 postgres
So, it generate 1M records as fast as possible for 45 seconds.
Test machine is Ryzen 5825U (8c/16th) limited to 2GHz.
Config:
max_connections = 1000
shared_buffers = 1024MB
fsync = off
wal_sync_method = fdatasync
full_page_writes = off
wal_buffers = 1024MB
checkpoint_timeout = 1d
Results are: "average for 45 sec" /"1 second max outlier"
Results for master @ d3d098316913 :
25 clients: 2908 /3230
50 clients: 2759 /3130
100 clients: 2641 /2933
200 clients: 2419 /2707
400 clients: 1928 /2377
800 clients: 1689 /2266
With v0-0001-Get-rid-of-WALBufMappingLock.patch :
25 clients: 3103 /3583
50 clients: 3183 /3706
100 clients: 3106 /3559
200 clients: 2902 /3427
400 clients: 2303 /2717
800 clients: 1925 /2329
Combined with v0-0002-several-attempts-to-lock-WALInsertLocks.patch
No WALBufMappingLock + attempts on XLogInsertLock:
25 clients: 3518 /3750
50 clients: 3355 /3548
100 clients: 3226 /3460
200 clients: 3092 /3299
400 clients: 2575 /2801
800 clients: 1946 /2341
This results are with untouched NUM_XLOGINSERT_LOCKS == 8.
[1]
http://postgr.es/m/flat/3b11fdc2-9793-403d-b3d4-67ff9a00d447%40postgrespro.ru
PS.
Increasing NUM_XLOGINSERT_LOCKS to 64 gives:
25 clients: 3457 /3624
50 clients: 3215 /3500
100 clients: 2750 /3000
200 clients: 2535 /2729
400 clients: 2163 /2400
800 clients: 1700 /2060
While doing this on master:
25 clients 2645 /2953
50 clients: 2562 /2968
100 clients: 2364 /2756
200 clients: 2266 /2564
400 clients: 1868 /2228
800 clients: 1527 /2133
So, patched version with increased NUM_XLOGINSERT_LOCKS looks no worse
than unpatched without increasing num of locks.
-------
regards
Yura Sokolov aka funny-falcon
Attachment | Content-Type | Size |
---|---|---|
v0-0001-Get-rid-of-WALBufMappingLock.patch | text/x-patch | 17.0 KB |
v0-0002-several-attempts-to-lock-WALInsertLocks.patch | text/x-patch | 3.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Yura Sokolov | 2025-01-19 00:17:26 | Re: [RFC] Lock-free XLog Reservation from WAL |
Previous Message | Sami Imseih | 2025-01-18 23:42:50 | Re: improve DEBUG1 logging of parallel workers for CREATE INDEX? |