From: | "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Jeff Davis <pgsql(at)j-davis(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Subject: | Re: walsender performance regression due to logical decoding on standby changes |
Date: | 2023-05-10 09:52:35 |
Message-ID: | ddf14508-282d-b8a8-0a31-764ffccbeb04@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 5/10/23 8:36 AM, Bharath Rupireddy wrote:
> On Wed, May 10, 2023 at 12:33 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>> Unfortunately I have found the following commit to have caused a performance
>> regression:
>>
>> commit e101dfac3a53c20bfbf1ca85d30a368c2954facf
>>
>> The problem is that, on a standby, after the change - as needed to for the
>> approach to work - the call to WalSndWakeup() in ApplyWalRecord() happens for
>> every record, instead of only happening when the timeline is changed (or WAL
>> is flushed or ...).
>>
>> WalSndWakeup() iterates over all walsender slots, regardless of whether in
>> use. For each of the walsender slots it acquires a spinlock.
>>
>> When replaying a lot of small-ish WAL records I found the startup process to
>> spend the majority of the time in WalSndWakeup(). I've not measured it very
>> precisely yet, but the overhead is significant (~35% slowdown), even with the
>> default max_wal_senders. If that's increased substantially, it obviously gets
>> worse.
>
Thanks Andres for the call out! I do agree that this is a concern.
>> The only saving grace is that this is not an issue on the primary.
>
> Yeah.
+1
>
>> My current guess is that mis-using a condition variable is the best bet. I
>> think it should work to use ConditionVariablePrepareToSleep() before a
>> WalSndWait(), and then ConditionVariableCancelSleep(). I.e. to never use
>> ConditionVariableSleep(). The latch set from ConditionVariableBroadcast()
>> would still cause the necessary wakeup.
Yeah, I think that "mis-using" a condition variable is a valid option. Unless I'm missing
something, I don't think there is anything wrong with using a CV that way (aka not using
ConditionVariableTimedSleep() or ConditionVariableSleep() in this particular case).
>
> How about something like the attached? Recovery and subscription tests
> don't complain with the patch.
Thanks Bharath for looking at it!
I launched a full Cirrus CI test with it but it failed on one environment (did not look in details,
just sharing this here): https://cirrus-ci.com/task/6570140767092736
Also I have a few comments:
@@ -1958,7 +1959,7 @@ ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *repl
* ------
*/
if (AllowCascadeReplication())
- WalSndWakeup(switchedTLI, true);
+ ConditionVariableBroadcast(&WalSndCtl->cv);
I think the comment above this change needs to be updated.
@@ -3368,9 +3370,13 @@ WalSndWait(uint32 socket_events, long timeout, uint32 wait_event)
WaitEvent event;
ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetSocketPos, socket_events, NULL);
+
+ ConditionVariablePrepareToSleep(&WalSndCtl->cv);
if (WaitEventSetWait(FeBeWaitSet, timeout, &event, 1, wait_event) == 1 &&
(event.events & WL_POSTMASTER_DEATH))
proc_exit(1);
+
+ ConditionVariableCancelSleep();
May be worth to update the comment above WalSndWait() too? (to explain why a CV handling is part of it).
@@ -108,6 +109,8 @@ typedef struct
*/
bool sync_standbys_defined;
+ ConditionVariable cv;
Worth to give it a more meaning full name? (to give a clue as when it is used)
I think we also need to update the comment above WalSndWakeup():
"
* For cascading replication we need to wake up physical walsenders separately
* from logical walsenders (see the comment before calling WalSndWakeup() in
* ApplyWalRecord() for more details).
"
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Zhijie Hou (Fujitsu) | 2023-05-10 10:11:26 | RE: walsender performance regression due to logical decoding on standby changes |
Previous Message | Peter Eisentraut | 2023-05-10 09:50:14 | smgrzeroextend clarification |