RE: walsender performance regression due to logical decoding on standby changes

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "bharath(dot)rupireddyforpostgres(at)gmail(dot)com" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "sawada(dot)mshk(at)gmail(dot)com" <sawada(dot)mshk(at)gmail(dot)com>, "thomas(dot)munro(at)gmail(dot)com" <thomas(dot)munro(at)gmail(dot)com>, "bertranddrouvot(dot)pg(at)gmail(dot)com" <bertranddrouvot(dot)pg(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "pgsql(at)j-davis(dot)com" <pgsql(at)j-davis(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>
Subject: RE: walsender performance regression due to logical decoding on standby changes
Date: 2023-05-19 07:36:21
Message-ID: OS0PR01MB57167C4B426626C6432A6054947C9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, May 19, 2023 11:08 AM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> At Thu, 18 May 2023 20:11:11 +0530, Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote in
> > > > + ConditionVariableInit(&WalSndCtl->physicalWALSndCV);
> > > > + ConditionVariableInit(&WalSndCtl->logicalWALSndCV);
> > >
> > > It's not obvious to me that it's worth having two CVs, because it's more
> > > expensive to find no waiters in two CVs than to find no waiters in one CV.
> >
> > I disagree. In the tight per-WAL record recovery loop, WalSndWakeup
> > wakes up logical walsenders for every WAL record, but it wakes up
> > physical walsenders only if the applied WAL record causes a TLI
> > switch. Therefore, the extra cost of spinlock acquire-release for per
> > WAL record applies only for logical walsenders. On the other hand, if
> > we were to use a single CV, we would be unnecessarily waking up (if at
> > all they are sleeping) physical walsenders for every WAL record -
> > which is costly IMO.
>
> As I was reading this, I start thinking that one reason for the
> regression could be to exccessive frequency of wakeups during logical
> replication. In physical replication, we make sure to avoid exccessive
> wakeups when the stream is tightly packed. I'm just wondering why
> logical replication doesn't (or can't) do the same thing, IMHO.

I thought(from the e101dfa's commit message) physical walsenders can't send
data until it's been flushed, so it only gets wakeup at the time of flush or TLI
switch. For logical walsender, it can start to decode changes after the change
is applied, and to avoid the case if the walsender is asleep, and there's work
to be done, it wakeup logical walsender when applying each record.

Or maybe you mean to wakeup(ConditionVariableBroadcast) walsender after
applying some wal records like[1], But it seems it may delay the wakeup of
walsender a bit(the walsender may be asleep before reaching the threshold).

[1]
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1833,6 +1833,7 @@ ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *repl
{
ErrorContextCallback errcallback;
bool switchedTLI = false;
+ static int nreplay = 0;

/* Setup error traceback support for ereport() */
errcallback.callback = rm_redo_error_callback;
@@ -1957,8 +1958,12 @@ ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *repl
* be created otherwise)
* ------
*/
- if (AllowCascadeReplication())
+ if (AllowCascadeReplication() &&
+ nreplay++ == 100)
+ {
WalSndWakeup(switchedTLI, true);
+ nreplay = 0;
+ }

Best Regards,
Hou zj

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hans Buschmann 2023-05-19 07:42:12 PG 16 draft release notes ready
Previous Message Michael Paquier 2023-05-19 06:54:42 Re: WAL Insertion Lock Improvements