From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | abbas(dot)butt(at)enterprisedb(dot)com, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, zahid(dot)iqbal(at)enterprisedb(dot)com |
Subject: | Re: Logical replication keepalive flood |
Date: | 2021-06-07 10:13:39 |
Message-ID: | CAA4eK1+6vYODWH4HHv+DODD=fACvnAosu0ivnP1w-wLWF=FqEw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jun 7, 2021 at 12:54 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Sat, 5 Jun 2021 16:08:00 +0500, Abbas Butt <abbas(dot)butt(at)enterprisedb(dot)com> wrote in
> > Hi,
> > I have observed the following behavior with PostgreSQL 13.3.
> >
> > The WAL sender process sends approximately 500 keepalive messages per
> > second to pg_recvlogical.
> > These keepalive messages are totally un-necessary.
> > Keepalives should be sent only if there is no network traffic and a certain
> > time (half of wal_sender_timeout) passes.
> > These keepalive messages not only choke the network but also impact the
> > performance of the receiver,
> > because the receiver has to process the received message and then decide
> > whether to reply to it or not.
> > The receiver remains busy doing this activity 500 times a second.
>
> I can reproduce the problem.
>
> > On investigation it is revealed that the following code fragment in
> > function WalSndWaitForWal in file walsender.c is responsible for sending
> > these frequent keepalives:
> >
> > if (MyWalSnd->flush < sentPtr &&
> > MyWalSnd->write < sentPtr &&
> > !waiting_for_ping_response)
> > WalSndKeepalive(false);
>
> The immediate cause is pg_recvlogical doesn't send a reply before
> sleeping. Currently it sends replies every 10 seconds intervals.
>
Yeah, but one can use -s option to send it at lesser intervals.
> So the attached first patch stops the flood.
>
I am not sure sending feedback every time before sleep is a good idea,
this might lead to unnecessarily sending more messages. Can we try by
using one-second interval with -s option to see how it behaves? As a
matter of comparison the similar logic in workers.c uses
wal_receiver_timeout to send such an update message rather than
sending it every time before sleep.
> That said, I don't think it is not intended that logical walsender
> sends keep-alive packets with such a high frequency. It happens
> because walsender actually doesn't wait at all because it waits on
> WL_SOCKET_WRITEABLE because the keep-alive packet inserted just before
> is always pending.
>
> So as the attached second, we should try to flush out the keep-alive
> packets if possible before checking pg_is_send_pending().
>
/* Send keepalive if the time has come */
WalSndKeepaliveIfNecessary();
+ /* We may have queued a keep alive packet. flush it before sleeping. */
+ pq_flush_if_writable();
We already call pq_flush_if_writable() from WalSndKeepaliveIfNecessary
after sending the keep-alive message, so not sure how this helps?
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Abbas Butt | 2021-06-07 10:26:05 | Re: Logical replication keepalive flood |
Previous Message | Peter Eisentraut | 2021-06-07 10:11:53 | Re: Tid scan improvements |