From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | abbas(dot)butt(at)enterprisedb(dot)com |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, zahid(dot)iqbal(at)enterprisedb(dot)com |
Subject: | Re: Logical replication keepalive flood |
Date: | 2021-06-07 07:23:53 |
Message-ID: | 20210607.162353.1202919828973013934.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Sat, 5 Jun 2021 16:08:00 +0500, Abbas Butt <abbas(dot)butt(at)enterprisedb(dot)com> wrote in
> Hi,
> I have observed the following behavior with PostgreSQL 13.3.
>
> The WAL sender process sends approximately 500 keepalive messages per
> second to pg_recvlogical.
> These keepalive messages are totally un-necessary.
> Keepalives should be sent only if there is no network traffic and a certain
> time (half of wal_sender_timeout) passes.
> These keepalive messages not only choke the network but also impact the
> performance of the receiver,
> because the receiver has to process the received message and then decide
> whether to reply to it or not.
> The receiver remains busy doing this activity 500 times a second.
I can reproduce the problem.
> On investigation it is revealed that the following code fragment in
> function WalSndWaitForWal in file walsender.c is responsible for sending
> these frequent keepalives:
>
> if (MyWalSnd->flush < sentPtr &&
> MyWalSnd->write < sentPtr &&
> !waiting_for_ping_response)
> WalSndKeepalive(false);
The immediate cause is pg_recvlogical doesn't send a reply before
sleeping. Currently it sends replies every 10 seconds intervals.
So the attached first patch stops the flood.
That said, I don't think it is not intended that logical walsender
sends keep-alive packets with such a high frequency. It happens
because walsender actually doesn't wait at all because it waits on
WL_SOCKET_WRITEABLE because the keep-alive packet inserted just before
is always pending.
So as the attached second, we should try to flush out the keep-alive
packets if possible before checking pg_is_send_pending().
Any one can "fix" the issue but I think each of them is reasonable by
itself.
Any thoughts, suggestions and/or opinions?
regareds.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
pg_recvlogical_send_reply_before_sleep.patch | text/x-patch | 465 bytes |
walsender_flush_keepalive_packet_before_sleep.patch | text/x-patch | 590 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2021-06-07 07:30:57 | Re: contrib/pg_visibility fails regression under CLOBBER_CACHE_ALWAYS |
Previous Message | Anton Voloshin | 2021-06-07 07:16:18 | back-port one-line gcc-10+ warning fix to REL_10_STABLE |