Author: Noah Misch Commit: Noah Misch In caught-up logical walsender, sleep only in WalSndWaitForWal(). Before sleeping, WalSndWaitForWal() sends a keepalive if MyWalSnd->write < sentPtr. When the latest physical LSN yields no logical replication messages (a common case), that keepalive elicits a reply. Processing the reply updates pg_stat_replication.replay_lsn. WalSndLoop() lacks that; when WalSndLoop() slept, replay_lsn advancement could stall until wal_receiver_status_interval elapsed. This sometimes stalled src/test/subscription/t/001_rep_changes.pl for up to 10s. Reviewed by FIXME. Discussion: https://postgr.es/m/FIXME diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 9e56115..a880195 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -1428,8 +1428,10 @@ WalSndWaitForWal(XLogRecPtr loc) /* * We only send regular messages to the client for full decoded * transactions, but a synchronous replication and walsender shutdown - * possibly are waiting for a later location. So we send pings - * containing the flush location every now and then. + * possibly are waiting for a later location. So, before sleeping, we + * send a ping containing the flush location. If the receiver is + * otherwise idle, this keepalive will trigger a reply. Processing the + * reply will update these MyWalSnd locations. */ if (MyWalSnd->flush < sentPtr && MyWalSnd->write < sentPtr && @@ -2314,14 +2316,14 @@ WalSndLoop(WalSndSendDataCallback send_data) WalSndKeepaliveIfNecessary(); /* - * We don't block if not caught up, unless there is unsent data - * pending in which case we'd better block until the socket is - * write-ready. This test is only needed for the case where the - * send_data callback handled a subset of the available data but then - * pq_flush_if_writable flushed it all --- we should immediately try - * to send more. + * Block if we have unsent data. XXX For logical replication, let + * WalSndWaitForWal(), handle any other blocking; idle receivers need + * its additional actions. For physical replication, also block if + * caught up; its send_data does not block. */ - if ((WalSndCaughtUp && !streamingDoneSending) || pq_is_send_pending()) + if ((WalSndCaughtUp && send_data != XLogSendLogical && + !streamingDoneSending) || + pq_is_send_pending()) { long sleeptime; int wakeEvents;