From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | ashutosh(dot)bapat(at)2ndquadrant(dot)com |
Subject: | walsender waiting_for_ping spuriously set |
Date: | 2020-08-06 22:55:58 |
Message-ID: | 20200806225558.GA22401@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Ashutosh Bapat noticed that WalSndWaitForWal() is setting
waiting_for_ping_response after sending a keepalive that does *not*
request a reply. The bad consequence is that other callers that do
require a reply end up in not sending a keepalive, because they think it
was already sent previously. So the whole thing gets stuck.
He found that commit 41d5f8ad734 failed to remove the setting of
waiting_for_ping_response after changing the "request" parameter
WalSndKeepalive from true to false; that seems to have been an omission
and it breaks the algorithm. Thread at [1].
The simplest fix is just to remove the line that sets
waiting_for_ping_response, but I think it is less error-prone to have
WalSndKeepalive set the flag itself, instead of expecting its callers to
do it (and know when to). Patch attached. Also rewords some related
commentary.
[1] https://postgr.es/m/flat/BLU436-SMTP25712B7EF9FC2ADEB87C522DC040(at)phx(dot)gbl
--
Álvaro Herrera Valdivia, Chile
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-waiting_for_ping-in-walsender.patch | text/x-diff | 3.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2020-08-07 00:02:50 | Should the nbtree page split REDO routine's locking work more like the locking on the primary? |
Previous Message | David Rowley | 2020-08-06 22:24:09 | Re: pg13dev: explain partial, parallel hashagg, and memory use |