From: | Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com> |
---|---|
To: | pgsql-admin(at)lists(dot)postgresql(dot)org |
Subject: | Re: PostgreSQL 10.5 : Logical replication timeout results in PANIC in pg_wal "No space left on device" |
Date: | 2018-11-21 06:03:57 |
Message-ID: | b34279c0-14ea-bf3b-61a4-109792059e09@matrix.gatewaynet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
On 20/11/18 10:48 μ.μ., Rui DeSousa wrote:
>
>
>> On Nov 20, 2018, at 3:34 PM, Achilleas Mantzios
>> <achill(at)matrix(dot)gatewaynet(dot)com <mailto:achill(at)matrix(dot)gatewaynet(dot)com>>
>> wrote:
>>
>> Hey, I was reading the docs, it seems it means :
>>
>> net.ipv4.tcp_keepalive_time + net.ipv4.tcp_keepalive_intvl *
>> net.ipv4.tcp_keepalive_probes = 2hrs 11 Mins 15 Secs, rather than 18 Hrs
>
> Yeah, that’s correct. I wonder why it didn’t terminate.
Most probably because there was another created clone, cloud migration
magic, that's my theory, albeit not confirmed by the provider. Logical
worker (walreceiver) was still alive and happy even after the primary
crushed. I have the logs from the other standby and it immediately
detected the problem (PANIC on the primary) and retried. No firewall
dropping packets, in every test I did, the logical bgworker detects any
problems *instantly*, and retries after 5 secs by default.
From | Date | Subject | |
---|---|---|---|
Next Message | Keith | 2018-11-21 06:40:27 | Re: Trigger to create automated range partition table |
Previous Message | Srinivas Reddy | 2018-11-21 04:59:41 | Error while upgrading from 9.5 to 10 |