From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Accept recovery conflict interrupt on blocked writing |
Date: | 2025-01-17 18:11:13 |
Message-ID: | xob3ehc6nb4xgqn7evb5gu2ptc6kd656afzufmexgawp6agjpc@stfdkc5awqqb |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2025-01-17 13:03:35 -0500, Andres Freund wrote:
> I don't see anything implementing the promotion of ERRORs to FATAL? You're
> preventing the error message being sent to the client, but I don't think that
> causes the connection to be terminated. The pre-existing code doesn't have
> that problem, because it's only active when ProcDiePending is already set.
>
> In fact, the test your patch added goes through
> ProcessRecoveryConflictInterrupts() multiple times:
>
> 2025-01-17 12:52:47.842 EST [3376411] LOG: recovery still waiting after 20.709 ms: recovery conflict on buffer pin
> 2025-01-17 12:52:47.842 EST [3376411] CONTEXT: WAL redo at 0/3462288 for Heap2/PRUNE_VACUUM_SCAN: , isCatalogRel: F, nplans: 0, nredirected: 0, ndead: 0, nun>
> 3376451: recovery conflict interrupt while blocked
> 3376451: recovery conflict processing done
> write(8192) = -1: 11/Resource temporarily unavailable
> 3376451: recovery conflict interrupt while blocked
> 3376451: recovery conflict processing done
> ...
> write(8192) = -1: 11/Resource temporarily unavailable
> 3376451: recovery conflict interrupt while blocked
> 3376451: recovery conflict processing done
> write(8192) = -1: 11/Resource temporarily unavailable
> 3376451: recovery conflict interrupt while blocked
> 2025-01-17 12:52:48.072 EST [3376451] 031_recovery_conflict.pl ERROR: canceling statement due to conflict with recovery
> 2025-01-17 12:52:48.072 EST [3376451] 031_recovery_conflict.pl DETAIL: User was holding shared buffer pin for too long.
> 2025-01-17 12:52:48.072 EST [3376451] 031_recovery_conflict.pl STATEMENT:
> BEGIN;
> DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM test_recovery_conflict_table1;
> FETCH FORWARD FROM test_recovery_conflict_cursor;
> SELECT generate_series(1, 100000);
>
> backend 3376451> 2025-01-17 12:52:48.072 EST [3376411] LOG: recovery finished waiting after 250.681 ms: recovery conflict on buffer pin
>
> I don't actually know why the conflict ends up being resolved after a bunch of
> retries.
It's because the test sets deadlock_timeout lower than
max_standby_streaming_delay.
> Note also the "backend>" (to which I added the PID to identify it) which gets
> emitted. Just setting whereToSendOutput = DestNone has side effects when not
> actually in a process exit status...
I the the only reason the patch works at all is because we end up in
InteractiveBackend()'s EOF handling, because InteractiveBackend reads from
stdin. That's closed here, but I don't think we have any guarantee that stdin
isn't something that can be read from.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Adam Brusselback | 2025-01-17 18:16:43 | Re: Catching query cancelations in PLPython3u |
Previous Message | Andres Freund | 2025-01-17 18:03:35 | Re: Accept recovery conflict interrupt on blocked writing |