From: | Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Accept recovery conflict interrupt on blocked writing |
Date: | 2025-01-21 17:20:22 |
Message-ID: | CAO6_Xqq9E1kwJ9nYnaA_0Y+1b-_fL8vOy=cxNPddjRN6W8a6Pg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Thanks for the detailed explanations, I've definitely misinterpreted
how interrupts and errors were handled.
On Fri, Jan 17, 2025 at 7:03 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Might be worth using it it in src/test/postmaster/t/002_start_stop.pl? That
> has e.g. code to send a startup message.
I've changed pgproto to provide the necessary helpers to be used in
001_connection_limits.pl and 002_start_stop.pl.
> > Playing around with this it's unfortunately worse - we very commonly get to
> > ProcessClientWriteInterrupt() with InterruptHoldoffCount > 0. Including just
> > after processing recovery conflict interrupts. The reason for that is that if
> > we do throw an error the first thing we do is to hold interrupts
> > (c.f. sigsetjmp() in PostgresMain()) and then we call EmitErrorReport().
> >
> > Unfortunately I suspect that this means any protection we'd get from this
> > version of the patch is rather incomplete - if the recovery conflict is
> > triggered while not blocked, we'll ERROR out and report the message with
> > interrupts held. That cycle of recovery conflict signalling wouldn't ever be
> > resolved in such a case, I think. Of course we do re-signal recovery conflict
> > interrupts rather aggressively - but I'm not sure that's going to help
> > reliably.
>
> This unfortunately indeed is true. If I, instead of the generate_series(),
> add the following to the pgproto query:
> DO \$\$BEGIN RAISE 'endless scream: %', repeat('a', 1000000); END;\$\$;
>
> the conflict doesn't get handled.
A possible approach to handle this case could be to stop retrying and
return a write error if there's a recovery conflict on a blocked
write, with ProcessClientWriteInterrupt deciding whether a write is
retryable or not. WIthin ProcessClientWriteInterrupt, if we conflict
with the recovery on a blocked write and interrupts can be processed,
then we can call ProcessRecoveryConflictInterrupts with
ExitOnAnyError. Otherwise, if the interrupts can't be processed, then
we make the write not retryable, leading to a write error which will
close the socket with a "connection to client lost".
This requires a new CheckBlockedWriteConflictInterrupts function that
validates whether the process conflicts or not without processing the
conflict as we don't want to abort a writing if the conflict could be
ignored.
The code is a bit rough as there's duplicated logic checks between
CheckBlockedWriteConflictInterrupts and
ProcessRecoveryConflictInterrupt that could very probably be
simplified, but I wanted to validate the approach first. It would also
be probably worth adding more tests to cover all conflicts while
there's a blocked write, though I did add the RAISE case in the tests.
Regards,
Anthonin Bonnefoy
Attachment | Content-Type | Size |
---|---|---|
v04-0002-Accept-recovery-conflict-interrupt-on-blocked-wr.patch | application/octet-stream | 17.9 KB |
v04-0001-Add-PgProto-test-module-to-send-message-on-a-raw.patch | application/octet-stream | 11.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Marcos Pegoraro | 2025-01-21 17:31:59 | Re: Year of first commit |
Previous Message | Tom Lane | 2025-01-21 17:18:11 | Re: How to deinitialize a connection for background worker |