Re: Accept recovery conflict interrupt on blocked writing

From: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Accept recovery conflict interrupt on blocked writing
Date: 2025-01-21 17:20:22
Message-ID: CAO6_Xqq9E1kwJ9nYnaA_0Y+1b-_fL8vOy=cxNPddjRN6W8a6Pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for the detailed explanations, I've definitely misinterpreted
how interrupts and errors were handled.

On Fri, Jan 17, 2025 at 7:03 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Might be worth using it it in src/test/postmaster/t/002_start_stop.pl? That
> has e.g. code to send a startup message.

I've changed pgproto to provide the necessary helpers to be used in
001_connection_limits.pl and 002_start_stop.pl.

> > Playing around with this it's unfortunately worse - we very commonly get to
> > ProcessClientWriteInterrupt() with InterruptHoldoffCount > 0. Including just
> > after processing recovery conflict interrupts. The reason for that is that if
> > we do throw an error the first thing we do is to hold interrupts
> > (c.f. sigsetjmp() in PostgresMain()) and then we call EmitErrorReport().
> >
> > Unfortunately I suspect that this means any protection we'd get from this
> > version of the patch is rather incomplete - if the recovery conflict is
> > triggered while not blocked, we'll ERROR out and report the message with
> > interrupts held. That cycle of recovery conflict signalling wouldn't ever be
> > resolved in such a case, I think. Of course we do re-signal recovery conflict
> > interrupts rather aggressively - but I'm not sure that's going to help
> > reliably.
>
> This unfortunately indeed is true. If I, instead of the generate_series(),
> add the following to the pgproto query:
> DO \$\$BEGIN RAISE 'endless scream: %', repeat('a', 1000000); END;\$\$;
>
> the conflict doesn't get handled.

A possible approach to handle this case could be to stop retrying and
return a write error if there's a recovery conflict on a blocked
write, with ProcessClientWriteInterrupt deciding whether a write is
retryable or not. WIthin ProcessClientWriteInterrupt, if we conflict
with the recovery on a blocked write and interrupts can be processed,
then we can call ProcessRecoveryConflictInterrupts with
ExitOnAnyError. Otherwise, if the interrupts can't be processed, then
we make the write not retryable, leading to a write error which will
close the socket with a "connection to client lost".

This requires a new CheckBlockedWriteConflictInterrupts function that
validates whether the process conflicts or not without processing the
conflict as we don't want to abort a writing if the conflict could be
ignored.

The code is a bit rough as there's duplicated logic checks between
CheckBlockedWriteConflictInterrupts and
ProcessRecoveryConflictInterrupt that could very probably be
simplified, but I wanted to validate the approach first. It would also
be probably worth adding more tests to cover all conflicts while
there's a blocked write, though I did add the RAISE case in the tests.

Regards,
Anthonin Bonnefoy

Attachment Content-Type Size
v04-0002-Accept-recovery-conflict-interrupt-on-blocked-wr.patch application/octet-stream 17.9 KB
v04-0001-Add-PgProto-test-module-to-send-message-on-a-raw.patch application/octet-stream 11.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Marcos Pegoraro 2025-01-21 17:31:59 Re: Year of first commit
Previous Message Tom Lane 2025-01-21 17:18:11 Re: How to deinitialize a connection for background worker