Re: pgsql: Implement pg_wal_replay_wait() stored procedure

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Alexander Korotkov <akorotkov(at)postgresql(dot)org>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Implement pg_wal_replay_wait() stored procedure
Date: 2024-09-03 13:07:11
Message-ID: CAPpHfduN68AzUHvzzPG80qwa-27QXjd820tzcjoUk2Tc6_O=5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Hi, Michael!

On Mon, Sep 2, 2024 at 3:25 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Sep 02, 2024 at 02:55:50AM +0300, Alexander Korotkov wrote:
> > Could you, please, check the attached patch?
>
> The patch moving the code looks correct at quick glance.

Thank you for your feedback.

> Now, I've been staring at this line, wondering why this is required
> while WaitForLSNReplay() does not return any status:
> + (void) WaitForLSNReplay(target_lsn, timeout);
>
> And this makes me question whether you have the right semantics
> regarding the SQL function itself. Could it be more useful for
> WaitForLSNReplay() to report an enum state that tells you the reason
> why a wait may not have happened with a text or a tuple returned by
> the function? There are quite a few reasons why a wait may or may not
> happen:
> - Recovery has ended, target LSN has been reached.
> - We're not in recovery anymore. Is an error the most useful thing
> here actually?
> - Timeout may have been reached, where an error is also generated.
> ERRCODE_QUERY_CANCELED is not a correct error state.
> - Perhaps more of these in the future?
>
> My point is: this is a feature that can be used for monitoring as far
> as I know, still it does not stick as a feature that could be most
> useful for such applications. Wouldn't it be more useful for client
> applications to deal with a state returned by the SQL function rather
> than having to parse error strings to know what happened? What is
> returned by pg_wal_replay_wal() reflects on the design of
> WaitForLSNReplay() itself.

By design, pg_wal_replay_wal() should be followed up with read-only
query to standby. The read-only query gets guarantee that it's
executed after the replay of LSN specified as pg_wal_replay_wal()
design. Returning the status from pg_wal_replay_wal() would require
additional cycle of its processing. But one of the main objectives of
this feature was reducing roundtrips during waiting for LSN.

On the other hand, I see that returning status could make sense for
certain use cases. I think I could write two patches to provide that.
1. Make WaitForLSNReplay() return status, and make pg_wal_replay_wal()
be responsible for throwing all the errors.
2. New procedure pg_wal_replay_wal_status() (or some better name?),
which returns status to the user instead of throwing errors.

If no objections, I will push the patch moving code then go ahead
writing the two patches above.

------
Regards,
Alexander Korotkov
Supabase

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Alexander Korotkov 2024-09-03 13:45:00 Re: pgsql: Implement pg_wal_replay_wait() stored procedure
Previous Message Daniel Gustafsson 2024-09-03 11:36:50 Re: pgsql: Remove support for OpenSSL older than 1.1.0