Re: pgsql: Implement pg_wal_replay_wait() stored procedure

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Alexander Korotkov <akorotkov(at)postgresql(dot)org>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Implement pg_wal_replay_wait() stored procedure
Date: 2024-09-29 10:40:12
Message-ID: CAPpHfdsg-hORcMKPe=3p0Hrf7kJ=jLzk2CW_+2Cz8XPo5w9MKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi!

Thank you for your review.

On Fri, Sep 27, 2024 at 7:35 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Fri, Sep 20, 2024 at 03:00:20PM +0300, Alexander Korotkov wrote:
> > Please, check the attached patchset for implementation of proposed approach.
>
> 0001 looks like it requires an indentation in its .h diffs.
>
> +typedef enum
> +{
> + WaitLSNResultSuccess, /* Target LSN is reached */
> + WaitLSNResultTimeout, /* Timeout occured */
>
> Perhaps use WAIT_LSN_RESULT_SUCCESS, etc, rather than camel casing.
>
> + * Results statuses for WaitForLSNReplay().
>
> Too much plural here.
>
> What you are doing with WaitForLSNReplay() sounds kind of right.
>
> rc = WaitLatch(MyLatch, wake_events, delay_ms,
> WAIT_EVENT_WAIT_FOR_WAL_REPLAY);

Thank you, fixed in the attached patchset.

> Question about this existing bit in waitlsn.c. Shouldn't this issue a
> FATAL if rc reports a WL_POSTMASTER_DEATH? Or am I missing an
> intention here. That was already the case before this patch set.

Fixed in the separate patch.

> pg_wal_replay_wait() is new to v18, meaning that we still have some
> time to agree on its final shape for this release cycle. This
> discussion shares similarities with the recent exercise done in
> f593c5517d14, and I think that we should be more consistent between
> both to not repeat the same mistakes as in the past, even if this case
> if more complex because we have more reasons behind why a wait could
> not have happened.
>
> I would suggest to keep things simple and have one single function
> rather than introduce two more pg_proc entries with slight differences
> in their error reporting, making the original function return a text
> about the reason of the failure when waiting (be it a timeout,
> success, promotion or outside recovery).

I also like to keep things simple. Keeping this as a single function
is not possible due to the reasons I described in [1]. However, it's
possible to fit into one stored procedure. I made 'no_error' as an
argument for the pg_wal_replay_wait() procedure. Done so in the
attached patchset.

> FWIW, I am confused regarding the need for WaitLSNResultNotInRecovery
> and WaitLSNResultPromotedConcurrently in the error states. On a code
> basis, they check the same thing: RecoveryInProgress(). I'd suggest
> to cut one of them.

Agreed. I initially intended to distinguish situations when the user
mistakenly calls pg_wal_replay_wait() on replication leader and when
concurrent promotion happens. However, given that the promotion could
happen after the user issued the query and before waiting starts, it
doesn't make much sense.

> This also points to the fact that
> WaitForLSNReplay() has some duplication that's not required. We could
> have less GetXLogReplayRecPtr() calls and do all the checks in the for
> loop. The current structure of the code in waitlsn.c is more complex
> than it could be.

Not sure about this. I'd like to keep the fast-path check before we
call addLSNWaiter().

Links.
1. https://www.postgresql.org/message-id/CAPpHfdukVbJZntibZZ4HM7p92zN-QmAtD1%2BsAALRTFCsvpAq7A%40mail.gmail.com

------
Regards,
Alexander Korotkov
Supabase

Attachment Content-Type Size
v3-0002-Refactor-WaitForLSNReplay-to-return-the-result-of.patch application/octet-stream 5.7 KB
v3-0003-Add-no_error-argument-to-pg_wal_replay_wait.patch application/octet-stream 5.3 KB
v3-0001-Make-WaitForLSNReplay-issue-FATAL-on-postmaster-d.patch application/octet-stream 1.5 KB

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2024-09-29 17:40:09 pgsql: In passwordFromFile, don't leak the open file after stat failure
Previous Message Noah Misch 2024-09-27 22:32:14 pgsql: Avoid 037_invalid_database.pl hang under debug_discard_caches.

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-09-29 13:03:29 Re: First draft of PG 17 release notes
Previous Message Dave Cramer 2024-09-29 10:31:40 Re: [PATCH] Add native windows on arm64 support