Re: pgsql: Implement pg_wal_replay_wait() stored procedure

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Implement pg_wal_replay_wait() stored procedure
Date: 2024-10-25 11:56:36
Message-ID: CAPpHfdu+9NnT8Mx8gpYVXy7-BHANDVdrD6qRwt=-1=m1_pWMxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi, Heikki!

On Fri, Oct 25, 2024 at 9:06 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> If you call this procedure on a stand-alone server, you get:
>
> postgres=# call pg_wal_replay_wait('1234/0');
> ERROR: recovery is not in progress
> DETAIL: Recovery ended before replaying target LSN 1234/0; last replay
> LSN 0/0.
>
> The DETAIL seems a bit misleading. Recovery never ended, because it
> never started in the first place. Last replay LSN is indeed 0/0, but
> that's not helpful.
>
> If a standby server has been promoted and you pass an LSN that's earlier
> than the last replay LSN, it returns successfully. That makes sense I
> guess; if you connect to a standby and wait for it to replay a commit
> that you made in the primary, and the standby gets promoted, that seems
> correct. But it's a little inconsistent: If the standby crashes
> immediately after promotion, and you call pg_wal_replay_wait() after
> recovery, it returns success. However, if you shut down the promoted
> server and restart it, then last replay LSN is 0/0, and the call will
> fail because no recovery happened.
>
> What is the use case for the 'no_error' argument? Why would you ever
> want to pass no_error=true ? The separate pg_wal_replay_wait_status()
> function feels weird to me. Also it surely shouldn't be marked IMMUTABLE
> nor parallel safe.
>
> This would benefit from more documentation, explaining how you would use
> this in practice. I believe the use case is that you want "read your
> writes" consistency between a primary and a standby. You commit a
> transaction in the primary, and you then want to run read-only queries
> in a standby, and you want to make sure that you see your own commit,
> but you're ok with slightly delayed results otherwise. It would be good
> to add a chapter somewhere in the docs to show how to do that in
> practice with these functions.

Thank you for your feedback!

I do agree that error reporting for "not in recovery" case needs to be
improved, as well, as the documentation.

I see that pg_wal_replay_wait_status() might look weird, but it seems
to me like the best of feasible solutions. Given that
pg_wal_replay_wait() procedure can't work concurrently to a query
involving pg_wal_replay_wait_status() function, I think
pg_wal_replay_wait_status() should be stable and parallel safe.

This is the brief answer. I will be able to come back with more
details on Monday.

------
Regards,
Alexander Korotkov
Supabase

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-10-25 12:14:51 pgsql: doc: Fix typo in pg_restore_*_stats function documentation
Previous Message Alexander Korotkov 2024-10-25 10:12:22 pgsql: Fix concurrrently in typcache_rel_type_cache.sql

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey M. Borodin 2024-10-25 12:01:21 Re: Using read_stream in index vacuum
Previous Message Daniel Gustafsson 2024-10-25 11:55:57 Re: pg_upgrade check for invalid databases