Re: pgsql: Implement pg_wal_replay_wait() stored procedure

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: Implement pg_wal_replay_wait() stored procedure
Date: 2024-11-03 20:54:33
Message-ID: CAPpHfdvRmTzGJw5rQdSMkTxUPZkjwtbQ=LJE2u9Jqh9gFXHpmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi, Heikki!

On Mon, Oct 28, 2024 at 9:42 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com>
wrote:
>
> On Mon, Oct 28, 2024 at 11:36 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
wrote:
> >
> > On 25/10/2024 14:56, Alexander Korotkov wrote:
> > > I see that pg_wal_replay_wait_status() might look weird, but it seems
> > > to me like the best of feasible solutions.
> >
> > I haven't written many procedures, but our docs say:
> >
> > > Procedures do not return a function value; hence CREATE PROCEDURE
> > lacks a RETURNS clause. However, procedures can instead return data to
> > their callers via output parameters.
> >
> > Did you consider using an output parameter?
>
> Yes I did consider them and found two issues.
> 1) You still need to pass something to them. And that couldn't be
> default values. That's a bit awkward.
> 2) Usage of them causes extra snapshot to be held.
> I'll recheck if it's possible to workaround any of these two.

I've rechecked the output parameters for stored procedures. And I think
the behavior I previously discovered is an anomaly.

CREATE PROCEDURE test_proc(a integer, out b integer)
LANGUAGE plpgsql
AS $$
BEGIN
b := a;
END;
$$;

# call test_proc(1);
ERROR: procedure test_proc(integer) does not exist
LINE 1: call test_proc(1);
^
HINT: No procedure matches the given name and argument types. You might
need to add explicit type casts.

# call test_proc(1,2);
b
---
1
(1 row)

Looks weird that we have to pass in some (ignored?) values for output
parameters. In contrast, functions don't require this.

CREATE FUNCTION test_func(a integer, out b integer)
LANGUAGE plpgsql
AS $$
BEGIN
b := a;
END;
$$;

# select test_func(1);
test_func
-----------
1
(1 row)

This makes me think we have an issue with stored procedures here. I'll try
to investigate it further.

Also when we have output parameters, PortalRun() have portal->strategy ==
PORTAL_UTIL_SELECT (it's PORTAL_MULTI_QUERY without them). This eventually
makes PortalRunUtility() to do RegisterSnapshot(). This is not clear
whether we can release this snapshot without a stored procedure without the
consequences.

> > > Given that
> > > pg_wal_replay_wait() procedure can't work concurrently to a query
> > > involving pg_wal_replay_wait_status() function, I think
> > > pg_wal_replay_wait_status() should be stable and parallel safe.
> >
> > If you call pg_wal_replay_wait() in the backend process, and
> > pg_wal_replay_wait_status() in a parallel worker process, it won't
> > return the result of the wait. Probably not what you'd expect. So I'd
> > argue that it should be parallel unsafe.
>
> Oh, sorry. You're absolutely correct. That should be parallel unsafe.
>
> > > This is the brief answer. I will be able to come back with more
> > > details on Monday.
> >
> > Thanks. A few more minor issues I spotted while playing with this:
> >
> > - If you pass a very high value as the timeout, e.g. INT_MAX-1, it wraps
> > around and doesn't wait at all
> > - You can pass NULLs as arguments. That should probably not be allowed,
> > or we need to document what it means.
> >
> > This is disappointing:
> >
> > > postgres=# set default_transaction_isolation ='repeatable read';
> > > SET
> > > postgres=# call pg_wal_replay_wait('0/55DA24F');
> > > ERROR: pg_wal_replay_wait() must be only called without an active or
registered snapshot
> > > DETAIL: Make sure pg_wal_replay_wait() isn't called within a
transaction with an isolation level higher than READ COMMITTED, another
procedure, or a function.
> >
> > Is there any way we could make that work? Otherwise, the feature just
> > basically doesn't work if you use repeatable read.
>
> Thank you for catching this. The last one is really disappointing.
> I'm exploring on what could be done there.

If we set repeatable read as the default transaction isolation level, then
the catalog snapshot used to find pg_wal_replay_wait() procedure got saved
as the transaction snapshot. As I get the correct way to release that
snapshot is to finish current transaction and start another one.
Implemented in the attached patch.

------
Regards,
Alexander Korotkov
Supabase

Attachment Content-Type Size
v1-0001-Teach-pg_wal_replay_wait-to-handle-REPEATABLE-REA.patch application/x-patch 3.6 KB

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Alexander Korotkov 2024-11-03 21:03:38 Re: pgsql: Implement pg_wal_replay_wait() stored procedure
Previous Message Michael Paquier 2024-11-03 10:52:12 pgsql: Add missing newlines at the end of two SQL files

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-11-03 21:03:38 Re: pgsql: Implement pg_wal_replay_wait() stored procedure
Previous Message Daniel Gustafsson 2024-11-03 19:36:28 Re: Time to add a Git .mailmap?