Quick Links

Re: [HACKERS] make async slave to wait for lsn to be replayed

From:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To:	Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc:	Kartyshov Ivan <i(dot)kartyshov(at)postgrespro(dot)ru>, Euler Taveira <euler(at)eulerto(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: [HACKERS] make async slave to wait for lsn to be replayed
Date:	2024-04-01 00:24:35
Message-ID:	CAPpHfdsV856ca8V=G8DFEViDP7v1hMtDUgOEdBE5dV45a3NKKw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Bharath,

Thank you for your feedback.

On Sun, Mar 31, 2024 at 8:44 AM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> On Sun, Mar 31, 2024 at 7:41 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> Thanks for the patch. I have a few comments on the v16 patch.
>
> 1. Is there any specific reason for pg_wal_replay_wait() being a
> procedure rather than a function? I haven't read the full thread, but
> I vaguely noticed the discussion on the new wait mechanism holding up
> a snapshot or some other resource. Is that the reason to use a stored
> procedure over a function? If yes, can we specify it somewhere in the
> commit message and just before the procedure definition in
> system_functions.sql?

Surely, there is a reason. Function should be executed in a snapshot,
which can prevent WAL records from being replayed. See [1] for a
particular test scenario. In a procedure we may enforce non-atomic
context and release the snapshot.

I've mentioned that in the commit message and in the procedure code.
I don't think system_functions.sql is the place for this type of
comment. We only use system_functions.sql to push the default values.

> 2. Is the pg_wal_replay_wait first procedure that postgres provides
> out of the box?

Yes, it appears first. I see nothing wrong about that.

> 3. Defining a procedure for the first time in system_functions.sql
> which is supposed to be for functions seems a bit unusual to me.

From the scope of DDL and system catalogue procedure is just another
kind of function (prokind == 'p'). So, I don't feel wrong about that.

> 4.
> +
> + endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), timeout);
> +
>
> + if (timeout > 0)
> + {
> + delay_ms = (endtime - GetCurrentTimestamp()) / 1000;
> + latch_events |= WL_TIMEOUT;
> + if (delay_ms <= 0)
> + break;
> + }
>
> Why is endtime calculated even for timeout <= 0 only to just skip it
> later? Can't we just do a fastpath exit if timeout = 0 and targetLSN <

OK, fixed.

> 5.
> Parameter
> + <parameter>timeout</parameter> is the time in milliseconds to wait
> + for the <parameter>target_lsn</parameter>
> + replay. When <parameter>timeout</parameter> value equals to zero no
> + timeout is applied.
> + replay. When <parameter>timeout</parameter> value equals to zero no
> + timeout is applied.
>
> It turns out to be "When timeout value equals to zero no timeout is
> applied." I guess, we can just say something like the following which
> I picked up from pg_terminate_backend timeout parameter description.
>
> <para>
> If <parameter>timeout</parameter> is not specified or zero, this
> function returns if the WAL upto
> <literal>target_lsn</literal> is replayed.
> If the <parameter>timeout</parameter> is specified (in
> milliseconds) and greater than zero, the function waits until the
> server actually replays the WAL upto <literal>target_lsn</literal> or
> until the given time has passed. On timeout, an error is emitted.
> </para></entry>

Applied as you suggested with some edits from me.

> 6.
> + ereport(ERROR,
> + (errcode(ERRCODE_QUERY_CANCELED),
> + errmsg("canceling waiting for LSN due to timeout")));
>
> We can be a bit more informative here and say targetLSN and currentLSN
> something like - "timed out while waiting for target LSN %X/%X to be
> replayed; current LSN %X/%X"?

Done this way. Adjusted other ereport()'s as well.

> 7.
> + if (context->atomic)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("pg_wal_replay_wait() must be only called in
> non-atomic context")));
> +
>
> Can we say what a "non-atomic context '' is in a user understandable
> way like explicit txn or whatever that might be? "non-atomic context'
> might not sound great to the end -user.

Added errdetail() to this ereport().

> 8.
> + the <literal>movie</literal> table and get the <acronym>lsn</acronym> after
> + changes just made. This example uses
> <function>pg_current_wal_insert_lsn</function>
> + to get the <acronym>lsn</acronym> given that
> <varname>synchronous_commit</varname>
> + could be set to <literal>off</literal>.
>
> Can we just mention that run pg_current_wal_insert_lsn on the primary?

The mention is added.

> 9. To me the following query blocks even though I didn't mention timeout.
> CALL pg_wal_replay_wait('0/fffffff');

If your primary server is freshly initialized, you need to do quite
data modifications to reach this LSN.

> 10. Can't we do some input validation on the timeout parameter and
> emit an error for negative values just like pg_terminate_backend?
> CALL pg_wal_replay_wait('0/ffffff', -100);

Reasonable, added.

> 11.
> +
> + if (timeout > 0)
> + {
> + delay_ms = (endtime - GetCurrentTimestamp()) / 1000;
> + latch_events |= WL_TIMEOUT;
> + if (delay_ms <= 0)
> + break;
> + }
> +
>
> Can we avoid calling GetCurrentTimestamp in a for loop which can be
> costly at times especially when pg_wal_replay_wait is called with
> larger timeouts on multiple backends? Can't we reuse
> pg_terminate_backend's timeout logic in
> pg_wait_until_termination, perhaps reducing waittime to 1msec or so?

Normally there shouldn't be many loops. It only happens on spurious
wakeups. For instance, some process was going to set our latch before
and for another reason, but due to kernel scheduling it does only now.
So, normally there is only one wakeup. pg_wait_until_termination()
may sacrifice timeout accuracy due to possible spurious wakeups and
time spent outside of WaitLatch(). I don't feel reasonable to repeat
this login in WaitForLSN() especially given that we don't need
frequent wakeups here.

> 12. Why should we let every user run pg_wal_replay_wait procedure?
> Can't we revoke execute from the public in system_functions.sql so
> that one can decide who to run this function? Per comment #11, one can
> easily cause a lot of activity by running this function on hundreds of
> sessions.

Generally, if a user can make many connections, then this user can
make them busy and can consume resources. Given my explanation above,
pg_wal_replay_wait() even wouldn't make the connection busy, it would
just wait on the latch. I don't see why pg_wal_replay_wait() could do
more harm than pg_sleep(). So, I would leave pg_wal_replay_wait()
public.

Links.
1. https://www.postgresql.org/message-id/CAPpHfdtiGgn0iS1KbW2HTam-1%2BoK%2BvhXZDAcnX9hKaA7Oe%3DF-A%40mail.gmail.com

------
Regards,
Alexander Korotkov

In response to

Re: [HACKERS] make async slave to wait for lsn to be replayed at 2024-03-31 05:44:20 from Bharath Rupireddy

Responses

Re: [HACKERS] make async slave to wait for lsn to be replayed at 2024-04-01 02:25:28 from Bharath Rupireddy

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2024-04-01 00:47:32	Re: Statistics Import and Export
Previous Message	Corey Huinker	2024-04-01 00:10:19	Re: Statistics Import and Export