Re: [HACKERS] make async slave to wait for lsn to be replayed

From: Kartyshov Ivan <i(dot)kartyshov(at)postgrespro(dot)ru>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: aekorotkov(at)gmail(dot)com, hlinnaka(at)iki(dot)fi, alvherre(at)alvh(dot)no-ip(dot)org, pashkin(dot)elfe(at)gmail(dot)com, bharath(dot)rupireddyforpostgres(at)gmail(dot)com, euler(at)eulerto(dot)com, thomas(dot)munro(at)gmail(dot)com, peter(at)eisentraut(dot)org, amit(dot)kapila16(at)gmail(dot)com, dilipbalaut(at)gmail(dot)com, smithpb2250(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed
Date: 2024-07-09 09:51:23
Message-ID: 8d0286f9f0c2f9fc29794dda6b5f4600@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for your interest in the patch.

On 2024-06-20 11:30, Kyotaro Horiguchi wrote:
> Hi, I looked through the patch and have some comments.
>
>
> ====== L68:
> + <title>Recovery Procedures</title>
>
> It looks somewhat confusing and appears as if the section is intended
> to explain how to perform recovery. Since this is the first built-in
> procedure, I'm not sure how should this section be written. However,
> the section immediately above is named "Recovery Control Functions",
> so "Reocvery Synchronization Functions" would align better with the
> naming of the upper section. (I don't believe we need to be so strcit
> about the distinction between functions and procedures here.)
>
> It looks strange that the procedure signature includes the return type.

Good point, change
Recovery Procedures -> Recovery Synchronization Procedures

> ====== L93:
> + If <parameter>timeout</parameter> is not specified or zero,
> this
> + procedure returns once WAL is replayed upto
> + <literal>target_lsn</literal>.
> + If the <parameter>timeout</parameter> is specified (in
> + milliseconds) and greater than zero, the procedure waits until
> the
> + server actually replays the WAL upto
> <literal>target_lsn</literal> or
> + until the given time has passed. On timeout, an error is
> emitted.
>
> The first sentence should mention the main functionality. Following
> precedents, it might be better to use something like "Waits until
> recovery surpasses the specified LSN. If no timeout is specified or it
> is set to zero, this procedure waits indefinitely for the LSN. If the
> timeout is specified (in milliseconds) and is greater than zero, the
> procedure waits until the LSN is reached or the specified time has
> elapsed. On timeout, or if the server is promoted before the LSN is
> reached, an error is emitted."
>
> The detailed explanation that follows the above seems somewhat too
> verbose to me, as other functions don't have such detailed examples.

Please offer your description. I think it would be better.

> ====== L484
> /*
> + * Set latches for processes, whose waited LSNs are already replayed.
> This
> + * involves spinlocks. So, we shouldn't do this under a spinlock.
> + */
>
> Here, I'm not quite sure what specifically spinlock (or mutex?) is
> referring to. However, more importantly, shouldn't we explain that it
> is okay not to use any locks at all, rather than mentioning that
> spinlocks should not be used here? I found a similar case around
> freelist.c:238, which is written as follows.
>
>> * Not acquiring ProcArrayLock here which is slightly icky. It's
>> * actually fine because procLatch isn't ever freed, so we just can
>> * potentially set the wrong process' (or no process') latch.
>> */
>> SetLatch(&ProcGlobal->allProcs[bgwprocno].procLatch);

???

> ===== L518
> +void
> +WaitForLSN(XLogRecPtr targetLSN, int64 timeout)
>
> This function is only called within the same module. I'm not sure if
> we need to expose it. I we do, the name should probably be more
> specific. I'm not quite sure if the division of functionality between
> this function and its only caller function is appropriate. As a
> possible refactoring, we could have WaitForLSN() just return the
> result as [reached, timedout, promoted] and delegate prerequisition
> checks and error reporting to the SQL function.

waitLSN -> waitLSNStates
No, waitLSNStates is not the best name, because waitLSNState is a state,
and waitLSN is not the array of waitLSNStates. We can think about
another name, what you think?

> ===== L524
> + /* Shouldn't be called when shmem isn't initialized */
> + Assert(waitLSN);
>
> Seeing this assertion, I feel that the name "waitLSN" is a bit
> obscure. How about renaming it to "waitLSNStates"?

> ===== L527
> + /* Should be only called by a backend */
> + Assert(MyBackendType == B_BACKEND && MyProcNumber <= MaxBackends);
>
> This is somewhat excessive, causing a server crash when ending with an
> error would suffice. By the way, if I execute "CALL
> pg_wal_replay_wait('0/0')" on a logical wansender, the server crashes.
> The condition doesn't seem appropriate.

Can you give more information on your server crashes, so I could repeat
them.

> ===== L565
> + if (timeout > 0)
> + {
> + delay_ms = (endtime - GetCurrentTimestamp()) / 1000;
> + latch_events |= WL_TIMEOUT;
> + if (delay_ms <= 0)
> + break;
>
> "timeout" is immutable in the function. Therefore, we can calculate
> "latch_events" before entering the loop. By the way, the name
> 'latch_events' seems a bit off. Latch is a kind of event the function
> can wait for. Therefore, something like wait_events might be more
> appropriate.

"wait_event" - it can't be, because in latch declaration, this events
responsible for wake up and not for wait
int WaitLatch(Latch *latch, int wakeEvents, long timeout, uint32
wait_event_info)

> ==== L578
> + if (rc & WL_LATCH_SET)
> + ResetLatch(MyLatch);
>
> I think we usually reset latches unconditionally after returning from
> WaitLatch(), even when waiting for timeouts.

No, it depends on you logic, when you have several wake_events and you
want to choose what event ignited your latch.
Check applyparallelworker.c:813

> ===== L798
> + * A process number, same as the index of this item in
> waitLSN->procInfos.
> + * Stored for convenience.
> + */
> + int procnum;
>
> It is described as "(just) for convenience". However, it is referenced
> by Startup to fetch the PGPROC entry for the waiter, which necessary
> for Startup. That aside, why don't we hold (the pointer to) procLatch
> instead of procnum? It makes things simpler and I believe it is our
> standard practice.

Can you deeper explane what you meen and give the example?

> ===== L809
> + /* A flag indicating that this item is added to waitLSN->waitersHeap
> */
> + bool inHeap;
>
> The name "inHeap" seems too leteral and might be hard to understand in
> most occurances. How about using "waiting" instead?

No, inHeap leteral mean indeed inHeap. Check the comment.
Please suggest a more suitable one.

> ===== L940
> +# Check that new data is visible after calling pg_wal_replay_wait()
>
> On the other hand, the comment for the check for this test states that
>
>> +# Make sure the current LSN on standby and is the same as primary's
>> LSN +ok($output eq 30, "standby reached the same LSN as primary");
>
> I think the first comment and the second should be consistent.

Thanks, I'll rephrase this comment

>> Oh, I forgot some notes about 044_wal_replay_wait_injection_test.pl.
>>
>> 1. It's not clear why this test needs node_standby2 at all. It seems
>> useless.
>
> I agree with you. What we would need is a second *waiter client*
> connecting to the same stanby rather than a second standby. I feel
> like having a test where the first waiter is removed while multiple
> waiters are waiting, as well as a test where promotion occurs under
> the same circumstances.

Can you give more information about this cases step by step, and what
means "remove" and "promotion".

>> 2. The target LSN is set to pg_current_wal_insert_lsn() + 10000. This
>> location seems to be unachievable in this test. So, it's not clear
>> what race condition this test could potentially detect.
>> 3. I think it would make sense to check for the race condition
>> reported by Heikki. That is to insert the injection point at the
>> beginning of WaitLSNSetLatches().
>
> I think the race condition you mentioned refers to the inconsistency
> between the inHeap flag and the pairing heap caused by a race
> condition between timeout and wakeup (or perhaps other combinations?
> I'm not sure which version of the patch the mentioned race condition
> refers to). However, I imagine it is difficult to reliably reproduce
> this condition. In that regard, in the latest patch, the coherence
> between the inHeap flag and the pairing heap is protected by LWLock,
> so I believe we no longer need that test.

No, Alexandre means that Heikki point on race condition just before
LWLock. But injection point we can inject and stepin on backend, and
WaitLSNSetLatches is used from Recovery process. But I have trouble
to wakeup injection point on server.

--
Ivan Kartyshov
Postgres Professional: www.postgrespro.com

Attachment Content-Type Size
v20-Implement-pg_wal_replay_wait-store.patch text/x-diff 39.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2024-07-09 09:53:39 pg_verifybackup: TAR format backup verification
Previous Message shveta malik 2024-07-09 09:39:35 Re: Conflict Detection and Resolution