From: | "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, fabriziomello(at)gmail(dot)com, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Minimal logical decoding on standbys |
Date: | 2023-02-06 11:36:56 |
Message-ID: | 8176a801-5bd8-a944-52dc-98a68808505e@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 1/31/23 12:50 PM, Drouvot, Bertrand wrote:
> Hi,
>
> On 1/6/23 4:40 AM, Andres Freund wrote:
>> Hi,
>> On 2023-01-05 16:15:39 -0500, Robert Haas wrote:
>>> On Tue, Jan 3, 2023 at 2:42 AM Drouvot, Bertrand
>>> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>> 0006:
>>
>>> diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
>>> index bc3c3eb3e7..98c96eb864 100644
>>> --- a/src/backend/access/transam/xlogrecovery.c
>>> +++ b/src/backend/access/transam/xlogrecovery.c
>>> @@ -358,6 +358,9 @@ typedef struct XLogRecoveryCtlData
>>> RecoveryPauseState recoveryPauseState;
>>> ConditionVariable recoveryNotPausedCV;
>>>
>>> + /* Replay state (see getReplayedCV() for more explanation) */
>>> + ConditionVariable replayedCV;
>>> +
>>> slock_t info_lck; /* locks shared variables shown above */
>>> } XLogRecoveryCtlData;
>>>
>>
>> getReplayedCV() doesn't seem to fit into any of the naming scheems in use for
>> xlogrecovery.h.
>
> Changed to check_for_replay() in V46 attached.
>
>>> - * Sleep until something happens or we time out. Also wait for the
>>> - * socket becoming writable, if there's still pending output.
>>> + * When not in recovery, sleep until something happens or we time out.
>>> + * Also wait for the socket becoming writable, if there's still pending output.
>>
>> Hm. Is there a problem with not handling the becoming-writable case in the
>> in-recovery case?
>>
>>
>
> Yes, when not in recovery we'd wait for the timeout to occur in ConditionVariableTimedSleep()
> (as the CV is broadcasted only in ApplyWalRecord()).
>
>>> + else
>>> + /*
>>> + * We are in the logical decoding on standby case.
>>> + * We are waiting for the startup process to replay wal record(s) using
>>> + * a timeout in case we are requested to stop.
>>> + */
>>> + {
>>
>> I don't think pgindent will like that formatting....
>
> Oops, fixed.
>
>>
>>
>>> + ConditionVariablePrepareToSleep(replayedCV);
>>> + ConditionVariableTimedSleep(replayedCV, 1000,
>>> + WAIT_EVENT_WAL_SENDER_WAIT_REPLAY);
>>> + }
>>
>> I think this is racy, see ConditionVariablePrepareToSleep()'s comment:
>>
>> * Caution: "before entering the loop" means you *must* test the exit
>> * condition between calling ConditionVariablePrepareToSleep and calling
>> * ConditionVariableSleep. If that is inconvenient, omit calling
>> * ConditionVariablePrepareToSleep.
>>
>> Basically, the ConditionVariablePrepareToSleep() should be before the loop
>> body.
>>
>
> I missed it, thanks! Moved it before the loop body.
>
>>
>> I don't think the fixed timeout here makes sense. For one, we need to wake up
>> based on WalSndComputeSleeptime(), otherwise we're ignoring wal_sender_timeout
>> (which can be quite small).
>
> Good point. Making use of WalSndComputeSleeptime() instead in V46.
>
>> It's also just way too frequent - we're trying to
>> avoid constantly waking up unnecessarily.
>>
>>
>> Perhaps we could deal with the pq_is_send_pending() issue by having a version
>> of ConditionVariableTimedSleep() that accepts a WaitEventSet?
>>
>
> What issue do you see?
> The one that I see with V46 (keeping the in/not recovery branches) is that one may need to wait
> for wal_sender_timeout to see changes that occurred right after the promotion.
>
> Regards,
>
Attaching a tiny rebase (V47) due to f9bc34fcb6.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v47-0006-Doc-changes-describing-details-about-logical-dec.patch | text/plain | 2.1 KB |
v47-0005-New-TAP-test-for-logical-decoding-on-standby.patch | text/plain | 26.6 KB |
v47-0004-Fixing-Walsender-corner-case-with-logical-decodi.patch | text/plain | 7.7 KB |
v47-0003-Allow-logical-decoding-on-standby.patch | text/plain | 11.8 KB |
v47-0002-Handle-logical-slot-conflicts-on-standby.patch | text/plain | 37.0 KB |
v47-0001-Add-info-in-WAL-records-in-preparation-for-logic.patch | text/plain | 76.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2023-02-06 11:51:12 | Re: Time delayed LR (WAS Re: logical replication restrictions) |
Previous Message | vignesh C | 2023-02-06 11:32:31 | Re: Support logical replication of DDLs |