Quick Links

Re: Changes to recovery_min_apply_delay are ignored while waiting for delay

From:	Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, ashwinstar(at)gmail(dot)com
Subject:	Re: Changes to recovery_min_apply_delay are ignored while waiting for delay
Date:	2021-08-14 00:59:21
Message-ID:	CAE-ML+8mB1R2_8ZVOFYROLFd=SdQXDtr-6aSxghUPkH+sUgkbw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hey Michael,

Really appreciate the review!

On Wed, Aug 11, 2021 at 12:40 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> Agreed that the current behavior is confusing. As you are using the
> commit timestamp for the comparison, this is right. One small-ish
> comment I have about the code is that we should mention
> recovery_min_apply_delay for HandleStartupProcInterrupts(), and not
> only the trigger file location.

Cool, updated.

> +# First, set the delay for the next commit to some obscenely high value.
> It has no need to be obscenely high, just high enough to give the time
> to slow machines to catch the wait event lookup done. So this could
> use better words to explain this choice.

Sounds good. Done.

> Anyway, it seems to me that something is incorrect in this new test
> (manual tests pass here, the TAP test is off). One thing we need to
> make sure for any new test added is that it correctly breaks if the
> fix proposed is *not* in place. And as far as I can see, the test
> passes even if the recalculation of delayUntil is done within the loop
> that reloads the configuration. The thing may be a bit tricky to make
> reliable as the parameter reloading may cause wait_event to not be
> RecoveryApplyDelay, so we should have at least a check based on a scan
> of pg_stat_replication.replay_lsn on the primary. Perhaps you have
> an other idea?

Hmm, please see attached v4 which just shifts the test to the middle,
like v1. When I run the test without the code change, the test hangs
as expected in:

# Now the standby should catch up.
$node_primary->wait_for_catchup('standby', 'replay');

and passes with the code change, as expected. I can't explain why the
test doesn't freeze up in v3 in wait_for_catchup() at the end.

As for checking on the wait event, since we only signal the standby
after performing the check, that should be fine. Nothing else would be
sending a SIGHUP before the check. Is that assumption correct?

> When using wait_for_catchup(), I would recommend to list "replay" for
> this test, even if that's the default mode, to make clear what the
> intention is.

Done.

Regards,
Soumyadeep (VMware)

Attachment	Content-Type	Size
v4-0001-Refresh-delayUntil-in-recovery-delay-loop.patch	text/x-patch	5.8 KB

In response to

Re: Changes to recovery_min_apply_delay are ignored while waiting for delay at 2021-08-11 07:40:39 from Michael Paquier

Responses

Re: Changes to recovery_min_apply_delay are ignored while waiting for delay at 2021-08-16 03:16:15 from Michael Paquier

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2021-08-14 08:03:01	Re: Default to TIMESTAMP WITH TIME ZONE?
Previous Message	David Zhang	2021-08-13 23:33:22	Re: [UNVERIFIED SENDER] Re: [bug] Logical Decoding of relation rewrite with toast does not reset toast_hash