Quick Links

Re: Time-Delayed Standbys

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Fabrízio Mello <fabriziomello(at)gmail(dot)com>
Cc:	Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Time-Delayed Standbys
Date:	2013-12-03 19:33:16
Message-ID:	CA+U5nML64Y7Nf0mpQirk-tOWrV+ZQ3y8DdRgk2mP+e3kGT16Mw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 18 October 2013 19:03, Fabrízio de Royes Mello
<fabriziomello(at)gmail(dot)com> wrote:

> The attached patch is a continuation of Robert's work [1].

Reviewing v2...

> I made some changes:
> - use of Latches instead of pg_usleep, so we don't have to wakeup regularly.

> - call HandleStartupProcInterrupts() before CheckForStandbyTrigger() because
> might change the trigger file's location

> - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
> XLOG_XACT_COMMIT_COMPACT checks

Why just those? Why not aborts and restore points also?

> - don't care about clockdrift because it's an admin problem.

Few minor points on things

* The code with comment "Clear any previous recovery delay time" is in
wrong place, move down or remove completely. Setting the delay to zero
doesn't prevent calling recoveryDelay(), so that logic looks wrong
anyway.

* The loop exit in recoveryDelay() is inelegant, should break if <= 0

* There's a spelling mistake in sample

* The patch has whitespace in one place

and one important point...

* The delay loop happens AFTER we check for a pause. Which means if
the user notices a problem on a commit, then hits pause button on the
standby, the pause will have no effect and the next commit will be
applied anyway. Maybe just one commit, but its an off by one error
that removes the benefit of the patch. So I think we need to test this
again after we finish delaying

if (xlogctl->recoveryPause)
recoveryPausesHere();

We need to explain in the docs that this is intended only for use in a
live streaming deployment. It will have little or no meaning in a
PITR.

I think recovery_time_delay should be called
<something>_apply_delay
to highlight the point that it is the apply of records that is
delayed, not the receipt. And hence the need to document that sync rep
is NOT slowed down by setting this value.

And to make the name consistent with other parameters, I suggest
min_standby_apply_delay

We also need to document caveats about the patch, in that it only
delays on timestamped WAL records and other records may be applied
sooner than the delay in some circumstances, so it is not a way to
avoid all cancellations.

We also need to document the behaviour of the patch is to apply all
data received as quickly as possible once triggered, so the specified
delay does not slow down promoting the server to a master. That might
also be seen as a negative behaviour, since promoting the master
effectively sets recovery_time_delay to zero.

I will handle the additional documentation, if you can update the
patch with the main review comments. Thanks.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Time-Delayed Standbys at 2013-10-18 18:03:16 from Fabrízio de Royes Mello

Responses

Re: Time-Delayed Standbys at 2013-12-04 14:01:59 from Andres Freund
Re: Time-Delayed Standbys at 2013-12-06 04:07:15 from Fabrízio de Royes Mello

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2013-12-03 19:34:10	Re: 9.2.1 & index-only scans : abnormal heap fetches after VACUUM FULL
Previous Message	Tom Lane	2013-12-03 19:31:17	Re: Extension Templates S03E11