From: | Eric Radman <ericshane(at)eradman(dot)com> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com> |
Subject: | Re: [PATCH] Add recovery_min_apply_delay_reconnect recovery option |
Date: | 2017-10-17 13:40:10 |
Message-ID: | 20171017134009.GA52403@vm2.eradman.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 17, 2017 at 12:34:17PM +0900, Michael Paquier wrote:
> On Tue, Oct 17, 2017 at 12:51 AM, Eric Radman <ericshane(at)eradman(dot)com> wrote:
> > This administrative compromise is necessary because the WalReceiver is
> > not resumed after a network interruption until all records are read,
> > verified, and applied from the archive on disk.
>
> Taking a step back here... recoveryApplyDelay() uses
> XLogCtl->recoveryWakeupLatch which gets set if the WAL receiver has
> received new WAL, or if the WAL receiver shuts down properly.
I thought I had observed cases where the WalReceiver was shut down
without causing XLogCtl->recoveryWakeupLatch to return. If I'm wrong
about this then there's no reason to spin every n seconds.
> the WAL receiver gets down for whatever reason during the loop of
> recoveryApplyDelay(), the startup process waits for a record to be
> applied maybe for a long time, and as there is no WAL receiver we
> actually don't receive any new WAL records.
...
> indeed a waste to not have a WAL receiver online while waiting for a
> delay to be applied.
Exactly!
> If there is a flacky network between the primary and a standby, you
> may end up with a standby way behind its primary, and that could
> penalize a primary clean shutdown as the primary waits for the
> shutdown checkpoint record to be flushed on the standby.
This is another artifact that the database administrator would not
anticipate.
> I think that your way to deal with the problem is messy though. If you
> think about it, no parameters are actually needed. What you should try
> to achieve is to make recoveryApplyDelay() smarter, by making the wait
> to forcibly stop if you detect a failure by getting out of the redo
> routine, and then force again the record to be read again. This way,
> the startup process would try to start again a new WAL receiver if it
> thinks that the source it should read WAL from is a stream. That may
> turn to be a patch more complicated than you think though.
One of my earlier attempts was to break from the redo loop and try
reading the next record. This was too simple because it only starts the
WAL receiver if there is nothing more to be read from the archive.
Which record are you suggesting should be forcibly "read again"? The
record identified by XLogCtl->replayEndRecPtr or
XLogCtl->lastReplayedEndRecPtr? I'll look more carefully at such an
approach.
> Your patch also breaks actually the use case of standbys doing
> recovery using only archives and no streaming. In this case
> WalRcvStreaming returns false, and recovery_min_apply_delay_reconnect
> would be used unconditionally, so you would break a lot of
> applications silently.
Excellent point--I had not thought of how this would interact with a
standby that used only archives.
All useful feedback, thank you for the thorough review!
--
Eric Radman | http://eradman.com
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2017-10-17 13:43:36 | Re: Partition-wise aggregation/grouping |
Previous Message | Tom Lane | 2017-10-17 13:34:24 | Re: SIGSEGV in BRIN autosummarize |