On Fri, Jun 2, 2017, at 11:51 AM, Alexander Kukushkin wrote:
> Hello hackers,
> There is one strange and awful thing I don't understand about
> restore_command: it is always being called for every single WAL
> segment postgres wants to apply (even if such segment already exists
> in pg_xlog) until replica start streaming from the master.
The real problem this question is related to is being unable to bring a
former master, demoted after a crash, online, since the WAL segments
required to get it to the consistent state were not archived while it
was still a master, and local segments in pg_xlog are ignored when a
restore_command is defined. The other replicas wouldn't be good
candidates for promotion as well, as they were way behind the master
(because the last N WAL segments were not archived and streaming
replication had a few seconds delay).
Is this a correct list for such questions, or would it be more
appropriate to ask elsewhere (i.e. pgsql-bugs?)
>
> If there is no restore_command in the recovery.conf - it perfectly
> works, i.e. postgres replays existing wal segments and at some point
> connects to the master and start streaming from it.>
> When recovery_conf is there, starting of a replica could become a real
> problem, especially if restore_command is slow.>
> Is it possible to change this behavior somehow? First look into
> pg_xlog and only if file is missing or "corrupted" call
> restore_command.>
>
> Regards,
> ---
> Alexander Kukushkin
Sincerely,
Alex