Hi Andrey,
Thank you for your feedback!
> I think here you can just specify target timeline for the standby instance_1 and it will continue recovery from instance_2.
Most likely yes, but nevertheless it looks more like a W/A.
> Persisting recovery signal file for some _timeout_ seems super dangerous to me. In distributed systems every extra _timeout_ is a source of complexity, uncertainty and despair.
The approach is not about persisting the signal files for some timeout.
Currently the files are removed in StartupXLOG() before
writeTimeLineHistory() and PerformRecoveryXLogAction() are called. The
suggestion is to move the file removal after PerformRecoveryXLogAction()
inside StartupXLOG().
Best regards,
Roman Eskin