Re: Timeline issue if StartupXLOG() is interrupted right before end-of-recovery record is done

From: Roman Eskin <r(dot)eskin(at)arenadata(dot)io>
To: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Timeline issue if StartupXLOG() is interrupted right before end-of-recovery record is done
Date: 2025-01-21 11:47:19
Message-ID: 15fc1d1c-f0c2-4489-9611-b0262c14cfdc@arenadata.io
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Andrey,

Thank you for your feedback!

> I think here you can just specify target timeline for the standby instance_1 and it will continue recovery from instance_2.

Most likely yes, but nevertheless it looks more like a W/A.

> Persisting recovery signal file for some _timeout_ seems super dangerous to me. In distributed systems every extra _timeout_ is a source of complexity, uncertainty and despair.

The approach is not about persisting the signal files for some timeout.
Currently the files are removed in StartupXLOG() before
writeTimeLineHistory() and PerformRecoveryXLogAction() are called. The
suggestion is to move the file removal after PerformRecoveryXLogAction()
inside StartupXLOG().

Best regards,
Roman Eskin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Marcos Pegoraro 2025-01-21 12:20:39 Year of first commit
Previous Message Daniel Gustafsson 2025-01-21 11:39:27 Re: Replace current implementations in crypt() and gen_salt() to OpenSSL