From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | a(dot)lubennikova(at)postgrespro(dot)ru |
Cc: | robertmhaas(at)gmail(dot)com, michael(at)paquier(dot)xyz, sk(at)zsrv(dot)org, andres(at)anarazel(dot)de, peter(dot)eisentraut(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Allow some recovery parameters to be changed with reload |
Date: | 2020-10-22 02:00:26 |
Message-ID: | 20201022.110026.524650499365127827.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Thu, 22 Oct 2020 01:59:07 +0300, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> wrote in
> On 10.08.2020 23:20, Robert Haas wrote:
> > On Sun, Aug 9, 2020 at 1:21 AM Michael Paquier <michael(at)paquier(dot)xyz>
> > wrote:
> >> Sorry for the late reply. I have been looking at that stuff again,
> >> and restore_command can be called in the context of a WAL sender
> >> process within the page_read callback of logical decoding via
> >> XLogReadDetermineTimeline(), as readTimeLineHistory() could look for a
> >> timeline history file. So restore_command is not used only in the
> >> startup process.
> > Hmm, interesting. But, does that make this change wrong, apart from
> > the comments? Like, in the case of primary_conninfo, maybe some
> > confusion could result if the startup process decided whether to ask
> > for a WAL receiver based on thinking primary_conninfo being set, while
> > that process thought that it wasn't actually set after all, as
> > previously discussed in
> > http://postgr.es/m/CA+TgmoZVmJX1+QTWw2tSnPHrnkwKZxC3ZsRynFB-Fpzm1Oxuhw@mail.gmail.com
> > ... but what's the corresponding hazard here, exactly? It doesn't seem
> > that there's any way in which the decision one process makes affects
> > the decision the other process makes. There's still a race condition:
> > it's possible for a walsender
> Did you mean walreceiver here?
It's logical walsender. restore_command is used within
logical_read_xlog_page() via XLogReadDetermineTimeline().
> > to use the old restore_command after the
> > startup process had already used the new one, or the other way around.
> > However, it doesn't seem like that should confuse anything inside the
> > server, and therefore I'm not sure we need to code around it.
> I came up with following scenario. Let's say we have xlog files 1,2,3
> in dir1 and files 4,5 in dir2. If startup process had only handled
> files 1 and 2, before we switched restore_command from reading dir1 to
> reading dir2, it will fail to find next file. IIUC, it will assume
> that recovery is done, start server and walreceiver. The walreceiver
> will fail as well. I don't know, how realistic is this case, though.
That operation is somewhat bogus, if the server is not in standby
mode. In standby mode, startup waits for the next segment safely.
> In general,. this feature looks useful and consistent with previous
> changes, so I am interested in pushing it forward.
Agreed. The feature seems to work fine as far as we don't make a
change of restore_command that moves to another history. Otherwise
recovery doesn't work correctly regaredless whether it is PGC_SIGHUP
or not.
> Sergey, could you please attach this thread to the upcoming CF, if
> you're going to continue working on it.
>
> A few more questions:
> - RestoreArchivedFile() is also used by pg_rewind. I don't see any
> - particular problem with it, just want to remind that we should test it
> - too.
> - How will it interact with possible future optimizations of archive
> - restore? For example, WAL prefetch [1].
>
> [1]
> https://www.postgresql.org/message-id/flat/601EE1F5-0B78-47E1-9AAE-C15F74A1C21D(at)postgrespro(dot)ru
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | k.jamison@fujitsu.com | 2020-10-22 02:06:43 | RE: [Patch] Optimize dropping of relation buffers using dlist |
Previous Message | Michael Paquier | 2020-10-22 01:53:53 | Re: PostgresNode::backup uses spread checkpoint? |