From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: After switching primary server while using replication slot. |
Date: | 2014-08-22 14:29:12 |
Message-ID: | 20140822142912.GQ17406@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2014-08-20 13:14:30 -0400, Robert Haas wrote:
> On Tue, Aug 19, 2014 at 6:25 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> > On Mon, Aug 18, 2014 at 11:16 PM, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >> Hi all,
> >> After switching primary serer while using repliaction slot, the
> >> standby server will not able to connect new primary server.
> >> Imagine this situation, if primary server has two ASYNC standby
> >> servers, also use each replication slots.
> >> And the one standby(A) apply WAL without problems. But another one
> >> standby(B) has stopped after connected to primary server.
> >> (or sending WAL is too delayed)
> >>
> >> In this situation, the standby(B) has not received WAL segment file
> >> while stopping itself.
> >> And the primary server can not remove WAL segments which has not been
> >> received to all standby.
> >> Therefore the primary server have to keep the WAL segment file which
> >> has not been received to all standby.
> >> But standby(A) can do checkpoint itself, and then it's possible to
> >> recycle WAL segments.
> >> The number of WAL segment of each server are different.
> >> ( The number of WAL files of standby(A) having smaller than primary server.)
> >> After the primary server is crashed, the standby(A) promote to primary,
> >> we can try to connect standby(B) to standby(A) as new standby server.
> >> But it will be failed because the standby(A) server might not have WAL
> >> segment files that standby(B) required.
> >
> > This sounds valid concern.
> >
> >> To resolve this situation, I think that we should make master server
> >> to notify about removal of WAL segment to all standby servers.
> >> And the standby servers recycle WAL segments files base on that information.
I think that'll end up being really horrible, at least if done in an
obligatory fashion. In a cascaded setup it's really sensible to only
retain WAL on the intermediate nodes. Consider e.g. a setup - rather
common these days actually - where there's a master somewhere and then a
cascading standby on each continent feeding off to further nodes on that
continent. You don't want to retain nodes on each continent (or on the
primary) just because one node somewhere is down for maintenance.
If you really want something like this we should probably add the
infrastructure for one standby to maintain a replication slot on another
standby server. So, if you have a setup like:
A
/ \
/ \
B C
/ \ /\
.. .. .. ..
B and C can coordinate that they keep enough WAL for each other. You can
actually easily write a external tool for that today. Just create a
replication slot oin B for C and the other way round and have a tool
update them once a minute or so.
I'm not sure if we want that builtin.
> >> Thought?
> >
> > How does the server recycle WAL files after it's promoted from the
> > standby to master?
> > It does that as it likes? If yes, your approach would not be enough.
> >
> > The approach prevents unexpected removal of WAL files while the standby
> > is running. But after the standby is promoted to master, it might recycle
> > needed WAL files immediately. So another standby may still fail to retrieve
> > the required WAL file after the promotion.
> >
> > ISTM that, in order to address this, we might need to log all the replication
> > slot activities and replicate them to the standby. I'm not sure if this
> > breaks the design of replication slot at all, though.
Yes, that'd break it. You can't WAL log anything on a standby, and
replication slots can be modified on standbys.
> I believe that the reason why replication slots are not currently
> replicated is because we had the idea that the standby could have
> slots that don't exist on the master, for cascading replication. I'm
> not sure that works yet, but I think Andres definitely had it in mind
> in the original design.
That works. And it's absolutely required for adding logical decoding on
standbys (I've a prototype patch for it...).
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2014-08-22 14:29:42 | Re: WIP Patch for GROUPING SETS phase 1 |
Previous Message | Robert Haas | 2014-08-22 14:27:02 | Re: Is this a bug? |