From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Streaming Replication patch for CommitFest 2009-09 |
Date: | 2009-09-17 08:46:48 |
Message-ID: | 9837222c0909170146g7721af7fte033c4a08349f407@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 17, 2009 at 10:08, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Fujii Masao wrote:
>> On Tue, Sep 15, 2009 at 7:53 PM, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> After playing with this a little bit, I think we need logic in the slave
>>> to reconnect to the master if the connection is broken for some reason,
>>> or can't be established in the first place. At the moment, that is
>>> considered as the end of recovery, and the slave starts up. You have the
>>> trigger file mechanism to stop that, but it only gives you a chance to
>>> manually kill and restart the slave before it chooses a new timeline and
>>> starts up, it doesn't reconnect automatically.
>>
>> I was thinking that the automatic reconnection capability is the TODO item
>> for the later CF. The infrastructure for it has already been introduced in the
>> current patch. Please see the macro MAX_WALRCV_RETRIES (backend/
>> postmaster/walreceiver.c). This is the maximum number of times to retry
>> walreceiver. In the current version, this is the fixed value, but we can make
>> this user-configurable (parameter of recovery.conf is suitable, I think).
>
> Ah, I see.
>
> Robert Haas suggested a while ago that walreceiver could be a
> stand-alone utility, not requiring postmaster at all. That would allow
> you to set up streaming replication as another way to implement WAL
> archiving. Looking at how the processes interact, there really isn't
> much communication between walreceiver and the rest of the system, so
> that sounds pretty attractive.
Yes, that would be very very useful.
> Walreceiver is really a slave to the startup process. The startup
> process decides when it's launched, and it's the startup process that
> then waits for it to advance. But the way it's set up at the moment, the
> startup process needs to ask the postmaster to start it up, and it
> doesn't look very robust to me. For example, if launching walreceiver
> fails for some reason, startup process will just hang waiting for it.
>
> I'm thinking that walreceiver should be a stand-alone program that the
> startup process launches, similar to how it invokes restore_command in
> PITR recovery. Instead of using system(), though, it would use
> fork+exec, and a pipe to communicate.
Not having looked at all into the details, that sounds like a nice
improvement :-)
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Csaba Nagy | 2009-09-17 11:22:01 | Re: Streaming Replication patch for CommitFest 2009-09 |
Previous Message | Heikki Linnakangas | 2009-09-17 08:08:06 | Re: Streaming Replication patch for CommitFest 2009-09 |