From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: streaming replication breaks horribly if master crashes |
Date: | 2010-06-17 05:57:55 |
Message-ID: | AANLkTinI949X-OWATDntssWPnRVO5JxkLbdSCpvDl-e6@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jun 17, 2010 at 5:26 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Jun 16, 2010 at 4:14 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> The first problem I noticed is that the slave never seems to realize
>>> that the master has gone away. Every time I crashed the master, I had
>>> to kill the wal receiver process on the slave to get it to reconnect;
>>> otherwise it just sat there waiting, either forever or at least for
>>> longer than I was willing to wait.
>>
>> Yes, I've noticed this. That was the reason for forcing walreceiver to
>> shut down on a restart per prior discussion and patches. This needs to
>> be on the open items list ... possibly it'll be fixed by Simon's
>> keepalive patch? Or is it just a tcp_keeplalive issue?
>
> I think a TCP keepalive might be enough, but I have not tried to code
> or test it.
The "keepalive on libpq" patch would help.
https://commitfest.postgresql.org/action/patch_view?id=281
>>> and this just
>>> makes it more likely. After the most recent crash, the master thought
>>> pg_current_xlog_location() was 1/86CD4000; the slave thought
>>> pg_last_xlog_receive_location() was 1/8733C000. After reconnecting to
>>> the master, the slave then thought that
>>> pg_last_xlog_receive_location() was 1/87000000.
>>
>> So, *in this case*, detecting out-of-sequence xlogs (and PANICing) would
>> have actually prevented the slave from being corrupted.
>>
>> My question, though, is detecting out-of-sequence xlogs *enough*? Are
>> there any crash conditions on the master which would cause the master to
>> reuse the same locations for different records, for example? I don't
>> think so, but I'd like to be certain.
>
> The real problem here is that we're sending records to the slave which
> might cease to exist on the master if it unexpectedly reboots. I
> believe that what we need to do is make sure that the master only
> sends WAL it has already fsync'd (Tom suggested on another thread that
> this might be necessary, and I think it's now clear that it is 100%
> necessary).
The attached patch changes walsender so that it always sends WAL up to
LogwrtResult.Flush instead of LogwrtResult.Write.
> But I'm not sure how this will play with fsync=off - if
> we never fsync, then we can't ever really send any WAL without risking
> this failure mode. Similarly with synchronous_commit=off, I believe
> that the next checkpoint will still fsync WAL, but the lag might be
> long.
First of all, we should not restart the master after the crash in
fsync=off case. That would cause the corruption of the master database
itself.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
send_after_fsync_v1.patch | application/octet-stream | 3.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2010-06-17 06:09:48 | Re: streaming replication breaks horribly if master crashes |
Previous Message | KaiGai Kohei | 2010-06-17 05:22:37 | modular se-pgsql as proof-of-concept |