From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: streaming replication breaks horribly if master crashes |
Date: | 2010-06-16 20:32:45 |
Message-ID: | AANLkTil01eBZVtqOWQqp2ZjAd1-JpY5l9PW3Lwn5P96o@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jun 16, 2010 at 22:26, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> and this just
>>> makes it more likely. After the most recent crash, the master thought
>>> pg_current_xlog_location() was 1/86CD4000; the slave thought
>>> pg_last_xlog_receive_location() was 1/8733C000. After reconnecting to
>>> the master, the slave then thought that
>>> pg_last_xlog_receive_location() was 1/87000000.
>>
>> So, *in this case*, detecting out-of-sequence xlogs (and PANICing) would
>> have actually prevented the slave from being corrupted.
>>
>> My question, though, is detecting out-of-sequence xlogs *enough*? Are
>> there any crash conditions on the master which would cause the master to
>> reuse the same locations for different records, for example? I don't
>> think so, but I'd like to be certain.
>
> The real problem here is that we're sending records to the slave which
> might cease to exist on the master if it unexpectedly reboots. I
> believe that what we need to do is make sure that the master only
> sends WAL it has already fsync'd (Tom suggested on another thread that
> this might be necessary, and I think it's now clear that it is 100%
> necessary). But I'm not sure how this will play with fsync=off - if
> we never fsync, then we can't ever really send any WAL without risking
Well, at this point we can just prevent streaming replication with
fsync=off if we can't think of an easy fix, and then design a "proper
fix" for 9.1. Given how late we are in the cycle.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Rafael Martinez | 2010-06-16 20:38:14 | Re: streaming replication breaks horribly if master crashes |
Previous Message | Kevin Grittner | 2010-06-16 20:30:08 | Re: streaming replication breaks horribly if master crashes |