From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Ants Aasma <ants(at)cybertec(dot)at>, sthomas(at)optionshouse(dot)com, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Samrat Revagade <revagade(dot)samrat(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com> |
Subject: | Re: Inconsistent DB data in Streaming Replication |
Date: | 2013-04-11 17:18:42 |
Message-ID: | CAHGQGwED7g9PBFN9N_4AHONC19_fR5B+JcKjUxET9TG=h=M3=g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Apr 11, 2013 at 2:42 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Ants Aasma <ants(at)cybertec(dot)at> writes:
>> We already rely on WAL-before-data to ensure correct recovery. What is
>> proposed here is to slightly redefine it to require WAL to be
>> replicated before it is considered to be flushed. This ensures that no
>> data page on disk differs from the WAL that the slave has. The
>> machinery to do this is already mostly there, we already wait for WAL
>> flushes and we know the write location on the slave. The second
>> requirement is that we never start up as master and we don't trust any
>> local WAL. This is actually how pacemaker clusters work, you would
>> only need to amend the RA to wipe the WAL and configure postgresql
>> with restart_after_crash = false.
>
>> It would be very helpful in restoring HA capability after failover if
>> we wouldn't have to read through the whole database after a VM goes
>> down and is migrated with the shared disk onto a new host.
>
> The problem with this is it's making an idealistic assumption that a
> crashed master didn't do anything wrong or lose/corrupt any data during
> its crash. As soon as you realize that's an unsafe assumption, the
> whole thing becomes worthless to you.
The crash recovery relies on the same assumption. If it's really unsafe,
we should stop supporting the crash recovery. But I don't think that's
always true.
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2013-04-11 17:25:42 | Analyzing bug 8049 |
Previous Message | Tom Lane | 2013-04-11 16:22:49 | Re: Nearing beta? |