Re: Inconsistent DB data in Streaming Replication

From: Ants Aasma <ants(at)cybertec(dot)at>
To: sthomas(at)optionshouse(dot)com
Cc: Samrat Revagade <revagade(dot)samrat(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconsistent DB data in Streaming Replication
Date: 2013-04-08 16:26:33
Message-ID: CA+CSw_v99xhD-Om=wrTu8PRzsXE7W+CVLgrUPgsdnoQkHzdZ=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 8, 2013 at 6:50 PM, Shaun Thomas <sthomas(at)optionshouse(dot)com> wrote:
> On 04/08/2013 05:34 AM, Samrat Revagade wrote:
>
>> One solution to avoid this situation is have the master send WAL
>> records to standby and wait for ACK from standby committing WAL files
>> to disk and only after that commit data page related to this
>> transaction on master.
>
>
> Isn't this basically what synchronous replication does in PG 9.1+?

Not exactly. Sync-rep ensures that commit success is not sent to the
client before a synchronous replica acks the commit record. What
Samrat is proposing here is that WAL is not flushed to the OS before
it is acked by a synchronous replica so recovery won't go past the
timeline change made in failover, making it necessary to take a new
base backup to resync with the new master. I seem to remember this
being discussed when sync rep was committed. I don't recall if the
idea was discarded only on performance grounds or whether there were
other issues too.

Thinking about it now it, the requirement is that after crash and
failover to a sync replica we should be able to reuse the datadir to
replicate from the new master without consistency. We should be able
to achieve that by ensuring that we don't write out pages until we
have received an ack from the sync replica and that we check for
possible timeline switches before recovering local WAL. For the first,
it seems to me that it should be enough to rework the updating of
XlogCtl->LogwrtResult.Flush so it accounts for the sync replica. For
the second part, I think Heikkis work on enabling timeline switches
over streaming connections already ensure this (I haven't checked it
out in detail), but if not, shouldn't be too hard to add.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2013-04-08 16:28:29 Re: [sepgsql 2/3] Add db_schema:search permission checks
Previous Message Jeff Bohmer 2013-04-08 16:06:13 Re: BUG #8043: 9.2.4 doesn't open WAL files from archive, only looks in pg_xlog