Re: Selective logical replication

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: konstantin knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: Selective logical replication
Date: 2015-11-20 14:20:40
Message-ID: CAMsr+YFbmvDNb7zNK3zJOTohEsWNitzOLTTbo2q97TgFEXtwmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20 November 2015 at 22:03, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

> On 19 November 2015 at 16:48, konstantin knizhnik <
> k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>
>> Hi,
>>
>> I want to use logical replication for implementing multimaster (so all
>> nodes are both sending and receiving changes).
>>
>
> Like http://bdr-project.org/ ?
>
>
>> But there is one "stupid" problem: how to prevent infinite recursion and
>> not to rereplicate replicated data.
>>
>
> You need replication origins, so you can tell where a change came from and
> in a mesh topology only send locally-originated tuples to peers.
>
> In a circular forwarding topology you instead filter out a peer's changes
> when they come back around the circle to you and forward everything else.
>
> That's exactly what the replication origins feature in 9.5 is for. It lets
> you associate a tuple with the node it came from. When doing logical
> decoding you can make decisions about whether to forward it based on your
> knowledge of the tuple's origin and the peer node(s).
>
> This is trivial to implement this on top of the pglogical output plugin -
> we already have the hooks for origin filtering in place. All you have to do
> is pass information about which nodes you want to filter out or retain.
>
>
Oh, see also the submission for 9.6.

http://www.postgresql.org/message-id/CAMsr+YGc6AYKjsCj0Zfz=X4Aczonq1SfQx9C=hUYUN4j2pKwHA@mail.gmail.com

The pglogical downstream can trivially support simple multimaster
replication. So can any other client that consumes the pglogical_output
plugin's change stream because it's designed with pluggable hooks you can
use to add your own knowledge of topology, node filtering, etc.

That said: Multimaster is a lot more complicated than just avoiding
circular replication. That's one of a very many complex issues. Dealing
with schema changes is way harder than you think because of the
queued-but-not-yet-replicated changes in an asynchronous system. That's why
we have that horrid global DDL lock in BDR (though it's being improved to
be a bit less extreme).

Conflict handling is hard, especially when foreign keys, multiple
constraints, etc get involved. There are tons and tons of hairy corner
cases.

Handling full table rewrites is a problem because the way Pg performs them
internally is currently hard for logical decoding to handle. You don't have
a good mapping from the pg_temp table being written to during the rewrite
to the final target table. It also means you lose the original replication
origin and commit timestamp info. So you need a way to make sure there are
no outstanding writes to replicate on that table when you do the full
rewrite, like a mini global checkpoint for just one table.

There are lots of things that can break an asynchronous multimaster
replication setup, often in subtle ways, that won't break standalone Pg.
Want to create an exclusion constraint? Think again. You can't do that
safely. Not without just discarding changed tuples that violate the
constraint and introducing divergence between node states, anyway.
Secondary unique indexes, FK constraints, etc, can all be challenging. A
naïve multimaster system is easy to deadlock with simple foreign key use.
There are still cases BDR doesn't handle.

I'd really, really love more involvement from others in doing multi-master
replication with PostgresQL using logical decoding, replication origins,
etc. We've got some amazing infrastructure now, and are in a position to
deliver some really cool technology if there's more uptake and interest. So
I'd love to work with you to whatever extent it's possible. Lets try to
share effort rather than reinvent. Feel free to get in touch off-list if
that's better for you at the current stage of your investigations.

(BTW, I'm quite interested in what's being discussed with distributed
locking and transactions, since that'd help solve some of the issues with
DDL in logical replication. Having a good way to get a global exclusive
lock before a table rewrite, for example.)

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2015-11-20 15:44:07 Re: Getting sorted data from foreign server for merge join
Previous Message Craig Ringer 2015-11-20 14:03:17 Re: Selective logical replication