Re: Design for In-Core Logical Replication

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Design for In-Core Logical Replication
Date: 2016-07-20 17:07:28
Message-ID: CANP8+jLBJ+7+5E4LyKo-AXTQaxc8c+xmPoT+WJmE=R9PwzM9ow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20 July 2016 at 16:39, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:

> Logical Replication is a method of replicating data objects and their
>> changes, based upon their Primary Keys (or Replication Identity). We
>>
>
> Do we want a limitation based on Primary Key, or would it be possible to
> use just UNIQUE or is that covered under Replication Identity?

That is covered by replication identity.

> <para>
>> Logical Replication uses a Publish and Subscribe model with one or
>> more Subscribers subscribing to one or more Publications on a
>> Provider node. Subscribers pull data from the Publications they
>> subscribe to and may subsequently re-publish data to allow
>> cascading replication or more complex configurations.
>>
>
> Is that somehow different than Origin/Subscriber or Master/Slave? If not,
> why are we using yet more terms?

Thanks for asking, an important question that we have a chance to get right
before we go too far down the road of implementation.

Issue: We need a noun for CREATE "SOMETHING" (or pg_create_$NOUN). So what
noun to use? SQLStandard gives us no guidance here.

I'll explain my thinking, so we can discuss the terms I've recommended,
which can be summarized as:
A Provider node has one or more Databases, each of which can publish its
data in zero, one or more PUBLICATIONs. A Subscribing node can receive data
in the form of zero, one or more SUBSCRIBERs, where each SUBSCRIBER may
bring together data from one or more PUBLICATIONs

Here's why...

Master/Slave is not appropriate, since both sending and receiving nodes are
Masters.

Origin/Subscriber is used by Slony. The term "Replication Origin" is
already used in PG9.5 for something related, but not identical.
Provider/Subscriber is used by Londiste.
Bucardo seems to use Master/Slave according to FAQ.

The Noun we are discussing is something that a single Database can have >1
of, so those terms aren't quite appropriate.

pglogical uses Provider/Subscriber and Replication Sets, so I started with
the thought that we might want CREATE REPLICATION SET or
pg_create_replication_set(). After some time considering this, ISTM that
the term "replication set" may not be that useful since we foresee a future
where data is actually filtered and transformed and the feature set extends
well beyond what we have with Slony, so I began looking for a term that was
general and obvious (POLA).

After some thought, I realised that we are describing this as "Publish &
Subscribe", so it makes a lot of sense to just use the terms Publication &
Subscription. Those phrases are commonly used by SQLServer, Sybase, Oracle,
Redis, RabbitMQ etc which is a pretty big set.
It's also a commonly used Enterprise Integration Design pattern
https://en.wikipedia.org/wiki/Publish–subscribe_pattern
I note especially that Publish/Subscribe does not imply any particular
topology (a mistake I made earlier when I called this stuff BDR, which
confused everybody when we tried to talk about a subset of that
functionality called UDR).
http://www.slideshare.net/ishraqabd/publish-subscribe-model-overview-13368808

So that brings us to...
A Provider node has one or more Databases, each of which can publish its
data in zero, one or more PUBLICATIONs. A Subscribing node can receive data
in the form of zero, one or more SUBSCRIBERs, where each SUBSCRIBER may
bring together data from one or more PUBLICATIONs.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2016-07-20 17:20:30 Re: Design for In-Core Logical Replication
Previous Message Andres Freund 2016-07-20 17:01:59 Re: skink's test_decoding failures in 9.4 branch