From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, vignesh C <vignesh21(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Multi-Master Logical Replication |
Date: | 2022-05-14 06:50:05 |
Message-ID: | CAA4eK1KnPo0QQ83hoBVrnr4kbJs0cyXdVtbTZ-sYgsfpNMw0OQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, May 14, 2022 at 12:33 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> Uh, without these features, what workload would this help with?
>
To allow replication among multiple nodes when some of the nodes may
have pre-existing data. This work plans to provide simple APIs to
achieve that. Now, let me try to explain the difficulties users can
face with the existing interface. It is simple to set up replication
among various nodes when they don't have any pre-existing data but
even in that case if the user operates on the same table at multiple
nodes, the replication will lead to an infinite loop and won't
proceed. The example in email [1] demonstrates that and the patch in
that thread attempts to solve it. I have mentioned that problem
because this work will need that patch.
Now, let's take a simple case where two nodes have the same table
which has some pre-existing data:
Node-1:
Table t1 (c1 int) has data
1, 2, 3, 4
Node-2:
Table t1 (c1 int) has data
5, 6, 7, 8
If we have to set up replication among the above two nodes using
existing interfaces, it could be very tricky. Say user performs
operations like below:
Node-1
#Publication for t1
Create Publication pub1 For Table t1;
Node-2
#Publication for t1,
Create Publication pub1_2 For Table t1;
Node-1:
Create Subscription sub1 Connection '<node-2 details>' Publication pub1_2;
Node-2:
Create Subscription sub1_2 Connection '<node-1 details>' Publication pub1;
After this the data will be something like this:
Node-1:
1, 2, 3, 4, 5, 6, 7, 8
Node-2:
1, 2, 3, 4, 5, 6, 7, 8, 5, 6, 7, 8
So, you can see that data on Node-2 (5, 6, 7, 8) is duplicated. In
case, table t1 has a unique key, it will lead to a unique key
violation and replication won't proceed. Here, I have assumed that we
already have functionality for the patch in email [1], otherwise,
replication will be an infinite loop replicating the above data again
and again. Now one way to achieve this could be that we can ask users
to stop all operations on both nodes before starting replication
between those and take data dumps of tables from each node they want
to replicate and restore them to other nodes. Then use the above
commands to set up replication and allow to start operations on those
nodes. The other possibility for users could be as below. Assume, we
have already created publications as in the above example, and then:
Node-2:
Create Subscription sub1_2 Connection '<node-1 details>' Publication pub1;
#Wait for the initial sync of table t1 to finish. Users can ensure
that by checking 'srsubstate' in pg_subscription_rel.
Node-1:
Begin;
# Disallow truncates to be published and then truncate the table
Alter Publication pub1 Set (publish = 'insert, update, delete');
Truncate t1;
Create Subscription sub1 Connection '<node-2 details>' Publication pub1_2;
Alter Publication pub1 Set (publish = 'insert, update, delete, truncate');
Commit;
This will become more complicated when more than two nodes are
involved, see the example provided for the three nodes case [2]. Can
you think of some other simpler way to achieve the same? If not, I
don't think the current way is ideal and even users won't prefer that.
I am not telling that the APIs proposed in this thread is the only or
best way to achieve the desired purpose but I think we should do
something to allow users to easily set up replication among multiple
nodes.
[1] - https://www.postgresql.org/message-id/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9%3D9orqubhjcQ%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CALDaNm3aD3nZ0HWXA8V435AGMvORyR5-mq2FzqQdKQ8CPomB5Q%40mail.gmail.com
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2022-05-14 08:27:28 | Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set |
Previous Message | Erik Rijkers | 2022-05-14 06:17:05 | Re: JSON Functions and Operators Docs for v15 |