RE: Handle infinite recursion in logical replication setup

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: RE: Handle infinite recursion in logical replication setup
Date: 2022-08-17 03:18:08
Message-ID: OS0PR01MB5716C0112EFA1C2328A84C8F946A9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, August 2, 2022 8:00 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Tue, Jul 26, 2022 at 9:07 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Jul 26, 2022 at 7:13 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org>
> wrote:
> > >
> > > Thanks for the example. I agree that it is fairly simple to reproduce.
> > >
> > > I understand that "copy_data = force" is meant to protect a user
> > > from hurting themself. I'm not convinced that this is the best way to do so.
> > >
> > > For example today I can subscribe to multiple publications that
> > > write to the same table. If I have a primary key on that table, and
> > > two of the subscriptions try to write an identical ID, we conflict.
> > > We don't have any special flags or modes to guard against that from
> > > happening, though we do have documentation on conflicts and managing
> them.
> > >
> > > AFAICT the same issue with "copy_data" also exists in the above
> > > scenario too, even without the "origin" attribute.
> > >
> >
> > That's true but there is no parameter like origin = NONE which
> > indicates that constraint violations or duplicate data problems won't
> > occur due to replication. In the current case, I think the situation
> > is different because a user has specifically asked not to replicate
> > any remote data by specifying origin = NONE, which should be dealt
> > differently. Note that current users or their setup won't see any
> > difference/change unless they specify the new parameter origin as
> > NONE.
> >
>
> Let me try to summarize the discussion so that it is easier for others to follow.
> The work in this thread is to avoid loops, and duplicate data in logical
> replication when the operations happened on the same table in multiple nodes.
> It has been explained in email [1] with an example of how a logical replication
> setup can lead to duplicate or inconsistent data.
>
> The idea to solve this problem is that we don't replicate data that is not
> generated locally which we can normally identify based on origin of data in
> WAL. The commit 366283961a achieves that for replication but still the
> problem can happen during initial sync which is performed internally via copy.
> We can't differentiate the data in heap based on origin. So, we decided to
> prohibit the subscription operations that can perform initial sync (ex. Create
> Subscription, Alter Subscription ... Refresh) by detecting that the publisher has
> subscribed to the same table from some other publisher.
>
> To prohibit the subscription operations, the currently proposed patch throws
> an error. Then, it also provides a new copy_data option 'force' under which
> the user will still be able to perform the operation. This could be useful when
> the user intentionally wants to replicate the initial data even if it contains data
> from multiple nodes (for example, when in a multi-node setup, one decides to
> get the initial data from just one node and then allow replication of data to
> proceed from each of respective nodes).
>
> The other alternative discussed was to just give a warning for subscription
> operations and probably document the steps for users to avoid it. But the
> problem with that is once the user sees this warning, it won't be able to do
> anything except recreate the setup, so why not give an error in the first place?
>
> Thoughts?

Thanks for the summary.

I think it's fine to make the user use the copy_data option more carefully to
prevent duplicate copies by reporting an ERROR.

But I also have similar concern with Sawada-san as it's possible for user to
receive an ERROR in some unexpected cases.

For example I want to build bi-directional setup between two nodes:

Node A: TABLE test (has actual data)
Node B: TABLE test (empty)

Step 1:
CREATE PUBLICATION on both Node A and B.

Step 2:
CREATE SUBSCRIPTION on Node A with (copy_data = on)
-- this is fine as there is no data on Node B

Step 3:
CREATE SUBSCRIPTION on Node B with (copy_data = on)
-- this should be fine as user needs to copy data from Node A to Node B,
-- but we still report an error for this case.

It looks a bit strict to report an ERROR in this case and it seems not easy to
avoid this. So, personally, I think it might be better to document the correct
steps to build the bi-directional replication and probably also docuemnt the
steps to recover if user accidently did duplicate initial copy if not
documented yet.

In addition, we could also LOG some additional information about the ORIGIN and
initial copy which might help user to analyze if needed.

----- Some other thoughts about the duplicate initial copy problem.

Actually, I feel the better way to address the possible duplicate copy problem
is to provide a command like "bi_setup" which can help user build the
bi-directional replication in all nodes and can handle the initial copy
automtically. But that might be too far.

Another naive idea I once thought is that maybe we can add a publication option
like: data_source_in_bi_group. If data_source_in_bi_group is true, we silently
let it succeed when we subscribe this publication
with(origin='NONE',copy_data=on). If data_source_in_bi_group is false, we
report an ERROR when we subscribe this publication
with(origin='NONE',copy_data=on). The slight difference in this approach is
that we don't do any additional check for the publisher and subscriber, we
trust the user and give the responsibility to choose the node with actual data
to user. Having said that this approach seems not a good approach as it doesn't
solve the actual problem and can only handle the case when ORIGIN='NONE' and
cannot handle the future case when we support ORIGIN='NODE1'.

So, I personally feel that adding document and LOG might be sufficient at this
stage.

Best regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aravind Phaneendra 2022-08-17 03:41:40 Regarding availability of 32bit client drivers for postgresql 13/14
Previous Message John Naylor 2022-08-17 02:53:01 Re: build remaining Flex files standalone