Re: Handle infinite recursion in logical replication setup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-07-25 04:20:58
Message-ID: CAA4eK1JQ2W-ZS4DYG18LE-xOgtb+eSM_9f1_+-pnNEJUGHizbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jul 24, 2022 at 10:21 PM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
>
> On 7/22/22 12:47 AM, Amit Kapila wrote:
> > On Fri, Jul 22, 2022 at 1:39 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
>
> >> 1. I'm concerned by calling this "Bidirectional replication" in the docs
> >> that we are overstating the current capabilities. I think this is
> >> accentuated int he opening paragraph:
> >>
> >> ==snip==
> >> Bidirectional replication is useful for creating a multi-master database
> >> environment for replicating read/write operations performed by any of the
> >> member nodes.
> >> ==snip==
> >>
> >> For one, we're not replicating reads, we're replicating writes. Amongst
> >> the writes, at this point we're only replicating DML. A reader could
> >> think that deploying can work for a full bidirectional solution.
> >>
> >> (Even if we're aspirationally calling this section "Bidirectional
> >> replication", that does make it sound like we're limited to two nodes,
> >> when we can support more than two).
> >>
> >
> > Right, I think the system can support N-Way replication.
>
> I did some more testing of the feature, i.e. doing 3-node and 4-node
> tests. While logical replication today can handle replicating between
> multiple nodes (N-way), the "origin = none" does require setting up
> subscribers between each of the nodes.
>
> For example, if I have 4 nodes A, B, C, D and I want to replicate the
> same table between all of them, I need to set up subscribers between all
> of them (A<=>B, A<=>C, A<=>D, B<=>C, B<=>D, C<=>D). However, each node
> can replicate between each other in a way that's convenient (vs. having
> to do something funky with partitions) so this is still a big step forward.
>
> This is a long way of saying that I do think it's fair to say we support
> "N-way" replication so long as you are set up in a mesh (vs. a ring,
> e.g. A=>B=>C=>D=>A).
>

Sorry, but I don't get your point of mesh vs. ring? I think with some
care users can set up replication even in a ring topology.

> > Among the above "Replicating changes between primaries" sounds good to
> > me or simply "Replication between primaries". As this is a sub-section
> > on the Logical Replication page, I feel it is okay to not use Logical
> > in the title.
>
> Agreed, I think that's fine.
>
> >> At a minimum, I think we should reference the documentation we have in
> >> the logical replication section on conflicts. We may also want to advise
> >> that a user is responsible for designing their schemas in a way to
> >> minimize the risk of conflicts.
> >>
> >
> > This sounds reasonable to me.
> >
> > One more point about docs, it appears to be added as the last
> > sub-section on the Logical Replication page. Is there a reason for
> > doing so? I feel this should be third sub-section after describing
> > Publication and Subscription.
>
> When I first reviewed, I had not built the docs. Did so on this pass.
>
> I agree with the positioning argument, i.e. it should go after
> "Subscription" in the table of contents -- but it makes me question a
> couple of things:
>
> 1. The general ordering of the docs
> 2. How we describe that section (more on that in a sec)
> 3. If "row filters" should be part of "subscription" instead of its own
> section.
>

I don't think it is a good idea to keep "row filters" as a part of
subscription because we define those at publisher but there are
certain things like initial sync or combining of row filters that are
related to subscriptions. So, probably having it in a separate
sub-section seems okay to me. I have also thought about keeping it as
a part of Publication or Subscription but left it due to the reasons
mentioned.

> If you look at the current order, "Quick setup" is the last section; one
> would think the "quick" portion goes first :) Given a lot of this is for
> the current docs, I may start a separate discussion on -docs for this part.
>
> For the time being, I agree it should be moved to the section after
> "Subscription".
>

Okay, thanks!

> I think what this section describes is "Configuring Replication Between
> Nodes" as it covers a few different scenarios.
>
> I do think we need to iterate on these docs -- the examples with the
> commands are generally OK and easy to follow, but a few things I noticed:
>
> 1. The general description of the section needs work. We may want to
> refine the description of the use cases, and in the warning, link to
> instructions on how to take backups.
> 2. We put the "case not supported" in the middle, not at the end.
> 3. The "generic steps for adding a new node..." section uses a
> convention for steps that is not found in the docs. We also don't
> provide an example for this section, and this is the most complicated
> scenario to set up.
>

I agree with all these points, especially your point related to a
complicated setup. This won't be easy for users without a very clear
description and examples. However, after this work, it will at least
be possible for users to set up an N-Way replication with some
restrictions and care. BTW, while working on this we have noticed
that it is normally a lot of work for users to set up N-way
replication among 3, 4, or more nodes and that is why there is another
proposal to make that set up easier by providing a simpler APIs which
will internally do similar to all the manual steps the 0002 patch is
trying to describe. See [1] (please don't be confused with the thread
title but the real intent is to allow users to provide an easier way
to set up an N-Way replication by having all current restrictions of
Logical Replication like it doesn't support DDL replication)

> I may be able to propose some suggestions in a few days.
>

Okay, thanks! The plan is to get the 0001 patch of Vignesh. Then work
on docs to describe how users can set up replication among primary
nodes. After that, if we get consensus on providing simpler APIs for
setting up replication among primary nodes as is being discussed in
the thread [1], then work on it.

> > BTW, do you have any opinion on the idea of the first remaining patch
> > where we accomplish two things: a) Checks and throws an error if
> > 'copy_data = on' and 'origin = none' but the publication tables were
> > also replicated from other publishers. b) Adds 'force' value for
> > copy_data parameter to allow copying in such a case. The primary
> > reason for this patch is to avoid loops or duplicate data in the
> > initial phase. We can't skip copying based on origin as we can do
> > while replicating changes from WAL. So, we detect that the publisher
> > already has data from some other node and doesn't allow replication
> > unless the user uses the 'force' option for copy_data.
>
> In general, I agree with the patch; but I'm not sure why we are calling
> "copy_data = force" in this case and how it varies from "on". I
> understand the goal is to prevent the infinite loop, but is there some
> technical restriction why we can't set "origin = none, copy_data = on"
> and have this work (and apologies if I missed that upthread)?
>

The technical restriction is that we want to throw an error when users
set "origin = none, copy_data = on" and we find that the publication
tables were also replicated from other publishers. Now, we can give
WARNING to users in this case but users won't have any way to avoid
duplicate data which can lead to constraint violations. So, we decided
to throw ERROR and allow users to perform it with a "force" option.

> The other concern I'll note is that we're changing a boolean parameter
> to an enum and I want to be sensitive to folks who are already using
> "copy_data" to be sure we don't break them.
>

Okay, but AFAIU, the patch still allows users to specify on/off, so
won't that be sufficient?

[1] - https://www.postgresql.org/message-id/CAHut%2BPuwRAoWY9pz%3DEubps3ooQCOBFiYPU9Yi%3DVB-U%2ByORU7OA%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhang Mingli 2022-07-25 04:28:23 Re: optimize lookups in snapshot [sub]xip arrays
Previous Message jian he 2022-07-25 04:18:12 COPY FROM FORMAT CSV FORCE_NULL(*) ?