Re: Handle infinite recursion in logical replication setup

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2022-09-01 00:19:19
Message-ID: CAHut+Pt1vsUujQ57OBgxc9vJ1aKxisiTzhTxSciNcZFsa-wxoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 31, 2022 at 8:36 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Aug 31, 2022 at 11:35 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Here are my review comments for v43-0001.
> >
...
> > ======
> >
> > 2. doc/src/sgml/ref/create_subscription.sgml
> >
> > + <para>
> > + If the subscription is created with <literal>origin = none</literal> and
> > + <literal>copy_data = true</literal>, it will check if the publisher has
> > + subscribed to the same table from other publishers and, if so, log a
> > + warning to notify the user to check the publisher tables. Before continuing
> > + with other operations the user should check that publisher tables did not
> > + have data with different origins to prevent data inconsistency issues on the
> > + subscriber.
> > + </para>
> >
> > I am not sure about that wording saying "to prevent data inconsistency
> > issues" because I thought maybe is already too late because any
> > unwanted origin data is already copied during the initial sync. I
> > think the user can do it try to clean up any problems caused before
> > things become even worse (??)
> >
>
> Oh no, it is not too late in all cases. The problem can only occur if
> the user will try to subscribe from all the different publications
> with copy_data = true. We are anyway trying to clarify in the second
> patch the right way to accomplish this. So, I am not sure what better
> we can do here. The only bulletproof way is to provide some APIs where
> users don't need to bother about all these cases, basically something
> similar to what Kuroda-San is working on in the thread [1].
>

The point of my review comment was only about the wording of the note
- specifically, you cannot "prevent" something (e,g, data
inconsistency) if it has already happened.

Maybe modify the wording like below?

BEFORE
Before continuing with other operations the user should check that
publisher tables did not have data with different origins to prevent
data inconsistency issues on the subscriber.

AFTER
If a publisher table with data from different origins was found to
have been copied to the subscriber, then some corrective action may be
necessary before continuing with other operations.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-09-01 00:23:41 Re: Reducing the chunk header sizes on all memory context types
Previous Message Justin Pryzby 2022-09-01 00:15:59 Re: PostgreSQL 15 release announcement draft