Re: Optionally automatically disable logical replication subscriptions on error

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, "Smith, Peter" <peters(at)fast(dot)au(dot)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Optionally automatically disable logical replication subscriptions on error
Date: 2021-06-21 02:53:00
Message-ID: CAA4eK1KHnxT1Qm2MfiUa+0PAdG8h+KJm3XumpU+yzU=Y2gjJqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 19, 2021 at 8:14 PM Mark Dilger
<mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>
> > On Jun 19, 2021, at 3:17 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > Also, why not also log the xid of the failed
> > transaction?
>
> We could also do that. Reading [1], it seems you are overly focused on user-facing xids. The errdetail in the examples I've been using for testing, and the one mentioned in [1], contain information about the conflicting data. I think users are more likely to understand that a particular primary key value cannot be replicated because it is not unique than to understand that a particular xid cannot be replicated. (Likewise for permissions errors.) For example:
>
> 2021-06-18 16:25:20.139 PDT [56926] ERROR: duplicate key value violates unique constraint "s1_tbl_unique"
> 2021-06-18 16:25:20.139 PDT [56926] DETAIL: Key (i)=(1) already exists.
> 2021-06-18 16:25:20.139 PDT [56926] CONTEXT: COPY tbl, line 2
>
> This tells the user what they need to clean up before they can continue. Telling them which xid tried to apply the change, but not the change itself or the conflict itself, seems rather unhelpful. So at best, the xid is an additional piece of information. I'd rather have both the ERROR and DETAIL fields above and not the xid than have the xid and lack one of those two fields. Even so, I have not yet included the DETAIL field because I didn't want to bloat the catalog.
>

I never said that we don't need the error information. I think we need
xid along with other things.

> For the problem in [1], having the xid is more important than it is in my patch, because the user is expected in [1] to use the xid as a handle. But that seems like an odd interface to me. Imagine that a transaction on the publisher side inserted a batch of data, and only a subset of that data conflicts on the subscriber side. What advantage is there in skipping the entire transaction? Wouldn't the user rather skip just the problematic rows?
>

I think skipping some changes but not others can make the final
transaction data inconsistent. Say, we have a case where, in a
transaction after insert, there is an update or delete on the same
row, then we might silently skip such updates/deletes unless the same
row is already present in the subscriber. I think skipping the entire
transaction based on user instruction would be safer than skipping
some changes that lead to an error.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-06-21 03:09:34 Re: Optionally automatically disable logical replication subscriptions on error
Previous Message Thomas Munro 2021-06-21 02:43:16 Windows' copy of ICU