Re: Conflict detection and logging in logical replication

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Conflict detection and logging in logical replication
Date: 2024-08-19 06:23:50
Message-ID: CAJpy0uAbQbStSyBG1w9aXH93p6nkXQVhBbH9czyyW3v1+DFVvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 19, 2024 at 11:37 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Aug 19, 2024 at 9:08 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Sun, Aug 18, 2024 at 2:27 PM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > Attach the V16 patch which addressed the comments we agreed on.
> > > I will add a doc patch to explain the log format after the 0001 is RFC.
> > >
> >
> > Thank You for addressing comments. Please see this scenario:
> >
> > create table tab1(pk int primary key, val1 int unique, val2 int);
> >
> > pub: insert into tab1 values(1,1,1);
> > sub: insert into tab1 values(2,2,3);
> > pub: update tab1 set val1=2 where pk=1;
> >
> > Wrong 'replica identity' column logged? shouldn't it be pk?
> >
> > ERROR: conflict detected on relation "public.tab1": conflict=update_exists
> > DETAIL: Key already exists in unique index "tab1_val1_key", modified
> > locally in transaction 801 at 2024-08-19 08:50:47.974815+05:30.
> > Key (val1)=(2); existing local tuple (2, 2, 3); remote tuple (1, 2,
> > 1); replica identity (val1)=(1).
> >
>
> The docs say that by default replica identity is primary_key [1] (see
> REPLICA IDENTITY),

yes, I agree. But here the importance of dumping it was to know the
value of RI as well which is being used as an identification of row
being updated rather than row being conflicted. Value is logged
correctly.

>[2] (see pg_class.relreplident). So, using the same
> format to display PK seems reasonable. I don't think adding additional
> code to distinguish these two cases in the LOG message is worth it.

I don't see any additional code added for this case except getting an
existing logic being used for update_exists.

>We
> can always change such things later if that is what users and or
> others prefer.
>

Sure, if fixing this issue (where we are reporting the wrong col name)
needs additional logic, then I am okay to skip it for the time being.
We can address later if/when needed.

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-08-19 06:28:52 Re: Normalize queries starting with SET for pg_stat_statements
Previous Message Michael Paquier 2024-08-19 06:21:30 Re: Create syscaches for pg_extension