Re: Conflict detection and logging in logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: Conflict detection and logging in logical replication
Date: 2024-08-16 06:49:14
Message-ID: CAA4eK1LHb_g-08WfhhW1XGNvmqopxU8Aecb2Z5r8ukjwOO8MSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 16, 2024 at 11:48 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Fri, Aug 16, 2024 at 10:46 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > 3)
> > For update_exists(), we dump:
> > Key (a, b)=(2, 1)
> >
> > For delete_missing, update_missing, update_differ, we dump:
> > Replica identity (a, b)=(2, 1).
> >
> > For update_exists as well, shouldn't we dump 'Replica identity'? Only
> > for insert case, it should be referred as 'Key'.
> >
>
> On rethinking, is it because for update_exists case 'Key' dumped is
> not the one used to search the row to be updated? Instead it is the
> one used to search the conflicting row. Unlike update_differ, the row
> to be updated and the row currently conflicting will be different for
> update_exists case. I earlier thought that 'KEY' and 'Existing local
> tuple' dumped always belong to the row currently being
> updated/deleted/inserted. But for 'update_eixsts', that is not the
> case. We are dumping 'Existing local tuple' and 'Key' for the row
> which is conflicting and not the one being updated. Example:
>
> ERROR: conflict detected on relation "public.tab_1": conflict=update_exists
> Key (a, b)=(2, 1); existing local tuple (2, 1); remote tuple (2, 1).
>
> Operations performed were:
> Pub: insert into tab values (1,1);
> Sub: insert into tab values (2,1);
> Pub: update tab set a=2 where a=1;
>
> Here Key and local tuple are both 2,1 instead of 1,1. While replica
> identity value (used to search original row) will be 1,1 only.
>
> It may be slightly confusing or say tricky to understand when compared
> to other conflicts' LOGs. But not sure what better we can do here.
>

The update_exists behaves more like insert_exists as we detect that
only while inserting into index. It is also not clear to me if we can
do better than to clarify this in docs.

> --------------------
>
> One more comment:
>
> 5)
> For insert/update_exists, the sequence is:
> Key .. ; existing local tuple .. ; remote tuple ...
>
> For rest of the conflicts, sequence is:
> Existing local tuple .. ; remote tuple .. ; replica identity ..
>
> Is it intentional? Shall the 'Key' or 'Replica Identity' be the first
> one to come in all conflicts?
>

This is worth considering but Replica Identity signifies the old tuple
values, that is why it is probably kept at the end. But let's see what
Hou-San or others think about this.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-08-16 07:04:52 Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
Previous Message Amit Kapila 2024-08-16 06:31:10 Re: Conflict detection and logging in logical replication