Re: Conflict detection and logging in logical replication

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Conflict detection and logging in logical replication
Date: 2024-07-19 08:36:34
Message-ID: CAJpy0uDGJXdVCGoaRHP-5G0pL0zhuZaRJSqxOxs=CNsSwc+SJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 18, 2024 at 7:52 AM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> Attach the V5 patch set which changed the following.

Thanks for the patch. Tested that previous reported issues are fixed.
Please have a look at below scenario, I was expecting it to raise
'update_differ' but it raised both 'update_differ' and 'delete_differ'
together:

-------------------------
Pub:
create table tab (a int not null, b int primary key);
create publication pub1 for table tab;

Sub (partitioned table):
create table tab (a int not null, b int primary key) partition by
range (b);
create table tab_1 partition of tab for values from (minvalue) to
(100);
create table tab_2 partition of tab for values from (100) to
(maxvalue);
create subscription sub1 connection '.......' publication pub1 WITH
(detect_conflict=true);

Pub - insert into tab values (1,1);
Sub - update tab set b=1000 where a=1;
Pub - update tab set b=1000 where a=1; -->update_missing detected
correctly as b=1 will not be found on sub.
Pub - update tab set b=1 where b=1000; -->update_differ expected, but
it gives both update_differ and delete_differ.
-------------------------

Few trivial comments:

1)
Commit msg:
For insert_exists conflict, the log can include origin and commit
timestamp details of the conflicting key with track_commit_timestamp
enabled.

--Please add update_exists as well.

2)
execReplication.c:
Return false if there is no conflict and *conflictslot is set to NULL.

--This gives a feeling that this function will return false if both
the conditions are true. But instead first one is the condition, while
the second is action. Better to rephrase to:

Returns false if there is no conflict. Sets *conflictslot to NULL in
such a case.
Or
Sets *conflictslot to NULL and returns false in case of no conflict.

3)
FindConflictTuple() shares some code parts with
RelationFindReplTupleByIndex() and RelationFindReplTupleSeq() for
checking status in 'res'. Is it worth making a function to be used in
all three.

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2024-07-19 09:41:38 Re: July Commitfest: Entries Needing Review
Previous Message Dean Rasheed 2024-07-19 08:30:07 Re: Removing unneeded self joins