Re: Conflict detection and logging in logical replication

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Conflict detection and logging in logical replication
Date: 2024-07-11 05:03:02
Message-ID: CAJpy0uBP8ubVix9vHhJiHOt37mbc2fsET6AdiRHq_nH_TGP1mA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 11, 2024 at 7:47 AM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, July 10, 2024 5:39 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Wed, Jul 3, 2024 at 8:31 AM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
> > wrote:
> > >
> > > On Wednesday, June 26, 2024 10:58 AM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > > >
> > >
> > > Hi,
> > >
> > > As suggested by Sawada-san in another thread[1].
> > >
> > > I am attaching the V4 patch set which tracks the delete_differ
> > > conflict in logical replication.
> > >
> > > delete_differ means that the replicated DELETE is deleting a row that
> > > was modified by a different origin.
> > >
> >
> > Thanks for the patch. I am still in process of review but please find few
> > comments:
>
> Thanks for the comments!
>
> > 1) When I try to *insert* primary/unique key on pub, which already exists on
> > sub, conflict gets detected. But when I try to *update* primary/unique key to a
> > value on pub which already exists on sub, conflict is not detected. I get the
> > error:
> >
> > 2024-07-10 14:21:09.976 IST [647678] ERROR: duplicate key value violates
> > unique constraint "t1_pkey"
> > 2024-07-10 14:21:09.976 IST [647678] DETAIL: Key (pk)=(4) already exists.
>
> Yes, I think the detection of this conflict is not added with the
> intention to control the size of the patch in the first version.
>
> >
> > This is because such conflict detection needs detection of constraint violation
> > using the *new value* rather than *existing* value during UPDATE. INSERT
> > conflict detection takes care of this case i.e. the columns of incoming row are
> > considered as new values and it tries to see if all unique indexes are okay to
> > digest such new values (all incoming columns) but update's logic is different.
> > It searches based on oldTuple *only* and thus above detection is missing.
>
> I think the logic is the same if we want to detect the unique violation
> for UDPATE, we need to check if the new value of the UPDATE violates any
> unique constraints same as the detection of insert_exists (e.g. check
> the conflict around ExecInsertIndexTuples())
>
> >
> > Shall we support such detection? If not, is it worth docuementing?
>
> I am personally OK to support this detection.

+1. I think it should not be a complex or too big change.

> And
> I think it's already documented that we only detect unique violation for
> insert which mean update conflict is not detected.
>
> > 2)
> > Another case which might confuse user:
> >
> > CREATE TABLE t1 (pk integer primary key, val1 integer, val2 integer);
> >
> > On PUB: insert into t1 values(1,10,10); insert into t1 values(2,20,20);
> >
> > On SUB: update t1 set pk=3 where pk=2;
> >
> > Data on PUB: {1,10,10}, {2,20,20}
> > Data on SUB: {1,10,10}, {3,20,20}
> >
> > Now on PUB: update t1 set val1=200 where val1=20;
> >
> > On Sub, I get this:
> > 2024-07-10 14:44:00.160 IST [648287] LOG: conflict update_missing detected
> > on relation "public.t1"
> > 2024-07-10 14:44:00.160 IST [648287] DETAIL: Did not find the row to be
> > updated.
> > 2024-07-10 14:44:00.160 IST [648287] CONTEXT: processing remote data for
> > replication origin "pg_16389" during message type "UPDATE" for replication
> > target relation "public.t1" in transaction 760, finished at 0/156D658
> >
> > To user, it could be quite confusing, as val1=20 exists on sub but still he gets
> > update_missing conflict and the 'DETAIL' is not sufficient to give the clarity. I
> > think on HEAD as well (have not tested), we will get same behavior i.e. update
> > will be ignored as we make search based on RI (pk in this case). So we are not
> > worsening the situation, but now since we are detecting conflict, is it possible
> > to give better details in 'DETAIL' section indicating what is actually missing?
>
> I think It's doable to report the row value that cannot be found in the local
> relation, but the concern is the potential risk of exposing some
> sensitive data in the log. This may be OK, as we are already reporting the
> key value for constraints violation, so if others also agree, we can add
> the row value in the DETAIL as well.

Okay, let's see what others say. JFYI, the same situation holds valid
for delete_missing case.

I have one concern about how we deal with conflicts. As for
insert_exists, we keep on erroring out while raising conflict, until
it is manually resolved:
ERROR: conflict insert_exists detected

But for other cases, we just log conflict and either skip or apply the
operation. I
LOG: conflict update_differ detected
DETAIL: Updating a row that was modified by a different origin

I know that it is no different than HEAD. But now since we are logging
conflicts explicitly, we should call out default behavior on each
conflict. I see some incomplete and scattered info in '31.5.
Conflicts' section saying that:
"When replicating UPDATE or DELETE operations, missing data will not
produce a conflict and such operations will simply be skipped."
(lets say it as pt a)

Also some more info in a later section saying (pt b):
:A conflict will produce an error and will stop the replication; it
must be resolved manually by the user."

My suggestions:
1) in point a above, shall we have:
missing data or differing data (i.e. somehow reword to accommodate
update_differ and delete_differ cases)

2) Now since we have a section explaining conflicts detected and
logged with detect_conflict=true, shall we mention default behaviour
with each?

insert_exists: error will be raised until resolved manually.
update_differ: update will be applied
update_missing: update will be skipped
delete_missing: delete will be skipped
delete_differ: delete will be applied.

thanks
Shveta

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message cca5507 2024-07-11 06:08:32 Redundant syscache access in get_rel_sync_entry()
Previous Message Michael Paquier 2024-07-11 04:58:19 Re: relfilenode statistics