RE: Conflict detection and logging in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: RE: Conflict detection and logging in logical replication
Date: 2024-08-09 06:59:29
Message-ID: OS3PR01MB5718F4CC16B5549F6D15775194BA2@OS3PR01MB5718.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, August 7, 2024 1:24 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Aug 6, 2024 at 1:45 PM Zhijie Hou (Fujitsu)
> > Thanks for the idea! I thought about few styles based on the suggested
> format,
> > what do you think about the following ?
> >
> > ---
> > Version 1
> > ---
> > LOG: CONFLICT: insert_exists; DESCRIPTION: remote INSERT violates
> unique constraint "uniqueindex" on relation "public.test".
> > DETAIL: Existing local tuple (a, b, c) = (2, 3, 4)
> xid=123,origin="pub",timestamp=xxx; remote tuple (a, b, c) = (2, 4, 5).
> >
>
> Won't this case be ERROR? If so, the error message format like the
> above appears odd to me because in some cases, the user may want to
> add some filter based on the error message though that is not ideal.
> Also, the primary error message starts with a small case letter and
> should be short.
>
> > LOG: CONFLICT: update_differ; DESCRIPTION: updating a row with key (a,
> b) = (2, 4) on relation "public.test" was modified by a different source.
> > DETAIL: Existing local tuple (a, b, c) = (2, 3, 4)
> xid=123,origin="pub",timestamp=xxx; remote tuple (a, b, c) = (2, 4, 5).
> >
> > LOG: CONFLICT: update_missing; DESCRIPTION: did not find the row with
> key (a, b) = (2, 4) on "public.test" to update.
> > DETAIL: remote tuple (a, b, c) = (2, 4, 5).
> >
> > ---
> > Version 2
> > It moves most the details to the DETAIL line compared to version 1.
> > ---
> > LOG: CONFLICT: insert_exists on relation "public.test".
> > DETAIL: Key (a)=(1) already exists in unique index "uniqueindex", which
> was modified by origin "pub" in transaction 123 at 2024xxx;
> > Existing local tuple (a, b, c) = (1, 3, 4), remote tuple (a, b, c)
> = (1, 4, 5).
> >
> > LOG: CONFLICT: update_differ on relation "public.test".
> > DETAIL: Updating a row with key (a, b) = (2, 4) that was modified by a
> different origin "pub" in transaction 123 at 2024xxx;
> > Existing local tuple (a, b, c) = (2, 3, 4); remote tuple (a, b, c)
> = (2, 4, 5).
> >
> > LOG: CONFLICT: update_missing on relation "public.test".
> > DETAIL: Did not find the row with key (a, b) = (2, 4) to update;
> > Remote tuple (a, b, c) = (2, 4, 5).
> >
>
> I think we can combine sentences with full stop.
>
> ...
> > ---
> > Version 3
> > It is similar to the style in the current patch, I only added the key value for
> > differ and missing conflicts without outputting the complete
> > remote/local tuple value.
> > ---
> > LOG: conflict insert_exists detected on relation "public.test".
> > DETAIL: Key (a)=(1) already exists in unique index "uniqueindex", which
> was modified by origin "pub" in transaction 123 at 2024xxx.
> >
>
> For ERROR messages this appears suitable.
>
> Considering all the above points, I propose yet another version:
>
> LOG: conflict detected for relation "public.test": conflict=insert_exists
> DETAIL: Key (a)=(1) already exists in unique index "uniqueindex",
> which was modified by the origin "pub" in transaction 123 at 2024xxx.
> Existing local tuple (a, b, c) = (1, 3, 4), remote tuple (a, b, c) =
> (1, 4, 5).
>
> LOG: conflict detected for relation "public.test": conflict=update_differ
> DETAIL: Updating a row with key (a, b) = (2, 4) that was modified by a
> different origin "pub" in transaction 123 at 2024xxx. Existing local
> tuple (a, b, c) = (2, 3, 4); remote tuple (a, b, c) = (2, 4, 5).
>
> LOG: conflict detected for relation "public.test": conflict=update_missing
> DETAIL: Could not find the row with key (a, b) = (2, 4) to update.
> Remote tuple (a, b, c) = (2, 4, 5).

Here is the V12 patch that improved the log format as discussed. I also fixed a
bug in previous version where it reported the wrong column value in the DETAIL
message.

In the latest patch, the DETAIL line comprises two parts: 1. Explanation of the
conflict type, including the tuple value used to search the existing local
tuple provided for update or deletion, or the tuple value causing the unique
constraint violation. 2. Display of the complete existing local tuple and the
remote tuple, if any.

I also addressed Shveta's comments and tried to merge Kuroda-san's changes[2] to
the new codes.

And the 0002(new sub option) patch is removed as discussed. The 0003(stats
collection) patch is also removed temporarily, we can bring it back After
finishing the 0001 work.

[1] https://www.postgresql.org/message-id/CAJpy0uAjJci%2BOtm4ANU0__-2qqhH2cALp8hQw5pBjNZyREF7rg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/TYAPR01MB5692224DB472AA3FA58E1D1AF5B82%40TYAPR01MB5692.jpnprd01.prod.outlook.com

Best Regards,
Hou zj

Attachment Content-Type Size
v12-0001-Detect-and-log-conflicts-in-logical-replication.patch application/octet-stream 68.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-08-09 07:10:08 Re: Logical Replication of sequences
Previous Message shveta malik 2024-08-09 06:43:00 Re: Logical Replication of sequences