Quick Links

Re: Conflict detection for multiple_unique_conflicts in logical replication

From:	Peter Smith <smithpb2250(at)gmail(dot)com>
To:	Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Subject:	Re: Conflict detection for multiple_unique_conflicts in logical replication
Date:	2025-03-12 01:02:35
Message-ID:	CAHut+PseZJ4E=fRG2nh5wqhVLq2Fx82qWFvzjFu7OOgShmtRHw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Nisha,

Thanks for addressing some of my v1 comments. I confirmed they are all
ok. But, I haven't reviewed the v2 again because I still had doubts
about the "stats" question and am waiting to see how that pans out.
Meanwhile, I had a couple more replies below.

On Tue, Feb 25, 2025 at 8:37 PM Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
>
> On Mon, Feb 24, 2025 at 7:39 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Nisha.
> >
> > Some review comments for patch v1-0001
> >
> > ======
> > GENERAL
> >
> > 1.
> > This may be a basic/naive question, but it is unclear to me why we
> > care about the stats of confl_multiple_unique_conflicts?
> >
> > I can understand it would be useful to get multiple conflicts logged
> > at the same time so the user doesn't have to stumble across them one
> > by one when fixing them, but as far as the stats go, why do we care
> > about this stat? Also, since you don't distinguish between multiple
> > insert conflicts versus multiple update conflicts the stat usefulness
> > seemed even more dubious.
> >
> > (because of this question, I skipped reviewing some parts of this
> > patch related to stats)
> >
>
> IMO, tracking multiple_unique_conflicts, like other conflict stats,
> helps users understand their workload better, especially since in
> these cases, stats aren't gathered under either insert_exists or
> update_exists.

When you say "stats aren't gathered" isn't that just your
implementation choice? e.g. If an INSERT fails because it
simultaneously violates multiple (say 5) different constraints I
assumed that could be implemented as conf_insert_exists += 1 or
conf_insert_exists += 5. Isn't it just a matter of documenting the
behaviour?

> I'll wait for others' opinions too on the need for the stats in this case.
>

OK. I'll revisit this thread after I learn the outcome.

> > ~~~
> >
> > 12.
> > +ok( $logfile =~
> > + qr/conflict detected on relation \"public.conf_tab\":
> > conflict=multiple_unique_conflicts*\n.*DETAIL:.* The remote tuple
> > violates multiple unique constraints on the local table./,
> > + 'multiple_unique_conflicts detected during update');
> >
> > (same as #12)
> >
> > Won't it be more useful to also log the column name causing the
> > conflict? Otherwise, if there are 100s of unique columns and just 2 of
> > them are conflicts how will the user know what to look for when fixing
> > it?
> >
>
> The conflicting column details are logged. In the test case, only the
> header line of the DETAIL message is compared to keep it simple.
> For example, the full LOG message will look like -
>
> ERROR: conflict detected on relation "public.conf_tab":
> conflict=multiple_unique_conflicts
> DETAIL: The remote tuple violates multiple unique constraints on the
> local table.
> Key already exists in unique index "conf_tab_pkey", modified
> locally in transaction 757 at 2025-02-25 14:00:56.955403+05:30.
> Key (a)=(2); existing local tuple (2, 2, 2); remote tuple (2, 3, 4).
> Key already exists in unique index "conf_tab_b_key", modified
> locally in transaction 758 at 2025-02-25 14:00:56.957092+05:30.
> Key (b)=(3); existing local tuple (3, 3, 3); remote tuple (2, 3, 4).
> Key already exists in unique index "conf_tab_c_key", modified
> locally in transaction 759 at 2025-02-25 14:00:56.957337+05:30.
> Key (c)=(4); existing local tuple (4, 4, 4); remote tuple (2, 3, 4).

OK, but I think the "keep it simple" approach hides all the
interesting conflict details, which means you can't even tell if the
log emits all the expected conflicts or not. Changing the test regex
to match all those local/remove tuple details in the log would be more
useful.

BTW, when I looked at the
'tmp_check/log/035_multiple_unique_conflicts_subscriber.log' I didn't
understand why there are there 3 separate
"conflict=multiple_unique_conflicts" ERRORs in the log. Since, AFAICT
in the TAP file, there are only 2 failure test cases (e.g. due to
INSERT, and due to UPDATE). Can you explain the difference?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Re: Conflict detection for multiple_unique_conflicts in logical replication at 2025-02-25 09:37:10 from Nisha Moond

Responses

Re: Conflict detection for multiple_unique_conflicts in logical replication at 2025-03-12 09:45:48 from Nisha Moond

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Mihail Nikalayeu	2025-03-12 01:05:50	Re: [BUG?] check_exclusion_or_unique_constraint false negative
Previous Message	Tender Wang	2025-03-12 01:00:22	Re: Question about duplicate JSONTYPE_JSON check