Re: Conflict detection for multiple_unique_conflicts in logical replication

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Subject: Re: Conflict detection for multiple_unique_conflicts in logical replication
Date: 2025-03-11 05:40:12
Message-ID: CAFiTN-vxAo543mMG_xNr4rwPbydRN0T4UsfExGeTDjmX1Rq0Ww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 20, 2025 at 5:01 PM Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
>
> Hi Hackers,
> (CCing people involved in related discussions)
>
> I am starting this thread to propose a new conflict detection type
> "multiple_unique_conflicts" that identifies when an incoming row
> during logical replication violates more than one UNIQUE constraint.
> If multiple constraints (such as the primary key/replica identity
> along with additional unique indexes) are violated for the same
> incoming tuple, the apply_worker will trigger a dedicated
> multiple_unique_conflicts conflict.
>
> ----
> ISSUE:
> ----
> Currently, when the apply_worker encounters an 'insert_exists'
> conflict (where the incoming row conflicts with an existing row based
> on the replica identity), it logs the conflict and errors out
> immediately. If the user tries to resolve this manually by deleting
> the conflicting row, the apply worker may fail again due to another
> unique constraint violation on a different column. This forces users
> to fix conflicts one by one, making resolution tedious and
> inefficient.
>
> Example:
> Schema:
> CREATE TABLE tab1 (col1 integer PRIMARY KEY, col2 integer UNIQUE, col3
> integer UNIQUE);
> - col1 is Replica Identity.
>
> Data:
> - on pub: (1, 11, 111)
> - on sub: 3 additional local Inserts: (2, 22, 222); (3, 33, 333); (4, 44, 444)
> - Concurrently on pub, new insert: (2, 33, 444)
>
> When the new incoming tuple (2, 33, 444) is applied on the subscriber:
> - The apply worker first detects an 'insert_exists' conflict on col1
> (primary key) and errors out.
> - If the user deletes the conflicting row (key col1=2) : (2, 22,
> 222), the apply worker fails again because col2=33 violates another
> unique constraint for tuple (3, 33, 333);
> - If the user deletes col2=33, the apply worker fails yet again due
> to tuple (4, 44, 444) because col3=444 is also unique.
> Conflicts on both col2 and col3 (which are independent of each other)
> are an example of a 'Multiple Unique Constraints' violation.
>
> In such cases, users are forced to resolve conflicts one by one,
> making the process slow and error-prone.
>
> ---
> SOLUTION:
> ---
> During an INSERT or UPDATE conflict check, instead of stopping at the
> first encountered conflict, the apply_worker will now check all unique
> indexes before reporting a conflict. If multiple unique key violations
> are found, it will report a 'multiple_unique_conflicts' conflict,
> listing all conflicting tuples in the logs. If only a single key
> conflict is detected, the existing 'insert_exists' conflict will be
> raised as it is now.

I think it makes sense to report all the unique key conflicts for a
single row at once, rather than stopping after the first one. However,
I don't understand the need to create a new conflict type. Can't we
report multiple conflicts for the row using the existing conflict
type, if different columns in the incoming row conflict with different
existing rows?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Junwang Zhao 2025-03-11 06:06:41 Re: SQL Property Graph Queries (SQL/PGQ)
Previous Message Naga Appani 2025-03-11 05:15:28 Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring