RE: Conflict detection and logging in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: RE: Conflict detection and logging in logical replication
Date: 2024-06-24 02:09:27
Message-ID: OS0PR01MB57161006B8F2779F2C97318194D42@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, June 21, 2024 3:47 PM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> - The detail of the conflict detection
>
> We add a new parameter detect_conflict for CREATE and ALTER subscription
> commands. This new parameter will decide if subscription will go for
> confict detection. By default, conflict detection will be off for a
> subscription.
>
> When conflict detection is enabled, additional logging is triggered in the
> following conflict scenarios:
> insert_exists: Inserting a row that violates a NOT DEFERRABLE unique
> constraint.
> update_differ: updating a row that was previously modified by another origin.
> update_missing: The tuple to be updated is missing.
> delete_missing: The tuple to be deleted is missing.
>
> For insert_exists conflict, the log can include origin and commit
> timestamp details of the conflicting key with track_commit_timestamp
> enabled. And update_differ conflict can only be detected when
> track_commit_timestamp is enabled.
>
> Regarding insert_exists conflicts, the current design is to pass
> noDupErr=true in ExecInsertIndexTuples() to prevent immediate error
> handling on duplicate key violation. After calling
> ExecInsertIndexTuples(), if there was any potential conflict in the
> unique indexes, we report an ERROR for the insert_exists conflict along
> with additional information (origin, committs, key value) for the
> conflicting row. Another way for this is to conduct a pre-check for
> duplicate key violation before applying the INSERT operation, but this
> could introduce overhead for each INSERT even in the absence of conflicts.
> We welcome any alternative viewpoints on this matter.

When testing the patch, I noticed a bug that when reporting the conflict
after calling ExecInsertIndexTuples(), we might find the tuple that we
just inserted and report it.(we should only report conflict if there are
other conflict tuples which are not inserted by us) Here is a new patch
which fixed this and fixed a compile warning reported by CFbot.

Best Regards,
Hou zj

Attachment Content-Type Size
v2-0002-Collect-statistics-about-conflicts-in-logical-rep.patch application/octet-stream 19.1 KB
v2-0001-Detect-and-log-conflicts-in-logical-replication.patch application/octet-stream 89.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiro.Ikeda 2024-06-24 02:38:32 RE: Improve EXPLAIN output for multicolumn B-Tree Index
Previous Message Ranier Vilela 2024-06-24 01:34:03 Re: Avoid incomplete copy string (src/backend/access/transam/xlog.c)