From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
Subject: | Re: Conflict Detection and Resolution |
Date: | 2024-06-13 06:11:21 |
Message-ID: | CAD21AoAa6JzqhXY02uNUPb-aTozu2RY9nMdD1=TUh+FpskkYtw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jun 5, 2024 at 3:32 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> Hi,
>
> This time at PGconf.dev[1], we had some discussions regarding this
> project. The proposed approach is to split the work into two main
> components. The first part focuses on conflict detection, which aims to
> identify and report conflicts in logical replication. This feature will
> enable users to monitor the unexpected conflicts that may occur. The
> second part involves the actual conflict resolution. Here, we will provide
> built-in resolutions for each conflict and allow user to choose which
> resolution will be used for which conflict(as described in the initial
> email of this thread).
I agree with this direction that we focus on conflict detection (and
logging) first and then develop conflict resolution on top of that.
>
> Of course, we are open to alternative ideas and suggestions, and the
> strategy above can be changed based on ongoing discussions and feedback
> received.
>
> Here is the patch of the first part work, which adds a new parameter
> detect_conflict for CREATE and ALTER subscription commands. This new
> parameter will decide if subscription will go for conflict detection. By
> default, conflict detection will be off for a subscription.
>
> When conflict detection is enabled, additional logging is triggered in the
> following conflict scenarios:
>
> * updating a row that was previously modified by another origin.
> * The tuple to be updated is not found.
> * The tuple to be deleted is not found.
>
> While there exist other conflict types in logical replication, such as an
> incoming insert conflicting with an existing row due to a primary key or
> unique index, these cases already result in constraint violation errors.
What does detect_conflict being true actually mean to users? I
understand that detect_conflict being true could introduce some
overhead to detect conflicts. But in terms of conflict detection, even
if detect_confict is false, we detect some conflicts such as
concurrent inserts with the same key. Once we introduce the complete
conflict detection feature, I'm not sure there is a case where a user
wants to detect only some particular types of conflict.
> Therefore, additional conflict detection for these cases is currently
> omitted to minimize potential overhead. However, the pre-detection for
> conflict in these error cases is still essential to support automatic
> conflict resolution in the future.
I feel that we should log all types of conflict in an uniform way. For
example, with detect_conflict being true, the update_differ conflict
is reported as "conflict %s detected on relation "%s"", whereas
concurrent inserts with the same key is reported as "duplicate key
value violates unique constraint "%s"", which could confuse users.
Ideally, I think that we log such conflict detection details (table
name, column name, conflict type, etc) to somewhere (e.g. a table or
server logs) so that the users can resolve them manually.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2024-06-13 06:23:34 | Re: Logical Replication of sequences |
Previous Message | Joel Jacobson | 2024-06-13 05:34:30 | Re: [PATCH] pg_permissions |