From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
Subject: | RE: Conflict Detection and Resolution |
Date: | 2024-06-18 02:14:16 |
Message-ID: | OS0PR01MB571681062FA5D7306333F7C694CE2@OS0PR01MB5716.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thursday, June 13, 2024 2:11 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
Hi,
> On Wed, Jun 5, 2024 at 3:32 PM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
> wrote:
> >
> > This time at PGconf.dev[1], we had some discussions regarding this
> > project. The proposed approach is to split the work into two main
> > components. The first part focuses on conflict detection, which aims
> > to identify and report conflicts in logical replication. This feature
> > will enable users to monitor the unexpected conflicts that may occur.
> > The second part involves the actual conflict resolution. Here, we will
> > provide built-in resolutions for each conflict and allow user to
> > choose which resolution will be used for which conflict(as described
> > in the initial email of this thread).
>
> I agree with this direction that we focus on conflict detection (and
> logging) first and then develop conflict resolution on top of that.
Thanks for your reply !
>
> >
> > Of course, we are open to alternative ideas and suggestions, and the
> > strategy above can be changed based on ongoing discussions and
> > feedback received.
> >
> > Here is the patch of the first part work, which adds a new parameter
> > detect_conflict for CREATE and ALTER subscription commands. This new
> > parameter will decide if subscription will go for conflict detection.
> > By default, conflict detection will be off for a subscription.
> >
> > When conflict detection is enabled, additional logging is triggered in
> > the following conflict scenarios:
> >
> > * updating a row that was previously modified by another origin.
> > * The tuple to be updated is not found.
> > * The tuple to be deleted is not found.
> >
> > While there exist other conflict types in logical replication, such as
> > an incoming insert conflicting with an existing row due to a primary
> > key or unique index, these cases already result in constraint violation errors.
>
> What does detect_conflict being true actually mean to users? I understand that
> detect_conflict being true could introduce some overhead to detect conflicts.
> But in terms of conflict detection, even if detect_confict is false, we detect
> some conflicts such as concurrent inserts with the same key. Once we
> introduce the complete conflict detection feature, I'm not sure there is a case
> where a user wants to detect only some particular types of conflict.
>
> > Therefore, additional conflict detection for these cases is currently
> > omitted to minimize potential overhead. However, the pre-detection for
> > conflict in these error cases is still essential to support automatic
> > conflict resolution in the future.
>
> I feel that we should log all types of conflict in an uniform way. For example,
> with detect_conflict being true, the update_differ conflict is reported as
> "conflict %s detected on relation "%s"", whereas concurrent inserts with the
> same key is reported as "duplicate key value violates unique constraint "%s"",
> which could confuse users.
Do you mean it's ok to add a pre-check before applying the INSERT, which will
verify if the remote tuple violates any unique constraints, and if it violates
then we log a conflict message ? I thought about this but was slightly
worried about the extra cost it would bring. OTOH, if we think it's acceptable,
we could do that since the cost is there only when detect_conflict is enabled.
I also thought of logging such a conflict message in pg_catch(), but I think we
lack some necessary info(relation, index name, column name) at the catch block.
Best Regards,
Hou zj
From | Date | Subject | |
---|---|---|---|
Next Message | Chapman Flack | 2024-06-18 02:17:06 | Re: jsonpath: Missing regex_like && starts with Errors? |
Previous Message | Masahiko Sawada | 2024-06-18 01:59:29 | Re: Logical Replication of sequences |