Re: Conflict Detection and Resolution

From: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Subject: Re: Conflict Detection and Resolution
Date: 2024-08-27 08:36:32
Message-ID: CABdArM43epV60iedut8Ee7CoM1nYq+AM0ZCejb_0ne3YV2JobQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 23, 2024 at 10:39 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Thu, Aug 22, 2024 at 3:44 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> >
> > For clock-skew and timestamp based resolution, if needed, I will post
> > another email for the design items where suggestions are needed.
> >
>
> Please find issues which need some thoughts and approval for
> time-based resolution and clock-skew.
>
> 1)
> Time based conflict resolution and two phase transactions:
>
> Time based conflict resolution (last_update_wins) is the one
> resolution which will not result in data-divergence considering
> clock-skew is taken care of. But when it comes to two-phase
> transactions, it might not be the case. For two-phase transaction, we
> do not have commit timestamp when the changes are being applied. Thus
> for time-based comparison, initially it was decided to user prepare
> timestamp but it may result in data-divergence. Please see the
> example at [1].
>
> Example at [1] is a tricky situation, and thus in the initial draft,
> we decided to restrict usage of 2pc and CDR together. The plan is:
>
> a) During Create subscription, if the user has given last_update_wins
> resolver for any conflict_type and 'two_phase' is also enabled, we
> ERROR out.
> b) During Alter subscription, if the user tries to update resolver to
> 'last_update_wins' but 'two_phase' is enabled, we error out.
>
> Another solution could be to save both prepare_ts and commit_ts. And
> when any txn comes for conflict resolution, we first check if
> prepare_ts is available, use that else use commit_ts. Availability of
> prepare_ts would indicate it was a prepared txn and thus even if it is
> committed, we should use prepare_ts for comparison for consistency.
> This will have some overhead of storing prepare_ts along with
> commit_ts. But if the number of prepared txns are reasonably small,
> this overhead should be less.
>
> We currently plan to go with restricting 2pc and last_update_wins
> together, unless others have different opinions.
>
> ~~
>
> 2)
> parallel apply worker and conflict-resolution:
> As discussed in [2] (see last paragraph in [2]), for streaming of
> in-progress transactions by parallel worker, we do not have
> commit-timestamp with each change and thus it makes sense to disable
> parallel apply worker with CDR. The plan is to not start parallel
> apply worker if 'last_update_wins' is configured for any
> conflict_type.
>
> ~~
>
> 3)
> parallel apply worker and clock skew management:
> Regarding clock-skew management as discussed in [3], we will wait for
> the local clock to come within tolerable range during 'begin' rather
> than before 'commit'. And this wait needs commit-timestamp in the
> beginning, thus we plan to restrict starting pa-worker even when
> clock-skew related GUCs are configured.
>
> Earlier we had restricted both 2pc and parallel worker worker start
> when detect_conflict was enabled, but now since detect_conflict
> parameter is removed, we will change the implementation to restrict
> all 3 above cases when last_update_wins is configured. When the
> changes are done, we will post the patch.
>
> ~~
>
> 4)
> <not related to timestamp and clock skew>
> Earlier when 'detect_conflict' was enabled, we were giving WARNING if
> 'track_commit_timestamp' was not enabled. This was during CREATE and
> ALTER subscription. Now with this parameter removed, this WARNING has
> also been removed. But I think we need to bring back this WARNING.
> Currently default resolvers set may work without
> 'track_commit_timestamp' but when user gives CONFLICT RESOLVER in
> create-sub or alter-sub explicitly making them configured to
> non-default values (or say any values, does not matter if few are
> defaults), we may still emit this warning to alert user:
>
> 2024-07-26 09:14:03.152 IST [195415] WARNING: conflict detection
> could be incomplete due to disabled track_commit_timestamp
> 2024-07-26 09:14:03.152 IST [195415] DETAIL: Conflicts update_differ
> and delete_differ cannot be detected, and the origin and commit
> timestamp for the local row will not be logged.
>
> Thoughts?
>
> If we emit this WARNING during each resolution, then it may flood our
> log files, thus it seems better to emit it during create or alter
> subscription instead of during resolution.
>
>
Done.
v10 has implemented the suggested warning when a user gives CONFLICT
RESOLVER in create-sub or alter-sub explicitly.

Thanks,
Nisha

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2024-08-27 08:43:45 Re: New function normal_rand_array function to contrib/tablefunc.
Previous Message Nisha Moond 2024-08-27 08:32:08 Re: Conflict Detection and Resolution