Re: Conflict Detection and Resolution

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Subject: Re: Conflict Detection and Resolution
Date: 2024-07-03 10:42:12
Message-ID: CAFiTN-uoJnc-CAnQEn-p+UYP=V60yeryEkfr_ECf0hW9rXvrhA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 3, 2024 at 4:02 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Wed, Jul 3, 2024 at 11:29 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Wed, Jul 3, 2024 at 11:00 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > > Yes, I also think it should be independent of CDR. IMHO, it should be
> > > > based on the user-configured maximum clock skew tolerance and can be
> > > > independent of CDR.
> > >
> > > +1
> > >
> > > > IIUC we would make the remote apply wait just
> > > > before committing if the remote commit timestamp is ahead of the local
> > > > clock by more than the maximum clock skew tolerance, is that correct?
> > >
> > > +1 on condition to wait.
> > >
> > > But I think we should make apply worker wait during begin
> > > (apply_handle_begin) instead of commit. It makes more sense to delay
> > > the entire operation to manage clock-skew rather than the commit
> > > alone. And only then CDR's timestamp based resolution which are much
> > > prior to commit-stage can benefit from this. Thoughts?
> >
> > But do we really need to wait at apply_handle_begin()? I mean if we
> > already know the commit_ts then we can perform the conflict resolution
> > no?
>
> I would like to highlight one point here that the resultant data may
> be different depending upon at what stage (begin or commit) we
> conclude to wait. Example:
>
> --max_clock_skew set to 0 i.e. no tolerance for clock skew.
> --Remote Update with commit_timestamp = 10.20AM.
> --Local clock (which is say 5 min behind) shows = 10.15AM.
>
> Case 1: Wait during Begin:
> When remote update arrives at local node, apply worker waits till
> local clock hits 'remote's commit_tts - max_clock_skew' i.e. till
> 10.20 AM. In the meantime (during the wait period of apply worker) if
> some local update on the same row has happened at say 10.18am (local
> clock), that will be applied first. Now when apply worker's wait is
> over, it will detect 'update_diffe'r conflict and as per
> 'last_update_win', remote_tuple will win as 10.20 is latest than
> 10.18.
>
> Case 2: Wait during Commit:
> When remote update arrives at local node, it finds no conflict and
> goes for commit. But before commit, it waits till the local clock hits
> 10.20 AM. In the meantime (during wait period of apply worker)) if
> some local update is trying to update the same row say at 10.18, it
> has to wait (due to locks taken by remote update on that row) and
> remote tuple will get committed first with commit timestamp of 10.20.
> Then local update will proceed and will overwrite remote tuple.
>
> So in case1, remote tuple is the final change while in case2, local
> tuple is the final change.

Got it, but which case is correct, I think both. Because in case-1
local commit's commit_ts is 10:18 and the remote commit's commit_ts is
10:20 so remote apply wins. And case 2, the remote commit's commit_ts
is 10:20 whereas the local commit's commit_ts must be 10:20 + delta
(because it waited for the remote transaction to get committed).

Now say which is better, in case-1 we have to make the remote apply to
wait at the beginning state without knowing what would be the local
clock when it actually comes to commit, it may so happen that if we
choose case-2 by the time the remote transaction finish applying the
local clock is beyond 10:20 and we do not even need to wait?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2024-07-03 10:43:40 Re: Doc: fix track_io_timing description to mention pg_stat_io
Previous Message Dean Rasheed 2024-07-03 10:41:58 Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.