Re: Conflict Detection and Resolution

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Conflict Detection and Resolution
Date: 2024-07-03 11:38:21
Message-ID: CAJpy0uASY-8rWQKUXN_rbrm51fRYAzaPGci2e0SHmNjvvD606w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 3, 2024 at 4:12 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Wed, Jul 3, 2024 at 4:02 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Wed, Jul 3, 2024 at 11:29 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Jul 3, 2024 at 11:00 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > > >
> > > > > Yes, I also think it should be independent of CDR. IMHO, it should be
> > > > > based on the user-configured maximum clock skew tolerance and can be
> > > > > independent of CDR.
> > > >
> > > > +1
> > > >
> > > > > IIUC we would make the remote apply wait just
> > > > > before committing if the remote commit timestamp is ahead of the local
> > > > > clock by more than the maximum clock skew tolerance, is that correct?
> > > >
> > > > +1 on condition to wait.
> > > >
> > > > But I think we should make apply worker wait during begin
> > > > (apply_handle_begin) instead of commit. It makes more sense to delay
> > > > the entire operation to manage clock-skew rather than the commit
> > > > alone. And only then CDR's timestamp based resolution which are much
> > > > prior to commit-stage can benefit from this. Thoughts?
> > >
> > > But do we really need to wait at apply_handle_begin()? I mean if we
> > > already know the commit_ts then we can perform the conflict resolution
> > > no?
> >
> > I would like to highlight one point here that the resultant data may
> > be different depending upon at what stage (begin or commit) we
> > conclude to wait. Example:
> >
> > --max_clock_skew set to 0 i.e. no tolerance for clock skew.
> > --Remote Update with commit_timestamp = 10.20AM.
> > --Local clock (which is say 5 min behind) shows = 10.15AM.
> >
> > Case 1: Wait during Begin:
> > When remote update arrives at local node, apply worker waits till
> > local clock hits 'remote's commit_tts - max_clock_skew' i.e. till
> > 10.20 AM. In the meantime (during the wait period of apply worker) if
> > some local update on the same row has happened at say 10.18am (local
> > clock), that will be applied first. Now when apply worker's wait is
> > over, it will detect 'update_diffe'r conflict and as per
> > 'last_update_win', remote_tuple will win as 10.20 is latest than
> > 10.18.
> >
> > Case 2: Wait during Commit:
> > When remote update arrives at local node, it finds no conflict and
> > goes for commit. But before commit, it waits till the local clock hits
> > 10.20 AM. In the meantime (during wait period of apply worker)) if
> > some local update is trying to update the same row say at 10.18, it
> > has to wait (due to locks taken by remote update on that row) and
> > remote tuple will get committed first with commit timestamp of 10.20.
> > Then local update will proceed and will overwrite remote tuple.
> >
> > So in case1, remote tuple is the final change while in case2, local
> > tuple is the final change.
>
> Got it, but which case is correct, I think both. Because in case-1
> local commit's commit_ts is 10:18 and the remote commit's commit_ts is
> 10:20 so remote apply wins. And case 2, the remote commit's commit_ts
> is 10:20 whereas the local commit's commit_ts must be 10:20 + delta
> (because it waited for the remote transaction to get committed).
>
> Now say which is better, in case-1 we have to make the remote apply to
> wait at the beginning state without knowing what would be the local
> clock when it actually comes to commit, it may so happen that if we
> choose case-2 by the time the remote transaction finish applying the
> local clock is beyond 10:20 and we do not even need to wait?

yes, agree that wait time could be lesser to some extent in case 2.
But the wait during commit will make user operations on the same row
wait, without user having any clue on concurrent blocking operations.
I am not sure if it will be acceptable.

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2024-07-03 11:40:00 Re: numeric.c: Should MUL_GUARD_DIGITS be increased from 2 to 3?
Previous Message Dilip Kumar 2024-07-03 11:36:05 Re: Conflict Detection and Resolution