Re: Conflict Detection and Resolution

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Conflict Detection and Resolution
Date: 2024-07-02 09:09:49
Message-ID: CAJpy0uDr+W8a1HbPPkSgmjZKnRD3jN-gJCSMNG-DFEcoWJyZFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 19, 2024 at 1:52 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Jun 18, 2024 at 3:29 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > On Tue, Jun 18, 2024 at 11:34 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > I tried to work out a few scenarios with this, where the apply worker
> > will wait until its local clock hits 'remote_commit_tts - max_skew
> > permitted'. Please have a look.
> >
> > Let's say, we have a GUC to configure max_clock_skew permitted.
> > Resolver is last_update_wins in both cases.
> > ----------------
> > 1) Case 1: max_clock_skew set to 0 i.e. no tolerance for clock skew.
> >
> > Remote Update with commit_timestamp = 10.20AM.
> > Local clock (which is say 5 min behind) shows = 10.15AM.
> >
> > When remote update arrives at local node, we see that skew is greater
> > than max_clock_skew and thus apply worker waits till local clock hits
> > 'remote's commit_tts - max_clock_skew' i.e. till 10.20 AM. Once the
> > local clock hits 10.20 AM, the worker applies the remote change with
> > commit_tts of 10.20AM. In the meantime (during wait period of apply
> > worker)) if some local update on same row has happened at say 10.18am,
> > that will applied first, which will be later overwritten by above
> > remote change of 10.20AM as remote-change's timestamp appear more
> > latest, even though it has happened earlier than local change.
>
> For the sake of simplicity let's call the change that happened at
> 10:20 AM change-1 and the change that happened at 10:15 as change-2
> and assume we are talking about the synchronous commit only.
>
> I think now from an application perspective the change-1 wouldn't have
> caused the change-2 because we delayed applying change-2 on the local
> node which would have delayed the confirmation of the change-1 to the
> application that means we have got the change-2 on the local node
> without the confirmation of change-1 hence change-2 has no causal
> dependency on the change-1. So it's fine that we perform change-1
> before change-2 and the timestamp will also show the same at any other
> node if they receive these 2 changes.
>
> The goal is to ensure that if we define the order where change-2
> happens before change-1, this same order should be visible on all
> other nodes. This will hold true because the commit timestamp of
> change-2 is earlier than that of change-1.
>
> > 2) Case 2: max_clock_skew is set to 2min.
> >
> > Remote Update with commit_timestamp=10.20AM
> > Local clock (which is say 5 min behind) = 10.15AM.
> >
> > Now apply worker will notice skew greater than 2min and thus will wait
> > till local clock hits 'remote's commit_tts - max_clock_skew' i.e.
> > 10.18 and will apply the change with commit_tts of 10.20 ( as we
> > always save the origin's commit timestamp into local commit_tts, see
> > RecordTransactionCommit->TransactionTreeSetCommitTsData). Now lets say
> > another local update is triggered at 10.19am, it will be applied
> > locally but it will be ignored on remote node. On the remote node ,
> > the existing change with a timestamp of 10.20 am will win resulting in
> > data divergence.
>
> Let's call the 10:20 AM change as a change-1 and the change that
> happened at 10:19 as change-2
>
> IIUC, although we apply the change-1 at 10:18 AM the commit_ts of that
> commit_ts of that change is 10:20, and the same will be visible to all
> other nodes. So in conflict resolution still the change-1 happened
> after the change-2 because change-2's commit_ts is 10:19 AM. Now
> there could be a problem with the causal order because we applied the
> change-1 at 10:18 AM so the application might have gotten confirmation
> at 10:18 AM and the change-2 of the local node may be triggered as a
> result of confirmation of the change-1 that means now change-2 has a
> causal dependency on the change-1 but commit_ts shows change-2
> happened before the change-1 on all the nodes.
>
> So, is this acceptable? I think yes because the user has configured a
> maximum clock skew of 2 minutes, which means the detected order might
> not always align with the causal order for transactions occurring
> within that time frame. Generally, the ideal configuration for
> max_clock_skew should be in multiple of the network round trip time.
> Assuming this configuration, we wouldn’t encounter this problem
> because for change-2 to be caused by change-1, the client would need
> to get confirmation of change-1 and then trigger change-2, which would
> take at least 2-3 network round trips.

As we agreed, the subscriber should wait before applying an operation
if the commit timestamp of the currently replayed transaction is in
the future and the difference exceeds the maximum clock skew. This
raises the question: should the subscriber wait only for insert,
update, and delete operations when timestamp-based resolution methods
are set, or should it wait regardless of the type of remote operation,
the presence or absence of conflicts, and the resolvers configured?
I believe the latter approach is the way to go i.e. this should be
independent of CDR, though needed by CDR for better timestamp based
resolutions. Thoughts?

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-07-02 09:22:44 Re: [PATCH] Handle SK_SEARCHNULL and SK_SEARCHNOTNULL in HeapKeyTest
Previous Message Bertrand Drouvot 2024-07-02 09:05:40 Re: Restart pg_usleep when interrupted