Re: Commit Timestamp and LSN Inversion issue

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, tomas(at)vondra(dot)me
Subject: Re: Commit Timestamp and LSN Inversion issue
Date: 2024-09-04 11:34:05
Message-ID: CAA4eK1LOmtMpmJAxsmN0yXJb-Q9vi6KThDYXdpVGL_WZW5ZgvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 4, 2024 at 2:05 PM Aleksander Alekseev
<aleksander(at)timescale(dot)com> wrote:
>
> > While discussing Logical Replication's Conflict Detection and
> > Resolution (CDR) design in [1] , it came to our notice that the
> > commit LSN and timestamp may not correlate perfectly i.e. commits may
> > happen with LSN1 < LSN2 but with Ts1 > Ts2. This issue may arise
> > because, during the commit process, the timestamp (xactStopTimestamp)
> > is captured slightly earlier than when space is reserved in the WAL.
> > [...]
> > There was a suggestion in [3] to acquire the timestamp while reserving
> > the space (because that happens in LSN order). The clock would need to
> > be monotonic (easy enough with CLOCK_MONOTONIC), but also cheap. The
> > main problem why it's being done outside the critical section, because
> > gettimeofday() may be quite expensive. There's a concept of hybrid
> > clock, combining "time" and logical counter, which might be useful
> > independently of CDR.
>
> I don't think you can rely on a system clock for conflict resolution.
> In a corner case a DBA can move the clock forward or backward between
> recordings of Ts1 and Ts2. On top of that there is no guarantee that
> 2+ servers have synchronised clocks. It seems to me that what you are
> proposing will just hide the problem instead of solving it in the
> general case.
>

It is possible that we can't rely on the system clock for conflict
resolution but that is not the specific point of this thread. As
mentioned in the subject of this thread, the problem is "Commit
Timestamp and LSN Inversion issue". The LSN value and timestamp for a
commit are not generated atomically, so two different transactions can
have them in different order.

Your point as far as I can understand is that in the first place, it
is not a good idea to have a strategy like "last_write_wins" which
relies on the system clock. So, even if LSN->Timestamp ordering has
any problem, it won't matter to us. Now, we can discuss whether
"last_write_wins" is a poor strategy or not but if possible, for the
sake of the point of this thread, let's assume that users using the
resolution feature ("last_write_wins") ensure that clocks are synced
or they won't enable this feature and then see if we can think of any
problem w.r.t the current code.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2024-09-04 11:41:35 Re: why there is not VACUUM FULL CONCURRENTLY?
Previous Message vignesh C 2024-09-04 11:24:35 Re: Improving the latch handling between logical replication launcher and worker processes.