RE: Commit Timestamp and LSN Inversion issue

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "tomas(at)vondra(dot)me" <tomas(at)vondra(dot)me>, Jan Wieck <jan(at)wi3ck(dot)info>
Subject: RE: Commit Timestamp and LSN Inversion issue
Date: 2024-11-08 09:08:55
Message-ID: OS0PR01MB57162A227EC357482FEB4470945D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, November 8, 2024 2:20 AM Andres Freund <andres(at)anarazel(dot)de> wrote:

Hi,

> On 2024-11-05 08:58:36 -0500, Jan Wieck wrote:
> > The attached solution is minimally invasive because it doesn't move
> > the timestamp generation (clock_gettime() call) into the critical
> > section of
> > ReserveXLogInsertLocation() that is protected by a spinlock. Instead
> > it keeps track of the last commit-ts written to WAL in shared memory
> > and simply bumps that by one microsecond if the next one is below or
> > equal. There is one extra condition in that code section plus a
> > function call by pointer for every WAL record. In the unlikely case of
> > encountering a stalled or backwards running commit-ts, the expensive
> > part of recalculating the CRC of the altered commit WAL-record is done
> > later, after releasing the spinlock. I have not been able to measure
> > any performance impact on a machine with 2x Xeon-Silver (32 HT cores).
>
> I think it's *completely* unacceptable to call a hook inside
> ReserveXLogInsertLocation, with the spinlock held, no less. That's the most
> contended section of code in postgres.

I understand your concern and appreciate the feedback. I've made some
adjustments to the patch by directly placing the code to adjust the commit
timestamp within the spinlock, aiming to keep it as efficient as possible. The
changes have resulted in just a few extra lines. Would this approach be
acceptable to you ?

Additionally, we're also doing performance tests for it and will share the
results once they're available.

Best Regards,
Hou zj

Attachment Content-Type Size
v2-0001-POC-Make-commit-timestamp-monotonic-increase.patch application/octet-stream 9.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-11-08 09:21:07 Re: [PoC] Federated Authn/z with OAUTHBEARER
Previous Message Dmitry Dolgov 2024-11-08 08:54:09 Re: Identify huge pages accessibility using madvise